The babel protocol normally sends all its messages as multicast packets,
but the protocol specification allows most messages to be sent as either
unicast or multicast, and the two can be mixed freely. In particular, the
babeld implementation can be configured to unicast updates to all peers
instead of sending them as unicast.
Daniel discovered that this can cause problems with the packet counter
checks in the MAC extension due to packet reordering. This happens on WiFi
networks where clients have power save enabled (which is quite common in
infrastructure networks): in this case, the access point will buffer all
multicast traffic and only send it out along with its beacons, leading to a
maximum buffering in default Linux-based access point configuration of up
to 200 ms.
This means that a Babel sender that mixes unicast and multicast messages
can have the unicast messages overtake the multicast messages because of
this buffering; when authentication is enabled, this causes the receiver to
discard the multicast message when it does arrive because it now has a
packet counter value less than the unicast message that arrived before it.
Daniel observed that this happens frequently enough that Babel ceases to
work entirely when runner over a WiFi network.
The issue has been described in draft-ietf-babel-mac-relaxed, which is
currently pending RFC publication. That also describes two mitigation
mechanisms: Keeping separate PC counters for unicast and multicast, and
using a reorder window for PC values. This patch implements the former as
that is the simplest, and resolves the particular issue seen on WiFi.
Thanks to Daniel Gröber for the bugreport.
Minor changes from committer.
The patch implements an IPv4 via IPv6 extension (RFC 9229) to the Babel
routing protocol (RFC 8966) that allows annoncing routes to an IPv4
prefix with an IPv6 next hop, which makes it possible for IPv4 traffic
to flow through interfaces that have not been assigned an IPv4 address.
The implementation is compatible with the current Babeld version.
Thanks to Toke Høiland-Jørgensen for early review on this work.
Minor changes from committer.
Instead of propagating interface updates as they are loaded from kernel,
they are enqueued and all the notifications are called from a
protocol-specific event. This change allows to break the locking loop
between protocols and interfaces.
Anyway, this change is based on v2 branch to keep the changes between v2
and v3 smaller.
The interface list must be flushed when device protocol is stopped. This
was done in a hardcoded specific hook inside generic protocol routines.
The cleanup hook was originally used for table reference counting late
cleanup, yet it can be also simply used for prettier interface list flush.
There ware missing dependencies for proto-build.c generation, which
sometimes lead to failed builds, and ignores changes in the set of
built protocols. Fix that, and also improve formatting of proto-build.c
When creating a new babel_source object we initialise the seqno to 0. The
caller will update the source object with the right metric and seqno value,
for both newly created and old source objects. However if we initialise the
source object seqno to 0 that may actually turn out to be a valid (higher)
seqno than the one in the routing table, because of seqno wrapping. In this
case the source metric will not be set properly, which breaks feasibility
tracking for subsequent updates.
To fix this, add a new initial_seqno argument to babel_get_source() which
is used when allocating a new object, and set that to the seqno value of
the update we're sending.
Thanks to Juliusz Chroboczek for the bugreport.
Juliusz noticed there were a couple of places we were doing straight
inequality comparisons of seqnos in Babel. This is wrong because seqnos can
wrap: so we need to use the modulo-64k comparison function for these cases
as well.
Introduce a strict-inequality version of the modulo-comparison for this
purpose.
Instead of calling custom hooks from object locks, we use standard event
sending mechanism to inform protocols about object lock changes. This is
a backport from version 3 where these events are passed across threads.
This implementation of object locks doesn't use mutexes to lock the
whole data structure. In version 3, this data structure may get accessed
from multiple threads and must be protected by mutex.
Instead of calling custom hooks from object locks, we use standard event
sending mechanism to inform protocols about object lock changes. As
event sending is lockless, the unlocking protocol simply enqueues the
appropriate event to the given loop when the locking is done.
This reverts commit 7144c9ca46.
The onlink attribute implementation collides with the nexthop attribute
behavior in v3; keeping it aside until finding out how to reimplement it
correctly.
For active sessions, ignore received packets with zero local id and
mismatched remote id. That forces a session timeout instead of an
immediate session restart. It makes BFD sessions more resilient to
packet spoofing.
Thanks to André Grüneberg for the suggestion.
Protocols receive if_notify() announcements that are filtered according
to their VRF setting, but during reconfiguration, they access iface_list
directly and forgot to check VRF setting here, which leads to all
interfaces be addedd.
Fix this issue for Babel, OSPF, RAdv and RIP protocols.
Thanks to Marcel Menzel for the bugreport.