If the protocol supports route refresh on export, we keep the stop-start
method of route refeed. This applies for BGP with ERR or with export
table on, for OSPF, Babel, RIP or Pipe.
For BGP without ERR or for future selective ROA reloads, we're adding an
auxiliary export request, doing the refeed while the main export request
is running, somehow resembling the original method of BIRD 2 refeed.
There is also a refeed request queue to keep track of different refeed
requests.
Memory allocation is a fragile part of BIRD and we need checking that
everybody is using the resource pools in an appropriate way. To assure
this, all the resource pools are associated with locking domains and
every resource manipulation is thoroughly checked whether the
appropriate locking domain is locked.
With transitive resource manipulation like resource dumping or mass free
operations, domains are locked and unlocked on the go, thus we require
pool domains to have higher order than their parent to allow for this
transitive operations.
Adding pool locking revealed some cases of insecure memory manipulation
and this commit fixes that as well.
When several BGPs requested a BFD session in short time, chances were
that the second BGP would file a request while the pickup routine was
still running and it would get enqueued into the waiting list instead of
being picked up.
Fixed this by enforcing pickup loop restart when new requests got added,
and also by atomically moving the unpicked requests to a temporary list
to announce admin down before actually being added into the wait list.
Now sk_open() requires an explicit IO loop to open the socket in. Also
specific functions for socket RX pause / resume are added to allow for
BGP corking.
And last but not least, socket reloop is now synchronous to resolve
weird cases of the target loop stopping before actually picking up the
relooped socket. Now the caller must ensure that both loops are locked
while relooping, and this way all sockets always have their respective
loop.
If there are lots of loops in a single thread and only some of the loops
are actually active, the other loops are now kept aside and not checked
until they actually get some timers, events or active sockets.
This should help with extreme loads like 100k tables and protocols.
Also ping and loop pickup mechanism was allowing subtle race
conditions. Now properly handling collisions between loop ping and pickup.
Repeated pipe refeed should not end route refresh as the prune routine
may start pruning otherwise valid routes.
The same applies for BGP repeated route refresh.
Instead of propagating interface updates as they are loaded from kernel,
they are enqueued and all the notifications are called from a
protocol-specific event. This change allows to break the locking loop
between protocols and interfaces.
Anyway, this change is based on v2 branch to keep the changes between v2
and v3 smaller.
Instead of calling custom hooks from object locks, we use standard event
sending mechanism to inform protocols about object lock changes. This is
a backport from version 3 where these events are passed across threads.
This implementation of object locks doesn't use mutexes to lock the
whole data structure. In version 3, this data structure may get accessed
from multiple threads and must be protected by mutex.
Instead of calling custom hooks from object locks, we use standard event
sending mechanism to inform protocols about object lock changes. As
event sending is lockless, the unlocking protocol simply enqueues the
appropriate event to the given loop when the locking is done.
The effective keepalive time now scales relative to the negotiated
hold time, to maintain proportion between the keepalive time and the
hold time. This avoids issues when both keepalive and hold times
were configured, the hold time was negotiated to a smaller value,
but the keepalive time stayed the same.
Add new options 'min hold time' and 'min keepalive time', which reject
session attempts with too small hold time.
Improve validation of config options an their documentation.
Thanks to Alexander Zubkov and Sergei Goriunov for suggestions.
Add BGP channel option 'next hop prefer global' that modifies BGP
recursive next hop resolution to use global next hop IPv6 address instead
of link-local next hop IPv6 address for immediate next hop of received
routes.
In principle, the channel list is a list of parent struct proto and can
contain general structures of type struct channel, That is useful e.g.
for adding MPLS channels to BGP.
Had to fix route source locking inside BGP export table as we need to
keep the route sources properly allocated until even last BGP pending
update is sent out, therefore the export table printout is accurate.
For BGP LLGR purposes, there was an API allowing a protocol to directly
modify their stale routes in table before flushing them. This API was
called by the table prune routine which violates the future locking
requirements.
Instead of this, BGP now requests a special route export and reimports
these routes into the table, allowing for asynchronous execution without
locking the table on export.