Fix issue with missing AF cap (e.g. IPv4 unicast when no capabilities
are announced).
Add Linpool save/restore action similar to bgp_create_update().
Based on patch from Michal Zagorski <mzagorsk@akamai.com> co-authored
with Pawel Maslanka <pmaslank@akamai.com>. Thanks!
After converting BFD to the new IO loop system, reconfiguration never
really worked. Sadly, we missed this case in our testing suite so it
passed under the radar for quite a while.
Thanks to Andrei Dinu <andrei.dinu@digitalit.ro> for reporting and
isolating this issue.
When an OPEN message without capability options was parsed, the remote
role field was not initialized with the proper (non-zero) default value,
so it was interpreted as if 'provider' was announced.
Thanks to Mikhail Grishin for the bugreport.
The BMP protocol needs OPEN messages of established BGP sessions to
construct appropriate Peer Up messages. Instead of saving them internally
we use OPEN messages stored in BGP instances. This allows BMP instances
to be restarted or enabled later.
Because of this change, we can simplify BMP data structures. No need to
keep track of BGP sessions when we are not started. We have to iterate
over all (established) BGP sessions when the BMP session is established.
This is just a scaffolding now, but some kind of iteration would be
necessary anyway.
Also, the commit cleans up handling of msg/msg_length arguments to be
body/body_length consistently in both rx/tx and peer_up/peer_down calls.
For whatever reason, parser allocated a symbol for every parsed keyword
in each scope. That wasted time and memory. The effect is worsened with
recent changes allowing local scopes, so keywords often promote soft
scopes (with no symbols) to real scopes.
Do not allocate a symbol for a keyword. Take care of keywords that could
be promoted to symbols (kw_sym) and do it explicitly.
Memory allocation is a fragile part of BIRD and we need checking that
everybody is using the resource pools in an appropriate way. To assure
this, all the resource pools are associated with locking domains and
every resource manipulation is thoroughly checked whether the
appropriate locking domain is locked.
With transitive resource manipulation like resource dumping or mass free
operations, domains are locked and unlocked on the go, thus we require
pool domains to have higher order than their parent to allow for this
transitive operations.
Adding pool locking revealed some cases of insecure memory manipulation
and this commit fixes that as well.
Hooks called from BGP to BMP should not log warning when BMP is not
connected, that is not an error (and we do not want to flood logs with
a ton of messages).
Blocked sk_send() should not log warning, that is expected situation.
Error during sk_send() is handled in error hook anyway.
Replace broken TCP connection management with a simple state machine.
Handle failed attempts properly with a timeout, detect and handle TCP
connection close and try to reconnect after that. Remove useless
'station_connected' flag.
Keep open messages saved even after the BMP session establishment,
so they can be used after BMP session flaps.
Use proper log messages for session events.
Use local variable to refence relevant instance instead of using global
instance ptr. Also, use 'p' variable instead of 'bmp' so we can use
common macros like TRACE().
Most error handling code was was for cases that cannot happen,
or they would be code bugs (and should use ASSERT()). Keep error
handling for just for I/O errors, like in rest of BIRD.
Initial implementation of a basic subset of the BMP (BGP Monitoring
Protocol, RFC 7854) from Akamai team. Submitted for further review
and improvement.
When several BGPs requested a BFD session in short time, chances were
that the second BGP would file a request while the pickup routine was
still running and it would get enqueued into the waiting list instead of
being picked up.
Fixed this by enforcing pickup loop restart when new requests got added,
and also by atomically moving the unpicked requests to a temporary list
to announce admin down before actually being added into the wait list.
Now sk_open() requires an explicit IO loop to open the socket in. Also
specific functions for socket RX pause / resume are added to allow for
BGP corking.
And last but not least, socket reloop is now synchronous to resolve
weird cases of the target loop stopping before actually picking up the
relooped socket. Now the caller must ensure that both loops are locked
while relooping, and this way all sockets always have their respective
loop.
If there are lots of loops in a single thread and only some of the loops
are actually active, the other loops are now kept aside and not checked
until they actually get some timers, events or active sockets.
This should help with extreme loads like 100k tables and protocols.
Also ping and loop pickup mechanism was allowing subtle race
conditions. Now properly handling collisions between loop ping and pickup.
Repeated pipe refeed should not end route refresh as the prune routine
may start pruning otherwise valid routes.
The same applies for BGP repeated route refresh.
The import table feed wasn't resetting the table-specific route values
like REF_FILTERED and thus made the route look like filtered even though
it should have been re-evaluated as accepted.