The original implementation for BIRD 3 was rooted in the first
methods how I tried to go for multithreading and it had several flaws,
mostly incomprehensive notification and request pickup routines.
Also converting to a double-loop architecture where one of the
loops (low-latency) solely runs BFD socket communication, whereas
the other one does all the other shenanigans.
Channels that are down keep ptr on routing tables but do not keep them
locked. It is safe because the existence of tables depend on being
configured. But when a table is removed during reconfiguration, the
channel kept a dangling pointer since it fell down until it was removed.
This could be triggered by 'show protocols all' and other similar.
Change locking so that a channel kept a table locked for its whole
existence. The same change is already in BIRD 3.
Note that this is somewhat conceptually problematic as downed channels
do not keep resources. Also, other objects in specialized channels
(igp_table, base_table in bgp_channel, mpls_domain / mpls_range in
mpls_channel) are still locked only when channel is active, but for
them it is easier to keep track that they are not accessed when
they are deconfigured.
When exchanging routes in BGP export table, we forgot to update
the src in cases of add path off. This led to falsely claiming another
origin of that route in export table dump and also holding protocols
in the flush state because of their srcs being kept in the export tables.
In some edge cases, the dynamic BGP starts but doesn't yet pick up
the socket from the peer, when it gets shut down, typically on
a complete shutdown. Fixing this to just close the socket, not assert
it being already picked up.
The Babel seqno wraps around when reaching its maximum value (UINT16_MAX).
When comparing seqnos, this has to be taken into account. Therefore,
plain number comparisons do not work.
We missed that the protocol spawner violates the prescribed
locking order. When the rtable level is locked, no new protocol can be
started, thus we need to:
* create the protocol from a clean mainloop context
* in protocol start hook, take the socket
Testsuite: cf-bgp-autopeer
Fixes: #136
Thanks to Job Snijders <job@fastly.com> for reporting:
https://trubka.network.cz/pipermail/bird-users/2024-December/017980.html
The resource dumping routines needed to be updated in v3 to use the new
API introduced in v2.
Conflicts:
filter/f-util.c
filter/filter.c
lib/birdlib.h
lib/event.c
lib/mempool.c
lib/resource.c
lib/resource.h
lib/slab.c
lib/timer.c
nest/config.Y
nest/iface.c
nest/iface.h
nest/locks.c
nest/neighbor.c
nest/proto.c
nest/route.h
nest/rt-attr.c
nest/rt-table.c
proto/bfd/bfd.c
proto/bmp/bmp.c
sysdep/unix/io.c
sysdep/unix/krt.c
sysdep/unix/main.c
sysdep/unix/unix.h
The Babel seqno wraps around when reaching its maximum value (UINT16_MAX).
When comparing seqnos, this has to be taken into account. Therefore,
plain number comparisons do not work.
Implement several options (min/max graceful restart time, min/max long
lived stale time) to override graceful restart and long-lived graceful
restart timer values, as suggested by RFC 9494.