This basically means that:
* there are some more levels of indirection and asynchronicity, mostly
in cleanup procedures, requiring correct lock ordering
* all the internal table operations (prune, next hop update) are done
without blocking the other parts of BIRD
* the protocols may get their own loops very soon
This commit prevents use-after-free of routes belonging to protocols
which have been already destroyed, delaying also all the protocols'
shutdown until all of their routes have been finally propagated through
all the pipes down to the appropriate exports.
The use-after-free was somehow hypothetic yet theoretically possible in
rare conditions, when one BGP protocol authors a lot of routes and the
user deletes that protocol by reconfiguring in the same time as next hop
update is requested, causing rte_better() to be called on a
not-yet-pruned network prefix while the owner protocol has been already
freed.
In parallel execution environments, this would happen an inter-thread
use-after-free, causing possible heisenbugs or other nasty problems.
There is a simple universal IO loop, taking care of events, timers and
sockets. Primarily, one instance of a protocol should use exactly one IO
loop to do all its work, as is now done in BFD.
Contrary to previous versions, the loop is now launched and cleaned by
the nest/proto.c code, allowing for a protocol to just request its own
loop by setting the loop's lock order in config higher than the_bird.
It is not supported nor checked if any protocol changed the requested
lock order in reconfigure. No protocol should do it at all.
* internal tables are now more standalone, having their own import and
export hooks
* route refresh/reload uses stale counter instead of stale flag,
allowing to drop walking the table at the beginning
* route modify (by BGP LLGR) is now done by a special refeed hook,
reimporting the modified routes directly without filters
Channels have now included rt_import_req and rt_export_req to hook into
the table instead of just one list node. This will (in future) allow for:
* channel import and export bound to different tables
* more efficient pipe code (dropping most of the channel code)
* conversion of 'show route' to a special kind of export
* temporary static routes from CLI
The import / export states are also updated to the new algorithms.
In general, events are code handling some some condition, which is
scheduled when such condition happened and executed independently from
I/O loop. Work-events are a subgroup of events that are scheduled
repeatedly until some (often significant) work is done (e.g. feeding
routes to protocol). All scheduled events are executed during each
I/O loop iteration.
Separate work-events from regular events to a separate queue and
rate limit their execution to a fixed number per I/O loop iteration.
That should prevent excess latency when many work-events are
scheduled at one time (e.g. simultaneous reload of many BGP sessions).
If there are roa_check() calls in channel filters, then the channel
subscribes to ROA table notifications, which are sent when ROA tables
are updated (subject to settle time) and trigger channel reload or
refeed.
The patch add support for per-channel debug flags, currently just
'states', 'routes', and 'filters'. Flag 'states' is used for channel
state changes, remaining two for routes passed through the channel.
The per-protocol debug flags 'routes'/'filters' still enable reporting
of routes for all channels, to keep existing behavior.
The patch causes minor changes in some log messages.
When config structures are copied due to template application,
we need to reset list node structure before calling add_tail().
Thanks to Mikael Magnusson for patches.
Most commands like 'show ospf neighbors' fail when protocol is not
specified and there are multiple instances of given protocol type.
This is annoying in BIRD 2, as many protocols have IPv4 and IPv6
instances. The patch changes that by showing output from all protocol
instances of appropriate type.
Note that the patch also removes terminating cli_msg() call from these
commands and moves it to the common iterating code.
Channel currently does not have independent pool and uses protocol pool,
which is freed when protocol changes state to down, while channel is
still in flushing. Move some some cleanup code to channel_do_flush()
so it is done before freeing of protocol pool.
Use a hierarchical bitmap in a routing table to assign ids to routes, and
then use bitmaps (indexed by route id) in channels to keep track whether
routes were exported. This avoids unreliable and inefficient re-evaluation
of filters for old routes in order to determine whether they were exported.
The patch implements optional internal export table to a channel and
hooks it to BGP so it can be used as Adj-RIB-Out. When enabled, all
exported (post-filtered) routes are stored there. An export table can be
examined using e.g. 'show route export table bgp1.ipv4'.
Several BGP channel options (including 'next hop self') could be
reconfigured without session reset, with just route refeed/refresh.
The patch improves reconfiguration code to do it that way.
Protocol can have specified VRF, in such case it is restricted to a set
of ifaces associated with the VRF, otherwise it can use all interfaces.
The patch allows to specify VRF as 'default', in which case it is
restricted to a set of iface not associated with any VRF.
When 'graceful down' command is entered, protocols are shut down
with regard to graceful restart. Namely Kernel protocol does
not remove routes and BGP protocol does not send notification,
just closes the connection.
Support for dynamically spawning BGP protocols for incoming connections.
Use 'neighbor range' to specify range of valid neighbor addresses, then
incoming connections from these addresses spawn new BGP instances.
This protocol is highly experimental and nobody should use it in
production. Anyway it may help you getting some insight into what eats
so much time in filter processing.
The patch d506263d... blocked adding channel during reconfiguration,
that broke protocols which use the same functiona also during init.
This patch fixes that.