One of previous workarounds for phantom route avoidance breaks export
counters by expanding sending of spurious withdraws, which are send when
we are not sure whether we have advertised that routes in the past.
If not, then export counter is decreased, but it was not increased
before, so it overflows under zero.
The patch fixes that by sendung spurious withdraws, but not counting them
on export counter. That may lead to error in the other direction, but that
happens only as a race condition (i.e., in normal operation filters
return proper values about old route export state).
The earlier fix loosen conditions for not running filters on old
route when deciding about route propagation to a protocol to avoid
issues with ghost routes in some race conditions.
Unfortunately, the fix also caused back-propagation of withdraws. For
regular updates, back-propagation is prevented in import_control hooks,
but these are not called on withdraws. For them, import_control hooks
are called on old routes instead, changing (old, NULL) notification
to (NULL, NULL), which is ignored. By not calling export processing
in some cases, the withdraw is not ignored and is back-propagated.
This patch fixes that by contract conditions so the earlier fix is not
applied to back-propagated updates.
The patch implements optional internal import table to a channel and
hooks it to BGP so it can be used as Adj-RIB-In. When enabled, all
received (pre-filtered) routes are stored there and import filters can
be re-evaluated without explicit route refresh. An import table can be
examined using e.g. 'show route import table bgp1.ipv4'.
Once upon a time, far far away, there were the old Bird developers
discussing what direction of route flow shall be called import and
export. They decided to say "import to protocol" and "export to table"
when speaking about a protocol. When speaking about a table, they
spoke about "importing to table" and "exporting to protocol".
The latter terminology was adopted in configuration, then also the
bird CLI in commit ea2ae6dd0 started to use it (in year 2009). Now
it's 2018 and the terminology is the latter. Import is from protocol to
table, export is from table to protocol. Anyway, there was still an
import_control hook which executed right before route export.
One thing is funny. There are two commits in April 1999 with just two
minutes between them. The older announces the final settlement
on config terminology, the newer uses the other definition. Let's see
their commit messages as the git-log tool shows them (the newer first):
commit 9e0e485e50
Author: Martin Mares <mj@ucw.cz>
Date: Mon Apr 5 20:17:59 1999 +0000
Added some new protocol hooks (look at the comments for better explanation):
make_tmp_attrs Convert inline attributes to ea_list
store_tmp_attrs Convert ea_list to inline attributes
import_control Pre-import decisions
commit 5056c559c4
Author: Martin Mares <mj@ucw.cz>
Date: Mon Apr 5 20:15:31 1999 +0000
Changed syntax of attaching filters to protocols to hopefully the final
version:
EXPORT <filter-spec> for outbound routes (i.e., those announced
by BIRD to the rest of the world).
IMPORT <filter-spec> for inbound routes (i.e., those imported
by BIRD from the rest of the world).
where <filter-spec> is one of:
ALL pass all routes
NONE drop all routes
FILTER <name> use named filter
FILTER { <filter> } use explicitly defined filter
For all protocols, the default is IMPORT ALL, EXPORT NONE. This includes
the kernel protocol, so that you need to add EXPORT ALL to get the previous
configuration of kernel syncer (as usually, see doc/bird.conf.example for
a bird.conf example :)).
Let's say RIP to this almost 19-years-old inconsistency. For now, if you
import a route, it is always from protocol to table. If you export a
route, it is always from table to protocol.
And they lived happily ever after.
The new MRT protocol is responsible for periodic RIB table dumps in the
MRT format (RFC 6396). Also the existing code for BGP4MP MRT dumps is
refactored and splitted between BGP to MRT protocols, will be more
integrated into MRT in the future.
Example:
protocol mrt {
table "*";
filename "%N_%F_%T.mrt";
period 60;
}
It is partially based on the old MRT code from Pavel Tvrdik.
If export filter is changed during reconfiguration and a route disappears
between reconfiguration and refeed (e.g., if the route is a static route
also removed during the reconfiguration), the route is not withdrawn.
The issue was fixed for regular channels by an earlier patch. This patch
fixes the issue for channels in RA_ACCEPTED mode (first-pass-the-filter),
used by BGP with 'secondary' option.
If export filter is changed during reconfiguration and a route disappears
between reconfiguration and refeed (e.g., if the route is a static route
also removed during the reconfiguration), the route is not withdrawn.
The patch fixes that by adding tx reconfiguration timestamp.
This is a fundamental change of an original (1999) concept of route
processing inside BIRD. During import/export, there was a temporary
ea_list created which was to be used instead of the another one inside
the route itself.
This led to some confusion, quirks, and strange filter code that handled
extended route attributes. Dropping it now.
The protocol interface has changed in an uniform way -- the
`struct ea_list *attrs` argument has been removed from store_tmp_attrs(),
import_control(), rt_notify() and get_route_info().
This patch adds support for source-specific IPv6 routes to BIRD core.
This is based on Dean Luga's original patch, with the review comments
addressed. SADR support is added to network address parsing in confbase.Y
and to the kernel protocol on Linux.
Currently there is no way to mix source-specific and non-source-specific
routes (i.e., SADR tables cannot be connected to non-SADR tables).
Thanks to Toke Hoiland-Jorgensen for the original patch.
Minor changes by Ondrej Santiago Zajicek.
A filter should log messages only if executed explicitly (e.g., during
route export or route import). When a filter is executed for technical
reasons (e.g., to establish whether a route was exported before), it
should run silently.
Add proper support for per-nexthop onlink flag in routes to handle next
hop addresses that are not covered by interface IP ranges. Supported by
kernel and static protocols.
Thanks to Vincent Bernat for the idea.
Some code cleanup, multiple bugfixes, allows to specify also channel
for 'show route export'. Interesting how such apparenty simple thing
like show route cmd has plenty of ugly corner cases.
Basic support for SAFI 4 and 128 (MPLS labeled IP and VPN) for IPv4 and
IPv6. Should work for route reflector, but does not properly handle
originating routes with next hop self.
Based on patches from Jan Matejka.
Anyway, Bird is now capable to insert both MPLS routes and MPLS encap
routes into kernel.
It was (among others) needed to define platform-specific AF_MPLS to 28
as this constant has been assigned in the linux kernel.
No support for BSD now, it may be added in the future.
Dropped struct mpnh and mpnh_*()
Now struct nexthop exists, nexthop_*(), and also included struct nexthop
into struct rta.
Also converted RTD_DEVICE and RTD_ROUTER to RTD_UNICAST. If it is needed
to distinguish between these two cases, RTD_DEVICE is equivalent to
IPA_ZERO(a->nh.gw), RTD_ROUTER is then IPA_NONZERO(a->nh.gw).
From now on, we also explicitely want C99 compatible compiler. We assume
that this 20-year norm should be known almost everywhere.
Kernel protocol calls rt_export_merged(), which used @rte_update_pool for
temporary allocations, supposing it is called from other functions from
rt-table.c that handles locking and flushing of the linpool. Therefore,
linpool was not flushed properly and memory leaked.
Add linpool argument to rt_export_merged() and use @krt_filter_lp when
called from kernel protocol.
Thanks to Justin Cattle and Alexander Frolkin for the bugreport.
(Commit squashed and updated by Ondrej Zajicek)
Many protocols do almost the same when creating a rte_update request
before calling rte_update2(). This commit should simplify the protocol
side of the route-creation routine.
The patch adds support for channels, structures connecting protocols and
tables and handling most interactions between them. The documentation is
missing yet.
Explicit setting of AF_INET(6|) in IP socket creation. BFD set to listen
on v6, without setting the V6ONLY flag to catch both v4 and v6 traffic.
Squashing and minor changes by Ondrej Santiago Zajicek
Symbol lookup by cf_find_symbol() not only did the lookup but also added
new void symbols allocated from cfg_mem linpool, which gets broken when
lookups are done outside of config parsing, which may lead to crashes
during reconfiguration.
The patch separates lookup-only cf_find_symbol() and config-modifying
cf_get_symbol(), while the later is called only during parsing. Also
new_config and cfg_mem global variables are NULLed outside of parsing.
New data types net_addr and variants (in lib/net.h) describing
network addresses (prefix/pxlen). Modifications of FIB structures
to handle these data types and changing everything to use these
data types instead of prefix/pxlen pairs where possible.
The commit is WiP, some protocols are not yet updated (BGP, Kernel),
and the code contains some temporary scaffolding.
Comments are welcome.
The new RIP implementation fixes plenty of old bugs and also adds support
for many new features: ECMP support, link state support, BFD support,
configurable split horizon and more. Most options are now per-interface.
In some circumstances during reconfiguration, routes propagated by pipes
to other tables may hang there even after the primary routes are removed.
There is already a workaround for this issue in the code which removes
these stale routes by flush process when source protocols are shut down.
This patch is a cleaner fix and allows to simplify the flush process
When route was propagated to another rtable through a pipe and then the
pipe was reconfigured softly in such a way that any subsequent route
updates are filtered, then the source protocol shutdown didn't clean up
the route in the second rtable which caused stale routes and potential
crashes.
Temporary dummy routes created by a kernel protocol during routing table
scan get mixed with real routes propagated from another kernel protocol
through a pipe.
The RAdv protocol could be configured to change its behavior based on
availability of routes, e.g., do not announce router lifetime when a
default route is not available.
When 'import keep rejected' protocol option is activated, routes
rejected by the import filter are kept in the routing table, but they
are hidden and not propagated to other protocols. It is possible to
examine them using 'show route rejected'.
Allows to send and receive multiple routes for one network by one BGP
session. Also contains necessary core changes to support this (routing
tables accepting several routes for one network from one protocol).
It needs some more cleanup before merging to the master branch.
When a protocol went down, all its routes were flushed in one step, that
may block BIRD for too much time. The patch fixes that by limiting
maximum number of routes flushed in one step.
The nest-protocol interaction is changed to better handle multitable
protocols. Multitable protocols now declare that by 'multitable' field,
which tells nest that a protocol handles things related to proto-rtable
interaction (table locking, announce hook adding, reconfiguration of
filters) itself.
Filters and stats are moved to announce hooks, a protocol could have
different filters and stats to different tables.
The patch is based on one from Alexander V. Chernikov, thanks.
Hostcache is a structure for monitoring changes in a routing table that
is used for routes with dynamic/recursive next hops. This is needed for
proper iBGP next hop handling.
In usual configuration, such export is already restricted
with the aid of the direct protocol but there are some
races that can circumvent it. This makes it harder to
break kernel device routes. Also adds an option to
disable this restriction.
- BSD kernel syncer is now self-conscious and can learn alien routes
- important bugfix in BSD kernel syncer (crash after protocol restart)
- many minor changes and bugfixes in kernel syncers and neighbor cache
- direct protocol does not generate host and link local routes
- min_scope check is removed, all routes have SCOPE_UNIVERSE by default
- also fixes some remaining compiler warnings
It seems that by adding one pipe-specific exception to route
announcement code and by adding one argument to rt_notify() callback i
could completely eliminate the need for the phantom protocol instance
and therefore make the code more straightforward. It will also fix some
minor bugs (like ignoring debug flag changes from the command line).
When uncofiguring the pipe and the peer table, the peer table was
unlocked when pipe protocol state changed to down/flushing and not to
down/hungry. This leads to the removal of the peer table before
the routes from the pipe were flushed.
The fix leads to adding some pipe-specific hacks to the nest,
but this seems inevitable.
If protocol announces a route, route is accepted by import filter to
routing table, and later it announces replacement of that route that is
rejected by import filter, old route remains in routing table.