Many protocols do almost the same when creating a rte_update request
before calling rte_update2(). This commit should simplify the protocol
side of the route-creation routine.
In BIRD, RX has lower priority than TX with the exception of RX from
control socket. The patch replaces heuristic based on socket type with
explicit mark and uses it for both control socket and BGP session waiting
to be established.
This should avoid an issue when during heavy load, outgoing connection
could connect (TX event), send open, but then failed to receive OPEN /
establish in time, not sending notifications between and therefore
got hold timer expired error from the neighbor immediately after it
finally established the connection.
When a kernel route changed, function krt_learn_scan() noticed that and
replaced the route in internal kernel FIB, but after that, function
krt_learn_prune() failed to propagate the new route to the nest, because
it confused the new route with the (removed) old best route and decided
that the best route did not changed.
Wow, the original code (and the bug) is almost 17 years old.
The patch adds support for channels, structures connecting protocols and
tables and handling most interactions between them. The documentation is
missing yet.
Explicit setting of AF_INET(6|) in IP socket creation. BFD set to listen
on v6, without setting the V6ONLY flag to catch both v4 and v6 traffic.
Squashing and minor changes by Ondrej Santiago Zajicek
Since 2.6.19, the netlink API defines RTA_TABLE routing attribute to
allow 32-bit routing table IDs. Using this attribute to index routing
tables at Linux, instead of 8-bit rtm_table field.
Symbol lookup by cf_find_symbol() not only did the lookup but also added
new void symbols allocated from cfg_mem linpool, which gets broken when
lookups are done outside of config parsing, which may lead to crashes
during reconfiguration.
The patch separates lookup-only cf_find_symbol() and config-modifying
cf_get_symbol(), while the later is called only during parsing. Also
new_config and cfg_mem global variables are NULLed outside of parsing.
New data types net_addr and variants (in lib/net.h) describing
network addresses (prefix/pxlen). Modifications of FIB structures
to handle these data types and changing everything to use these
data types instead of prefix/pxlen pairs where possible.
The commit is WiP, some protocols are not yet updated (BGP, Kernel),
and the code contains some temporary scaffolding.
Comments are welcome.
If the number of sockets is too much for select(), we should at least
handle it with proper error messages and reject new sockets instead of
breaking the event loop.
Thanks to Alexander V. Chernikov for the patch.
When a new route was imported from kernel and chosen as preferred, then
the old best route was propagated as a withdraw to the kernel protocol.
Under some circumstances such withdraw propagated to the BSD kernel could
remove the new alien route and thus reverting the import.
When an interface goes down, (Linux) kernel removes routes pointing to
that ifacem but does not send withdraws for them. We rescan the
kernel table to ensure synchronization.
Thanks to Alexander Demenshin for the bugreport.
Although RFC 3542 allows both cases, Theo de Raadt thinks
he knows better, and msg_controllen without last padding
fails on OpenBSD.
Thanks to Job Snijders for the bugreport.
I/O:
- BSD: specify src addr on IP sockets by IP_HDRINCL
- BSD: specify src addr on UDP sockets by IP_SENDSRCADDR
- Linux: specify src addr on IP/UDP sockets by IP_PKTINFO
- IPv6: specify src addr on IP/UDP sockets by IPV6_PKTINFO
- Alternative SKF_BIND flag for binding to IP address
- Allows IP/UDP sockets without tx_hook, on these
sockets a packet is discarded when TX queue is full
- Use consistently SOL_ for socket layer values.
OSPF:
- Packet src addr is always explicitly set
- Support for secondary addresses in BSD
- Dynamic RX/TX buffers
- Fixes some minor buffer overruns
- Interface option 'tx length'
- Names for vlink pseudoifaces (vlinkX)
- Vlinks use separate socket for TX
- Vlinks do not use fixed associated iface
- Fixes TTL for direct unicast packets
- Fixes DONTROUTE for OSPF sockets
- Use ifa->ifname instead of ifa->iface->name
RIP sockets for multiple ifaces collided, because we cannot bind to
a specific iface on BSD. Workarounded by SO_REUSEPORT.
Thanks to Eugene M. Zheganin for the bugreport.
The bug caused that krt_prefsrc attribute was not processed when a route
received from a kernel protocol was exported to another kernel protocol.
Thanks to Sergey Popovich for a bugreport.
Implemented eval command can be used to evaluate expressions.
The patch also documents echo command and allows to use log classes
instead of integer as a mask for echo.
Interfaces for OSPF and RIP could be configured to use (and request)
TTL 255 for traffic to direct neighbors.
Thanks to Simon Dickhoven for the original patch for RIPng.
Implements support for IPv6 traffic class, sets higher priority for OSPF
and RIP outgoing packets by default and allows to configure ToS/DS/TClass
IP header field and the local priority of outgoing packets.
Several new configure command variants:
configure undo - undo last reconfiguration
configure timeout - configure with scheduled undo if not confirmed in timeout
configure confirm - confirm last configuration
configure check - just parse and validate config file
When 'import keep rejected' protocol option is activated, routes
rejected by the import filter are kept in the routing table, but they
are hidden and not propagated to other protocols. It is possible to
examine them using 'show route rejected'.
Allows to send and receive multiple routes for one network by one BGP
session. Also contains necessary core changes to support this (routing
tables accepting several routes for one network from one protocol).
It needs some more cleanup before merging to the master branch.
- ROA tables, which are used as a basic part for RPKI.
- Commands for examining and modifying ROA tables.
- Filter operators based on ROA tables consistent with RFC 6483.