Most protocols in IPv6 mode use link-local source addresses and expect
that there is one on each active interface. The old code depended on
assumption that if there is some IPv6 address on iface, there is also an
IPv6 link-local address on that iface (added by kernel when the iface
went up). Unfortunately, that is not generally true, as a configured
global address sometimes ceases to be tentative (finishes DOD) before
a link-local address on the same iface. In such case a protocol iface
(namely RAdv and Babel) is activated, but fails to found link-local
address and stays in failed state.
The patch fixes that by tracking 'primary' IPv6 link-local address,
sending iface restart notifications when it changes and making
protocols ignore iface-up notifications when no such address is
selected for an iface.
The patch implements Default Router Preferences and More-Specific Routes
(RFC 4191) for RAdv protocol, allowing to announce router preference and
more specific routes in router advertisements. Routes can be exported to
RAdv like to regular routing protocols.
Some cleanups, bugfixes and other changes done by Ondrej Zajicek.
Add basic VRF (virtual routing and forwarding) support. Protocols can be
associated with VRFs, such protocols will be restricted to interfaces
assigned to the VRF (as reported by Linux kernel) and will use sockets
bound to the VRF. E.g., different multihop BGP instances can use diffent
kernel routing tables to handle BGP TCP connections.
The VRF support is preliminary, currently there are several limitations:
- Recent Linux kernels (4.11) do not handle correctly sockets bound
to interaces that are part of VRF, so most protocols other than multihop
BGP do not work. This will be fixed by future kernel versions.
- Neighbor cache ignores VRFs. Breaks config with the same prefix on
local interfaces in different VRFs. Not much problem as single hop
protocols do not work anyways.
- Olock code ignores VRFs. Breaks config with multiple BGP peers with the
same IP address in different VRFs.
- Incoming BGP connections are not dispatched according to VRFs.
Breaks config with multiple BGP peers with the same IP address in
different VRFs. Perhaps we would need some kernel API to read VRF of
incoming connection? Or probably use multiple listening sockets in
int-new branch.
- We should handle master VRF interface up/down events and perhaps
disable associated protocols when VRF goes down. Or at least disable
associated interfaces.
- Also we should check if the master iface is really VRF iface and
not some other kind of master iface.
- BFD session request dispatch should be aware of VRFs.
- Perhaps kernel protocol should read default kernel table ID from VRF
iface so it is not necessary to configure it.
- Perhaps we should have per-VRF default table.
Keep a cache of all the relevant prefixes we send out. When a prefix
appears, insert it into the cache. If it dies, keep it there for a
while, marked as dead.
Send out the dead prefixes with zero lifetime.
Adapt the naming conventions to be a bit closer to the other protocols.
proto_radv -> radv_proto
struct radv_proto *ra -> struct radv_proto *p
struct proto *p -> struct proto *P
I/O:
- BSD: specify src addr on IP sockets by IP_HDRINCL
- BSD: specify src addr on UDP sockets by IP_SENDSRCADDR
- Linux: specify src addr on IP/UDP sockets by IP_PKTINFO
- IPv6: specify src addr on IP/UDP sockets by IPV6_PKTINFO
- Alternative SKF_BIND flag for binding to IP address
- Allows IP/UDP sockets without tx_hook, on these
sockets a packet is discarded when TX queue is full
- Use consistently SOL_ for socket layer values.
OSPF:
- Packet src addr is always explicitly set
- Support for secondary addresses in BSD
- Dynamic RX/TX buffers
- Fixes some minor buffer overruns
- Interface option 'tx length'
- Names for vlink pseudoifaces (vlinkX)
- Vlinks use separate socket for TX
- Vlinks do not use fixed associated iface
- Fixes TTL for direct unicast packets
- Fixes DONTROUTE for OSPF sockets
- Use ifa->ifname instead of ifa->iface->name
The RAdv protocol could be configured to change its behavior based on
availability of routes, e.g., do not announce router lifetime when a
default route is not available.