Previously, the Babel protocol would never use prefix compression on outgoing
updates (but would parse it on incoming ones). This adds compression of IPv6
addresses of outgoing updates.
The compression only works to the extent that the FIB is walked in lexicographic
order; i.e. a prefix is only compressed if it shares bytes with the previous
prefix in the same packet.
Thanks to Toke Høiland-Jørgensen <toke@toke.dk> for the patch.
This adds support for dual-stack v4/v6 operation to the Babel protocol.
Routing messages will be exchanged over IPv6, but IPv4 routes can be
carried in the messages being exchanged. This matches how the reference
Babel implementation (babeld) works.
The nexthop address for v4 can be configured per interface, and will
default to the first available IPv4 address on the given interface. For
symmetry, a configuration option to configure the IPv6 nexthop address
is also added.
Thanks to Toke Høiland-Jørgensen <toke@toke.dk> for the patch.
Covers IPv4/VPNv4 routes with IPv6 next hop (RFC 5549), IPv6 routes with
IPv4 next hop (RFC 4798) and VPNv6 routes with IPv4 next hop (RFC 4659).
Unfortunately it also makes next hop hooks more messy.
Each BGP channel now could have two IGP tables, one for IPv4 next hops,
the other for IPv6 next hops.
Basic support for SAFI 4 and 128 (MPLS labeled IP and VPN) for IPv4 and
IPv6. Should work for route reflector, but does not properly handle
originating routes with next hop self.
Based on patches from Jan Matejka.
When a BGP session with ADD_PATH is restarted and the neighbor do not
announce ADD_PATH capability during reconnect, the accept_ra_types is
still set to RA_ANY.
Thanks to Lennert Buytenhek for the bugreport
The patch fixes several bugs introduced in previous changes, simplifies
the protocol by handing routes uniformly, introduces asynchronous route
processing to avoid issues with separate notifications for each next-hop
in ECMP routes, and makes reconfiguration faster by avoiding quadratic
complexity.
During reconfiguration, old and new filter expressions in static routes
are compared using i_same() function. When filter expressions contain
function calls, it is necessary that old filter expressions are the
second argument in i_same(), as it is internally modified by i_same().
Otherwise pointers to old (and freed) data appear in the config
structure.
Thanks to Lennert Buytenhek for tracking and reporting the bug.
Dropped struct mpnh and mpnh_*()
Now struct nexthop exists, nexthop_*(), and also included struct nexthop
into struct rta.
Also converted RTD_DEVICE and RTD_ROUTER to RTD_UNICAST. If it is needed
to distinguish between these two cases, RTD_DEVICE is equivalent to
IPA_ZERO(a->nh.gw), RTD_ROUTER is then IPA_NONZERO(a->nh.gw).
From now on, we also explicitely want C99 compatible compiler. We assume
that this 20-year norm should be known almost everywhere.
The variable nfa is not cleaned before each loop iteration and can have
a wrong value of nfa.nhs_reuse from the previous step.
Thanks to Bernardo Figueiredo for the bugreport and analysis.
Stubnet nodes in OSPF FIB were removed during rt_sync(), but the pointer
remained in top_hash_entry.nf, so net-summary LSA origination was
confused, reported 'LSA ID collision' and net-summary LSAs were not
originated properly.
Thanks to Naveen Chowdary Yerramneni for bugreport and analysis.
Prefix and bucket tables are initialized when entering established state
but not explicitly freed when leaving it (that is handled by protocol
restart). With graceful restart, BGP may enter and leave established
state multiple times without hard protocol restart causing memory leak.
Commit 3c09af41... changed behavior of int_set_add() from prepend to
append, which makes more sense for community list, but prepend must be
used for cluster list. Add int_set_prepend() and use it in cluster list
handling code.
- Unit Testing Framework (BirdTest)
- Integration of BirdTest into the BIRD build system
- Tests for several BIRD modules
Based on squashed Pavel Tvrdik's int-test branch, updated for
current int-new branch.
Implement BFD authentication (part of RFC 5880). Supports plaintext
passwords and cryptographic MD5 / SHA-1 authentication.
Based on former commit from Pavel Tvrdik
Add support for large communities (draft-ietf-idr-large-community),
96bit alternative to RFC 1997 communities.
Thanks to Matt Griswold for the original patch.
It is possible that sockets_add() are called between sockets_prepare()
and sockets_fire() during poll loop in birdloop_main(), so we need to
use loop->poll_fd.used instead of loop->sock_num to find the last field.
An interface reconfiguration may change both the hello and update
intervals. An update interval change is immediately put into effect,
while a hello interval change is not. This also updates the hello
interval immediately (if the new interval is shorter than the old one),
and sends a hello to notify peers of the change.
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
We do not need to maintain feasibility distances for our own router
ID (we ignore the updates anyway). Not doing so makes the routes be
garbage collected sooner when export filters change.
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
When a route becomes infeasible it should not be kept as selected; this
is forbidden by section 3.6 of the RFC and prevents subsequent updates
from the same router ID from replacing it.
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
This makes BIRD send a wildcard retraction on all interfaces before
shutting down and right after starting up. This helps ensure that
neighbours will discard the announced routes as soon as possible,
rather than only after the normal timeout procedures.
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
An update with wildcard AE and infinite metric should be treated as a
global retraction of all prefixes announced by that neighbour, per
section 4.4.9 of the RFC. In addition, router ID and seqno in retraction
updates should be ignored. This reworks the handling of retractions and
adjusts the parser to handle all this correctly.
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Intervals are carried as 16-bit centisecond values, but kept internally
in 16-bit second values, which causes a potential for overflow. This adds
some checks to make sure this does not happen.
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Although RFC 4271 does not forbid empty path segments, they are useless
and some implementations consider them invalid. It is clarified in RFC 7606,
specifying that AS_PATH with empty segment is considered malformed.
Also removed the lib-dir merging with sysdep. Updated #include's
accordingly.
Fixed make doc on recent Debian together with moving generated doc into
objdir.
Moved Makefile.in into root dir
Retired all.o and birdlib.a
Linking the final binaries directly from all the .o files.
This patch implements the IPv6 subset of the Babel routing protocol.
Based on the patch from Toke Hoiland-Jorgensen, with some heavy
modifications and bugfixes.
Thanks to Toke Hoiland-Jorgensen for the original patch.
Add code for manipulation with TCP-MD5 keys in the IPsec SA/SP database
at FreeBSD systems. Now, BGP MD5 authentication (RFC 2385) keys are
handled automatically on both Linux and FreeBSD.
Based on patches from Pavel Tvrdik.
Many protocols do almost the same when creating a rte_update request
before calling rte_update2(). This commit should simplify the protocol
side of the route-creation routine.
In BIRD, RX has lower priority than TX with the exception of RX from
control socket. The patch replaces heuristic based on socket type with
explicit mark and uses it for both control socket and BGP session waiting
to be established.
This should avoid an issue when during heavy load, outgoing connection
could connect (TX event), send open, but then failed to receive OPEN /
establish in time, not sending notifications between and therefore
got hold timer expired error from the neighbor immediately after it
finally established the connection.
After restart, LSAs locally originated by the previous instance are
received from neighbors. They are installed to LSA db and flushed. If
export of a route triggers origination of a new external LSA before flush
of the received one is complete, the check in ospf_originate_lsa() causes
origination to fail (because en->nf is NULL for the old LSA and non-NULL
for the new LSA). The patch fixes this by updating the en->nf for LSAs
being flushed (as is already done for empty ones). Generally, en->nf
field deserves some better description in the code.
Thanks to Jigar Mehta for analyzing the problem.
When a BGP session was established by an outgoing connection with
Graceful Restart behavior negotiated, a pending incoming connection in
OpenSent state, and another incoming connection was received, then the
outgoing connection (and whole BGP session) was closed, but the old
incoming connection was just overwritten by the new one. That later
caused a crash when the hold timer from the old connection fired.
The patch adds support for channels, structures connecting protocols and
tables and handling most interactions between them. The documentation is
missing yet.
Explicit setting of AF_INET(6|) in IP socket creation. BFD set to listen
on v6, without setting the V6ONLY flag to catch both v4 and v6 traffic.
Squashing and minor changes by Ondrej Santiago Zajicek
New data types net_addr and variants (in lib/net.h) describing
network addresses (prefix/pxlen). Modifications of FIB structures
to handle these data types and changing everything to use these
data types instead of prefix/pxlen pairs where possible.
The commit is WiP, some protocols are not yet updated (BGP, Kernel),
and the code contains some temporary scaffolding.
Comments are welcome.
The new RIP implementation fixes plenty of old bugs and also adds support
for many new features: ECMP support, link state support, BFD support,
configurable split horizon and more. Most options are now per-interface.
The patch adds suport for specifying route attributes together with
static routes, e.g.:
route 10.1.1.0/24 via 10.0.0.1 { krt_advmss = 1200; ospf_metric1 = 100; };
New LSA checksumming code separates generic Fletcher-16 and OSPF-specific
code and avoids back and forth endianity conversions, making it much more
readable and also several times faster.
Prior to this patch, BIRD validates the OSPF LSA checksum by calculating
a new checksum and comparing it with the checksum in the header. Due to
the specifics of the Fletcher checksum used in OSPF, this is not
necessarily correct as the checkbytes in the header may be calculated via
a different means and end up with a different value that is nonetheless
still correct.
The documented means of validating the checksum as specified in RFC 905
B.4 is to calculate c0 and c1 from the unchanged contents of the packet,
which must result in a zero value to be considered valid.
Thanks to Chris Boot for the patch.
Under some circumstances and heavy load, TX could be postponed
until the session fails with hold timer expired.
Thanks to Javor Kliachev for making the bug reproductible.
Permit specifying neighbor address, AS number and port independently.
Add 'interface' parameter for specifying interface for link-local
sessions independently.
Thanks to Alexander V. Chernikov for the original patch.
Fixes cases where the same network or external route are propagated by
several OSPF routes and some other corner cases in next hop construction
and ECMP. Allows to specify whether external routes should be merged.
Thanks to Peter Christensen for the original patch.
Stack variable may be used unitialized and that would lead to spurious
rta_free(), which may cause crash. The bug was introduced in 1.4.1 from
merging add-path branch.
Thanks to Peter Andreev for reporting it and Alexander V. Chernikov for
resolving it.
When a BFD session is removed while being scheduled for notification,
the session stays in notify list and is removed twice, which leads to
a strange crash after a while.
The old static route was not removed when the nexthop changed and the
new one was not viable (no neighbor).
Thanks to Pierluigi Rolando for the original patch.
I/O:
- BSD: specify src addr on IP sockets by IP_HDRINCL
- BSD: specify src addr on UDP sockets by IP_SENDSRCADDR
- Linux: specify src addr on IP/UDP sockets by IP_PKTINFO
- IPv6: specify src addr on IP/UDP sockets by IPV6_PKTINFO
- Alternative SKF_BIND flag for binding to IP address
- Allows IP/UDP sockets without tx_hook, on these
sockets a packet is discarded when TX queue is full
- Use consistently SOL_ for socket layer values.
OSPF:
- Packet src addr is always explicitly set
- Support for secondary addresses in BSD
- Dynamic RX/TX buffers
- Fixes some minor buffer overruns
- Interface option 'tx length'
- Names for vlink pseudoifaces (vlinkX)
- Vlinks use separate socket for TX
- Vlinks do not use fixed associated iface
- Fixes TTL for direct unicast packets
- Fixes DONTROUTE for OSPF sockets
- Use ifa->ifname instead of ifa->iface->name
This is more consistent with common usage and also with the behavior of
other implementations (Cisco, Juniper).
Also changes the default for gw mode to be based solely on
direct/multihop.
Neighbor events related to received route next hops got mixed up with
sticky neighbor node for an IP of the BGP peer. If a neighbor for a next
hop disappears, BGP session is shut down.
Interfaces for OSPF and RIP could be configured to use (and request)
TTL 255 for traffic to direct neighbors.
Thanks to Simon Dickhoven for the original patch for RIPng.
Implements support for IPv6 traffic class, sets higher priority for OSPF
and RIP outgoing packets by default and allows to configure ToS/DS/TClass
IP header field and the local priority of outgoing packets.
BIRD used zero netmask in hello packets on all PtP links, not just on
unnumbered ones. This patch fixes it and adds option 'ptp netmask'
for overriding the default behavior.
Thanks to Alexander V. Chernikov for the original patch.
Configured NBMA neighbors in OSPFv3 should be link-local addresses, old
behavior was to silently ignore global ones. The patch allows BIRD to
accept global ones, but adds a warning and a documentation notice.
Thanks to Wilco Baan Hofman for the bugreport.
The RAdv protocol could be configured to change its behavior based on
availability of routes, e.g., do not announce router lifetime when a
default route is not available.
Router ID could be automatically determined based of subset of
ifaces/addresses specified by 'router id from' option. The patch also
does some minor changes related to router ID reconfiguration.
Thanks to Alexander V. Chernikov for most of the work.
Although it is a slight deviation from the standard, it has no ill
consequences for OSPFv2 and the change fixes a compatibility issue
with some broken implementations.
When 'import keep rejected' protocol option is activated, routes
rejected by the import filter are kept in the routing table, but they
are hidden and not propagated to other protocols. It is possible to
examine them using 'show route rejected'.
Allows to send and receive multiple routes for one network by one BGP
session. Also contains necessary core changes to support this (routing
tables accepting several routes for one network from one protocol).
It needs some more cleanup before merging to the master branch.
The nest-protocol interaction is changed to better handle multitable
protocols. Multitable protocols now declare that by 'multitable' field,
which tells nest that a protocol handles things related to proto-rtable
interaction (table locking, announce hook adding, reconfiguration of
filters) itself.
Filters and stats are moved to announce hooks, a protocol could have
different filters and stats to different tables.
The patch is based on one from Alexander V. Chernikov, thanks.
Hostcache is a structure for monitoring changes in a routing table that
is used for routes with dynamic/recursive next hops. This is needed for
proper iBGP next hop handling.
Now it shows a distance, option to change showing reachable/all network
nodes and better handling of AS-external LSAs in multiple areas. The
command 'show ospf topology' was changed to not show stubnets in both
OSPFv2 and OSPFv3 (previously it displayed stubnets in OSPFv2).
A very tricky bug. OSPF on NBMA interfaces probably never really worked.
When a packet was sent to multiple destinations, the checksum was
calculated multiple times from a packet with already filled checksum
field (from previous calculation). Therefore, many packets were sent
with an invalid checksum.
When device protocol goes down, interfaces should be flushed
asynchronously (in the same way like routes from protocols are flushed),
when protocol goes to DOWN/HUNGRY.
This fixes the problem with static routes staying in kernel routing
table after BIRD shutdown.
- BSD kernel syncer is now self-conscious and can learn alien routes
- important bugfix in BSD kernel syncer (crash after protocol restart)
- many minor changes and bugfixes in kernel syncers and neighbor cache
- direct protocol does not generate host and link local routes
- min_scope check is removed, all routes have SCOPE_UNIVERSE by default
- also fixes some remaining compiler warnings
It seems that by adding one pipe-specific exception to route
announcement code and by adding one argument to rt_notify() callback i
could completely eliminate the need for the phantom protocol instance
and therefore make the code more straightforward. It will also fix some
minor bugs (like ignoring debug flag changes from the command line).
When uncofiguring the pipe and the peer table, the peer table was
unlocked when pipe protocol state changed to down/flushing and not to
down/hungry. This leads to the removal of the peer table before
the routes from the pipe were flushed.
The fix leads to adding some pipe-specific hacks to the nest,
but this seems inevitable.
Process well-known communities before the export filter (old behavior is
to process these attributes after, which does not allow to send route
with such community) and just for routes received from other BGP
protocols. Also fixes a bug in next_hop check.
There is no reak callback scheduler and previous behavior causes
bad things during hard congestion (like BGP hold timeouts).
Smart callback scheduler is still missing, but main loop was
changed such that it first processes all tx callbacks (which
are fast enough) (but max 4* per socket) + rx callbacks for CLI,
and in the second phase it processes one rx callback per
socket up to four sockets (as rx callback can be slow when
there are too many protocols, because route redistribution
is done synchronously inside rx callback). If there is event
callback ready, second phase is skipped in 90% of iterations
(to speed up CLI during congestion).
Mixing ip_addr and u32 does bad things on Ultrasparc.
Although both have the same size. Fascinating.
It was not catched by compiler because of varargs.
Although standard says that if we receive AS_PATH_CONFED_*
(and we are not a part of a confederation) segment, we should
drop session, nobody does that and it is unwise to do that.
Now we drop session just in case that peer ASN is in
AS_PATH_CONFED_* segment (to detect peer that considers BIRD
as a part of its confederation).
Allows to add more interface patterns to one common 'options'
section like:
interface "eth3", "eth4" { options common to eth3 and eth4 };
Also removes undocumented and unnecessary ability to specify
more interface patterns with different 'options' sections:
interface "eth3" { options ... }, "eth4" { options ... };
Cryptographic authentication in OSPF is defective by
design - there might be several packets independently
sent to the network (for example HELLO, LSUPD and LSACK)
where they might be reordered and that causes crypt.
sequence number error.
That can be workarounded by not incresing sequence number
too often. Now we update it only when last packet was sent
before at least one second. This can constitute a risk of
replay attacks, but RFC supposes something similar (like time
in seconds used as CSN).
If a DBDES packet from a master to a slave is lost, then the old code
does not retransmit it and instead send a next one with the same
sequence number. That leads to silent desynchronization of LSA
databases.
AS4 optional attribute errors were handled by session
drop (according to BGP RFC). This patch implements
error handling according to new BGP AS4 draft (*)
- ignoring invalid AS4 optional attributes.
(*) http://www.ietf.org/internet-drafts/draft-chen-rfc4893bis-02.txt
This patch extends the length for attributes from 1024 to 2048
(because both AS_PATH and AS4_PATH attributes take 2+4 B per AS).
If there is not enough space for attributes, Bird skips that
route group. Old behavior (skipping remaining attributes)
leads to skipping required attributes and session drop.
When OSPF neighbor state drops down to EXSTART,
clear LSA request and retransmit lists, as specified
by RFC. I hope that this will prevent oscillations
between EXSTART and LOADING states, which sometimes
happened.
It also contains related fix from Yury Shevchuk that
properly resets DB summary list iterator.
When capability related error is received, next connect will be
without capabilities. Also cease error subcodes descriptions
(according to [RFC4486]) are added.
BGP keeps its copy of configuration ptr and didn't update it during
reconfiguration. But old configuration is freed during reconfiguration.
That leads to unnecessary reset of BGP connection during reconfiguration
(old conf is corrupted and therefore different) and possibly other strange
behavior.
Fixes two race conditions causing crash of Bird, several unhandled
cases during BGP initialization, and some other bugs. Also changes
handling of startup delay to be more useful and implement
reporting of last error in 'show protocols' command.
values for MD5 password ID changed during reconfigure, Second
bug is that BIRD chooses password in first-fit manner, but RFC
says that it should use the one with the latest generate-from.
It also modifies the syntax for multiple passwords.
Now it is possible to just add more 'password' statements
to the interface section and it is not needed to use
'passwords' section. Old syntax can be used too.
RFC says that only connections in OpenConfirm and Established state
should participate in connection collision detection.
The current implementation leads to race condition when both sides
are trying to connect at the almost same time, then both sides
receive OPEN message by different connections at the almost same
time and close the other connection. Both connections are
closed and the both sides end in start/idle or start/active
state.
Two new CLI commands for OSPF giving nice informative (and still machine
parsable) representation of OSPF network graph (based on datas from the
LSA database).
The first command (show ospf topology) shows routers, networks and stub
networks, The second command (show ospf state) shows also external
routes and area-external networks and routers propagated by given area
boundary router.
The code generating LSAs for PTP OSPF links is buggy. The old behavior
is that it generates PTP link if there is a full/ptp neighbor and stub
link if there isn't. According to RFC 2328, the correct behavior is to
generate stub link in both cases (in the first case together with PTP
link).
And because of buggy detection of unnumbered networks, for numbered
networks the code creates stub links with 0.0.0.0/32.
- Old MED handling was completely different from behavior
specified in RFCs - for example they havn't been propagated
to neighboring areas.
- Update tie-breaking according to RFC 4271.
- Change default value for 'default bgp_med' configuration
option according to RFC 4271.