0
0
mirror of https://gitlab.nic.cz/labs/bird.git synced 2024-12-23 18:21:54 +00:00
Commit Graph

4459 Commits

Author SHA1 Message Date
Maria Matejka
71b434a987 Merge commit 'f0507f05ce57398e135651896dace4cb68eeed54' into thread-next 2022-08-02 22:08:59 +02:00
Maria Matejka
0072d11f34 Merge branch 'ballygarvan' into HEAD
Replacing the old 3.0-alpha0 cork mechanism with another one inside the
routing table. This version should be simpler and also quite clear what
it does, why and when.
2022-08-02 17:58:14 +02:00
Maria Matejka
2e95d269d6 Revert "Split route table event into separate events"
This reverts commit 445eeaf3df.
2022-08-02 17:55:50 +02:00
Maria Matejka
de5b884280 Revert "Table cork: Stop creating updates when there are too many pending."
This reverts commit 3b20722a1f.
2022-08-02 17:55:47 +02:00
Maria Matejka
db9153e216 Also next hop update routines are corking themselves when congestion is detected 2022-08-02 17:51:58 +02:00
Maria Matejka
449cd471ad BGP: respecting table cork 2022-08-02 17:34:38 +02:00
Maria Matejka
f8500b5943 Route table cork: Indicate whether the export queues are congested.
These routines detect the export congestion (as defined by configurable
thresholds) and propagate the state to readers. There are no readers for
now, they will be added in following commits.
2022-08-02 17:34:38 +02:00
Maria Matejka
058ed71139 Introducing basic RCU primitives for lock-less shared data structures 2022-08-02 10:00:21 +02:00
Maria Matejka
f1d6c66a78 Fixed main birdloop init in unit tests
Some unit tests weren't initializing the birdloop, trying to write the
birdloop ping into stdin. Fixed this and also forced stdin close on
startup of every test just to be sure that CI and local build behave the
same in this. (CI was failing on this while local build not.)
2022-08-01 15:17:41 +02:00
Maria Matejka
f60f7dfdee Sending an event must also ping the target IO loop 2022-07-28 19:52:19 +02:00
Maria Matejka
e858dce757 Moved the thread starting code to IO loop code 2022-07-28 19:49:03 +02:00
Maria Matejka
00f6b162c9 Merge commit '03bf6b90' into thread-next 2022-07-28 19:22:58 +02:00
Maria Matejka
03bf6b9087 Revert "Adding a generic cork mechanism for events"
This reverts commit 6e841b3153.
2022-07-28 19:22:48 +02:00
Ondrej Zajicek
082905a833 Merge branch 'master' into backport 2022-07-27 00:47:24 +02:00
Ondrej Zajicek
ddb1bdf281 Netlink: Restrict route replace for IPv6
Seems like the previous patch was too optimistic, as route replace is
still broken even in Linux 4.19 LTS (but fixed in Linux 5.10 LTS) for:

  ip route add 2001:db8::/32 via fe80::1 dev eth0
  ip route replace 2001:db8::/32 dev eth0

It ends with two routes instead of just the second.

The issue is limited to direct and special type (e.g. unreachable)
routes, the patch restricts route replace for cases when the new route
is a regular route (with a next hop address).
2022-07-26 18:45:20 +02:00
Ondrej Zajicek
722daa9500 Netlink: Simplify handling of IPv6 ECMP routes
When IPv6 ECMP support first appeared in Linux kernel, it used different
API than IPv4 ECMP. Individual next hops were updated and announced
separately, instead of using RTA_MULTIPATH as in IPv4. This has several
drawbacks and requires complex code to merge received notifications to
one multipath route.

When Linux came with IPv6 RTA_MULTIPATH support, the initial versions
were somewhat buggy, so we kept using the old API for updates (splitting
multipath routes to sequences of route updates), while accepting both
old-style routes and RTA_MULTIPATH routes in scans / notifications.

As IPv6 RTA_MULTIPATH support is here for a long time, this patch fully
switches Netlink to the IPv6 RTA_MULTIPATH API and removes old complex
code for handling individual next hop announces.

The required Linux version is at least 4.11 for reliable operation.

Thanks to Daniel Gröber for the original patch.
2022-07-25 00:11:40 +02:00
Ondrej Zajicek
2e484f8d29 Merge branch 'master' into backport 2022-07-24 20:08:02 +02:00
Ondrej Zajicek
534d0a4b44 KRT: Scan routing tables separetely on linux to avoid congestion
Remove compile-time sysdep option CONFIG_ALL_TABLES_AT_ONCE, replace it
with runtime ability to run either separate table scans or shared scan.

On Linux, use separate table scans by default when the netlink socket
option NETLINK_GET_STRICT_CHK is available, but retreat to shared scan
when it fails.

Running separate table scans has advantages where some routing tables are
managed independently, e.g. when multiple routing daemons are running on
the same machine, as kernel routing table modification performance is
significantly reduced when the table is modified while it is being
scanned.

Thanks Daniel Gröber for the original patch and Toke Høiland-Jørgensen
for suggestions.
2022-07-24 02:15:20 +02:00
Maria Matejka
432dfe3b9b Fixed a rarely used part of Babel: comparing two routes in table by their metric 2022-07-22 15:48:20 +02:00
Maria Matejka
4d48ede51d Revert "Export table: Delay freeing of old stored route."
This reverts commit cee0cd148c.
This change is not needed in version 2 and the surrounding code has
disappeared mostly in version 3.
2022-07-22 15:37:21 +02:00
Maria Matejka
e91754f5b9 Event lists rewritten to a single linked list
In multithreaded environment, we need to pass messages between workers.
This is done by queuing events to their respective queues. The
double-linked list is not really useful for that as it needs locking
everywhere.

This commit rewrites the event subsystem to use a single-linked list
where events are enqueued by a single atomic instruction and the queue
is processed after atomically moving the whole queue aside.
2022-07-18 13:28:35 +02:00
Maria Matejka
08c8484608 Merge commit '94eb0858' into thread-next 2022-07-18 12:33:00 +02:00
Maria Matejka
4b6f5ee870 Merge commit 'a4451535' into thread-next 2022-07-18 11:11:46 +02:00
Maria Matejka
9901ca6fb3 Fixed an annoying warning in ea_get_storage() 2022-07-18 10:56:20 +02:00
Maria Matejka
812edb85e1 Fixing build issues caused by a nonportable Makefile rule 2022-07-18 10:26:55 +02:00
Maria Matejka
636ab95f44 Merge commit 'a845651b' into thread-next 2022-07-18 10:19:59 +02:00
Maria Matejka
05673b16a8 Merge commit 'c70b3198' into thread-next [lots of conflicts]
There were more conflicts that I'd like to see, most notably in route
export. If a bisect identifies this commit with something related, it
may be simply true that this commit introduces that bug. Let's hope it
doesn't happen.
2022-07-15 14:57:02 +02:00
Maria Matejka
1c2851ecfa Fixed invalid routes handling
The invalid routes were filtered out before they could ever get
exported, yet some of the routines need them available, e.g. for
display or import reload.

Now the invalid routes are properly exported and dropped in channel
export routines instead.
2022-07-14 12:13:18 +02:00
Maria Matejka
239edf8d31 Merge branch 'backport' into thread-next 2022-07-13 14:46:36 +02:00
Maria Matejka
68a2c9d4c9 Merge commit '2e5bfeb73ac25e236a24b6c1a88d0f2221ca303f' into thread-next 2022-07-13 14:14:37 +02:00
Maria Matejka
af0d5ec279 Merge commit 'd429bc5c841a8e9d4c81786973edfa56d20a407e' into thread-next 2022-07-13 12:54:20 +02:00
Maria Matejka
5be34f5ab4 Merge commit '7e9cede1fd1878fb4c00e793bccd0ca6c18ad452' into thread-next 2022-07-13 12:02:34 +02:00
Maria Matejka
4ec443b5c2 Fixed bug in repeated show route command
Introduced by 13ef5e53dd, the CLI was not
properly cleaned up when the command finished, causing BIRD to not parse
any other command after "show route".
2022-07-13 11:24:09 +02:00
Maria Matejka
4f16270dd9 Merge commit 'f18968f5' into thread-next 2022-07-12 15:05:04 +02:00
Ondrej Zajicek
971721c9b5 BGP: Minor improvements to BGP roles
Add support for bgp_otc in filters and warning for configuration
inside confederations.
2022-07-12 15:03:17 +02:00
Maria Matejka
2a27564423 Merge commit '1df20989' into thread-next 2022-07-12 14:46:17 +02:00
Maria Matejka
1df20989c1 Revert "Special table hooks rectified."
This reverts commit 44f26c49f9.
2022-07-12 14:46:06 +02:00
Maria Matejka
bc2ce4aaa8 Removing the rte_modify API
For BGP LLGR purposes, there was an API allowing a protocol to directly
modify their stale routes in table before flushing them. This API was
called by the table prune routine which violates the future locking
requirements.

Instead of this, BGP now requests a special route export and reimports
these routes into the table, allowing for asynchronous execution without
locking the table on export.
2022-07-12 14:45:27 +02:00
Maria Matejka
080cbd1219 Route refresh in tables uses a stale counter.
Until now, we were marking routes as REF_STALE and REF_DISCARD to
cleanup old routes after route refresh. This needed a synchronous route
table walk at both beginning and the end of route refresh routine,
marking the routes by the flags.

We avoid these walks by using a stale counter. Every route contains:
  u8 stale_cycle;
Every import hook contains:
  u8 stale_set;
  u8 stale_valid;
  u8 stale_pruned;
  u8 stale_pruning;

In base_state, stale_set == stale_valid == stale_pruned == stale_pruning
and all routes' stale_cycle also have the same value.

The route refresh looks like follows:
+ ----------- + --------- + ----------- + ------------- + ------------ +
|             | stale_set | stale_valid | stale_pruning | stale_pruned |
| Base        |     x     |      x      |        x      |       x      |
| Begin       |    x+1    |      x      |        x      |       x      |
  ... now routes are being inserted with stale_cycle == (x+1)
| End         |    x+1    |     x+1     |        x      |       x      |
  ... now table pruning routine is scheduled
| Prune begin |    x+1    |     x+1     |       x+1     |       x      |
  ... now routes with stale_cycle not between stale_set and stale_valid
      are deleted
| Prune end   |    x+1    |     x+1     |       x+1     |      x+1     |
+ ----------- + --------- + ----------- + ------------- + ------------ +

The pruning routine is asynchronous and may have high latency in
high-load environments. Therefore, multiple route refresh requests may
happen before the pruning routine starts, leading to this situation:

| Prune begin |    x+k    |     x+k     |    x -> x+k   |       x      |
  ... or even
| Prune begin |   x+k+1   |     x+k     |    x -> x+k   |       x      |
  ... if the prune event starts while another route refresh is running.

In such a case, the pruning routine still deletes routes not fitting
between stale_set and and stale_valid, effectively pruning the remnants
of all unpruned route refreshes from before:

| Prune end   |    x+k    |     x+k     |       x+k     |      x+k     |

In extremely rare cases, there may happen too many route refreshes
before any route prune routine finishes. If the difference between
stale_valid and stale_pruned becomes more than 128 when requesting for
another route refresh, the routine walks the table synchronously and
resets all the stale values to a base state, while logging a warning.
2022-07-12 12:22:41 +02:00
Eugene Bogomazov
c73b5d2d3d BGP: Implement BGP roles
Implement BGP roles as described in RFC 9234. It is  a mechanism for
route leak prevention and automatic route filtering based on common BGP
topology relationships. It defines role capability (controlled by 'local
role' option) and OTC route attribute, which is used for automatic route
filtering and leak detection.

Minor changes done by commiter.
2022-07-11 17:25:54 +02:00
Maria Matejka
4ef2262bd5 There are now no internal tables at all. 2022-07-11 17:08:59 +02:00
Maria Matejka
9efaf6bafe Dropped the internal kernel protocol table for learnt routes.
The learnt routes are now pushed all into the connected table, not only
the best one. This shouldn't do any damage in well managed setups, yet
it should be noted that it is a change of behavior.

If anybody misses a feature which they implemented by misusing this
internal learn table, let us know, we'll consider implementing it in a
better way.
2022-07-11 17:04:52 +02:00
Maria Matejka
6b0368cc2c Export tables merged with BGP prefix hash
Until now, if export table was enabled, Nest was storing exactly the
route before rt_notify() was called on it. This was quite sloppy and
spooky and it also wasn't reflecting the changes BGP does before
sending. And as BGP is storing the routes to be sent anyway, we are
simply keeping the already-sent routes in there to better rule out
unneeded reexports.

Some of the route attributes (IGP metric, preference) make no sense in
BGP, therefore these will be probably replaced by something sensible.
Also the nexthop shown in the short output is the BGP nexthop.
2022-07-11 16:07:09 +02:00
Maria Matejka
d5e3272f3d Hash: iterable now per partes by an iterator
It's now possible to pause iteration through hash. This requires
struct hash_iterator to be allocated somewhere handy.

The iteration itself is surrounded by HASH_WALK_ITER and
HASH_WALK_ITER_END. Call HASH_WALK_ITER_PUT to ask for pausing; it may
still do some more iterations until it comes to a suitable pausing
point. The iterator must be initalized to an empty structure. No cleanup
is needed if iteration is abandoned inbetween.
2022-07-11 16:07:09 +02:00
Maria Matejka
b06911f6ef Do not try to check flowspec validity for piped routes 2022-07-11 16:07:09 +02:00
Maria Matejka
61842ff315 Fixed bad import table attributes freeing 2022-07-11 16:07:09 +02:00
Maria Matejka
fd72c69678 Attribute lists split to storage headers and data to save BGP memory 2022-07-11 16:07:09 +02:00
Maria Matejka
dc720a085f Show route uses the export request also for one-net queries 2022-07-11 16:07:09 +02:00
Maria Matejka
b5c8fce284 Added forgotten route source locking in flowspec validation 2022-07-11 13:04:01 +02:00
Maria Matejka
2e5bfeb73a Merge remote-tracking branch 'origin/master' into backport 2022-07-11 11:08:10 +02:00