0
0
mirror of https://gitlab.nic.cz/labs/bird.git synced 2024-11-10 13:18:42 +00:00
Commit Graph

807 Commits

Author SHA1 Message Date
Maria Matejka
b50224a003 Merge branch 'master' into HEAD 2021-12-01 13:04:52 +01:00
Maria Matejka
55ee9961e0 Fix of shutdown: premature log cleanup led to use-after-free 2021-12-01 13:00:54 +01:00
Maria Matejka
bb63e99d78 Page allocator moved from pools to IO loops.
The resource pool system is highly hierarchical and keeping spare pages
in pools leads to unnecessarily complex memory management.

Loops have a flat hiearchy, at least for now, and it is therefore much
easier to keep care of pages, especially in cases of excessive virtual memory
fragmentation.
2021-12-01 13:00:54 +01:00
Maria Matejka
385b3ea395 For safer memory allocations, resources are bound to loops.
Also all loops have their basic resource pool for allocations which are
auto-freed when the loop is stopping.
2021-11-30 21:38:25 +01:00
Maria Matejka
644e9ca94e Directly mapped pages are kept for future use if temporarily not needed 2021-11-24 19:42:52 +00:00
Maria Matejka
821344c781 Stored pages release routine 2021-11-23 11:13:11 +00:00
Maria Matejka
1b39473993 Introducing basic RCU primitives for lock-less shared data structures 2021-11-22 19:05:44 +01:00
Maria Matejka
794a4eefa1 Keeping un-unmmappable pages until they can be reused
On Linux, munmap() may fail with ENOMEM when virtual memory is too
fragmented. Working this around by just keeping such blocks for future
use.
2021-11-22 19:05:44 +01:00
Maria Matejka
878eeec12b Routing tables now have their own loops.
This basically means that:
* there are some more levels of indirection and asynchronicity, mostly
  in cleanup procedures, requiring correct lock ordering
* all the internal table operations (prune, next hop update) are done
  without blocking the other parts of BIRD
* the protocols may get their own loops very soon
2021-11-22 19:05:44 +01:00
Maria Matejka
f0507f05ce Route sources have an explicit owner
This commit prevents use-after-free of routes belonging to protocols
which have been already destroyed, delaying also all the protocols'
shutdown until all of their routes have been finally propagated through
all the pipes down to the appropriate exports.

The use-after-free was somehow hypothetic yet theoretically possible in
rare conditions, when one BGP protocol authors a lot of routes and the
user deletes that protocol by reconfiguring in the same time as next hop
update is requested, causing rte_better() to be called on a
not-yet-pruned network prefix while the owner protocol has been already
freed.

In parallel execution environments, this would happen an inter-thread
use-after-free, causing possible heisenbugs or other nasty problems.
2021-11-22 19:05:44 +01:00
Maria Matejka
6e841b3153 Adding a generic cork mechanism for events 2021-11-22 19:05:43 +01:00
Maria Matejka
94eb0858c2 Converting the former BFD loop to a universal IO loop and protocol loop.
There is a simple universal IO loop, taking care of events, timers and
sockets. Primarily, one instance of a protocol should use exactly one IO
loop to do all its work, as is now done in BFD.

Contrary to previous versions, the loop is now launched and cleaned by
the nest/proto.c code, allowing for a protocol to just request its own
loop by setting the loop's lock order in config higher than the_bird.

It is not supported nor checked if any protocol changed the requested
lock order in reconfigure. No protocol should do it at all.
2021-11-22 19:05:43 +01:00
Maria Matejka
c84ed60371 Moved BFD IO loop out of BFD as we want to use it as socket-io coroutine 2021-11-22 19:05:43 +01:00
Maria Matejka
a4451535c6 Unified time for whole BIRD
In previous versions, every thread used its own time structures,
effectively leading to different time in every thread and strange
logging messages.

The time processing code now uses global atomic variables to keep
current time available for fast concurrent reading and safe updates.
2021-11-22 19:05:43 +01:00
Maria Matejka
a2af807357 Debug messages with timestamps.
On most of current hardware, getting monotonic clock is fast enough to
get it and write for each debug message.
2021-11-22 19:05:43 +01:00
Maria Matejka
df3264f51f Lock position checking allows for safe lock unions 2021-11-22 19:05:43 +01:00
Maria Matejka
1289c1c5ee Coroutines: A simple and lightweight parallel execution framework. 2021-11-22 19:05:43 +01:00
Maria Matejka
1db83a507a Locking subsystem: Just a global BIRD lock to begin with. 2021-11-22 19:05:43 +01:00
Maria Matejka
b5061659d3 POSIX threads and thread-local storage is needed for concurrent execution 2021-11-22 19:05:43 +01:00
Maria Matejka
44f26c49f9 Special table hooks rectified.
* internal tables are now more standalone, having their own import and
  export hooks
* route refresh/reload uses stale counter instead of stale flag,
  allowing to drop walking the table at the beginning
* route modify (by BGP LLGR) is now done by a special refeed hook,
  reimporting the modified routes directly without filters
2021-11-22 19:05:43 +01:00
Maria Matejka
f81702b7e4 Table import and export are now explicit hooks.
Channels have now included rt_import_req and rt_export_req to hook into
the table instead of just one list node. This will (in future) allow for:

* channel import and export bound to different tables
* more efficient pipe code (dropping most of the channel code)
* conversion of 'show route' to a special kind of export
* temporary static routes from CLI

The import / export states are also updated to the new algorithms.
2021-11-22 18:33:53 +01:00
Maria Matejka
6d87cf4be7 Kernel routes are flushed on shutdown by kernel scan, not by table scan 2021-11-09 19:20:41 +01:00
Maria Matejka
0767a0c288 Secondary and merged exports get a whole feed instead of traversing the table structures directly 2021-11-09 19:20:41 +01:00
Maria Matejka
69d1ffde4c Split route data structure to storage (ro) / manipulation (rw) structures.
Routes are now allocated only when they are just to be inserted to the
table. Updating a route needs a locally allocated route structure.
Ownership of the attributes is also now not transfered from protocols to
tables and vice versa but just borrowed which should be easier to handle
in a multithreaded environment.
2021-11-09 19:20:41 +01:00
Maria Matejka
541881bedf RIP fixup + dropping the tmp_attrs mechanism as obsolete 2021-10-13 19:09:04 +02:00
Maria Matejka
e42eedb912 Kernel: Convert the rte-local attributes to extended attributes and flags to pflags 2021-10-13 19:09:04 +02:00
Maria Matejka
3660f19dd5 Dropping the RTS_DUMMY temporary route storage.
Kernel route sync is done by other ways now and this code is not used
currently.
2021-10-13 19:09:04 +02:00
Maria Matejka
5cff1d5f02 Route: moved rte_src pointer from rta to rte
It is an auxiliary key in the routing table, not a route attribute.
2021-10-13 19:09:04 +02:00
Maria Matejka
eb937358c0 Preference moved to RTA and set explicitly in protocols 2021-10-13 19:09:04 +02:00
Maria Matejka
d5a32563df Preexport: No route modification, no linpool needed 2021-10-13 19:09:04 +02:00
Maria Matejka
6cd3771378 Multipage allocation
We can also quite simply allocate bigger blocks. Anyway, we need these
blocks to be aligned to their size which needs one mmap() two times
bigger and then two munmap()s returning the unaligned parts.

The user can specify -B <N> on startup when <N> is the exponent of 2,
setting the block size to 2^N. On most systems, N is 12, anyway if you
know that your configuration is going to eat gigabytes of RAM, you are
almost forced to raise your block size as you may easily get into memory
fragmentation issues or you have to raise your maximum mapping count,
e.g. "sysctl vm.max_map_count=(number)".
2021-10-13 19:01:22 +02:00
Maria Matejka
3a31c3aad6 CLI socket accept() may also fail and should produce some message, not a coredump. 2021-10-13 19:00:36 +02:00
Maria Matejka
7f0e598208 Bound allocated pages to resource pools with page caches to avoid unnecessary syscalls 2021-09-10 18:13:50 +02:00
Maria Matejka
227e2d5541 Debug output uses local buffer to avoid clashes between threads. 2021-09-10 17:37:46 +02:00
Ondrej Zajicek (work)
47d92d8f9d Nest: Clean up main channel handling
Remove assumption that main channel is the only channel.
2021-09-10 17:32:05 +02:00
Ondrej Zajicek (work)
f761be6b30 Nest: Clean up main channel handling
Remove assumption that main channel is the only channel.
2021-06-17 16:56:51 +02:00
Ondrej Zajicek (work)
e5724f71d2 sysdep: Add wrapper to get random bytes - update
Simplify the code and fix an issue with getentropy() return value.
2021-06-06 16:26:06 +02:00
Toke Høiland-Jørgensen
c48ebde5ce sysdep: Add wrapper to get random bytes
Add a wrapper function in sysdep to get random bytes, and required checks
in configure.ac to select how to do it. The configure script tries, in
order, getrandom(), getentropy() and reading from /dev/urandom.
2021-06-06 16:26:06 +02:00
Toke Høiland-Jørgensen
b17adf0735 BSD: Propagate OS-level IFF_MULTICAST to internal IF_MULTICAST flag
The BSD code did not propagate the OS-level IFF_MULTICAST flag to the
Bird-internal IF_MULTICAST flag, which causes problems with Wireguard
interfaces on FreeBSD. The Linux sysdep code does propagate the flag
already, so just copy over the same check and flag update.
2021-05-10 19:49:43 +02:00
Stefan Haller
a7c9515ebc BSD: Fix invalid pointer derefence in logging code
For logging purposes a stack allocated net_addr struct was passed by
value as vararg (instead of the expected pointer). This resulted in
a segfault when the specific error condition got logged.
2021-04-19 15:06:42 +02:00
Ondrej Zajicek (work)
a2277975d7 Unix: Expand accepted ranges of iproute2 constants
We support 32bit table and realm/flow ids, we should also accept them as
constants.

Thanks to Patrick Hemmer for the bugreport.
2021-04-07 16:14:20 +02:00
Maria Matejka
ff397df7ed Routing table is now a resource allocated from its own pool
This also fixes memory leaks from import/export tables being never
cleaned up and freed.
2021-03-30 21:56:08 +02:00
Maria Matejka
886dd92eee Slab: head now uses bitmask for used/free nodes info instead of lists
From now, there are no auxiliary pointers stored in the free slab nodes.
This led to strange debugging problems if use-after-free happened in
slab-allocated structures, especially if the structure's first member is
a next pointer.

This also reduces the memory needed by 1 pointer per allocated object.
OTOH, we now rely on pages being aligned to their size's multiple, which
is quite common anyway.
2021-03-25 16:47:48 +01:00
Ondrej Zajicek (work)
82f19ba95e NEWS and version update 2021-03-18 20:18:38 +01:00
Ondrej Zajicek (work)
7be3af7fa6 Rate-limit scheduling of work-events
In general, events are code handling some some condition, which is
scheduled when such condition happened and executed independently from
I/O loop. Work-events are a subgroup of events that are scheduled
repeatedly until some (often significant) work is done (e.g. feeding
routes to protocol). All scheduled events are executed during each
I/O loop iteration.

Separate work-events from regular events to a separate queue and
rate limit their execution to a fixed number per I/O loop iteration.
That should prevent excess latency when many work-events are
scheduled at one time (e.g. simultaneous reload of many BGP sessions).
2021-03-12 15:35:56 +01:00
Vincent Bernat
714238716e BGP: Add support for BGP hostname capability
This is an implementation of draft-walton-bgp-hostname-capability-02.
It is implemented since quite some time for FRR and in datacenter, this
gives a nice output to avoid using IP addresses.

It is disabled by default. The hostname is retrieved from uname(2) and
can be overriden with "hostname" option. The domain name is never set
nor displayed.

Minor changes by committer.
2021-02-10 16:53:57 +01:00
Ondrej Zajicek (work)
df83f62697 Netlink: Ignore dead routes
With net.ipv4.conf.XXX.ignore_routes_with_linkdown sysctl, a user can
ensure the kernel does not use a route whose target interface is down.
Such route is marked with a 'dead' / RTNH_F_DEAD flag.

Ignore these routes or multipath nexthops during scan.

Thanks to Vincent Bernat for the original patch.
2021-01-14 02:01:07 +01:00
Ondrej Zajicek (work)
2a8cc7259e Kernel: Do not check templates
So one can define kernel protocol template without channels.
For other protocols, it is either irrelevant or already done.

Thanks to Clemens Schrimpe for the bugreport.
2021-01-07 01:56:00 +01:00
Ondrej Zajicek (work)
21f9acd2a0 Kernel: Fix handling of krt_realm with ECMP routes
For ECMP routes, RTA_FLOW attribute must be set per-nexthop, not
per-route. Our corresponding krt_realm attribute is per-route.

Thanks to Mikhail Petrov for the bugreport.
2021-01-06 05:25:59 +01:00
Ondrej Zajicek (work)
62d57b9bdf Log: Fix locking during log reconfiguration
The log subsystem should be locked earlier, as default_log_list() may
internally manipulate with the current_log_list (if it is also a default
log list).
2020-11-25 15:15:13 +01:00