1
0
mirror of https://git.FreeBSD.org/src.git synced 2026-06-02 11:24:32 +00:00
Commit Graph

309661 Commits

Author SHA1 Message Date
Konstantin Belousov 963a92d63b amd64: explain in more details why the slop is needed
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2026-05-31 22:18:46 +03:00
Anaelle Cazuc 03c69dd901 pmc: add sapphire rapids model
This commit adds the sapphire rapids CPU model to hwpmc_intel.c,
allowing hwpmc to be used on this CPU family.

Reviewed by:	mhorne
MFC after:	3 days
Sponsored by:	Stormshield
Differential Revision:	https://reviews.freebsd.org/D57263
2026-05-31 14:50:20 -03:00
Konstantin Belousov 510ee6698d linux_ntsync: linux compat shim for ntsync(9)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D57038
2026-05-31 20:17:07 +03:00
Konstantin Belousov d0ea3aff90 ntsync: add kinfo reporting
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D57038
2026-05-31 20:14:47 +03:00
Konstantin Belousov 0ac9aac81c ntsync: install headers for userspace consumption
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D57038
2026-05-31 20:14:47 +03:00
Konstantin Belousov 03ca6dbdb8 ntsync(4)
The driver implements the ntsync interface as specified in the Linux
7.0-rc3 document Documentation/userspace-api/ntsync.rst.  Only the
documentation and the userspace tests (Linux'
tools/testing/selftests/drivers/ntsync/ntsync.c) were used for
reference.  When the documentation contradicted the tests, tests
behavior was implemented.

One quirk is that Linux API needs to return an error from ioctl() and to
copyout the modified ioctl() argument.  Our generic ioctl() is not flexible
enough to implement this, so the ntsync_ioctl_copyout() hack allows to
copyout the ioctl parameter directly from the ioctl method, instead of
relying on the ioctl infra.

The FreeBSD port of the tests, that can be compiled both on FreeBSD and
Linux, is available at https://github.com/kostikbel/freebsd-ntsync-test.
The Linux binary compiled with the Linux test harness, cannot be run
under linuxolator due to unimplemented syscalls, but the shims in
freebsd-ntsync-test can be compiled on Linux and resulting Linux/glibc
binary run on linuxolator to test linux compat.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D57038
2026-05-31 20:10:46 +03:00
Ahmad Khalifa aef014de3f Revert "edk2: enable static asserts for *INT64 alignment"
This fails when using WITH_BEARSSL. It seems like we build the EFI bits
of libsecureboot (which is really just part of libsa in this case), even
when building the BIOS loader. Revert for now to unbreak the build.

This reverts commit 2fa4bdd7f9.

Reported by: freebsd@walstatt-de.de
2026-05-31 15:12:34 +03:00
Ahmad Khalifa 23996d940a stand/efi/Makefile: fix build order
Move liblua32efi and ficl32efi before .WAIT, otherwise there's a race
between the interpreter and the loader being built.

Reported by:	kbowling
Discussed with:	kevans
Fixes:		d15cc7625d
2026-05-31 14:48:05 +03:00
Joshua Rogers 8809ea46f1 ukbd: fix SET_REPORT wValue always using report ID 0 for LED output
ukbd_set_leds_callback() built the SET_REPORT control request with
USETW2(req.wValue, UHID_OUTPUT_REPORT, 0) before the loop that
determines the actual HID report ID from sc_id_numlock,
sc_id_scrolllock, or sc_id_capslock.  The data payload was already
correctly prefixed with the real report ID when id != 0, but the
control request's wValue told the device to set report ID 0, which
does not exist on devices that use non-zero report IDs for LED output.

Apple Internal Keyboard / Trackpad (0x05ac:0x0274) uses report ID 1
for LED output.  The mismatch caused the device to STALL every
SET_REPORT request, so the capslock LED could never be updated.

Move the USETW2 call to after the LED-detection loop so that wValue
carries the correct report ID.

Signed-off-by:	Joshua Rogers <Joshua@Joshua.Hu>
Reviewed by:	wulf
MFC after:	1 week
Pull Request:	https://github.com/freebsd/freebsd-src/pull/2210
2026-05-31 14:29:15 +03:00
Bjoern A. Zeeb b53eab3229 LinuxKPi: idr: use macros for lock idr lock operations
Our idr implementation is using a mtx lock which in the past has
already caused problems (613723bac2).
In order to make it easier to tackle the problem start by factoring
out all the operations related to the idr->lock into macros as we
have often done in other parts of code as well.

Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
Reviewed by:	wulf, emaste
Differential Revision: https://reviews.freebsd.org/D55392
2026-05-30 21:45:47 +00:00
Bjoern A. Zeeb d07460f194 LinuxKPI: 802.11 suspend/resume: fix the is_pci_dev check
Shortly before I committed the works from a year ago, jhb added a
function ("is_pci_device") so that the check against the devclass
does not have to be coded in every driver.  Use this instead in main
(and stable/15 in case the works get MFCed).

At the same time this fixes the check (the old one was wrong) as we
attach to the LinuxKPI 802.11 driver, e.g., iwlwifi and thus we need
to check the parent of the parent and not just the parent to be
of the devclass "pci" in the identify bus function.  The was the
first error.  The second was (and this is why it worked) that we
checked for == instead of != and so the wrong check became true again.

Discussed with:	jhb
Fixes:		11d69a4558 ("LinuxKPI: 802.11: add support for s/r")
MFC after:	3 days
X-MFC after:	ffcf5e3566 ("pci: Add is_pci_device helper function")
Sponsored by:	The FreeBSD Foundation
2026-05-30 21:33:51 +00:00
Bjoern A. Zeeb 49b413c4b0 rtwn/usb: add ID for D-Link DWA-121 rev B1 to rtwn RTL8188EU
Add the device ID to the usbdevs table in order to be able to use
it in the rtwn/usb driver for the RTL8188EU attachment.

(I adjusted the name to B1 compared to the original submission)

PR:		291839
MFC after:	3 days
2026-05-30 21:29:09 +00:00
Pawel Biernacki a64148e21b linux: Add support for PR_SET_VMA to prctl(2)
Implement dummy support for PR_SET_VMA with PR_SET_VMA_ANON_NAME in
prctl(2).  This prevents applications from receiving EINVAL when
attempting to name anonymous memory regions.
2026-05-30 19:52:58 +00:00
Aymeric Wibo eda74fe479 rand(3): Normalize function ordering
Align ordering between NAME & SYNOPSIS sections.

Obtained from:	https://github.com/apple-oss-distributions/libc
Sponsored by:	Klara, Inc.
2026-05-30 20:07:29 +01:00
Faraz Vahedi c115aad996 assert.3: Update as per C23
Signed-off-by:	Faraz Vahedi <kfv@kfv.io>
Reviewed by:	fuz
MFC after:	1 month
Pull Request:	https://github.com/freebsd/freebsd-src/pull/2203
2026-05-30 15:43:52 +02:00
Faraz Vahedi 0fe73dcf7c libc: Add <assert.h> C23 feature test macro
Signed-off-by:	Faraz Vahedi <kfv@kfv.io>
Reviewed by:	fuz
MFC after:	1 month
Pull Request:	https://github.com/freebsd/freebsd-src/pull/2203
2026-05-30 15:43:52 +02:00
Faraz Vahedi 867b51452e libc: Add variadic assert in accordance with C23
Signed-off-by:	Faraz Vahedi <kfv@kfv.io>
Reviewed by:	fuz
MFC after:	1 month
Pull Request:	https://github.com/freebsd/freebsd-src/pull/2203
2026-05-30 15:43:51 +02:00
Faraz Vahedi 157c184689 assert.h: Remove leading tabs for whitespace consistency
Signed-off-by:	Faraz Vahedi <kfv@kfv.io>
Reviewed by:	fuz
MFC after:	1 month
Pull Request:	https://github.com/freebsd/freebsd-src/pull/2203
2026-05-30 15:43:51 +02:00
Faraz Vahedi c5c7d18d01 libc: Restrict the static_assert macro to pre-C23 modes
Signed-off-by:	Faraz Vahedi <kfv@kfv.io>
Reviewed by:	fuz
MFC after:	1 month
Pull Request:	https://github.com/freebsd/freebsd-src/pull/2203
2026-05-30 15:43:51 +02:00
Faraz Vahedi 64502126e1 mdmfs: Use standard bool definition
Include `<stdbool.h>` instead of defining a local bool enum.
This avoids duplicating a standard type name and keeps the
source compatible with headers that provide bool as a macro,
or in case of C23 that compilers provide it as keyword.

Signed-off-by:	Faraz Vahedi <kfv@kfv.io>
Reviewed by:	fuz
MFC after:	1 month
Pull Request:	https://github.com/freebsd/freebsd-src/pull/2203
2026-05-30 15:43:51 +02:00
Faraz Vahedi 60c11e7c54 rpcsvc: Remove obsolete bool definition from yp_prot.h
`yp_prot.h` has carried a SunRPC-era typedef of `bool` guarded by
`BOOL_DEFINED`, but the header itself does not use it. The YP/RPC
interfaces use `bool_t` for protocol booleans.

Defining `bool` in a public header collides with modern C headers
that provide `bool` as a macro or keyword, such as `<stdbool.h>`
and C23-aware assert handling. Drop the compatibility typedef and
leave `bool` definition to the consumer's language mode.

Signed-off-by:	Faraz Vahedi <kfv@kfv.io>
Reviewed by:	fuz
MFC after:	1 month
Pull Request:	https://github.com/freebsd/freebsd-src/pull/2203
2026-05-30 15:43:51 +02:00
Konstantin Belousov 201090678e imgact_elf: add sysctl kern.elfXX.phnums for the number of program headers
that are accepted in the activated image or interpreter.

Requested by:	jhb
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D57328
2026-05-30 15:56:40 +03:00
Ahmad Khalifa 2fa4bdd7f9 edk2: enable static asserts for *INT64 alignment
The ia32 loader is now built with -malign-double, so these should pass.

Differential Revision:	https://reviews.freebsd.org/D55386
2026-05-30 05:40:58 +03:00
Ahmad Khalifa d15cc7625d stand: compile ia32 EFI loader with -malign-double
The UEFI spec says:
> Structures are aligned on boundaries equal to the largest internal
> datum of the structure and internal data are implicitly padded to
> achieve natural alignment.

Unlike the old Intel EFI toolkit, the EDK2 headers expect ia32 builds to
use -malign-double to achive this.

Make EFI versions of libsa32, liblua32, and ficl32. With the difference
being that they are compiled with -malign-double.

Differential Revision:	https://reviews.freebsd.org/D55385
2026-05-30 05:40:39 +03:00
Mark Johnston f048a1a1de tests/ipsec: Run in parallel
Use execenv=jail to enable this.

MFC after:	1 week
2026-05-30 01:16:51 +00:00
Olivier Cochard e492ad08fc netlink/route: extend pre-2.6.19 Linux compat shim to del/getroute
Commit f34aca55ad ("netlink/route: provide pre-2.6.19 Linux compat shim",
2024-06) fixed the partial fix for net/bird2 on the netlink path by mapping the
legacy 8-bit struct rtmsg::rtm_table field onto the modern 32-bit RTA_TABLE
attribute when the latter is absent.

That fix, however, was only applied to rtnl_handle_newroute. The two sibling
handlers: rtnl_handle_delroute and rtnl_handle_getroute were left looking at
attrs.rta_table directly. They are reachable from exactly the same client
(bird, in its netlink scan path), so any FIB number that fits in 8 bits
silently maps to RT_TABLE_UNSPEC in those handlers.

Reviewed by:	melifaro (previous version)
Approved by:	emaste
MFC after:	1 week
Sponsored by:	Netflix
2026-05-30 01:23:12 +02:00
Ed Maste 96dbc9a8de netlink: Check permissions for interface flag changes
Reviewed by:	pouria, melifaro
Sponsored by:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D57332
2026-05-29 19:11:21 -04:00
Ed Maste 9ddb6064f8 netlink: Use early exit pattern in _nl_modify_ifp_generic
No functional change.

Reviewed by:	pouria, melifaro
Sponsored by:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D57349
2026-05-29 19:11:21 -04:00
Stefan Eßer 11f23d7c07 tools/test/stress2/misc: Fix and enable new tests
The previously committed versions of these tests failed to prevent
duplicate file names in the list of files to process, leading to
missing files when a "mv" commando tried to operate on a file that
had already been renamed.

The test for filenames containing UTF-16 surrogate pairs stays
disabled, since the required kernel changes have not been committed,
yet.
2026-05-30 01:10:35 +02:00
Ed Maste 692b0ef150 syscalls.master: Allow clock_nanosleep in capability mode
It is akin to nanosleep(2) and does not access global namespaces.
It should be permitted in capability mode.

Reviewed by: vangyzen
Fixes: 3f8455b090 ("Add clock_nanosleep()")
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D57343
2026-05-29 18:25:42 -04:00
Ed Maste 32a7ba251a route: Fix flush w/o specified address family
PR:		291867
Reported by:	gavin
Reviewed by:	pouria, melifaro
Sponsored by:	The FreeBSD Foundation
Fixes: c597432e22 ("route(8): convert to netlink")
Differential Revision: https://reviews.freebsd.org/D57336
2026-05-29 18:18:20 -04:00
Dag-Erling Smørgrav b5dce0ae4f login_class: Fix kqueues, pipebuf resource types
* kqueues is a count but is listed as a size

* pipebuf is a size but is listed as a count

PR:		295623
MFC after:	1 week
Fixes:          a4c04958f5 ("libutil: support RLIMIT_PIPEBUF")
Fixes:          85a0ddfd0b ("Add a resource limit for the total...")
Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D57333
2026-05-30 00:06:44 +02:00
Dag-Erling Smørgrav dce6aff90b fts: Improve the description of FTS_NOSTAT
Note that we still need to stat directories and the roots.

MFC after:	1 week
Sponsored by:	Klara, Inc.
Reviewed by:	kevans
Differential Revision:	https://reviews.freebsd.org/D57325
2026-05-29 19:45:10 +02:00
Dag-Erling Smørgrav b2b95249ae fts: Check link count before using it
* Check the range of the link count before trying to use it.

* Rewrite the comment explaining what the link count is used for.

MFC after:	1 week
Sponsored by:	Klara, Inc.
Reviewed by:	kevans
Differential Revision:	https://reviews.freebsd.org/D57324
2026-05-29 19:45:06 +02:00
Dag-Erling Smørgrav 7ec549870f fts: Add some depth to the options test
MFC after:	1 week
Sponsored by:	Klara, Inc.
Reviewed by:	kevans
Differential Revision:	https://reviews.freebsd.org/D57323
2026-05-29 19:45:01 +02:00
Sulev-Madis Silber ee41a88205 spi: switch to switch
use recommended switch with default case to catch invalid values

Reviewed by:	kevans, adrian
Differential Revision:	https://reviews.freebsd.org/D54759
2026-05-29 09:58:50 -07:00
Stefan Eßer aa029088ec tools/test/stress2/misc: Add msdosfs tests (currently failing)
Test msdos22.sh creates 1000 files with long random names consisting
of only ASCII characters. The mount is performed without -L option,
therefore no use of iconv to convert between character sets.

Test msdos23.sh mixes some non-ASCII characters into the file names.
The file system is therefore mounted with -L C.UTF-8 to include tests
of the conversions between UTF-8 and UTF-16.

Test msdos24.sh adds emojis to the names to test the (not yet
committed) support of UTF-16 surrogate pairs in filenames.

All 3 tests succeed with a small number of files (e.g., 10), but fail
most of the time when testing with 1000 files.

The tests have been added to all.exclude since they are expected to
fail. They shall be enabled as regression tests, when the msdosfs code
has been fixed.
2026-05-29 18:15:33 +02:00
Andrew Turner f6911b941f sys: Renumber MTE SEGV codes
Some third party software expects these to not conflict. As the MTE
support isn't fully in the tree, and these values aren't in a release
we can renumber them without any backwards compatibility issues.

Sponsored by:	Arm Ltd
2026-05-29 17:06:14 +01:00
Olivier Certner 851499046d MAC/do: Add consistency tests
Test that:
1. Concurrent changes to different parameters on the same jail are
   independent/atomic.
2. Inheritance works.
3. Relaxing only parent jail rules does not leak to a subjail thanks to
   sequential consistency.
4. Sysctl knobs and jail parameters stay consistent.

Some of these tests may be extended in the future with several layers of
jails (there is only a single subjail currently).

Reviewed by:    bapt
MFC after:      1 month
Sponsored by:   The FreeBSD Foundation
Pull Request:   https://ron-dev.freebsd.org/FreeBSD/src/pulls/38
2026-05-29 17:41:51 +02:00
Olivier Certner a95ff5ef7d MAC/do: Tests: Add support for exec paths, jail parameters, subjails
And also allow configuration of the mdo(1) executable path.

This commit only contains new or modified infrastructure.  No functional
change intended at this point.

Reviewed by:    bapt
MFC after:      1 month
Sponsored by:   The FreeBSD Foundation
Pull Request:   https://ron-dev.freebsd.org/FreeBSD/src/pulls/38
2026-05-29 17:41:36 +02:00
Olivier Certner 33daea3f86 MAC/do: Tests: Quote the source directory
In a standard test suite installation, this is not necessary, but be
bullet-proof to custom ones, however improbable.

Reviewed by:    bapt
MFC after:      3 days
Sponsored by:   The FreeBSD Foundation
Pull Request:   https://ron-dev.freebsd.org/FreeBSD/src/pulls/38
2026-05-29 17:41:29 +02:00
Olivier Certner 6159187329 MAC/do: Tests: Declare required programs closer to use
Reviewed by:    bapt
MFC after:      3 days
Sponsored by:   The FreeBSD Foundation
Pull Request:   https://ron-dev.freebsd.org/FreeBSD/src/pulls/38
2026-05-29 17:41:24 +02:00
Olivier Certner b0c948fe92 MAC/do: Tests: Fix copyrights
No comma needed after a single year.  Add SPDX.

Reviewed by:    bapt
MFC after:      3 days
Sponsored by:   The FreeBSD Foundation
Pull Request:   https://ron-dev.freebsd.org/FreeBSD/src/pulls/38
2026-05-29 17:41:17 +02:00
Olivier Certner 79a987aba1 MAC/do: Tests: Remove shebang lines
They are automatically added by <bsd.test.mk>.

Reviewed by:    bapt
MFC after:      3 days
Sponsored by:   The FreeBSD Foundation
Pull Request:   https://ron-dev.freebsd.org/FreeBSD/src/pulls/38
2026-05-29 17:41:02 +02:00
Olivier Certner 39818654ae mac_do.4: Document executable paths, default jail values and consistency
While here, fix the bug of mentioning 'enable' as a possible value for
the 'mac.do' jail parameter whereas it is 'new' instead.

Reviewed by:    bapt
MFC after:      1 month
Sponsored by:   The FreeBSD Foundation
Pull Request:   https://ron-dev.freebsd.org/FreeBSD/src/pulls/38
2026-05-29 17:40:25 +02:00
Olivier Certner fcb0018634 MAC/do: Update copyright
Update years for the Foundation.

While here, remove the initial '/*-' which has been useless for a long
time.

While here, add a missing space on bapt@'s copyright line (approved by
him).

Reviewed by:    bapt
MFC after:      1 month
Sponsored by:   The FreeBSD Foundation
Pull Request:   https://ron-dev.freebsd.org/FreeBSD/src/pulls/38
2026-05-29 17:39:20 +02:00
Olivier Certner 1fa1e3f395 MAC/do: Do not skip blanks when parsing executable paths
The kind of tolerance we apply to parsing rules, whose format we have
defined, cannot be applied to paths since blank characters are allowed
there.

There is still the limitation that no escape character is currently
supported, and so it is not possible to configure a path having a ':'
character.

Reviewed by:    bapt
Fixes:          9818224174 ("MAC/do: Executable paths feature (GSoC 2025's final state)")
MFC after:      1 month
Sponsored by:   The FreeBSD Foundation
Pull Request:   https://ron-dev.freebsd.org/FreeBSD/src/pulls/38
2026-05-29 17:37:14 +02:00
Olivier Certner 4c98f7a002 MAC/do: Serialize installing/modifying some jail's configuration
See the immediately preceding commit for explanations on what this is
fixing.

When setting 'mac.do' to 'inherit' on a jail with 'mac.do.rules' and
'mac.do.exec_paths' also specified in the same call, ensure that the
check that these passed parameters are the same as those to be inherited
is atomic with respect to enabling the inheritance (i.e., removing the
jail's 'struct conf' object).  (See previous commit "MAC/do: Fix the
recent logic to set jail parameters, make it more tolerant" as for why
this check exists.)

Because we currently only modify a single configuration object per
transaction, we introduce the parse_and_commit_conf() wrapper around
parse_and_set_conf() to remove duplicated code that would ensue from
calling the latter directly, namely, releasing the 'mac_do_rwl' lock and
freeing the old configuration object (if any).

Taking the 'mac_do_rwl' lock for writing as a way to freeze all accesses
to mac_do(4) configurations was deemed too thin an operation to be worth
wrapping.

Reviewed by:    bapt (older version)
Fixes:          9818224174 ("MAC/do: Executable paths feature (GSoC 2025's final state)")
MFC after:      1 month
Sponsored by:   The FreeBSD Foundation
Pull Request:   https://ron-dev.freebsd.org/FreeBSD/src/pulls/38
2026-05-29 17:36:29 +02:00
Olivier Certner 0db7f110cb MAC/do: Support for atomically modifying configurations
As mentioned in previous commits "MAC/do: parse_and_set_conf(): Require
the model configuration" and "MAC/do: Sequential consistency for
configuration retrieval", the introduction of the "executable path"
feature, more fundamentally, the fact that there is now more than one
per-jail parameter and that parameters can be independently modified or
copied, causes an atomicity problem in case of concurrent accesses to of
a jail's applicable configuration.

Partially modifying a configuration is indeed akin to
a read-modify-write operation, where the read is either to the current
or an inherited configuration.  More precisely, once pointed to by
a jail, a configuration object is immutable, and changing the jail's
configuration means making the jail point to another configuration
object.  To change a jail's configuration, a new configuration object is
thus built, and if only some parameters have been explicitly specified,
those that have not been are set by copying the corresponding values
from an existing configuration object (in case of partial modification
of the existing configuration, from the original configuration object
that is going to be replaced; in case of breakage of inheritance or at
jail creation, from the applicable configuration object, which is on an
ancestor jail).  This process is not immune to concurrent modifications
because nothing prevents changes of configurations between reading
existing values and setting the new configuration.  Thus, some other
thread could change the value of a parameter after a copy of it has been
made into the new object but before that copy is actually installed,
which effectively will erase the other thread's modification.

To avoid this, we introduce support for serializing configuration
changes on a given jail.  To this end, we move the jail climbing process
from find_conf() into find_conf_locked(), and make the former call the
latter in a read-locked section.  Similarly, we isolate setting
a configuration in the new set_conf_locked() function, and make
set_conf() call it inside a write-locked section.  The new *_unlocked()
variants make it possible to prevent any configuration access between
determining and reading an applicable configuration, computing from it
a new configuration object and finally setting it, by holding a write
lock over the whole process (there is a trade-off here, as read-mostly
locks cannot be upgraded), effectively making it atomic and realizing
full sequential consistency of configuration changes.  Also, the
'mac_do_rm' global read-mostly lock is made sleepable so that it can be
write-locked over sysctl_handle*() functions or memory allocations
(eases implementation, at the expense of a potential loss of concurrency
which is most probably irrelevant).

No functional change (intended) at this point.

Reviewed by:    bapt
Fixes:          9818224174 ("MAC/do: Executable paths feature (GSoC 2025's final state)")
MFC after:      1 month
Sponsored by:   The FreeBSD Foundation
Pull Request:   https://ron-dev.freebsd.org/FreeBSD/src/pulls/38
2026-05-29 17:35:19 +02:00
Olivier Certner 5b194a4ae3 MAC/do: Sequential consistency for configuration retrieval
Since the inception of mac_do(4), find_conf(), used to retrieve the
applicable configuration, has been weakly consistent with respect to
concurrent modifications to configuration inheritance that influence its
result (and it has been sequentially consistent with respect to other
configuration modifications, which the initial executable paths feature
and introduction of implicit parameters broke and which will be fixed in
a subsequent commit).

Indeed, find_conf() climbs the jail tree to find an applicable
configuration, which is not an atomic operation.  It examines the
current jail's configuration pointer for each browsed jail, which does
not prevent concurrent modifications of the configuration pointer for
jails below or above it.  Modifications above the current jail are not
a problem, since if climbing needs to continue (i.e., the current jail
inherits), these modifications will be seen if performed before that
check (and may or may not be seen if performed after that check).
However, modifications below the current jail impair sequential
consistency, because they could be done before other modifications at
the current jail or higher up, and the latter could still be picked up
by the rest of the climb, effectively ignoring that the former should
have blocked the climb earlier, making them look as if they had happened
after for the climbing thread.

As a concrete example of this situation, let's examine a scenario where
some jail A is the parent of some jail B, and B inherits its
configuration from A.  An administrator may want to relax the rules only
for jail A but not B.  To this end, he first copies the current rules on
B over to A and then relaxes the rules on A.  He can intuitively and
reasonably expect that changing B's rules first will prevent A's relaxed
rules to leak to threads in B.  Unfortunately, that is not the case: As
explained in the previous paragraph, there can be a time window where
threads from B can still pick up A's new configuration just after it has
been installed.  This arguably makes changing inheritance in mac_do(4)
in a fully secure fashion almost impossible.

If preserving fine-grained locking of prisons, we could prevent this
problem by having find_conf(), once it has climbed to a non-NULL pointer
(actual, non-inherited configuration), do another climb to check that it
can reach the same configuration on the same jail again.  If the new
climb gives another pointer or jail, it could make it the new candidate
and do a climb check again until the situation stabilizes.  A climb
check detects whether changes in jails below that of the candidate
configuration object happened, catching in particular such changes that
happened before changes to the candidate object.  However, that process
alone would still be subject to ABA problems, and we would additionally
need to tag each prison with some modification timestamp (global, or
local but necessitating allocating memory during the check) to fix them.

In the end, we considered this direction to be unnecessarily complex,
given that configuration changes are to be rare events and most uses
will just be configuration determination.

Consequently, switch protecting jail configurations with a single
read-mostly lock.

While here, adapt set_conf() to accept NULL as the new configuration
object, and have remove_conf() call it, which removes duplicated code.

While here, add a comment explaining why we do not need to take any more
locks when climbing the jail tree.

Reviewed by:    bapt
MFC after:      1 month
Sponsored by:   The FreeBSD Foundation
Pull Request:   https://ron-dev.freebsd.org/FreeBSD/src/pulls/38
2026-05-29 17:34:04 +02:00