This simplifies the lock state management considerably since the
previously pushed type doesn't need to be tracked. Instead, 2 counters
are kept to track how many times each lock type has been pushed. When
the number of exclusive locks drops to 0, the lock transitions back to
shared.
This API is push rather than pull, which makes it much more
suitable to use cases like parsing a tar file from external
code.
Now, we have a large mess in this area internally because
the original file writing code was pull based, but static
deltas hit the same problem of wanting a push API, so I added
this special `OstreeRepoBareContent` just for writing regular
files from a push API.
Eventually...I'd like to deprecate the pull based API,
and rework things so that for regular files the push API
is the default, and then `write_content_object()` would
be split up into archive/bare cases.
In this world the `ostree_repo_write_content()` API would
then need to hackily bridge pull to push and it'd be
less efficient.
Anyways for now due to this bifurcation, this API only
works on non-archive repositories, but that's fine for
now because that's what I want for the `ostree-ext-container`
bits.
Continuation of the addition of `ostree_repo_write_regfile_inline()`.
This will be helpful for ostree-rs-ext and importing from tar, it's
quite inefficient and awkward for small files to end up creating
a whole `GInputStream` and `GFileInfo` and etc. for small files.
When working on ostree-ext and importing from tar, it's
quite inefficient and awkward for small files to end up creating
a whole `GInputStream` and `GFileInfo` and etc. for small files.
Plus the gtk-rs binding API to map from `impl Read` to Gio
https://docs.rs/gio/0.9.1/gio/struct.ReadInputStream.html
requires that the input stream is `Send` but the Rust `tar` API
isn't.
This is only 1/3 of the problem; we also need similar APIs
to directly create a symlink, and to stream large objects via
a push-based API.
If `glnx_make_lock_file` falls back to `flock`, on NFS this uses POSIX
locks (`F_SETLK`). As such, we need to be able to handle `EACCES` as
well as `EAGAIN` (see `fnctl(2)`).
I think this is what coreos-ostree-importer has been hitting, which runs
on RHEL7 in the Fedora infra and does locking over an NFS share where
multiple apps could concurrently pull things into the repo.
I've only noticed this by inspection. But I think it's possible for
`cleanup_txn_dir` to get called with the `staging-...-lock` file since
it matches the prefix.
Make the checking here stronger by verifying that it's a directory. If
it's not a directory (lockfile), then follow the default pruning expiry
logic so that we still cleanup stray lockfiles eventually.
The [dev-overlay](332c6ab3b9/src/cmd-dev-overlay)
script shipped in coreos-assembler mostly exists to deal
with the nontrivial logic around SELinux policy. Let's make
the use case of "commit some binaries overlaying a base tree, using
the base's selinux policy" just require a magical
`--selinux-policy-from-base` argument to `ostree commit`.
A new C API was added to implement this in the case of `--tree=ref`;
when the base directory is already checked out, we can just reuse
the existing logic that `--selinux-policy` was using.
Requires: https://github.com/ostreedev/ostree/pull/2039
Using fs-verity is natural for OSTree because it's file-based,
as opposed to block based (like dm-verity). This only covers
files - not symlinks or directories. And we clearly need to
have integrity for the deployment directories at least.
Also, what we likely need is an API that supports signing files
as they're committed.
So making this truly secure would need a lot more work. Nevertheless,
I think it's time to start experimenting with it. Among other things,
it does *finally* add an API that makes files immutable, which will
help against some accidental damage.
This is basic enablement work that is being driven by
Fedora CoreOS; see also https://github.com/coreos/coreos-assembler/pull/876
Append a byte encoding the OSTree object type for each object in the
metadata. This allows the commit metadata to be fetched and then for the
program to see which objects it already has for an accurate calculation
of which objects need to be downloaded.
This slightly breaks the `ostree.sizes` `ay` metadata entries. However,
it's unlikely anyone was asserting the length of the entries since the
array currently ends in 2 variable length integers. As far as I know,
the only users of the sizes metadata are the ostree test suite and
Endless' eos-updater[1]. The former is updated here and the latter
already expects this format.
1. https://github.com/endlessm/eos-updater/
If the object was already in the repo then the sizes metadata entry was
skipped. Move the sizes entry creation after the data has been computed
but before the early return for an existing object.
The object sizes hash table was only being cleared when the repo was
finalized. That means that performing multiple commits while the repo
was open would reuse all the object sizes metadata for each commit.
Clear the hash table when the sizes metadata is setup and when it's
added to a commit. This still does not fix the issue all the way since
it does nothing to prevent the program from constructing multiple
commits simultaneously. To handle that, the object sizes hash table
should be attached to the MutableTree since that has the commit state.
However, the MutableTree is gone when the commit is actually created.
The hash table would have to be transferred to the root file when
writing the MutableTree. That would be an awkward addition to
OstreeRepoFile, though. Add a FIXME to capture that.
After the corruption has been fixed with "ostree fsck -a --delete", a
second run of the "ostree fsck" command will print X partial commits
not verified and exit with a zero.
The zero exit code makes it hard to detect if a repair operation needs
to be run. When ever fsck creates a partial commit it should add a
reason for the partial commit to the state file found in
state/<hash>.commitpartial. This will allow a future execution of the
fsck to still return an error indicating that the repository is still
in the damaged state, awaiting repair.
Additional reason codes could be added in the future for why a partial
commit exists.
Text from: https://github.com/ostreedev/ostree/pull/1880
====
cgwalters commented:
To restate, the core issue is that it's valid to have partial commits
for reasons other than fsck pruned bad objects, and libostree doesn't
have a way to distinguish.
Another option perhaps is to write e.g. fsck-partial into the
statefile state/<hash>.commitpartial which would mean "partial, and
expected to exist but was pruned by fsck" and fsck would continue to
error out until the commit was re-pulled. Right now the partial stamp
file is empty, so it'd be fully compatible to write a rationale into
it.
====
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Closes: #1910
Approved by: cgwalters
We want a case where we can disable the min-free-space check. Initially,
it felt like to add a OSTREE_REPO_PULL_FLAGS_DISABLE_FREE_SPACE_CHECK but
the problem is prepare_transaction() does not have a OstreeRepoPullFlags
parameter which we can parse right here. On top of it, prepare_transaction()
enforces min-free-space check and won't let the transaction proceed if
the check failed.
This is pretty bad in conjunction with "inherit-transaction" as what
Flatpak uses. There is no way to disable this check unless we remove
it altogether from prepare_transaction.
This issue came out to light when flatpak wasn't able to write metadata
after fetching from remote:
[uajain@localhost ~]$ flatpak remote-info flathub org.kde.Platform//5.9
error: min-free-space-size 500MB would be exceeded
Metadata objects helps in housekeeping and restricting them means
restricting crucial UX (like search, new updates) functionalities
in clients like gnome-software. The error banners originated from
these issues are also abrupt and not much helpful to the user. This
is the specific instance of the issue this patches tries to address.
See https://github.com/flatpak/flatpak/issues/2139 for discussion.
Closes: #1779
Approved by: mwleeds
When falling back to copying objects when importing them into a
bare-user repo, we only actually need to transfer over the
`user.ostreemeta` xattr.
This allows the destination repo to be on a separate filesystem that
might not even support `security.selinux`. (I hit this while importing
over virtio-9p).
Closes: #1771
Approved by: cgwalters
I found this useful while hacking on rpm-ostree but I think it might be
useful enough to upstream. This stat is really helpful for validating
that a pipeline is hitting the devino cache sweet spot.
Closes: #1772
Approved by: cgwalters
The idea is that if the process is running as root, it can change
ownership of newly written files to match the owner of the repo.
Unfortunately, it currently applies in the other direction, too - a
non-root user writing to a root owned repository. If the repo is
writable by the user but owned by root, it can still create files and
directories there, but it can't change ownership of them.
This feature comes from
https://bugzilla.gnome.org/show_bug.cgi?id=738954. As it turns out, this
feature was never completed. It only works on content objects and not
metadata objects, refs, deltas, summaries, etc. Rather than try to fix
all of those, remove the feature until someone has interest in
completing it.
Closes: #1754
Approved by: cgwalters
There are use cases for libostree as a local content store
for content derived or delivered via other mechanisms (e.g. OCI
images, RPMs, etc.). rpm-ostree today imports RPMs into OSTree
branches, and puts the RPM header value as commit metadata.
Some of these can be quite large because the header includes
permissions for each file. Similarly, some OCI metadata is large.
Since there's no security issues with this, support committing
such content.
We still by default limit the size of metadata fetches, although
for good measure we make this configurable too via a new
`max-metadata-size` value.
Closes: https://github.com/ostreedev/ostree/issues/1721Closes: #1744
Approved by: jlebon
Copying the xattrs on metadata objects is wrong in general, we
don't "own" them. Notably this would fail in the situation of
doing a pull from e.g. a `bare-user` source to a destination
that was on a different mount point (so we couldn't hardlink),
and the source had e.g. a `security.selinux` attribute.
Closes: #1734Closes: #1736
Approved by: jlebon
Earlier, the actual reserved space (in blocks) were calculated inside the
transaction codepath ostree_repo_prepare_transaction(). However, while
reworking on ostree_repo_get_min_free_space_bytes() API, it was realized that
this calculation can be done independently from the transaction's codepaths, hence
enabling the usage for ostree_repo_get_min_free_space_bytes() API irrespective
of whether there is an ongoing transaction or not.
https://github.com/ostreedev/ostree/issues/1720Closes: #1722
Approved by: pwithnall
when converted to bytes
In a subsequent commit, we add a public API to read the value of
min-free-space-* value in bytes. The value for free space check
is enforced in terms of block size instead of bytes. Therefore,
for consistency we check while preparing the transaction that the
value doesn't overflow when converted to bytes.
https://phabricator.endlessm.com/T23694Closes: #1715
Approved by: cgwalters
Mildly bikeshed, though I find the name `auto-update-summary` to be
easier to grok than `change-update-summary`. I think it's because it can
be read as "verb-verb-noun" rather than "noun-verb-noun".
Closes: #1693
Approved by: mwleeds
This commits adds and implements a boolean repo config option called
"change-update-summary" which updates the summary file every time a ref
changes (additions, updates, and deletions).
The main impetus for this feature is that the `ostree create-usb` and
`flatpak create-usb` commands depend on the repo summary being up to
date. On the command line you can work around this by asking the user to
run `ostree summary --update` but in the case of GNOME Software calling
out to `flatpak create-usb` this wouldn't work because it's running as a
user and the repo is owned by root. That strategy also means flatpak
can't update the repo metadata refs for fear of invalidating the
summary.
Another use case for this relates to LAN updates. Specifically, the
component of eos-updater that generates DNS-SD records advertising ostree
refs depends on the repo summary being up to date.
Since ostree_repo_regenerate_summary() now takes an exclusive lock, this
should be safe to enable. However it's not enabled by default because of
the performance cost, and because it's more useful on clients than
servers (which likely have another mechanism for updating the summary).
Fixes https://github.com/ostreedev/ostree/issues/1664Closes: #1681
Approved by: jlebon
In `write_metadata_object()`, make sure when creating tombstone commits
that we're actually passed an expected checksum to use.
In `write_dir_entry_to_mtree_internal()`, sanity check that `dfd_iter`
is indeed not `NULL` before trying to dereference it.
Discovered by Coverity.
Closes: #1692
Approved by: cgwalters
Since min_free_space_size_mb is considered before min_free_space_percent
in min_free_space_calculate_reserved_blocks(), it has to be considered
first when generating the error message in order for it to be accurate.
Closes: #1691
Approved by: jlebon
Running `ostree commit --tree=ref=a --tree=dir=b` involves reading the
whole of a into an `OstreeMutableTree` before composing `b` on top. This
is inefficient if a is a complete rootfs and b is just touching one file.
We process O(size of a + size of b) directories rather than
O(number of touched dirs).
This commit makes `ostree commit` more efficient at composing multiple
directories together. With `ostree_mutable_tree_fill_empty_from_dirtree`
we create a lazy `OstreeMutableTree` which only reads the underlying
information from disk when needed. We don't need to read all the
subdirectories just to get the checksum of a tree already checked into the
repo.
This provides great speedups when composing a rootfs out of multiple other
rootfs as we do in our build system. We compose multiple containers
together with:
ostree commit --tree=ref=base-rootfs --tree=ref=container1 --tree=ref=container2
and it is much faster now.
As a test I ran
time ostree --repo=... commit --orphan --tree=ref=big-rootfs --tree=dir=modified_etc
Where modified_etc contained a modified sudoers file under /etc. I used
`strace` to count syscalls and I seperatly took timing measurements. To
test with a cold cache I ran
sync && echo 3 | sudo tee /proc/sys/vm/drop_caches
Results:
| | Before | After |
| -------------------- | ------ | ----- |
| Time (cold cache) | 8.1s | 0.12s |
| Time (warm cache) | 3.7s | 0.08s |
| `openat` calls | 53589 | 246 |
| `fgetxattr` calls | 78916 | 0 |
I'm not sure if this will have some negative interaction with the
`_ostree_repo_commit_modifier_apply` which is short-circuited here. I
think it was disabled for `--tree=ref=` anyway, but I'm not certain. All
the tests pass anyway.
I originally implemented this in terms of the `OstreeRepoFile` APIs, but
it was *way* less efficient, opening and reading many files unnecessarily.
Closes: #1643
Approved by: cgwalters