There's still some silliness here, but there is now only one opcode
open-splice-and-close, that writes a single chunk from the payload.
This is really all we need for metadata, and small content objects are
also fine with this.
We get some deduplication between content objects by creating a
dictionary for (uid,gid,mode) tuples and xattrs.
This still keeps the operation/payload code in, so we could do
rollsums in a future update easily.
I was hitting a bug in libguestfs/guestmount/FUSE where it blew up
with EINVAL on directories containing lots of files (more than
32000?). We really want to use prefixed subdirs just like the real
objects/ directory does.
This allows us to share more code between the paths, is more
efficient, etc.
gnome-continuous uses the ostree_repo_scan_hardlinks() mode to
avoid re-checksumming everything. However, when I ported the commit
code to use openat() and friends, this optimization was lost.
Re add it. The difference is about 15s versus 5 minutes.
This follows up from the previous commit; now that pull knows how to
do the efficient link() or copy for local files, we can just have
pull-local call into ostree_repo_pull().
As part of this:
- pull() can also accept a file:/// URI instead
of a remote name (since pull local supports anonymous pulls)
- pull() knows an "override-remote-name" option, since pull-local
supported writing a ref out even if there wasn't a remote with
that name
It's always been suboptimal to have both pull and pull-local; as we go
beyond the raw object data into things like deltas and summary files,
the logic to perform e.g. mirroring should only be in one place.
This will be used by Pulp's OSTree content plugin at least to perform
promotions.
When doing a pull --mirror from an archive-z2 repository into another
archive-z2 repository, currently we gunzip/checksum/gzip each content
object. The re-gzip process in particular is fairly expensive.
This does assume that the upstream content is trusted and correct.
It'd be nice in the future to do at least a CRC check, if not the full
checksum. (Could we append CRC data to the end of filez objects?)
We could also choose to only do this optimization if fetching over
TLS.
before: 1626 metadata, 20320 content objects fetched; 299634 KiB transferred in 62 seconds
after : 1626 metadata, 20320 content objects fetched; 299634 KiB transferred in 11 seconds
For future delta work where we do more interesting things than just
"tar of new objects", this lays the groundwork for doing streaming
writes into content objects.
It's also more efficient, as we avoid many intermediate allocations
and virtual calls. Just a single `g_output_stream_write_all` for the
splice case.
Conflicts:
src/libostree/ostree-repo-private.h
src/libostree/ostree-repo-static-delta-processing.c
We could just make everything relative to this, but the objects/ and
tmp/ are accessed very often, so I think it's worth holding individual
fds.
This fd can cover everything else: refs, deltas, etc.
Do not write directly to objects/ but maintain pulled files under tmp/
with a "tmpobject-$CHECKSUM.$OBJTYPE" name until they are syncfs'ed to
disk.
Move them under objects/ at ostree_repo_commit_transaction cleanup
time.
Before (test done on a local network):
$ LANG=C sudo time ./ostree --repo=repo pull origin master
0 metadata, 3 content objects fetched; 83820 KiB; 4 delta parts
fetched, transferred in 417 seconds
16.42user 6.73system 6:57.19elapsed 5%CPU (0avgtext+0avgdata
248428maxresident)k
24inputs+794472outputs (0major+233968minor)pagefaults 0swaps
After:
$ LANG=C sudo time ./ostree --repo=repo pull origin master
0 metadata, 3 content objects fetched; 83820 KiB; 4 delta parts
fetched, transferred in 9 seconds
14.70user 2.87system 0:09.99elapsed 175%CPU (0avgtext+0avgdata
256168maxresident)k
0inputs+794472outputs (0major+164333minor)pagefaults 0swaps
https://bugzilla.gnome.org/show_bug.cgi?id=728065
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
subscription-manager has a daemon that runs in a confined domain,
and it doesn't have permission to write usr_t, which is the default
label of /ostree/deploy/$osname/deploy.
A better long term fix is probably to move the origin file into the
deployment root as /etc/ostree/origin.conf or so.
In the meantime, let's ensure the .origin files are labeled as
configuration.
If an object already existed and we somehow tried to pull it, the
caller would still expect a returned checksum.
This appears to happen with static deltas for some reason; we might be
including duplicate metadata objects. Regardless, this is a bug that
should be fixed.
We have a chain of checksums from the root up until here. While doing
checksums of the objects individually would be a good redundancy check
for test cases and the like, when doing a pull there's no good reason
to burn cycles on SHA256.
This caused deadlocks and/or EMFILE due to the interaction between
threads and fds. What we really want here is a better pull-based
model for parsing content objects.
Another idea would be to change static deltas so that content objects
have a special opcode that includes their metadata first, and then do
rollsums etc. only over actual content.
This will avoid too many open files at the same time that could cause
an EMFILE error.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
(cherry picked from commit bc092b06f0e34e93f7d6102957bf55fd7ffd1b9e)
This is a noticeable cleanup, and fixes another big user of GFile* in
performance/security sensitive codepaths.
I'm specifically making this change because the static deltas code was
leaking temporary files, and cleaning that up nicely would be best if
we were fd relative.
This will be used for a later change to use openat() for the fetching
code. Note that we drop the code to use mmap() - it was an attempt to
avoid keeping a fd open, but we do correctly close anyways.
You create these with something like:
ostree static-delta generate --empty --to=master
These will be automatically used during pull if no previous revision
exists in the target repo.
These work very much like the normal static deltas except they
are named just by the "to" revision. I.e:
deltas/94/f7d2dc23759dd21f9bd01e6705a8fdf98f90cad3e0109ba3f6c091c1a3774d
for a from-scratch to 94f7d2dc23759dd21f9bd01e6705a8fdf98f90cad3e0109ba3f6c091c1a3774d delta.
https://bugzilla.gnome.org/show_bug.cgi?id=721799
The current layout uses a prefix of two bytes as the initial dir
and a second directory inside that with the superblock. This
updates the list code to handle that.
https://bugzilla.gnome.org/show_bug.cgi?id=721799
Regression from 86764dbf00
This function is kind of fiendish now that we have 3 cases, each of
which want to be optimized somewhat to only load what's necessary
(e.g. don't open the file if we don't have an output for stream
requested).
Clean things up so that BARE_USER and BARE are separate conditionals
that share as much as possible, and fix the bug that asserted we
were in BARE mode.
I tested this by running test-basic-user.sh by hand.
Some use cases for checkouts don't need to fsync during checkout.
Installer programs for example will just do a global fsync at the end.
In the future, the default "ostree admin" core could also be
rearchitected to only do a transaction commit right before reboot, and
do the fsync then.
https://bugzilla.gnome.org/show_bug.cgi?id=742482
This is just an efficiency optimization. We're getting fairly close
to all of the hot code paths using `*at()`.
Note that we end up maintaining a half-duplicate code path set here,
because we still need to support commits from an arbitrary GFile *,
which in a possible common case is an OSTree commit.
I think it's worth it though.
rpm-ostree had code to check for this, which didn't actually work.
I don't see a no backwards compatibility concern in changing this, as
it's unlikely a caller would try to sensibly disambiguate FAILED.
We were already using openat() for the contents, but not the xattrs.
Now that libgsystem 2014.3 has gs_fd_get_all_xattrs(), make use of it.
Clean things up a bit so we only open the fd once.
For Anaconda, I needed OSTREE_REPO_REMOTE_CHANGE_ADD_IF_NOT_EXISTS,
with the GFile *sysroot argument to avoid ugly hacks. We want to
write the content provided via "ostreesetup" as a remote to the target
chroot only in the case where it isn't provided as part of the tree
content itself.
This is also potentially useful in idempotent systems management tools
like Ansible.
https://bugzilla.gnome.org/show_bug.cgi?id=741577
ostree_repo_pull_with_options() needs this, and I'd rather keep the
OstreeRemote struct definition tucked away in ostree-repo.c with its
own internal API.
OstreeRemote is a reference-counted struct that encompasses data about a
remote, whether read from a configuration file or created explicitly via
ostree_repo_remote_add().
OstreeRemotes are held in an internal table indexed by remote name.
This solves some problems caused by merging system-wide remote data into
the OstreeRepo's internal config key file.
Also fixes https://bugzilla.gnome.org/show_bug.cgi?id=740911
This format is pretty much the same as the "bare" format, except the
file ownership and xattrs is not stored in the actual filesystem object, but
rather on the side in a user xattr. This means two things:
1) An unprivileged user can store such a repo independent of the types
of files in it or their xattrs. And you can later (as root)
reconstruct the real filesystem tree with ownership. Although you
can't do that using hardlink-sharing. This also means ostree
fsck does a full verification.
2) Such a repository can be checked out with user-mode (checkout -U)
as an unprivileged user using hardlinks for space sharing.
Additionally, symlinks are stored as regular files (with the content
being the symlink target) because user xattrs are not supported on
symlinks. We know at checkout time if the file is a symlink because
the original st_mode is stored in the xattr metadata.
https://bugzilla.gnome.org/show_bug.cgi?id=741125
Applying xattrs on a symlink during checkout failed since
it was setting the xattrs on the final filename, not the
temporary name.
This made the "checkout union 1" test in test-basic.sh
fail.
https://bugzilla.gnome.org/show_bug.cgi?id=741125
When commiting a symlink we do store the uid/gid of the actual
symlink (i.e. not target). However, this was not restored
on non-user-mode checkout as it should.
This commit fixes that, and additionally it ensures xattrs
on symlinks are not set in user-mode checkout.
https://bugzilla.gnome.org/show_bug.cgi?id=741125
rpm-ostree at least has the option to generate a tree with just that
instead of /boot, but while we were enumerating the latter, we'd still
return paths from /boot.
https://bugzilla.gnome.org/show_bug.cgi?id=740947
In Anaconda, we're using "ostree admin --sysroot=/mnt/sysimage
instutil set-kargs", and it was working before, but newer versions of
lorax strip out /etc/system-release which grub2 wants.
That was wrong anyways as we want the /etc/system-release from the
target root.
(Man, grub2 sucks...give me a declarative config file format I can just
write)
https://bugzilla.gnome.org/show_bug.cgi?id=740697
Make _ostree_fetcher_request_uri_with_partial_async and
ostree_fetcher_stream_uri_async simple wrapper around the same
function, all the requests are created in the same place now.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Rename _ostree_fetcher_contents_membuf_sync to
ostree_fetcher_request_uri_to_membuf and drop unused argument
user_data.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
_ostree_fetcher_query_state_text() and_ostree_fetcher_get_n_requests()
have no callers, so remove them.
If they will be needed, they can be easily copied back from the git
history.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Use the pattern:
$PRETTY_NAME [$COMMIT_VERSION] (ostree[:$OSNAME][:$DEPLOYMENT_INDEX])
$OSNAME is only shown if there are multiple values.
$COMMIT_VERSION refers to the version tag in the commit's metadata.
$DEPLOYMENT_INDEX is only shown if no $COMMIT_VERSION is available.
https://bugzilla.gnome.org/show_bug.cgi?id=739416
src/libostree/ostree-repo-pull.c:1676:22: warning: 'flags' may be used uninitialized in this function [-Wmaybe-uninitialized]
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
We potentially need a lot of argument types for pull. Rather than
have a C function with tons of arguments, let's use a GVariant a{sv}
as a handy extensible (and immutable) bag of properties.
This is prepratory work for adding an option to pull to traverse
history.
https://bugzilla.gnome.org/show_bug.cgi?id=737844
fixes a coredump when using a command like:
$ ostree --repo=repo checkout -U --subpath=/usr/lib/passwd \
fedora-atomic/rawhide/x86_64/docker-host usrlib-new
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
We need basic support for UEFI - many newer servers don't support
BIOS compatibility mode anymore.
However, this patch only implements non-atomic because UEFI is FAT, and
we can't do the previous design for OSTree of atomic swap of
/boot/loader.
The Fedora/RHEL UEFI layout has the kernels on a "real" /boot
partition, and /boot/efi/EFI/$vendor just holds the grub2 UEFI binary
and grub.cfg.
Following this, /boot/loader is still on the OS boot partition, and we
still atomically swap it. This potentially paves the way to atomic
upgrades in the future.
https://bugzilla.gnome.org/show_bug.cgi?id=724246
Some package systems need to be run as root, so the process linking to
libostree may also be root. However, it's reasonable to have the
target repository be owned by a uid other than root.
This patch makes it Just Work by chowning the file content to match.
Note this only operates on archive-z2 repositories, because you can't
usefully serve bare repositories via HTTP.
https://bugzilla.gnome.org/show_bug.cgi?id=738954
For Anaconda, we have an ugly bootstrapping problem where we need to
add the remote to the repository's config, then do a pull+deploy, then
remove and re-add the config, because /etc/ostree/remotes.d doesn't
exist yet in the target system.
https://bugzilla.gnome.org/show_bug.cgi?id=738698
While we did support disabling the uncompressed-objects-cache
per-repository:
1) We didn't actually respect that operation when doing
CHECKOUT_MODE_USER on archive-z2 repositories
2) It'd be better to automatically detect we can't write to the
repo and disable the uncompressed cache then.
In this approach, we drop a /etc/grub.d/15_ostree file which is a
hybrid of shell/C that picks up bits from the GRUB2 library (e.g. the
block device script generation), and then calls into libostree's
GRUB2 code which knows about the BLS entries.
This is admittedly ugly. There exists another approach for GRUB2 to
learn the BLS specification. However, the spec has a few issues:
https://www.redhat.com/archives/anaconda-devel-list/2014-July/msg00002.html
This approach also gives a bit more control to the admin via the
naming of the 15_ostree symlink; they can easily disable it:
Or reorder the ostree entries ahead of 10_linux:
Also, this approach doesn't require patches for grub2, which is an
issue with the pressure to backport (rpm-)OSTree to EL7.
src/libostree/ostree-repo.c:1759: Warning: OSTree:
ostree_repo_import_object_from: unknown parameter 'checksum' in
documentation comment, should be 'sha256'
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Some operating systems may come with external tools for subscription
management that drive access to the content. In that case, the origin
file may not be useful (for example, it could refer to an installer
ISO).
This patch will allow OS installers to inject that state, with a
useful error message, directing the system administrator to an
external tool.
See: https://github.com/projectatomic/rpm-ostree/issues/31https://bugzilla.gnome.org/show_bug.cgi?id=737686
Now that we have a summary file, we can use it to allow a simple:
ostree pull --mirror
To download the latest commit on every branch. Also, for a case I'm
dealing with there's only one branch, but I don't want mirror users to
have to hardcode it.
https://bugzilla.gnome.org/show_bug.cgi?id=737807
And use it in pull-local. As one might expect, this is blazingly fast
if they're on the same filesystem.
I'll be using this to "promote" builds between different repositories.
Add command line arguments:
--import-proc-cmdline: import values from /proc/cmdline
--merge: import current values
--replace=ARG=VALUE: replace value
--append=ARG=VALUE: append a new argument
Extra command line arguments are treated like --append=, which
gives backwards compatibility.
https://bugzilla.gnome.org/show_bug.cgi?id=731051
Previously, in the case where a parent directory of a modified config
file was removed, we would throw an exception. This happens when
switching from a tree that has some software (e.g. firewalld), to one
that does not.
While it's nice to have this warning that your config file probably no
longer applies, there's no need to make it so...fatal.
It's particularly problematic that the only easy workaround is to
remove the config files from your current tree - which breaks
rollback.
The solution then is for for us to take ownership of the parent
directories too into the new /etc. Admins can clean up these files
afterwards at any time.
https://bugzilla.gnome.org/show_bug.cgi?id=734293
This fixes a regression introduced with https://git.gnome.org/browse/ostree/commit/?id=7baa600e237b326899de2899a9bc54a6b863943c
The original code in "ostree admin upgrade" had a comment:
/* Here we perform cleanup of any leftover data from previous
* partial failures. This avoids having to call gs_shutil_rm_rf()
* at random points throughout the process. */
But since I deleted that initial cleanup call, we *do* need to do the
cleanup during the process run. It turns out there are only a few
places this is necessary.
https://bugzilla.gnome.org/show_bug.cgi?id=733030
While looking to fix a different bug here, I found the current
state of things where we had a mix of fd-relative API versus not
frustrating.
Change the code around to consistently use *at, and also add some more
tests.
For Fedora and potentially other distributions which use globally
distributed mirrors, metalink is a popular solution to redirect
clients to a dynamic set of mirrors.
In order to make metalink work though, it needs *one* file which can
be checksummed. (Well, potentially we could explode all refs into the
metalink.xml, but that would be a lot more invasive, and a bit weird
as we'd end up checksumming the checksum file).
This commit adds a new command:
$ ostree summary -u
To regenerate the summary file. Can only be run by one process at a
time.
After that's done, the metalink can be generated based on it, and the
client fetch code will parse and load it.
https://bugzilla.gnome.org/show_bug.cgi?id=729585
Otherwise, we're potentially holding up subsequent requests.
I was hitting this when testing the metalink code, where we want to
continue doing more fetches after hitting a 404.
https://bugzilla.gnome.org/show_bug.cgi?id=729585
Changes the pull API to allow pulling only a single directory instead
of the whole deployment. This option is utilized by the check-diff
option in rpm-ostree.
Add a new state directory to hold <checksum>.commitpartial files, so
we know that we've only downloaded partial state.
I noticed OSTree was a bit slower, did some investigation
and saw we were enumerating all objects for things like
$ ostree rev-parse blah
Since "blah" can never be an object (because of the 'l' and 'h'), just
return no matches.
The user might "ostree ls /usr/bin/bash/blah", which previously would
segfault.
A somewhat related future enhancement here would be for "ostree ls" to
follow symbolic links.
Reported-by: Dusty Mabe <dustymabe@gmail.com>
https://bugzilla.gnome.org/show_bug.cgi?id=733476
Prune has worked fine on bare repositories for some time, but now that
I finally try to delete data on the server side, I notice we weren't
actually enumerating content objects =/
That caused them to not be pruned.
https://bugzilla.gnome.org/show_bug.cgi?id=733458
The prune API duplicated logic to delete objects, and furthermore the
core API to delete an object didn't clean up detached metadata.
Fix the duplication by doing the obvious thing: prune should call
_delete.
https://bugzilla.gnome.org/show_bug.cgi?id=733452
This patch adds a function that will parse a partial checksum when
resolving a refspec. If the inputted refspec matches a truncated
existing checksum, it will return that checksum to be parsed. If
multiple truncated checksums match the partial refspec, it is not
unique and will return false. This addition is inspired by the same
functionality in Docker, which allows a user to reference a specific
commit without typing the entire checksum.
partial checksums: Add function to abstract comparison
This modifies the list_objects and list_objects_at functions
to take an additional argument for the string that a commit starts
with. If this string arg is not null, it will only list commit
objects beginning with that string. This allows for a new function
ostree_repo_list_commit_objects_starting_with to pass a partial string
and return a list of all matching commits. This improves on the
previous strategy of listing refs because it will list all commit objects,
even ones in past history. This update also includes bugfixes on
error handling and string comparison, and changes the output structure
of resolve_partial_checksum. The new strcuture will no longer return FALSE
without error. Also, the hashtable foreach now uses iter. Also
includes modified test file
Some organizations will want to use private Certificate Authorities to
serve content to their clients. While it's possible to add the CA
to the system-wide CA store, that has two drawbacks:
1) Compromise of that cert means it can be used for other web traffic
2) All of ca-certificates is trusted
This patch allows a much stronger scenario where *only* the CAs in
tls-ca-path are used for verification from the given repository.
https://bugzilla.gnome.org/show_bug.cgi?id=726256
We were using unsigned size when we should have been using signed,
this means we basically weren't checking for errors on write...ouch.
Luckily if we e.g. hit ENOSPC during a pull, the checksums wouldn't
match and we'd return an error anyways. However when writing an
object, we'd end up silently ignoring it =/
https://bugzilla.gnome.org/show_bug.cgi?id=732020
The generic GKeyFile error isn't quite informative enough here.
I hit this with the new compose process where we don't automatically
inject a configured remote into the generated disk images; we expect
people to add them.
https://bugzilla.gnome.org/show_bug.cgi?id=731346
There's several use cases for calling into ostree itself to do
mirroring, instead of using bare rsync. For example, it's a bit more
efficient as it doesn't require syncing the objects/ directory.
https://bugzilla.gnome.org/show_bug.cgi?id=728351
In the case of running ostree as non-root on a regular filesystem (not
tmpfs which doesn't support immutable), we should just silently do
nothing if we encounter EPERM. Cache the result to avoid spam in
strace.
https://bugzilla.gnome.org/show_bug.cgi?id=728006
We weren't installing the headers, but at the moment all symbols
starting with ostree_ were being exported. Fix that by prefixing
non-static symbols with '_'.
https://bugzilla.gnome.org/show_bug.cgi?id=731369
Finally, fsync to ensure all entries are on disk, unless disabled.
We support disabling this for cases like server-side buildroot
construction where we don't need to be robust against power loss
This prevents people from creating new directories there and expecting
them to be persisted. The OSTree model has all local state to be in
/etc and /var.
This introduces a compile-time dependency on libe2fsprogs.
We're only doing this for the root directory at the moment.
https://bugzilla.gnome.org/show_bug.cgi?id=728006
If fetching GPG-signed commits over plain HTTP, a MitM attacker can
fill up the drive of targets by simply returning an enormous stream
for the commit object.
Related to this, an attacker can also cause OSTree to perform large
memory allocations by returning enormous GVariants in the metadata.
This helps close that attack by limiting all metadata objects to 10
MiB, so the initial fetch will be truncated.
But now the attack is only slightly more difficult as the attacker
will have to return a correctly formed commit object, then return a
large stream of < 10 MiB dirmeta/dirtree objects.
https://bugzilla.gnome.org/show_bug.cgi?id=725921