This large patch moves the core xattr logic down into libgsystem,
which allows the gs_shutil_cp_a() API to copy them. In turn, this
allows us to just use that API instead of rolling our own recursive
copy here.
As noted in the new comment though, one case that we are explicitly
regressing is where the new /etc removes a parent directory that's
needed by a modified file. This seems unlikely for most vendors now,
but let's do that as a separate bug.
https://bugzilla.gnome.org/show_bug.cgi?id=711058
Several APIs in libostree were moved there from the commandline code,
and have hardcoded g_print() for progress and notifications. This
isn't useful for people who want to write PackageKit backends, custom
GUIs and the like.
From what I can tell, there isn't really a winning precedent in GLib
for progress notifications.
PackageKit has the model where the source has GObject properties that
change as async ops execute, which isn't bad...but I'd like something
a bit more general where say you can have multiple outstanding async
ops and sensibly track their state.
So, OstreeAsyncProgress is basically a threadsafe property bag with a
change notification signal.
Use this new API to move the GSConsole usage (i.e. g_print()) out from
libostree/ and into ostree/.
Add a --generate-sizes option to commit to add size information to the
commit metadata. This will be used by higher level code which wants
to determine the total size necessary for downloading.
The gpgme examples use this, but from what I can tell we don't really
need to because we don't need detailed results; we only care whether
we signed it at all.
This uses gpgv for verification against DATADIR/ostree/pubring.gpg by
default. The keyring can be overridden by specifying OSTREE_GPG_HOME.
Add a unit test for commit signing with gpg key and verifying on pull;
to implement this we ship a test GPG key generated with no password
for Ostree Tester <test@test.com>.
Change all of the existing tests to disable GPG verification.
Add an optional dependency on gpgme to add GPG signatures into the
detached metadata, with the key "ostree.gpgsigs", as an "aay", an
array of signatures (treated as binary data).
The commit command gains a --gpg-sign=<key-id> argument. Also add an
argument --gpg-homedir to set the GPG homedir where we look for
keyrings.
read_commit resolves the ref to a commit, and a lot of consumers want
the resolved commit for their own purposes; this prevents them from
calling resolve_rev themselves.
https://bugzilla.gnome.org/show_bug.cgi?id=707727
We want an OstreeRepoFile to be the way to represent a filesystem tree
inside an ostree repository. In order to do this, we need to drop the
commit from an OstreeRepoFile, and make that go to callers.
Switch all current users of ostree_repo_file_new_root to
ostree_repo_read_commit, and make the actual constructor private.
https://bugzilla.gnome.org/show_bug.cgi?id=707727
This is a relic from long ago when we were trying to stage objects
before finally committing them all in one go in the pull code.
We're no longer doing that, so stop trying to make the directory.
This also fixes trying to use ostree as non-root to read the
root-owned repo, since we'd fail to create the pending dir.
Clean up how we deal with the uncompressed object cache; we now use
openat()/linkat() and such just like we do for the main objects/.
Use linkat() between the objects and the destination, if possible.
https://bugzilla.gnome.org/show_bug.cgi?id=707733
We can't quite do it for bare repositories yet because we need to have
a way to go from struct stat -> GFileInfo, and that's buried in gio's
private GLocalFile class.
Just use openat() for locating variants, rather than doing the lstat()
+ open(). This also drops several malloc+object allocations from the
lookup path.
Add new internal API to both fstatat() and write a pathname for the
given object. Use it in commit, and also wrapped in the old
GFile-based API.
This is more efficient.
Rather than having separate write_ref calls, make clients start a
transaction, add some refs, and then commit it. While this doesn't
make it 100% atomic, it makes it easier for us to use an atomic
model, and it means we don't do as much I/O updating the summary
file and such.
https://bugzilla.gnome.org/show_bug.cgi?id=707644
An earlier version of this API acted like git in that some objects
would be staged in a temporary directory which would be then committed
in one go by moving files around. The API doesn't match most users
expectations though, as while the stage is nice as a high-level API
it isn't really suited for low-level APIs.
While the stage was removed, the APIs were never renamed. Rename
them now so that they match expectations.
https://bugzilla.gnome.org/show_bug.cgi?id=707644
Update libgsystem submodule for a bugfix.
This is both more efficient from a kernel perspective, and avoids us
calling gs_file_get_path_cached() on tmp_dir constantly, which
triggered another bug due to lack of locking.
It's now isolated almost entirely to ostree-core.c, except
ostree-repo.c needs to know how to create archive-z2 file headers. So
give it a private API for that.
See the new comment in the source; basically if we're fetching content
over http, then someone with the capability to MITM the network could
create a transient setuid binary on disk with arbitrary content. If
they also had a process running on the system (such as an application)
it could be escalated to root.
https://bugzilla.gnome.org/show_bug.cgi?id=707139
Pull the cleanup code to a helper function, and ensure we delete
leftover temporary files also when aborting a transaction. Mainly
this will happen if a local 'ostree commit' fails.
While we're here, also change it to use gs_shutil_rm_rf() which also
handles directories, should we start using those.
Reviewed-by: Jeremy Whiting <jpwhiting@kde.org>
It turns out every builtin (with one special exception) that takes a
repo argument did the same thing; let's just centralize it. The
special exception was "ostree init --repo=foo" where foo is expected
to *not* actually be a repo. In that case, simply skip the
ostree_repo_check() invocation.
https://bugzilla.gnome.org/show_bug.cgi?id=706762
We'll always have "bare" mode for keeping files-as-hardlinks as root.
But "archive" was my second attempt at a format for non-root file
storage, used by the gnome-ostree buildsystem which runs as non-root.
It was really handy to have a "tar" like mode where I can create
tarballs as a user, that contain files owned by root for example.
The "archive" mode stored content files as two pieces in the
filesystem; ".file" contained metadata, and ".filecontent" was the
actual content, uncompressed. The nice thing about this was that to
check out a tree as non-root, you could just hardlink into the repo.
However, archive was fairly bad for serving via HTTP; it required
*two* HTTP requests per content object, greatly magnifing the already
inefficient fetch process. So "archive-z2" was introduced.
To allow gnome-ostree to still check out trees as a user, the
"uncompressed-object-cache" was introduced, and that's how things have
been working for a while.
So we should just be able to kill this code. Specifically note just
how much better the stage_object() function became.
https://bugzilla.gnome.org/show_bug.cgi?id=706057
We have APIs to load metadata as variants, and files as parsed
content/info/xattrs, but for some cases such as static deltas, all we
want is to operate on all objects in their canonical representation.
https://bugzilla.gnome.org/show_bug.cgi?id=706031
While the actual commit object format is presently the same, for a
number of reasons we'd like to change it fairly radically. Among
other things, we need to drop our a{sv} types in objects, to protect
against GVariant changing format.
Since now gnome-ostree now longer uses related objects, and nothing
ever used metadata, just drop them both.
If pull is interrupted, we may have downloaded an arbitrary subset of
the requested objects. Previously, we handled this by scanning for
all objects each time.
However, there's an easy optimization - this patch creates a lock file
in the repo. If we don't see that file when starting a pull, we know
we don't need to stat() every file; presence of a dirtree object for
example implies the existence of everything it references.
When multiple threads need to uncompress an object, there was
a race condition where thread A could get EEXIST, unlink,
then thread B calls linkat(), then thread A tries to link() but
fails.
We can just loop in this case.
This seems to work around a likely Linux kernel VFS bug, where I
randomly see ENOENT on link() when we *definitely* called mkdir() at
an earlier point in time.
This is an incompatible change to archive-z, thus it is now renamed to
archive-z2 and ostree will no longer parse archive-z.
I noticed in perf that we were spending some time zlib-decompressing
file headers, which is just inefficient. Rather than do this, keep
the headers uncompressed, and just zlib-compress content.
This gives us something closer to the advantages of archive and
archive-z when using the latter. Concretely we get deduplication
among multiple checkouts, along with the "devino" hash table trick
during commits to avoid checksumming content again.
This is enabled by default.
For similar reasons as metadata, this avoids having the main thread
blocked in fdatasync(), and even better - we can achieve much higher
parallelism if we have multiple threads blocked on fdatasync().
This is where loose content objects are stored as one compressed file,
instead of the two separate ones for regular archive mode. This mode
would be suitable for HTTP servers, beause only one HTTP request is
necessary, and the result would be compressed.
Cleanly separate metadata/content APIs, rather than defaulting to
raw streams. This helps most use cases.
Also, drop support for staging content without knowing the total
length. This complicated the code, and for things like streaming
HTTP, we should be able to figure this out from Content-Length.
They're not a large efficiency win at the moment, because we don't
do any delta compression.
At the moment, they simply served to compress data, but we will change
the archive mode to do that by default.
I run builds on my laptop, but it also crashes about 1/4 of the time
while suspending. It's definitely undesrirable to get e.g. empty
.dirtree objects because they corrupt builds. Concretely, I was
getting empty contents committed for xorg-util-macros.
Now, we used to write out temporary files using g_file_replace() which
does a fsync() during close, but then switched to a more "manual"
g_file_append_to().
We could switch back to g_file_replace(), but the problem is, we don't
want to call fsync() on temporary files in the case where we already
have the object. Attempting to add an object we already have is a
*very* common case.
This is both the old and new code sequence for the case where an
object is already stored:
open(temp, O_WRONLY)
write() write() write()
close()
lstat(objects/3a/9fe332...) = 0
unlink(temp)
In the *new* code, here's the case where an object *isn't* stored:
open(temp, O_WRONLY)
write() write() write()
close()
lstat(objects/3a/9fe332...) = -1
open(temp, O_RDONLY)
fdatasync()
close()
rename(temp, objects/3a/9fe332)
Compare with the *old* code path for when an object isn't stored:
open(temp, O_WRONLY)
write() write() write()
close()
lstat(objects/3a/9fe332...) = -1
link(temp, objects/3a/9fe332)
unlink(temp)
The problem with this is we really need to fdatasync(). Also doing
just rename() instead of the weird link()/unlink() helps us express to
the filesystem that we want atomic semantics. For example, BTRFS has
special handling for rename().
We really don't have a sane story for private files. This is a
defensive step ensuring that with old versions of gnome-ostree,
components that mistakenly have un-world-readable files don't break
pulls.