It allows an optimization to skip the download of the summary file
if its .sig file is unchanged.
Downloading the .sig file is much cheaper than downloading the summary
file from repositories with many branches.
https://bugzilla.gnome.org/show_bug.cgi?id=762973
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
I'd like to incrementally convert all of `ostree-repo*.c` to
fd-relative usage, so that we can sanely introduce
`ostree_repo_new_at()` which doesn't involve GFile.
This one is medium risk, but passes the test suite.
A fast way to generate new OSTree content using an existing
tree is to checkout (as hard links), add/replace files, then
call `ostree_repo_scan_hardlinks()`, then commit.
But `ostree_repo_scan_hardlinks()` scans the entire repo, which
can be slow if you have a lot of content.
All we really need is a mapping of (device,inode) -> checksum
just for the objects we checked out, then use that mapping
for commits.
This patch adds API so that callers can create a mapping via
`ostree_repo_devino_cache_new()`, then pass it to
`ostree_repo_checkout_tree_at()` which will populate it, and then
`ostree_repo_write_directory_to_mtree()` can consume it.
I plan to use this in rpm-ostree for package layering work.
Notes:
- The old `ostree_repo_scan_hardlinks()` API still works.
- I tweaked the cache to be a set with the checksum colocated with
the key, to avoid a separate malloc block per entry.
https://github.com/GNOME/ostree/pull/167
Concurrent pulls break since we're sharing the staging directory for
all transactions in the repo. This makes us use a per-transaction directory.
However, in order for resumes to work we first look for existing
staging directories and try to aquire an exclusive lock for them. If
we can't find any staging directory or they are all already locked,
then we create a new one.
https://bugzilla.gnome.org/show_bug.cgi?id=757611
This creates a subdirectory of the tmp dir with a selected prefix,
and takes a lockfile to ensure that nobody else is using the same directory.
However, if a directory with the same prefix already exists and is
not locked that is used instead.
The later is useful if you want to support some kind of resumed operation
on the tmpdir.
touch reused dirs
https://bugzilla.gnome.org/show_bug.cgi?id=757611
Also renames OstreeRepoTrustedContentBareCommit to
OstreeRepoContentBareCommit so that it can be used by both.
This will be needed when we introduce checksum verification of objects
in static deltas.
Had a rare situation where I had no libsoup development files, so I
took the opportunity to fix the build errors. Ugly, but works now.
Would be nice if libsoup could be a hard dependency since we rarely
ever test a configuration without it.
External daemons like rpm-ostree want push notification any time a
change is made by an external entity. inotify provides notification,
but a problem is there's no easy way to monitor all of the refs.
In the past, there has been discussion of opt-in recursive timestamps:
https://lkml.org/lkml/2013/4/5/307
But in today's world, let's just bump the mtime on the repo itself, as
a central inotify point.
Closes: https://github.com/GNOME/ostree/pull/111
Similar to c2b01ad. For some reason I was thinking the commit data
still needed to be written to disk prior to verifying, but it's just
another artifact of spawning gpgv2 (predates using GPGME).
Makes for a nice cleanup in fetch_metadata_to_verify_delta_superblock()
as well.
I was hitting a bug in libguestfs/guestmount/FUSE where it blew up
with EINVAL on directories containing lots of files (more than
32000?). We really want to use prefixed subdirs just like the real
objects/ directory does.
This allows us to share more code between the paths, is more
efficient, etc.
This follows up from the previous commit; now that pull knows how to
do the efficient link() or copy for local files, we can just have
pull-local call into ostree_repo_pull().
As part of this:
- pull() can also accept a file:/// URI instead
of a remote name (since pull local supports anonymous pulls)
- pull() knows an "override-remote-name" option, since pull-local
supported writing a ref out even if there wasn't a remote with
that name
When doing a pull --mirror from an archive-z2 repository into another
archive-z2 repository, currently we gunzip/checksum/gzip each content
object. The re-gzip process in particular is fairly expensive.
This does assume that the upstream content is trusted and correct.
It'd be nice in the future to do at least a CRC check, if not the full
checksum. (Could we append CRC data to the end of filez objects?)
We could also choose to only do this optimization if fetching over
TLS.
before: 1626 metadata, 20320 content objects fetched; 299634 KiB transferred in 62 seconds
after : 1626 metadata, 20320 content objects fetched; 299634 KiB transferred in 11 seconds
For future delta work where we do more interesting things than just
"tar of new objects", this lays the groundwork for doing streaming
writes into content objects.
It's also more efficient, as we avoid many intermediate allocations
and virtual calls. Just a single `g_output_stream_write_all` for the
splice case.
Conflicts:
src/libostree/ostree-repo-private.h
src/libostree/ostree-repo-static-delta-processing.c
We could just make everything relative to this, but the objects/ and
tmp/ are accessed very often, so I think it's worth holding individual
fds.
This fd can cover everything else: refs, deltas, etc.
Do not write directly to objects/ but maintain pulled files under tmp/
with a "tmpobject-$CHECKSUM.$OBJTYPE" name until they are syncfs'ed to
disk.
Move them under objects/ at ostree_repo_commit_transaction cleanup
time.
Before (test done on a local network):
$ LANG=C sudo time ./ostree --repo=repo pull origin master
0 metadata, 3 content objects fetched; 83820 KiB; 4 delta parts
fetched, transferred in 417 seconds
16.42user 6.73system 6:57.19elapsed 5%CPU (0avgtext+0avgdata
248428maxresident)k
24inputs+794472outputs (0major+233968minor)pagefaults 0swaps
After:
$ LANG=C sudo time ./ostree --repo=repo pull origin master
0 metadata, 3 content objects fetched; 83820 KiB; 4 delta parts
fetched, transferred in 9 seconds
14.70user 2.87system 0:09.99elapsed 175%CPU (0avgtext+0avgdata
256168maxresident)k
0inputs+794472outputs (0major+164333minor)pagefaults 0swaps
https://bugzilla.gnome.org/show_bug.cgi?id=728065
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
ostree_repo_pull_with_options() needs this, and I'd rather keep the
OstreeRemote struct definition tucked away in ostree-repo.c with its
own internal API.
OstreeRemote is a reference-counted struct that encompasses data about a
remote, whether read from a configuration file or created explicitly via
ostree_repo_remote_add().
OstreeRemotes are held in an internal table indexed by remote name.
This solves some problems caused by merging system-wide remote data into
the OstreeRepo's internal config key file.
Also fixes https://bugzilla.gnome.org/show_bug.cgi?id=740911
Some package systems need to be run as root, so the process linking to
libostree may also be root. However, it's reasonable to have the
target repository be owned by a uid other than root.
This patch makes it Just Work by chowning the file content to match.
Note this only operates on archive-z2 repositories, because you can't
usefully serve bare repositories via HTTP.
https://bugzilla.gnome.org/show_bug.cgi?id=738954
While we did support disabling the uncompressed-objects-cache
per-repository:
1) We didn't actually respect that operation when doing
CHECKOUT_MODE_USER on archive-z2 repositories
2) It'd be better to automatically detect we can't write to the
repo and disable the uncompressed cache then.
Changes the pull API to allow pulling only a single directory instead
of the whole deployment. This option is utilized by the check-diff
option in rpm-ostree.
Add a new state directory to hold <checksum>.commitpartial files, so
we know that we've only downloaded partial state.
The current "transaction" symlink was introduced to fix issues with
interrupted pulls; normally we assume that if we have a metadata
object, we also have all objects to which it refers.
There used to be a "summary" which had all the available refs, but I
deleted it because it wasn't really used, and was still racy despite
the transaction bits.
We still want the pull process to use the transaction link, so don't
delete the APIs, just relax the restriction on object writing, and
introduce a new ostree_repo_set_ref_immediate().
This has a very basic level of functionality (deltas can be generated,
and applied offline). There is only some stubbed out pull code to
fetch them via HTTP.
But, better to commit this now and improve it from a known starting
point, rather than have it languish in a branch.
This will be used by guestmount - it's WAY faster. We only take disks
as a unit, so it's safe. If the process fails halfway through, we
just start over from scratch the next time anyways.
Add a --generate-sizes option to commit to add size information to the
commit metadata. This will be used by higher level code which wants
to determine the total size necessary for downloading.
We want an OstreeRepoFile to be the way to reference a "filesystem
tree" that's stored in the repo, which is a combination of a DIR_TREE
and a DIR_META. The idea is that once you write an mtree to the repo
using ostree_repo_write_mtree, it becomes serialized and you get an
OstreeRepoFile in return.
Change any APIs that care about DIR_TREE / DIR_META checksums to care
about OstreeRepoFiles instead, which right now is mostly is
ostree_repo_write_commit.
https://bugzilla.gnome.org/show_bug.cgi?id=707727
We want an OstreeRepoFile to be the way to represent a filesystem tree
inside an ostree repository. In order to do this, we need to drop the
commit from an OstreeRepoFile, and make that go to callers.
Switch all current users of ostree_repo_file_new_root to
ostree_repo_read_commit, and make the actual constructor private.
https://bugzilla.gnome.org/show_bug.cgi?id=707727
Previously I thought we'd have to ditch the current commit
format to avoid a{sv} due to
See https://bugzilla.gnome.org/show_bug.cgi?id=673012
But I realized that we don't really have to care about
unpacking/repacking commit objects, so let's just re-expose the
existing metadata a{sv} in commits in the API.
Also, add support for "detached" metadata that can be updated at any
time post-commit. This is specifically designed for GPG signatures.
https://bugzilla.gnome.org/show_bug.cgi?id=707379
This is a relic from long ago when we were trying to stage objects
before finally committing them all in one go in the pull code.
We're no longer doing that, so stop trying to make the directory.
This also fixes trying to use ostree as non-root to read the
root-owned repo, since we'd fail to create the pending dir.
Clean up how we deal with the uncompressed object cache; we now use
openat()/linkat() and such just like we do for the main objects/.
Use linkat() between the objects and the destination, if possible.
https://bugzilla.gnome.org/show_bug.cgi?id=707733
Add new internal API to both fstatat() and write a pathname for the
given object. Use it in commit, and also wrapped in the old
GFile-based API.
This is more efficient.
Rather than having separate write_ref calls, make clients start a
transaction, add some refs, and then commit it. While this doesn't
make it 100% atomic, it makes it easier for us to use an atomic
model, and it means we don't do as much I/O updating the summary
file and such.
https://bugzilla.gnome.org/show_bug.cgi?id=707644
An earlier version of this API acted like git in that some objects
would be staged in a temporary directory which would be then committed
in one go by moving files around. The API doesn't match most users
expectations though, as while the stage is nice as a high-level API
it isn't really suited for low-level APIs.
While the stage was removed, the APIs were never renamed. Rename
them now so that they match expectations.
https://bugzilla.gnome.org/show_bug.cgi?id=707644
Update libgsystem submodule for a bugfix.
This is both more efficient from a kernel perspective, and avoids us
calling gs_file_get_path_cached() on tmp_dir constantly, which
triggered another bug due to lack of locking.