I learned today that `docker version` does this and I really like
the idea. While we have the patient open, also add the gitrev
with code taken from https://github.com/projectatomic/rpm-ostree/pull/584Closes: #691
Approved by: giuseppe
It's just simpler, and I'm not sure people are going to care
much about the difference by default.
We already folded in the fallback sizes into the download totals, so folding in
the count makes things consistent; previously you could see e.g.
`3/3 parts, 100MB/150MB` and be confused.
Closes: #678
Approved by: giuseppe
I don't know why I added support for this; it makes no sense really. If we have
large metadata objects something has gone badly wrong.
The delta compiler has always only processed fallbacks for regular
content files.
Dropping support in the fetcher for this will simplify later handling of
fallback progress accounting.
Closes: #678
Approved by: giuseppe
There were a few bugs here.
- We need to keep track of the size of the delta parts we've already processed,
in order to make progress reliable at all in the face of interruptions. Add
a new `fetched-delta-part-size` async progress variable for this.
- The total before disregarded what we'd already downloaded, which was confusing.
Now, a progress percentage is `fetched/total`.
- Correctly handle "unknown bytes/sec" in the progress display.
However, to be fully correct we need to show the fallback objects too. That
would require tracking in the pull code when we fetch an object as a fallback
versus "normally". This would be simpler really if we could assume in a run we
were *only* processing a delta, but currently we don't do that.
Related: https://github.com/ostreedev/ostree/issues/475Closes: #678
Approved by: giuseppe
Doing `g_variant_print (superblock)` is unreadable and not very useful,
since we show the checksums as byte arrays.
However, do show the checksums for fallback objects. This makes it easier to see
which objects are fallbacks (and inspect why).
Closes: #678
Approved by: giuseppe
In https://github.com/ostreedev/ostree/pull/634 we introduced
a subtle regression - the unreadable object was added to the *new*
reachable objects, when it shouldn't have been. Because it
was a *from* object, clients already had it.
This became more obvious now that I'm working on fixing delta
progress - I noticed my deltas were always starting out with 40MB
fetched, which turned out to be a non-world-readable initramfs object.
This code should simply *skip* the unreadable object, and the delta processing
below properly iterates over "new objects", so we'll pick it up from there.
Closes: #678
Approved by: giuseppe
We should get a release out to try to keep with at least a once-a-month cadence.
This one has some exciting stuff like libcurl and Rust, and various bugfixes.
Also importantly I want to cut this *before* we land some other bigger stuff, so
rpm-ostree can start using the reload_config API etc.
Closes: #685
Approved by: jlebon
These allow us to avoid copying a lot of data around
in userspace. Instead we splice the data directly from
the fd to the destination fd.
Closes: #684
Approved by: cgwalters
Switching between local branches should be supported too.
Signed-off-by: Anton Gerasimov <anton@advancedtelematic.com>
Closes: #683
Approved by: cgwalters
Clarify the documentation for functions like
ostree_repo_get_remote_boolean_option(), stating what out_value will be
set to on error.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Closes: #676
Approved by: cgwalters
When fetching over a fast enough connection, we can be receiving files
faster than we write them. This can then lead to EMFILE when we have
enough files open. This was made very easy to notice with the upcoming
libcurl backend, which makes use of pipelining.
Closes: #675
Approved by: cgwalters
For rpm-ostree, we already link to libcurl indirectly via librepo, and
only having one HTTP library in process makes sense.
Further, libcurl is (I think) more popular in the embedded space. It
also supports HTTP/2.0 today, which is a *very* nice to have for OSTree.
This seems to be working fairly well for me in my local testing, but it's
obviously brand new nontrivial code, so it's going to need some soak time.
The ugliest part of this is having to vendor in the soup-url code. With
Oxidation we could follow the path of Firefox and use the
[Servo URL parser](https://github.com/servo/rust-url). Having to redo
cookie parsing also sucked, and that would also be a good oxidation target.
But that's for the future.
Closes: #641
Approved by: jlebon
The libcurl backend does all the work in the main thread/loop, which
seems to starve the idle scanning worker more. With the libcurl
backend, we're a lot more likely to have at least one outstanding
metadata request.
But it can more easily transiently happen with libcurl that all of our current
fetches are content. To be accurate here, just show Estimating if we're scanning
too.
Closes: #654
Approved by: jlebon
Now that we have queuing in the higher level pull logic, we don't
need to do this anymore.
It's tempting to keep it since the code diff is so small (without
completely rewriting things), but dropping it here will make
it easier to see when things go wrong at a higher level.
Note that I kept an assertion.
Closes: #654
Approved by: jlebon
Working on the libcurl backend, I didn't want to reimplement another queue. I
think the queue logic is really better done at the high level, since the fetcher
knows how we want to prioritize metadata over content, etc.
Adding another queue here is duplication, but things will look nicer when we can
actually delete the libsoup one in the next commit.
Closes: #654
Approved by: jlebon
The gzip default is 6. When I was writing this code, I chose 9 under
the assumption that for long-term archival, the extra compression was
worth it.
Turns out level 9 is really, really not worth it. Here's run at level 9
compressing the current Fedora Atomic Host into archive:
```
ostree --repo=repo pull-local repo-build fedora-atomic/25/x86_64/docker-host
real 2m38.115s
user 2m31.210s
sys 0m3.114s
617M repo
```
And here's the new default level of 6:
```
ostree --repo=repo pull-local repo-build fedora-atomic/25/x86_64/docker-host
real 0m53.712s
user 0m43.727s
sys 0m3.601s
619M repo
619M total
```
As you can see, we run almost *three times* faster, and we take up *less
than one percent* more space.
Conclusion: Using level 9 is dumb. And here's a run at compression level 1:
```
ostree --repo=repo pull-local repo-build fedora-atomic/25/x86_64/docker-host
real 0m24.073s
user 0m17.574s
sys 0m2.636s
643M repo
643M total
```
I would argue actually many people would prefer even this for "devel" repos.
For production repos, you want static deltas anyways. (However, perhaps
we should support a model where generating a delta involves re-compressing
fallback objects with a bit stronger compression level).
Anyways, let's make everyone's life better and switch the default to 6.
Closes: #671
Approved by: jlebon
For a long time we've cached the remote configs in the repo, which
mostly makes sense for the `repo/config` file, but less sense
for `/etc/ostree/remotes.d`, because we want to support admins
interactively editing them.
One can delete the repo instance and create a new one, but that's a bit ugly.
Let's introduce an API for this so rpm-ostree can reload remotes after
admins/scripts edit them in `/etc`. We also might as well reload
any other entries in the config.
Structurually now, `ostree_repo_open()` deals with file descriptors, and then
calls `ostree_repo_reload_config()`. Except for the uncompressed cache, which is
the only thing that deals with FDs that can be configured. But we want to delete
that anyways.
No tests, since...we don't have a daemon in this codebase, don't want to shave
that yak just today.
Closes: #662
Approved by: jlebon
We weren't running it before. Also I switched it to use GLib. Preparation for
some oxidation work (having an implementation of bupsplit in Rust).
I exported another function to do the raw rollsum operation which is what this
test suite uses.
Closes: #655
Approved by: jlebon
I was working on https://bugzilla.redhat.com/show_bug.cgi?id=1393545
and it was annoying that I couldn't know what the new (unsigned)
commit has was until verification succeeded. I could pull it
manually without GPG, but then it'd be sitting in the repo.
Now:
```
Updating from: fedora-atomic:fedora-atomic/25/x86_64/docker-host
Receiving metadata objects: 0/(estimating) -/s 0 bytes
error: Commit 2fb89decd2cb5c3bd73983f0a7b35c7437f23e3aaa91698fab952bb224e46af5: GPG verification enabled, but no signatures found (use gpg-verify=false in remote config to disable)
```
Closes: #663
Approved by: giuseppe
Add an arg description for -P, otherwise it's not immediately obvious
that it takes an argument.
Mention that - is supported for --log-file.
Closes: #657
Approved by: cgwalters
There are use cases for having a single repo with branches
with different lifecycles; a simple example of what I was
trying to do in CentOS Atomic Host work is have "stable"
and "devel" branches, were we want to prune devel, but
retain *all* of stable.
This patch is split into two parts - first we add a low level "delete all
objects not in this set" API, and change the current prune API
to use this.
Next, we move more logic into the "ostree prune" command. This paves the way for
demonstrating how more sophisticated algorithms/logic could be developed outside
of the ostree core.
Also, the --keep-younger-than logic already lived in the commandline, so it
makes sense to keep extending it there.
Closes: https://github.com/ostreedev/ostree/issues/604Closes: #646
Approved by: jlebon
This is prep for the libcurl porting. `GTlsCertificate/GTlsDatabase` are
abstract classes implemented in glib-networking for gnutls. curl's APIs take
file paths as strings, so it's easier to work on both if we move the GLib TLS
bits into the libsoup code.
Closes: #651
Approved by: giuseppe
I was making some other changes in this code, and noticed that we were adding
checksums without object types into the same hash table for metadata. We should
*never* do this with both metadata content objects, since in theory a content
object could have the same hash as metadata.
I don't actually think it's possible in practice for pure metadata to collide,
since they have different structures, but let's do this anyways since it's
conceptually right.
Closes: #651
Approved by: giuseppe
I was trying to debug `test-pull-c`, and typing `Ctrl-C` in gdb
ended up sending `SIGINT` to trivial-httpd as well, killing it.
Daemonize a bit more properly to avoid this. I also followed the standard
`/dev/null` guidelines.
Closes: #643
Approved by: jlebon
For the pending libcurl port, the backend is a bit more
sensitive to the main context setup. The delta superblock
fetch here is a synchronous request in the section that's
supposed to be async.
Now, libsoup definitely supports mixing sync and async requests, but it wasn't
hard to help the libcurl port here by making this one async. Now fetchers are
either sync or async.
Closes: #636
Approved by: jlebon
Working on the libcurl backend, I hit the issue that the trivial-httpd program
depends on libsoup. I briefly considered having two versions, but libcurl is
client only, and moreover trivial-httpd is no longer trivial - it has various
features which are used by the test suite extensively.
Hence, what we'll do is build it as a separate binary which links to libsoup,
and use it during the tests. We *also* currently still provide `ostree
trivial-httpd` since some things use it like `rpm-ostree-toolbox` and the
Cockpit tests.
After those are ported to use some other webserver, I plan to add a build-time
option to drop it.
Closes: #636
Approved by: jlebon
The previous commit introduced a single low level API - however,
we can do things in a more optimal way for the curl backend if
we drop the "streaming API" variant. Currently, we only use
it to synchronously splice into a memory buffer...which is pretty
silly when we could just do that in the backend.
The only tweak here is that we have an "add NUL character" flag that is
(possibly) needed when fetching into a membuf.
The code here ends up being better I think, since we avoid the double return
value for the `_finish()` invocation, and now most of the fetcher code (in the
soup case) writes to a `GOutputStream` consistently.
This will again make things easier for a curl backend.
Closes: #636
Approved by: jlebon
Conceptually these now lay on top of the core API, and don't reference libsoup.
This is preparation for libcurl porting, but it's also just generally better.
Closes: #636
Approved by: jlebon
This is in preparation for the libcurl port. We're basically making public what
we had internally. The next step here is to create `ostree-fetcher-util.[ch]`
that only operates in terms of this lower level API.
Also drop the `_mirrored` from the function name since it's
the default now.
Closes: #636
Approved by: jlebon
Without the element-type annotations, bindings don't know how to handle
the elements of the hash table. Since the table is created with destroy
functions, the caller does not own the elements, so transfer container
is used.
Closes: #635
Approved by: cgwalters
ostree_object_name_serialize returns a floating ref, so sink it before
adding it to the hash table so it can properly be freed later when the
hash table is destroyed.
This is particularly a problem for pygobject, which sinks the refs on
variants as it marshals them to native python types. If the ref isn't
already sunk, then the ref count won't increase and a critical warning
will be raised when both the hash table and pygobject try to unref it.
Closes: #635
Approved by: cgwalters
This will prevent including in the delta the bits to update files that
are not world readable, so that we don't run into a permissions problem
when applying the deltas from a bare-user repository that has a bare
repository set as its parent.
This is the case for Endless when updating flatpak runtimes, as the
temporary directory created in ~/.local/share/flatpak/system-cache will
be of type bare-user with its parent set to /var/lib/flatpak which is a
bare repository in EOS, as it's shared with the one at /ostree/repo.
https://phabricator.endlessm.com/T14159Closes: #634
Approved by: cgwalters
I was having this thought today about making more of the OS readonly,
and ultimately if we got to the point where all ostree operations are
through the repo and sysroot dfds, we could have rpm-ostree be the
only process holding those fds open, and have a read-only bind mount
on top.
Anyways, we're not there, likely won't be soon, but this gets us
closer to being fully fd relative.
Closes: #628
Approved by: jlebon
This is a migration from the origin version. It's
nicer to have it in the remote, since that's what one
needs to change. Then tools don't need to mess with
the origin file.o
In fact in this scenario one can keep the "media source" like
`file:///install/repo` or whatever, since conceptually that's where it
came from. We're just providing a better error.
Closes: https://github.com/ostreedev/ostree/issues/626Closes: #627
Approved by: jlebon
These are out parameters, so add the (out) annotation and switch
(nullable) to (optional) since the latter is used for the purpose of
optional out parameters.
Closes: #629
Approved by: cgwalters
We were leaking in a few places that I noticed in an ASAN run. Also,
this was one of the last non-autoptr cleanup sections we have in
`out:` cleanup sections, making us a lot closer to a potential
full-tree rewrite to `return FALSE`.
Closes: #624
Approved by: jlebon
I installed `parallel` in my dev container, which got me
the sysroot locking tests, which caught this leak when
built with ASAN.
Closes: #623
Approved by: jlebon
The "remote cookies" code broke this. While I'm not sure anyone is
actually using ostree-without-http, it isn't too hard to keep the
build time conditional going. Further, this work is preparatory for
libcurl porting.
Closes: #621
Approved by: jlebon
Due to the way glib-mkenums runs the preprocessor itself, it
doesn't pick up the `AC_USE_SYSTEM_EXTENSIONS()` that we have in
`configure.ac`.
This blew up in an obscure way when I later wanted to `#include
"libglnx.h"` in one of the headers, since it needs the `basename()`
from `string.h` which is only available with `_GNU_SOURCE`.
Closes: #616
Approved by: jlebon
This is what we do for non-local (i.e. HTTP) pulls; we wnat to
correctly handle being interrupted during partial pulls.
Closes: https://github.com/ostreedev/ostree/issues/579Closes: #613
Approved by: jlebon
This is a follow up to conversation on list - in practice, if we're
backing away from summary signing, then it makes sense to remove the
special casing for checksums in deltas around summary signatures.
This is also related to the recent change to enable GPG checking for
commits in deltas - now we have a more coherent story between the
previous pull path and deltas.
I didn't do any performance checking, and while it's slightly annoying
that we're now doing sha256 on the delta content twice (once for the
part and once per object)...sha256 is pretty fast, I think most users
are I/O bound anyways, and it'd drop even farther if we started using
openssl.
Closes: #612
Approved by: jlebon
We should be religious about the "only set output variables on
success", otherwise it makes leaks more likely.
But the real leak was us simply not using autoptr in one place.
Closes: #598
Approved by: jlebon
And "move semantics" via `g_steal_pointer()`. Just a minor code
cleanup I noticed when I was hunting for a leak, which ended up being
elsewhere.
Closes: #598
Approved by: jlebon
glnx_make_lock_file requires that the dfd passed in survives the
lifetime of the lock. Since dfd_iter.fd gets cleaned up after the
function returns, this isn't the case. dfd_iter.fd should be equivalent
to tmpdir_dfd, since we iter on ".", and that survives past the
function, so just use that instead.
Closes: #591
Approved by: cgwalters
The fact that we weren't doing this is at best an oversight, and
for some deployment models a security vulnerability. Having both
`gpg-verify` and `gpg-verify-summary` shows that we were intending
them to be orthogonal/independent.
Lately I've been advocating moving towards pinned TLS instead of
gpg-signed summaries, and if we follow that path, performing GPG
verification of commit objects even if using deltas is more important,
as it provides an at-rest verifiable authenticity and integrity
mechanism.
Content providers which are signing their summary files and/or using
TLS (particularly pinned TLS) for transport should treat this as a
nice-to-have. However, for providers which are serving content over
plain HTTP and relying on GPG, this is a critical update.
Closes: https://github.com/ostreedev/ostree/issues/517Closes: #589
Approved by: jlebon
`-fsanitize=address` complained that the `refcount > 0` assertions
were reading without atomics. We can fix this by reworking them
to read the previous value.
Closes: #582
Approved by: jlebon
It turns out this is basically racy with the presence of other
threads. It was really cosmetic so let's stop doing it and make
`-fsanitize=thread` happy.
Closes: #582
Approved by: jlebon
This is actually fine in practice, but it triggers this
`-fsanitize=undefined` warning I saw in the test suite log:
```
src/libostree/ostree-repo-static-delta-compilation.c:160:10: runtime error: null pointer passed as argument 1, which is declared to never be null
```
Closes: #584
Approved by: jlebon
You'd expect
ostree commit --tree=ref=A --tree=ref=B
to produce a commit with the union of the trees given. Instead you'd get
a commit with the contents of just the latter commit. This was due to an
optimisation where we'd skip filling out the `files` and `subdirs`
members of the mtree, just filling in the metadata instead. This backfires
becuase this same code relies on checking the `files` and `subdirs` members
itself to work out whether the mtree is empty.
This commit removes the optimisation, fixing the bug. Maybe there's a way
to keep the optimisation and still fix the bug but it's not obvious to
me.
Closes: #581
Approved by: cgwalters
Conceptually we've been moving towards having our GPG verification
paths be per-remote. The code internally supports this, but we
didn't expose an API to use it conveniently.
This came up when trying to add a new `gpgkeypath` option, since
right now rpm-ostree manually finds keyrings for the remote, and
hence it wasn't looking at the keypath, and said "Unknown key"
in status.
Adding an API fixes this nicely.
Closes: #576
Approved by: giuseppe
For Project Atomic, we already have RPM signatures which use files in
`/etc/pki/rpm-gpg`. It's convenient to simply bind the OSTree remote
configuration to those file paths, rather than having duplicate key
data.
This does mean that we need to parse the files for verification, so we
end up importing them into the verifier's temporary keyring, which is
a bit ugly, but it's what other projects do.
Closes: https://github.com/ostreedev/ostree/issues/573Closes: #575
Approved by: giuseppe
When doing commit --tree=ref=XXX while at the same time applying some
form of modifier, ostree dies trying to read the xattrs using the
raw syscalls. We fix this by falling back to ostree_repo_file_get_xattrs()
in this case.
Also adds a testcase for this.
Closes: #577
Approved by: cgwalters
What in the code is called "scanning" is ensuring (potentially
recursively) have an object, and if not, fetching it. And then if
it's metadata, parsing it and finding new objects to fetch.
This logic has grown fairly complex. What I'm trying to fix
right now is that if we're doing a pull-local to a remote repository
via `sshfs` (FUSE) we still end up scanning, which is inefficient.
We can take advantage of the "commitpartial" logic here - if a commit
isn't partial, it's complete, hence we don't need to scan it.
At the same time, I'm changing the logic here to *always* do scans for
dirtree objects. This will fix cases where multiple commits share
dirtree objects. We have "commitpartial" metadata, but no such concept
of partial/complete for dirtrees.
But, we'll only ever scan dirtrees if we scan commits, which is
what the section above fixes.
Closes: https://github.com/ostreedev/ostree/issues/543Closes: #564
Approved by: alexlarsson
Some deployments may want to gate access to content based on things
like OAuth. In this model, the client system would normally compute a
token and pass it to the server via an API.
We could theoretically support this in the remote config too, but
that'd be a bit weird for OAuth as the information is dynamic.
Therefore this cleans up the code a little bit to more clearly handle
the case that the fetcher is initialized from both remote config
data plus pull options.
Closes: #574
Approved by: giuseppe
Otherwise it's possible for us to exhaust available file descriptors
or (on 32 bit) run up against mmap limits.
In the rollsum case, we didn't need to hold open the "from" object
at all. And in the bsdiff case, we weren't even looking at either of
the files until we started processing.
Also, while we have the patient open, switch to using O_TMPFILE
if available.
Closes: #567
Approved by: giuseppe
Private Cloudfront instances return 403 for objects which don't exist
rather then a 404.
Change the fetcher to assume 403 is ok for download that are "optional"
rather then erroring out at that step (e.g. trying to download a static
delta if the remote repo doesn't have those)
Closes: #531
Approved by: cgwalters
Optionally read cookie jars for a remote to be used when downloading
data. This can be used for private repositories which require specific
cookies to be present, e.g. repositories hosted on Amazon cloudfront
using signed cookies.
Closes: #531
Approved by: cgwalters
We should just download the commit objects directly, as it's
obviously a lot more efficient than deltas.
I had to generate a summary file in more places in the tests,
since once created, it needs to be updated.
Closes: https://github.com/ostreedev/ostree/issues/528Closes: #566
Approved by: jlebon
I was doing a chain of mirroring like A -> B -> C
And repo B had A as a remote. When I added B as
a remote to C, the summary file of B had a ref
upstream:foo/bar/baz, which caused all pulls from
B to C to fail, since the summary file is only
expected to have refs, not refspecs.
Closes: https://github.com/ostreedev/ostree/issues/561Closes: #565
Approved by: jlebon
Various bootloader add kernel commandline options dynamically, filter
these out when grabbing boot options from /proc/cmdline. Specifically
grub adds BOOT_IMAGE and systemd-boot adds initrd.
Closes: #560
Approved by: cgwalters
Found by valgrind memcheck. g_variant_new_from_bytes takes a ref to the
bytes, so we need to release the original ref.
Signed-off-by: Simon McVittie <smcv@debian.org>
Closes: #556
Approved by: cgwalters
ostree_repo_pull_with_options() and ostree_repo_remote_change() don't
sink floating GVariant arguments, and doing so now would be an
ABI change; so don't rely on them to do so.
Leak found with valgrind memcheck.
Signed-off-by: Simon McVittie <smcv@debian.org>
Closes: #556
Approved by: cgwalters