Its often the case that we want to look at objects inside a commit,
before the objects the transaction is finished. For instance:
https://github.com/flatpak/flatpak/pull/837
Which tries to verify the file permissions before committing the
transaction.
And:
1e5ffa926a
Which collects the storage size of the objects so that we can
put the total download size in the commit metadata.
I tried to find all the places where we did reads from the
object directories, and in particular this fixes:
- `ostree_repo_load_file()` for `bare` repos (`archive` was already working).
- `ostree_repo_query_object_storage_size()`
- Applying deltas that reference not-yet-commited objects
Closes: #916
Approved by: cgwalters
This came up in: https://github.com/ostreedev/ostree/pull/881
Basically doing streaming for metadata is dumb. Split up the metadata/content
paths so we pass metadata around as `GVariant`. This drops the last internal
caller of `ostree_repo_write_metadata_stream_trusted()` which was the dumb
function mentioned.
Closes: #923
Approved by: jlebon
If there are no deltas to be listed in the summary file, don’t bother
including the key for them in the additional metadata section of the
file. This saves a few bytes in some cases.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Closes: #911
Approved by: cgwalters
This makes it a bit more easily separable from the rest of the code in
the function. No functional changes.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Closes: #911
Approved by: cgwalters
Copying xattrs when manipulating the GPG keyring for a repository
causes errors when the underlying filesystem doesn't support writing
xattrs - overlayfs is a common example. It also causes the selinux
attributes of the keyring files to be copied from the temporary
location instead of properly inherited from the destination directory
(ending up, for example, as unconfined_u:object_r:user_tmp_t:s0, rather
than unconfined_u:object_r:data_home_t:s0)
Closes: #910
Approved by: cgwalters
This reverts commit 1eff3e8343. There
are a few issues with it. It's not a critical thing for now, so
let's ugly up the git history and revisit when we have time to
debug it and add more tests.
Besides the below issue, I noticed that the simple `ostree remote add`
now writes to `/ostree/repo/config` because we *aren't* using the
`--sysroot` argument.
Closes: https://github.com/ostreedev/ostree/issues/901Closes: #902
Approved by: mike-nguyen
Using `${sysroot}` to mean the physical storage root: We don't want to write to
`${sysroot}/etc/ostree/remotes.d`, since nothing will read it, and really
`${sysroot}` should just have `/ostree` (ideally). Today the Anaconda rpmostree
code ends up writing there. Fix this by adding a notion of "physical" sysroot.
We determine whether the path is physical by checking for `/sysroot`, which
exists in deployment roots (and there shouldn't be a `${sysroot}/sysroot`).
In order to unit test this, I added a `--sysroot` argument to `remote add`.
However, doing this better would require reworking the command line parsing for
the `remote` argument to support specifying `--repo` or `--sysroot`, and I
didn't quite want to do that yet in this patch.
Closes: https://github.com/ostreedev/ostree/issues/892Closes: #896
Approved by: jlebon
This is prep for introducing a fd-relative `ostree_repo_new_at()`.
Previously, `ostree_repo_is_system()` compared `GFile` paths, but
there's a much simpler check we can do first - if this repository
was created via `OstreeSysroot`, it must be a system repo.
Closes: #886
Approved by: jlebon
Make it an internal, not static, API; like _ostree_repo_add_remote(). It
will be used in many the same situations.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Closes: #875
Approved by: cgwalters
Return whether the remote already existed. This is an internal API, so
it’s not an API break. The return value will be useful in upcoming
commits for working out whether to later remove a remote again.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Closes: #875
Approved by: cgwalters
Add a name argument to the internal OstreeRemote constructor,
since this member (and several derived from it) is non-nullable,
and hence must always be set at construction time.
This changes the only call sites of the constructor to use the new API,
which is internal.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Closes: #875
Approved by: cgwalters
Follow up to a previous patch that addressed a double-close; I
realized we already had a helper for doing "open dfd iter, do nothing
if we get ENOENT". Raise it to libotuil, and port all consumers.
Closes: #863
Approved by: jlebon
Previously it was static to ostree-repo.c. Make it usable throughout
libostree so it can be used by an upcoming commit, but also expose the
typedef and reference counting functions so that opaque OstreeRemote
pointers can be used by user code, in anticipation of exposing more of
its API publicly in future.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Closes: #832
Approved by: cgwalters
• Commit timestamps, so it’s easy to work out whether a given commit is
newer than the one we have locally
• Summary file timestamp, so it’s easy to work out whether the summary
file is more up to date than another summary file
• Summary file expiry time, so clients can work out when they should
expect the summary file to next be updated, and hence can query for
it at roughly the right time
The expiry time requires input from the user, so is currently never set
automatically. Programs using libostree can set it if they wish.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Closes: #826
Approved by: cgwalters
If we're freeing the segment, it's basically always better to use
`autoptr()`. Fewer lines, more reliable, etc.
Noticed an instance of this in the pull code while reviewing a different PR,
decided to do a grep for it and fix it tree wide.
Closes: #836
Approved by: pwithnall
The keyring isn't large, so let's just fall back to copying it
rather than requiring `renameat()`.
Prep for `ostree_repo_open_at()`.
Closes: #821
Approved by: jlebon
Use the new well-known `status` key for OstreeAsyncProgress to get and
set the status atomically with other keys in an OstreeAsyncProgress
instance.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Closes: #819
Approved by: cgwalters
This will eliminate most of the potential races in progress reporting.
ostree_repo_pull_default_console_progress_changed() still calls three
getters, so there may still be races there, however.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Closes: #819
Approved by: cgwalters
I happened to glance at the top of my most recent patch and
noticed that I used an `throw_errno()` function in a non-errno place.
I scanned the patch for other instances of this but didn't find one.
Closes: #811
Approved by: jlebon
I was planning to change some of the object loading code in the
future, so here's some porting.
Note that I rewrote `_ostree_repo_has_loose_object()` since it
used an error return across multiple functions.
Honestly I'm not sure about this `TEMP_FAILURE_RETRY()` business...
in reality we're going to end up with a ton of code linked in
process that doesn't do it. Unix sucks =( But I'm keeping
what was there out of consistency.
Closes: #809
Approved by: jlebon
This did a `closedir` in the `goto out` section before, but it
turns out more nicely if we follow the usual pattern of doing
the `open(O_DIRECTORY)` in the callee function and handle `ENOENT`
there.
Closes: #809
Approved by: jlebon
I was reading a strace the other day and noticed we were loading the same
`.dirmeta` object many times. Unlike the other object types, `.dirmeta` objects
don't accumulate much over time; there are only so many directory metadata types.
(Without SELinux involved it'd probably be 5-6 I'd guess offhand).
For `fedora-atomic/25/x86_64/docker-host` there are currently 34 `.dirmeta` in
the tree.
But how many times during a checkout did we load those 34 dirmeta objects?
With a quick strace:
```
$ strace -s 2048 -f -o strace.log ostree --repo=repo-build checkout -U fedora-atomic/25/x86_64/docker-host host-test-checkout
$ grep dirmeta strace.log | wc -l
7165
```
After, as you'd expect, we just loaded `34` from disk. We do
6 system calls (`openat+fstat+fstat+read+read+close`) per dirmeta,
so we dropped a total of 42780 system calls - which is about 20% of the total
system calls made.
`perf record` tells me that we're spending ~40 of our time in the kernel during
a checkout, so reducing syscall traffic helps. Though most of that appears to be
in the VFS and XFS layers for `linkat` (which isn't surprising).
So how much did perf improve? Well, on my workstation, I get a lot of
fluctuation in timing, sometimes by 30%, so this was well within the noise. But
it's well worth speeding up checkout, and I think this optimization will shine
more as we improve performance elsewhere.
Closes: #795
Approved by: jlebon
These are leftovers from the packfile code and should have been
deleted in commit: 2a0601efc7
I noticed this now since I wanted to add a new type of caching.
Closes: #795
Approved by: jlebon
`perf record ostree checkout ...` for a bare-user repo was telling
me we were spending a good 13% of our time in the depchain of `ot_lgexattrat()`.
The problem here is that traversing the `/proc` path turns out to be
somewhat expensive - there are LSM (SELinux) checks, etc.
For regular files, opening and just getting the xattr, then closing is still
quite cheap. For symlinks, we'll always need to open anyways.
This appears to shave about ~0.1 seconds off of a checkout of
`fedora-atomic/25/x86_64/docker-host` on my workstation.
Oh, and this was the last user of `ot_lgexattrat()` so we can kill it, which is
nice - the xattr code should really live in libglnx.
Closes: #796
Approved by: jlebon
I was planning to change one here, decided to do a conversion
of some of the simpler functions in this file to keep up momentum.
Closes: #776
Approved by: jlebon
This mode is similar to bare-user, but does not store the permission,
ownership (uid/gid) and xattrs in an xattr on the file objects in the
repo. Additionally it stores symlinks as symlinks rather than as
regular files+xattrs, like the bare mode. The later is needed because
we can't store the is-symlink in the xattr.
This means that some metadata is lost, such as the uid. When reading a
repo like this we always report uid, gid as 0, and no xattrs, so
unless this is true in the commit the resulting repository will
not fsck correctly.
However, it the main usecase of the repository is to check out with
--user-mode, then no information is lost, and the repository can
work on filesystems without xattrs (such as tmpfs).
Closes: #750
Approved by: cgwalters
There are a lot of things suboptimal about this approach, but
on the other hand we need to get our CI back up and running.
The basic approach is to - in the test suite, detect if we're on overlayfs. If
so, set a flag in the repo, which gets picked up by a few strategic places in
the core to turn on "ignore xattrs".
I also had to add a variant of this for the sysroot work.
The core problem here is while overlayfs will let us read and
see the SELinux labels, it won't let us write them.
Down the line, we should improve this so that we can selectively ignore e.g.
`security.*` attributes but not `user.*` say.
Closes: https://github.com/ostreedev/ostree/issues/758Closes: #759
Approved by: jlebon
I've seen code in a few places that I think on balance is definitely better this
way. Some of our functions have huge variable declaration sections.
This change includes one small example where we could start using declarations
after statements.
A concern I had was - how does this interact with `__attribute__((cleanup))` and
early returns? I tested it, and AFAICS the behavior is what you'd expect - the
cleanup function isn't called if its variable isn't reachable.
Closes: #718
Approved by: jlebon
It's just simpler, and I'm not sure people are going to care
much about the difference by default.
We already folded in the fallback sizes into the download totals, so folding in
the count makes things consistent; previously you could see e.g.
`3/3 parts, 100MB/150MB` and be confused.
Closes: #678
Approved by: giuseppe
There were a few bugs here.
- We need to keep track of the size of the delta parts we've already processed,
in order to make progress reliable at all in the face of interruptions. Add
a new `fetched-delta-part-size` async progress variable for this.
- The total before disregarded what we'd already downloaded, which was confusing.
Now, a progress percentage is `fetched/total`.
- Correctly handle "unknown bytes/sec" in the progress display.
However, to be fully correct we need to show the fallback objects too. That
would require tracking in the pull code when we fetch an object as a fallback
versus "normally". This would be simpler really if we could assume in a run we
were *only* processing a delta, but currently we don't do that.
Related: https://github.com/ostreedev/ostree/issues/475Closes: #678
Approved by: giuseppe
Clarify the documentation for functions like
ostree_repo_get_remote_boolean_option(), stating what out_value will be
set to on error.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Closes: #676
Approved by: cgwalters
The libcurl backend does all the work in the main thread/loop, which
seems to starve the idle scanning worker more. With the libcurl
backend, we're a lot more likely to have at least one outstanding
metadata request.
But it can more easily transiently happen with libcurl that all of our current
fetches are content. To be accurate here, just show Estimating if we're scanning
too.
Closes: #654
Approved by: jlebon
The gzip default is 6. When I was writing this code, I chose 9 under
the assumption that for long-term archival, the extra compression was
worth it.
Turns out level 9 is really, really not worth it. Here's run at level 9
compressing the current Fedora Atomic Host into archive:
```
ostree --repo=repo pull-local repo-build fedora-atomic/25/x86_64/docker-host
real 2m38.115s
user 2m31.210s
sys 0m3.114s
617M repo
```
And here's the new default level of 6:
```
ostree --repo=repo pull-local repo-build fedora-atomic/25/x86_64/docker-host
real 0m53.712s
user 0m43.727s
sys 0m3.601s
619M repo
619M total
```
As you can see, we run almost *three times* faster, and we take up *less
than one percent* more space.
Conclusion: Using level 9 is dumb. And here's a run at compression level 1:
```
ostree --repo=repo pull-local repo-build fedora-atomic/25/x86_64/docker-host
real 0m24.073s
user 0m17.574s
sys 0m2.636s
643M repo
643M total
```
I would argue actually many people would prefer even this for "devel" repos.
For production repos, you want static deltas anyways. (However, perhaps
we should support a model where generating a delta involves re-compressing
fallback objects with a bit stronger compression level).
Anyways, let's make everyone's life better and switch the default to 6.
Closes: #671
Approved by: jlebon
For a long time we've cached the remote configs in the repo, which
mostly makes sense for the `repo/config` file, but less sense
for `/etc/ostree/remotes.d`, because we want to support admins
interactively editing them.
One can delete the repo instance and create a new one, but that's a bit ugly.
Let's introduce an API for this so rpm-ostree can reload remotes after
admins/scripts edit them in `/etc`. We also might as well reload
any other entries in the config.
Structurually now, `ostree_repo_open()` deals with file descriptors, and then
calls `ostree_repo_reload_config()`. Except for the uncompressed cache, which is
the only thing that deals with FDs that can be configured. But we want to delete
that anyways.
No tests, since...we don't have a daemon in this codebase, don't want to shave
that yak just today.
Closes: #662
Approved by: jlebon
I was working on https://bugzilla.redhat.com/show_bug.cgi?id=1393545
and it was annoying that I couldn't know what the new (unsigned)
commit has was until verification succeeded. I could pull it
manually without GPG, but then it'd be sitting in the repo.
Now:
```
Updating from: fedora-atomic:fedora-atomic/25/x86_64/docker-host
Receiving metadata objects: 0/(estimating) -/s 0 bytes
error: Commit 2fb89decd2cb5c3bd73983f0a7b35c7437f23e3aaa91698fab952bb224e46af5: GPG verification enabled, but no signatures found (use gpg-verify=false in remote config to disable)
```
Closes: #663
Approved by: giuseppe
Without the element-type annotations, bindings don't know how to handle
the elements of the hash table. Since the table is created with destroy
functions, the caller does not own the elements, so transfer container
is used.
Closes: #635
Approved by: cgwalters
ostree_object_name_serialize returns a floating ref, so sink it before
adding it to the hash table so it can properly be freed later when the
hash table is destroyed.
This is particularly a problem for pygobject, which sinks the refs on
variants as it marshals them to native python types. If the ref isn't
already sunk, then the ref count won't increase and a critical warning
will be raised when both the hash table and pygobject try to unref it.
Closes: #635
Approved by: cgwalters
I was having this thought today about making more of the OS readonly,
and ultimately if we got to the point where all ostree operations are
through the repo and sysroot dfds, we could have rpm-ostree be the
only process holding those fds open, and have a read-only bind mount
on top.
Anyways, we're not there, likely won't be soon, but this gets us
closer to being fully fd relative.
Closes: #628
Approved by: jlebon