The faster (OpenSSL/GnuTLS) code lived in a `GInputStream` wrapper, and that
adds a lot of weight (GObject + vtable calls). Move it into a simple
autoptr-struct wrapper, and use it in the metadata path, so we're
now using the faster checksums there too.
This also drops a malloc there as the new API does hexdigest in place to a
buffer.
Prep for more work in the commit path to avoid `GInputStream` for local file
commits, and ["adopting" files](https://github.com/ostreedev/ostree/pull/1255).
Closes: #1256
Approved by: jlebon
For many cases of commit, we can actually optimize things by simply "adopting"
the object rather than writing a new copy. For example, in rpm-ostree package
layering.
We can only make that optimization though if we take ownership of the file. This
commit hence adds an API where a caller tells us to do so. For now, that just
means we `unlink()` the files/dirs as we go, but we can now later add the
"adopt" optimization.
Closes: #1255
Approved by: jlebon
Buried in this large patch is a logical fix:
```
- if (!map)
- return glnx_throw_errno_prefix (error, "mmap");
+ if (map == (void*)-1)
+ return glnx_null_throw_errno_prefix (error, "mmap");
```
Which would have helped me debug another patch I was working
on. But it turns out that actually correctly checking for
errors from `mmap()` triggers lots of other bugs - basically
because we sometimes handle zero-length variants (in detached
metadata). When we start actually returning errors due to
this, things break. (It wasn't a problem in practice before
because most things looked at the zero size, not the data).
Anyways there's a bigger picture issue here - a while ago
we made a fix to only use `mmap()` for reading metadata from disk
only if it was large enough (i.e. `>16k`). But that didn't
help various other paths in the pull code and others that were
directly doing the `mmap()`.
Fix this by having a proper low level fs helper that does "read all data from
fd+offset into GBytes", which handles the size check. Then the `GVariant` bits
are just a clean layer on top of this. (At the small cost of an additional
allocation)
Side note: I had to remind myself, but the reason we can't just use
`GMappedFile` here is it doesn't support passing an offset into `mmap()`.
Closes: #1251
Approved by: jlebon
Appease Coverity by using the same condition for both the ternary check
and the if-condition later on. It should be smart enough to figure out
that `dir_enum == NULL` implies that `dfd_iter != NULL` from the
assertion at the top of the function.
Coverity CID: #1457318Closes: #1250
Approved by: cgwalters
Spotted while reading through the code, it looks like the
copy_detached_metadata() call is accidentally omitted if a hardlink
already exists for the .commit object.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Closes: #1242
Approved by: cgwalters
We can't use the cache if the file we want to commit has been modified
by the client through the file info or xattr modifiers. We would
prematurely look into the cache in `write_dfd_iter_to_mtree_internal`,
regardless of whether any filtering applied.
We remove that path there, and make sure that we only use the cache if
there were no modifications. We rename the `get_modified_xattrs` to
`get_final_xattrs` to reflect the fact that the xattrs may not be
modified.
One tricky bit that took me some time was that we now need to store the
st_dev & st_ino values in the GFileInfo because the cache lookup relies
on it. I'm guessing we regressed on this at some point.
This patch does slightly change the semantics of the xattr callback.
Previously, returning NULL from the cb meant no xattrs at all. Now, it
means to default to the on-disk state. We might want to consider putting
that behind a flag instead. Though it seems like a more useful behaviour
so that callers can only override the files they want to without losing
original on-disk state (and if they don't want that, just return an
empty GVariant).
Closes: #1165Closes: #1170
Approved by: cgwalters
In particular I'd like to get the copy fix in, since it might affect users for
the keyring bits.
Update submodule: libglnx
Closes: #1225
Approved by: jlebon
Noticed this while reading the code. The `child` var hasn't been
initialized yet at the time we throw this error (and even then, it's
only conditionally initialized). To be nice, let's just always calculate
the child path and pass that along.
Also do some minor style porting to decl near use.
Closes: #1216
Approved by: cgwalters
Add a few comments for each of the central functions used for committing
data from a directory. Took me a bit to understand the relationship
between those functions.
Closes: #1216
Approved by: cgwalters
This fixes up the last of the embarassing bits I saw from
the stack trace in:
https://github.com/ostreedev/ostree/issues/1184
We had a hardlink fast path, but that doesn't apply across
devices, which occurs in two notable cases:
- Installer ISO with local repo
- Tools like pungi that copy the repo to a local snapshot
Obviously there are a lot of subtleties here around things like the
bare-user-only conversions as well as exactly what data we copy. I think to get
better test coverage we may want to add `pull-local --no-hardlink` or so.
Closes: #1197
Approved by: jlebon
For the old `OSTREE_REPO_MODE_ARCHIVE_Z2`. Use it mostly tree
wide except for the repo finder tests (to avoid conflicting with
some outstanding PRs).
Just noted another user coming in some of those tests and wanted to do a
cleanup.
Closes: #1209
Approved by: jlebon
We added a `.dir-locals.el` in commit: 9a77017d87
There's no need to have it per-file, with that people might think
to add other editors, which is the wrong direction.
Closes: #1206
Approved by: jlebon
While opening a repo we've recorded the device/inode for a while; use it to
avoid calling `linkat()` during object import if we know it's going to fail.
Closes: #1193
Approved by: jlebon
Conceptually `ostree-repo-pull.c` should be be written using
just public APIs; we theoretically support building without HTTP
for people who just want to use the object store portion and
do their own fetching.
We have some nontrivial behaviors in the pull layer though; one
of those is the "bareuseronly" verification. Make a new internal
API that accepts flags, move it into `commit.c`. This
is prep for further work in changing object import to support
reflinks.
Closes: #1193
Approved by: jlebon
There are use cases for not syncing at all; think build cache repos, etc. Let's
be consistent here and make sure if fsync is disabled we do no sync at all.
I chose this opportunity to add tests using the shiny new strace fault
injection. I can forsee using this for a lot more things, so I made
the support for detecting things generic.
Related: https://github.com/ostreedev/ostree/issues/1184Closes: #1186
Approved by: jlebon
Update libglnx, which is mostly port the repo stagedir code
to the new tmpdir API. This turned out to require some
libglnx changes to support de-allocating the tmpdir ref while
still maintaining the on-disk dir.
Update submodule: libglnx
Closes: #1172
Approved by: jlebon
Doing this in prep for libglnx tmpdir porting, but I think we should also do
this because the partial fetch code IMO was never fully baked; among other
things it was never integrated into the scheme we came up with for "boot id
sync" that we use for complete/staged objects.
There's a lot of complexity here that while we have some coverage for, I think
we need to refocus on the core functionality. The libcurl backend doesn't have
an equivalent to this today.
In particular for small objects, this is simply overly complex. The downside is
clearly for large objects like FAH's 61MB initramfs; not being able to resume
fetches of those is unfortunate.
In practice though, I think most people should be using deltas, and we need to
make sure deltas work for large objects anyways.
Further ultimately the peer-to-peer work should help a lot for people
with truly unreliable connections.
Closes: #1176
Approved by: jlebon
There were some important ones there like a random `syncfs()`. The remaining
users are mostly blocked on the "fstatat enoent" case, I'll wait to port those.
Closes: #1150
Approved by: jlebon
The new --selinux-policy added in [0] exposed a subtle issue in the way
we handle labeling during commit. The CI system in rpm-ostree hit this
when trying to make use of it[1].
Basically, because of the way we use a GVariant to represent xattrs, if
a file to be committed already has an SELinux label, the xattr object
ends up with *two* label entries. This of course throws off fsck later
on, since the checksum will have gone over both entries, even though the
on-disk file will only have a single label (in which the second entry
wins).
I confirmed that the `fsck` added in the installed test fails without
the rest of this patch.
[0] https://github.com/ostreedev/ostree/pull/1114
[1] https://github.com/projectatomic/rpm-ostree/pull/953Closes: #1121
Approved by: cgwalters
The wrapping here is unnecessary, since `_ostree_repo_commit_modifier_apply()`
already does what this function did. Further, the return type was wrong.
Saw this while doing some libarchive work.
Closes: #1104
Approved by: jlebon
This is mostly the `copy_file_range` changes plus the Coverity files.
```
Colin Walters (4):
localalloc: Abort on EBADF from close() by default
local-alloc: Remove almost all macros like glnx_free, glnx_unref_variant
console: Fix Coverity NULL deref warning
fdio: Merge systemd code to use copy_file_range(), use FICLONE
Jonathan Lebon (1):
console: trim useless check
Matthew Leeds (1):
dirfd: Fix typo in comment
Philip Withnall (1):
glnx-console: Add missing NULL check before writing out text
```
Update submodule: libglnx
Closes: #1081
Approved by: jlebon
(remaining > 0) is asserted by the loop condition, and remaining is not
modified between that check and the G_UNLIKELY — so the condition in the
G_UNLIKELY will always be true.
Spotted by Coverity as issue #1452617.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Closes: #1059
Approved by: cgwalters
Part of cleaning up our usage of libglnx; we want to use what's in GLib where we
can.
Had to change a few .c files to `#include ostree.h` early on to pick up
autoptrs for the core types.
Closes: #1040
Approved by: jlebon
There are multiple use cases where we'd like to alias refs.
First, having a "stable" alias which gets swapped across major
versions: https://pagure.io/atomic-wg/issue/228
Another case is when a ref is obsoleted;
<https://pagure.io/atomic-wg/issue/303>
This second one could be done with endoflife rebase, but I think
this case is better on the server side, as we might later change
our minds and do actual releases there.
I initially just added some test cases for symlinks in the `refs/heads` dir to
ensure this actually works (and it did), but I think it's worth having APIs.
Closes: #1033
Approved by: jlebon
Coverity complained that the `else if (bytes_read == 0)` was dead
code if we happened to find it was already false when testing
`else if (G_UNLIKELY (bytes_read == 0 ...`.
There was nothing wrong with the logic, but let's rework it to
only test the value once; I think it does end up nicer anyways.
Coverity CID: 1452186
Closes: #1037
Approved by: jlebon
Mostly for the latest `-Wmaybe-uninitialized` fix, but while here also port some
places to newer APIs.
Update submodule: libglnx
Closes: #1027
Approved by: jlebon
Regression from previous tmpfile refactoring; unfortunately
the `OSTREE_REPO_COMMIT_MODIFIER_FLAGS_GENERATE_SIZES` option
only has coverage via gjs currently.
Might expose it via the cmdline in a later option, but in the big picture the
idea was that this data is better kept in static deltas.
Closes: https://github.com/ostreedev/ostree/issues/1014Closes: #1016
Approved by: jlebon
Using the error prefixing in the delta processing allows us to
do new code style. Also strip trailing whitespace.
Use error prefixing in a few other random places. I didn't
hunt for all of them, just testing out the new API.
Use `glnx_fchmod()`. Also note I dropped one `fchmod (tmpf, 0600)`
which is no longer necessary.
Update submodule: libglnx
Closes: #1011
Approved by: jlebon
Use goffset rather than gsize for file sizes. More importantly, get the
unpacked_size from g_file_info_get_size() (goffset) rather than from the
splice return value, which has type gssize.
This will make a difference on 32-bit systems, where goffset is defined
as off64_t, but gsize is 32 bits.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Closes: #999
Approved by: cgwalters
And in general, if for some reason we can't write `user.` xattrs, provide an
error immediately rather than doing it during a later pull. This way the failure
cause is a lot more obvious.
Related: https://github.com/ostreedev/ostree/issues/991Closes: #993
Approved by: jlebon
For ostree-as-host, we're the superuser, so we'll blow past
any reserved free space by default. While deltas have size
metadata, if one happens to do a loose fetch, we can fill
up the disk.
Another case is flatpak: the system helper has similar concerns
here as ostree-as-host, and for `flatpak --user`, we also
want to be nice and avoid filling up the user's quota.
Closes: https://github.com/ostreedev/ostree/issues/962Closes: #987
Approved by: jlebon
This is prep for storage space checks, where we look at free
space after parsing the metadata, before we write anything.
We did length-limited writes in the fd-based input path, but not for the
`GInputStream` path which in practice is used for HTTP pulls.
Closes: #987
Approved by: jlebon
Some of the Jenkins jobs for Fedora Atomic Host broke after updating
to 2017.7, and it turns out that we regressed handling unreadable
files in `bare-user` mode. An example of this is `/etc/shadow`, which
ends up in the ostree-as-host content as `/usr/etc/shadow`.
Now there are better fixes here; we should probably delete it and create it
during the config merge if it doesn't exist. In general, having secret files in
ostree really isn't supported, so it doesn't make sense to include them.
But let's fix this regression - when operating as an unprivileged user we don't
have `CAP_DAC_OVERRIDE` and hence will fail to open un-user-readable objects.
(We still preserve the actual `0` mode of course in the xattr and will
apply it in `bare`)
Closes: #989
Approved by: jlebon
I had thought `glnx_link_tmpfile_at()` actually consumed the tmpfile;
it does consume the *path* but not the fd. In the non-delta path
things were fine since we used the autocleanup.
But the delta code had a tmpfile allocated in its struct that got reused, and
hence leaked the fd. Fix this by making the commit API actually consume the
tmpfile fully, just like the path path.
Closes: #986
Approved by: jlebon
It's more natural for a few calling places. Prep for patches to go the other
way, which in turn are prep for adding a commit filter v2 that takes `struct
stat`.
`ot_gfile_type_for_mode()` was only used in this function, so inline it here.
Closes: #974
Approved by: jlebon
Use the new macros introduced recently in libglnx to make iterating over
hash tables cleaner. This is just a start, it does not migrate the whole
tree.
Update submodule: libglnx
Closes: #971
Approved by: cgwalters
There's lots of mechanically replacing `OtTmpFile` with `GLnxTmpfile`;
the biggest changes are in the commit path. Symlink commits are now
very clearly separated from regular files. Symlinks are `OtCleanupUnlinkat`,
and regular files are `GLnxTmpfile`.
The commit codepath separates those as `_ostree_repo_commit_path_final()` and
`_ostree_repo_commit_tmpf_final()`. A nice aspect of all of this is that they
both *consume* the temporary on success. This avoids an extra spurious
`unlink()` call.
One of the biggest bits of code motion is in `commit_loose_regfile_object()`,
which no longer needs to care about symlinks. For the most parth though it's
just removing conditionals.
Update submodule: libglnx
Closes: #958
Approved by: jlebon
The `OstreeRepoContentBareCommit` struct was basically an `OtTmpFile`, so let's
make it one. I moved the "convert to `GOutputStream`" logic into the callers,
since that bit can't fail; it makes the implementation much simpler since we can
just return the result of `ot_open_tmpfile_linkable_at()`.
Prep for `GLnxTmpfile` porting.
Closes: #957
Approved by: jlebon
The pull code also could make use of this in both the metadata and content
paths. I changed it to own the tempfile malloc (just like `GLnxTmpFile`), since
there's no reason to have different lifetimes for the filename and the file, and
that way we only have one variable rather than two.
The content path turns out to be a special case though, where
at least for mirroring archives, we directly pass the file *path*
down into `_ostree_repo_commit_loose_final()`.
This is prep for `GLnxTmpFile` porting.
Closes: #957
Approved by: jlebon
The variables here were duplicative; we don't need two booleans to distinguish
between symlinks and regular files. What we do need to handle is the "physical"
state versus the "object" state. Symlinks objects are stored as regular files in
`bare-user` and `archive`.
Prep for more cleanup.
Closes: #957
Approved by: jlebon
These are tuples of (collection ID, ref name) which are a globally-unique
form of local ref. They use OstreeCollectionRef as an identifier, and hence
need to be accessed using new API, as the existing API uses string
identifiers and sometimes accepts refspecs. Remote names are not
supported as part an OstreeCollectionRef.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Closes: #924
Approved by: cgwalters
Both callers of `commit_loose_object_trusted()` were passing
`OSTREE_OBJECT_TYPE_FILE`, so drop that parameter. This in turn
allows us to drop lots of checking of that inside the function.
Add a doc comment, and rename to `commit_loose_content_object()` for clarity.
Closes: #914
Approved by: alexlarsson
I noticed my previous patches incorrectly started doing `return glnx_throw*`
inside a `goto out;` function. Fix this by porting forward consistently to new
style. We just do the error prefixing in the caller.
Closes: #914
Approved by: alexlarsson
When a transaction is finished and we have moved all the staged loose
objects into the repo we fsync all the object directory, to ensure the
filenames are stable before we update the refs files to point to the
new commits.
With out this an unclean shutdown after the transaction is finished
could result in a refs file that points to an incomplete commit.
https://bugzilla.gnome.org/show_bug.cgi?id=759442Closes: #918
Approved by: cgwalters
These exist in the wild for flatpak, and aren't really a problem. The canonical
permissions are still either `0755` or `0644`, we just support the additional
writable bit for the group (i.e. extend the set to include `0775` and `0664`)
now to avoid breaking some flatpak content.
Closes: #913
Approved by: alexlarsson
Having every object in a bare-user repo (and checkouts) be executable
is ugly. I can't think of a good reason to do that; they should only
be executable if their input is. This does
for `bare-user` what we did for `bare-user-only` in
https://github.com/ostreedev/ostree/pull/909
It's also a stronger version of what we do with `checkout -U` in suppressing
suid - here we also strip world-writable files and the sticky bit (even though
that's meaningless today, it might not be in the future).
Closes: https://github.com/ostreedev/ostree/issues/907Closes: #908
Approved by: alexlarsson
For the flatpak use case where bare-user-only was introduced, we actually
don't want to support s{u,g} id files in particular.
Actually, I can't think of a reason to have anything outside of the
`0755 i.e. (u=rwx,g=rx,o=rx)` mask, so that's what we do here.
This will have the effect of treating existing `bare-user-only` repositories as
corrupted if they have files outside of that mask, but I think we should do this
now; most of the flatpak users will still be on `bare-user`, and we haven't
changed the semantics of that mode yet.
Note that in this patch we will also *reject* file content that doesn't
match this. This is somewhat asymmetric, since we aren't similarly rejecting
e.g. directory metadata. But, this will close off the biggest source
of the problem for flatpak (setuid binaries).
See: https://github.com/ostreedev/ostree/pull/908
See: https://github.com/flatpak/flatpak/pull/837Closes: #909
Approved by: alexlarsson
There was a lot of conditionals inside `write_object()` differentating
between metadata/content, and then for content, on the different repo
types. Further, in the metadata path since the logic is simpler, can
present a non-streaming API, and further use `OtTmpfile`, etc.
Splitting them up helps drop a lot of conditionals. We introduce a small
`CleanupUnlinkat` that allows us to fully convert to the new code style in both
functions.
This itself is still prep for fully switching to `GLnxTmpfile`.
Closes: #881
Approved by: jlebon
If we have an expected checksum, call `fstatat(repo_dfd, checksum)`
early on before we do much else. This actually duplicates code,
but future work here is going to split up the metadata/content
commit paths, so they'll need to diverge anyways.
Closes: #881
Approved by: jlebon
First, the streaming metadata API is pretty dumb, since metadata
should be small. Really we should have supported a `GBytes`
version. Currently, this API *is* used when we do local pulls,
so this commit has test coverage. However, I plan to change
the object import to avoid using this. But that's fine, since
I can't think of why someone would use this API.
Next, the only difference between `ostree_repo_write_metadata()` and
`ostree_repo_write_metadata_trusted()` is whether or not we pass
an output checksum; so just dedup the implementations.
Also while I'm here break out the input length validation and do
it early in the streaming case.
Closes: #881
Approved by: jlebon
Rather than `g_output_stream_splice()`, where the input is a regular
file.
See https://github.com/GNOME/libglnx/pull/44 for some more information.
I didn't try to measure the performance difference, but seeing the
read()/write() to/from userspace mixed in with the pointless `poll()` annoyed me
when reading strace.
As a bonus, we will again start using reflinks (if available) for `/etc`,
which is a regression from the https://github.com/ostreedev/ostree/pull/797
changes (which before used `glnx_file_copy_at()`).
Also, for the first time we'll use reflinks when doing commits from file-backed
content. This happens in `rpm-ostree compose tree` today for example.
Update submodule: libglnx
Closes: #817
Approved by: jlebon
Allow GI bindings to delete refs through ostree_repo_transaction_set_ref
and ostree_repo_transaction_set_refspec by setting the checksum to NULL.
Closes: #834
Approved by: cgwalters
While running the testsuite under valgrind a small memory leak showed up:
==16487== 65 bytes in 1 blocks are definitely lost in loss record 773 of 1,123
==16487== at 0x4C2BBAF: malloc (vg_replace_malloc.c:299)
==16487== by 0x6048E08: g_malloc (gmem.c:94)
==16487== by 0x6062EAE: g_strdup (gstrfuncs.c:363)
==16487== by 0x54CE3E6: write_object (ostree-repo-commit.c:776)
==16487== by 0x54CF2D4: ostree_repo_write_metadata (ostree-repo-commit.c:1528)
==16487== by 0x54CF505: _ostree_repo_write_directory_meta (ostree-repo-commit.c:1712)
==16487== by 0x54D0AB4: write_dfd_iter_to_mtree_internal (ostree-repo-commit.c:2650)
==16487== by 0x54D0E2D: ostree_repo_write_dfd_to_mtree (ostree-repo-commit.c:2793)
==16487== by 0x1190C4: ostree_builtin_commit (ot-builtin-commit.c:474)
==16487== by 0x11F2EE: ostree_run (ot-main.c:200)
==16487== by 0x116F32: main (main.c:78)
The reason for this is that ot_checksum_instream_get_string returns a chunk of newly allocated memory which never got freed.
Make actual_checksum something that gets autocleanend and own the memory
assigned to it in all cases.
Signed-off-by: Sjoerd Simons <sjoerd.simons@collabora.co.uk>
Closes: #827
Approved by: pwithnall
If we're freeing the segment, it's basically always better to use
`autoptr()`. Fewer lines, more reliable, etc.
Noticed an instance of this in the pull code while reviewing a different PR,
decided to do a grep for it and fix it tree wide.
Closes: #836
Approved by: pwithnall
I was working on `rpm-ostree livefs` which does some ostree-based
filesystem diffs, and noticed that we were ending up with `/proc`
not being labeled in our base trees.
Reading the selinux-policy source, indeed we have:
```
/proc -d <<none>>
/proc/.* <<none>>
```
This dates pretty far back. We really don't want unlabeled
content in ostree. In this case it's mostly OK since the kernel
will assign a label, but again *everything* should be labeled via
OSTree so that it's all consistent, which will fix `ostree diff`.
Notably, `/proc` is the *only* file path that isn't covered when composing a
Fedora Atomic Host. So I added a hack here to hardcode it (although I'm a bit
uncertain about whether it should really be `proc_t` on disk before systemd
mounts or not).
Out of conservatism, I made this a flag, so if we hit issues down the line, we
could easily change rpm-ostree to stumble on as it did before.
Closes: #768
Approved by: jlebon
I didn't touch everything since at least `commit_loose_object_trusted`
does this:
```
out:
if (G_UNLIKELY (error && *error))
g_prefix_error (error, "Writing object %s.%s: ", checksum, ostree_object_type_to_string (objtype));
```
Which...it'd be interesting to make into an autocleanup. But for now just
keeping up with converting things bit by bit.
Closes: #761
Approved by: jlebon
This adds to file permission masks the same bitmask that will
be applied to file objects in bare-user* repos. This will be
needed in the testsuite to ensure that the things we commit
will be expressable in bare-user-only repos.
Closes: #750
Approved by: cgwalters
This mode is similar to bare-user, but does not store the permission,
ownership (uid/gid) and xattrs in an xattr on the file objects in the
repo. Additionally it stores symlinks as symlinks rather than as
regular files+xattrs, like the bare mode. The later is needed because
we can't store the is-symlink in the xattr.
This means that some metadata is lost, such as the uid. When reading a
repo like this we always report uid, gid as 0, and no xattrs, so
unless this is true in the commit the resulting repository will
not fsck correctly.
However, it the main usecase of the repository is to check out with
--user-mode, then no information is lost, and the repository can
work on filesystems without xattrs (such as tmpfs).
Closes: #750
Approved by: cgwalters
There are a lot of things suboptimal about this approach, but
on the other hand we need to get our CI back up and running.
The basic approach is to - in the test suite, detect if we're on overlayfs. If
so, set a flag in the repo, which gets picked up by a few strategic places in
the core to turn on "ignore xattrs".
I also had to add a variant of this for the sysroot work.
The core problem here is while overlayfs will let us read and
see the SELinux labels, it won't let us write them.
Down the line, we should improve this so that we can selectively ignore e.g.
`security.*` attributes but not `user.*` say.
Closes: https://github.com/ostreedev/ostree/issues/758Closes: #759
Approved by: jlebon
The current `OstreeChecksumInputStream` is public due to a historical
mistake. I'd like to add an OpenSSL checksum backend, but that's
harder without breaking this API.
Let's ignore it and create a new private version, so it's easier to do the
GLib/OpenSSL abstraction in one place.
Closes: #738
Approved by: jlebon
The gzip default is 6. When I was writing this code, I chose 9 under
the assumption that for long-term archival, the extra compression was
worth it.
Turns out level 9 is really, really not worth it. Here's run at level 9
compressing the current Fedora Atomic Host into archive:
```
ostree --repo=repo pull-local repo-build fedora-atomic/25/x86_64/docker-host
real 2m38.115s
user 2m31.210s
sys 0m3.114s
617M repo
```
And here's the new default level of 6:
```
ostree --repo=repo pull-local repo-build fedora-atomic/25/x86_64/docker-host
real 0m53.712s
user 0m43.727s
sys 0m3.601s
619M repo
619M total
```
As you can see, we run almost *three times* faster, and we take up *less
than one percent* more space.
Conclusion: Using level 9 is dumb. And here's a run at compression level 1:
```
ostree --repo=repo pull-local repo-build fedora-atomic/25/x86_64/docker-host
real 0m24.073s
user 0m17.574s
sys 0m2.636s
643M repo
643M total
```
I would argue actually many people would prefer even this for "devel" repos.
For production repos, you want static deltas anyways. (However, perhaps
we should support a model where generating a delta involves re-compressing
fallback objects with a bit stronger compression level).
Anyways, let's make everyone's life better and switch the default to 6.
Closes: #671
Approved by: jlebon
`-fsanitize=address` complained that the `refcount > 0` assertions
were reading without atomics. We can fix this by reworking them
to read the previous value.
Closes: #582
Approved by: jlebon
You'd expect
ostree commit --tree=ref=A --tree=ref=B
to produce a commit with the union of the trees given. Instead you'd get
a commit with the contents of just the latter commit. This was due to an
optimisation where we'd skip filling out the `files` and `subdirs`
members of the mtree, just filling in the metadata instead. This backfires
becuase this same code relies on checking the `files` and `subdirs` members
itself to work out whether the mtree is empty.
This commit removes the optimisation, fixing the bug. Maybe there's a way
to keep the optimisation and still fix the bug but it's not obvious to
me.
Closes: #581
Approved by: cgwalters
When doing commit --tree=ref=XXX while at the same time applying some
form of modifier, ostree dies trying to read the xattrs using the
raw syscalls. We fix this by falling back to ostree_repo_file_get_xattrs()
in this case.
Also adds a testcase for this.
Closes: #577
Approved by: cgwalters
If there is a transaction active, then we put writes to detached
metadata into the staging dir, and when reading it we look there
first. This allows transactions to be aborted half-way without
writing the detached metadata into the repository (possibly
overwriting any old metadata from there).
This fixes https://github.com/ostreedev/ostree/issues/526Closes: #539
Approved by: giuseppe
If the detached metadata is not in the repo, try in the parent
repo if that is set.
Without this a commit will not gpg validate in the child repo
Closes: #539
Approved by: giuseppe
We hold a fd open on this, and it's basically now expected
to be immortal. Confer that status.
This was showing up in flatpak crashers, because we'd get
an unexpected errno.
(I didn't test this fixes the crasher, but it's clearly right)
https://bugzilla.redhat.com/show_bug.cgi?id=1347293Closes: #476
Approved by: alexlarsson
In general this is even cleaner now, though it was better after I
extracted a helper function for the "write tempfile with contents"
bits that were shared between metadata and regular file codepaths.
Closes: #369
Approved by: jlebon
When reworking the ostree core [to use O_TMPFILE](https://github.com/ostreedev/ostree/pull/369),
I hit an issue in the way the untrusted delta codepath ends up trying
to re-open the file to checksum it. That's not possible with
`O_TMPFILE` since the fd (which we opened `O_WRONLY`) is the only
accessible reference to the content.
Fix this by changing the delta processing code to update a checksum as
we're doing writes, which is also faster, and ends up simplifying the
code as well.
What would be an even larger simplification here is if we e.g. used a
separate thread calling `write_object()` or something like that; the
main issue I see there is somehow bridging the fact that function
wants a `GInputStream*` but the delta code is generating stream of
writes.
Closes: #392
Approved by: jlebon
When trying to switch ostree to `O_TMPFILE`, I hit the fact that
by default it uses mode `000`. It still works to write to the
open fd of course, but it *doesn't* work to set xattrs because
that code path for some reason in the kernel checks the mode bits.
This only broke for bare-user repos where we tried to set the xattr
before calling `fchmod()`, so just invert those two operations.
Closes: #391
Approved by: jlebon
Import `gs_file_enumerator_iterate()` for the next six months or
so...after RHEL 7.3 is released I'm strongly considering hard
requiring 2.46 or so.
Likely at some point we should figure out how to share more "glib
backport" code with NetworkManager at least.
Closes: #341
Approved by: jlebon
The recent memleak fixes motivated me to look at the bitrotted code to
run invocations of `ostree` in the test suite underneath valgrind.
There are a few things here. First, update suppressions file from
libhif, since I recently worked on it.
When running *uninstalled* as we now support, we need
`libtool --mode=execute` in the mix so it expands out to
the uninstalled binary and we don't valgrind the intermediate shell.
However, it's harder than that because we chdir into a tmpdir,
which defeats the libtool logic. AFAICS, the only fix for this
is to determine the realbin path before we chdir, and then unfortunately
we need to change every use of `ostree` to `${OSTREE}` =(
Then this immediately breaks for me on RHEL7 because my ancient
copy of `valgrind-3.10.0-16.el7.x86_64` is unaware of syscall 306, i.e.
`syncfs`.
But let's do this first before I dive into that.
Closes: #292
Approved by: krnowak
1 is a better choice than 0 because some programs use 0
as a special value; for example, GNU Tar warns of an
"implausibly old timestamp" with 0.
Closes: #330
Approved by: cgwalters
We have a lot of "allow_noent" type wrapper functions since
a common pattern is to allow files to not exist, but still
throw cleanly on other issues.
This is another instance of that, and cleans up duplicated error
handling code.
Part of this is prep for moving away from `GFile` consumers.
Closes: #319
Approved by: jlebon
In addition to generic fd relative porting,
this is a necessary preparatory step for libglnx porting, because
when I tried to use `g_mapped_file_new` I hit an issue with
it using a different error domain from GIO.
Thankfully libglnx consistently uses the GIO error domain, and here
we're now using it for the `open()` call.
Closes: #317
Approved by: jlebon
This tries to avoid leaking GVariantBuilders and GVariants in some
situations. The leaks were usually happening when some error occurred
or because of unclear variant ownership situation.
The former is mostly about making sure that g_variant_builder_clear is
called on builders that didn't finish their variant building process.
The latter is surely more work - sometimes the result of
g_variant_builder_end() should not be passed directly to a function,
but rather stored in a g_autoptr(GVariant), sunk and then passed to a
function. IMO, with an advent of g_autoptr, GVariants should be always
sunk instead of relying on some receiver function sinking it. This
would make an easy-to-follow policy of always sinking your
variants. Functions could then assume that the passed variant is
already sunk. These leaks are still happenning in commands, but they
are less harmful, since that code will not be used by some daemon as a
library routine.
Closes: #291
Approved by: cgwalters
- Make hardlink handling more generic. The previous strategy worked for
tar archives, but not for cpio. It now works for both.
- Add support for SEL labeling (through the OstreeRepoCommitModifier)
- Add support for xattr_callback (through the OstreeRepoCommitModifier)
- Add support for filter (through the OstreeRepoCommitModifier)
- Add a use_ostree_convention option
Closes: #275
Approved by: cgwalters
The filesystem commit code will never give us potentially hostile
filenames, and when importing from archives, we do some validation.
However, we should be extra paranoid and also add error messages in
the mtree in case someone tries to import a hostile
libarchive-supported format.
Closes: #283
Approved by: jlebon
We were arbitrarily only deleting content after exactly one day. Some
use cases may want something else; make it configurable.
Closes: #170
Approved by: jlebon
We had a policy of cleaning up all files in `$repo/tmp` older
than one day, but we should really clean up previous bootid staging
directories too, as they can potentially take up a lot of disk space.
https://bugzilla.gnome.org/show_bug.cgi?id=760531Closes: #170
Approved by: jlebon
Setting this causes commit to error out. There are other ways we
could do this in a more sophisticated fashion, such as via SystemTap
etc. But this has low-tech applicablity, works as non-root.
The reason I'm adding this is so that we can add test cases for
cleanup of the `tmp/staging-` directory.
Closes: #170
Approved by: jlebon
There was some leftover intermediate cruft here I noticed
while reviewing another patch:
- We had an output `GFile*` for that was never used
- We required the caller to allocate the loose pathbuf, but
none of them ever reused it
- We had an extra intermediate function
Also while looking at this, I'm now uncertain whether some of the
callers of `_ostree_repo_has_loose_object` should really be invoking
`ostree_repo_has_object()`, but let's leave that aside for now.
Closes: #272
Approved by: alexlarsson
in write_directory_content_to_mtree_internal dfd_iter can be NULL,
for instance if commiting from --tree=ref=FOO. Don't blindly de-ref
it to avoid crashing.
Closes: #256
Approved by: cgwalters
In some cases (such as `ostree-sysroot-cleanup.c`), the surrounding
code would be substantially cleaner if it was also ported to
fd-relative, but I'm going to do that in a separate patch.
That way these patches are easier to review for mechanical
correctness. I used an Emacs keyboard macro as the poor man's
[Coccinelle](http://coccinelle.lip6.fr/).
⚠️ There is a notable spiked pit trap here around
`posix_fallocate()` and `errno`. This has bit other projects,
see e.g.
7bb87460e6
Otherwise the port was straightforward.
A fast way to generate new OSTree content using an existing
tree is to checkout (as hard links), add/replace files, then
call `ostree_repo_scan_hardlinks()`, then commit.
But `ostree_repo_scan_hardlinks()` scans the entire repo, which
can be slow if you have a lot of content.
All we really need is a mapping of (device,inode) -> checksum
just for the objects we checked out, then use that mapping
for commits.
This patch adds API so that callers can create a mapping via
`ostree_repo_devino_cache_new()`, then pass it to
`ostree_repo_checkout_tree_at()` which will populate it, and then
`ostree_repo_write_directory_to_mtree()` can consume it.
I plan to use this in rpm-ostree for package layering work.
Notes:
- The old `ostree_repo_scan_hardlinks()` API still works.
- I tweaked the cache to be a set with the checksum colocated with
the key, to avoid a separate malloc block per entry.
https://github.com/GNOME/ostree/pull/167
Concurrent pulls break since we're sharing the staging directory for
all transactions in the repo. This makes us use a per-transaction directory.
However, in order for resumes to work we first look for existing
staging directories and try to aquire an exclusive lock for them. If
we can't find any staging directory or they are all already locked,
then we create a new one.
https://bugzilla.gnome.org/show_bug.cgi?id=757611
ostree_repo_write_commit_with_time() converts the timestamp to
big-endian byte order.
ostree_repo_write_commit() was also doing this when calling
ostree_repo_write_commit_with_time(), resulting in a corrupted
commit object (timestamp bytes were backwards).
Recent regression in 14ffd7022a
Do not delete a .commitmeta file after removing the last metadata entry.
This way a client will pull the empty .commitmeta file and overwrite old
metadata as expected.
https://bugzilla.gnome.org/750459
Also renames OstreeRepoTrustedContentBareCommit to
OstreeRepoContentBareCommit so that it can be used by both.
This will be needed when we introduce checksum verification of objects
in static deltas.
When a commit is deleted and the repo is configured to use tombstone
commits, create one. Delete the tombstone file only if the commit is
pulled again.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
This originally was a way that we detected the case where a pull was
interrupted. Later, we added `.commitpartial` files which also cover
this case.
See also https://github.com/GNOME/ostree/pull/85
We still want to honor their existence (and unlink them) in case an
old version of ostree was in use, but I believe it's safe to stop
creating them now.
The only case where this would break is if you have a version of
ostree that predates commitpartial in your rollback history, but such
old versions are no longer in use by operating systems I support at
least.
Closes: https://github.com/GNOME/ostree/pull/100
I was hitting a bug in libguestfs/guestmount/FUSE where it blew up
with EINVAL on directories containing lots of files (more than
32000?). We really want to use prefixed subdirs just like the real
objects/ directory does.
This allows us to share more code between the paths, is more
efficient, etc.
gnome-continuous uses the ostree_repo_scan_hardlinks() mode to
avoid re-checksumming everything. However, when I ported the commit
code to use openat() and friends, this optimization was lost.
Re add it. The difference is about 15s versus 5 minutes.
When doing a pull --mirror from an archive-z2 repository into another
archive-z2 repository, currently we gunzip/checksum/gzip each content
object. The re-gzip process in particular is fairly expensive.
This does assume that the upstream content is trusted and correct.
It'd be nice in the future to do at least a CRC check, if not the full
checksum. (Could we append CRC data to the end of filez objects?)
We could also choose to only do this optimization if fetching over
TLS.
before: 1626 metadata, 20320 content objects fetched; 299634 KiB transferred in 62 seconds
after : 1626 metadata, 20320 content objects fetched; 299634 KiB transferred in 11 seconds
For future delta work where we do more interesting things than just
"tar of new objects", this lays the groundwork for doing streaming
writes into content objects.
It's also more efficient, as we avoid many intermediate allocations
and virtual calls. Just a single `g_output_stream_write_all` for the
splice case.
Conflicts:
src/libostree/ostree-repo-private.h
src/libostree/ostree-repo-static-delta-processing.c
Do not write directly to objects/ but maintain pulled files under tmp/
with a "tmpobject-$CHECKSUM.$OBJTYPE" name until they are syncfs'ed to
disk.
Move them under objects/ at ostree_repo_commit_transaction cleanup
time.
Before (test done on a local network):
$ LANG=C sudo time ./ostree --repo=repo pull origin master
0 metadata, 3 content objects fetched; 83820 KiB; 4 delta parts
fetched, transferred in 417 seconds
16.42user 6.73system 6:57.19elapsed 5%CPU (0avgtext+0avgdata
248428maxresident)k
24inputs+794472outputs (0major+233968minor)pagefaults 0swaps
After:
$ LANG=C sudo time ./ostree --repo=repo pull origin master
0 metadata, 3 content objects fetched; 83820 KiB; 4 delta parts
fetched, transferred in 9 seconds
14.70user 2.87system 0:09.99elapsed 175%CPU (0avgtext+0avgdata
256168maxresident)k
0inputs+794472outputs (0major+164333minor)pagefaults 0swaps
https://bugzilla.gnome.org/show_bug.cgi?id=728065
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
If an object already existed and we somehow tried to pull it, the
caller would still expect a returned checksum.
This appears to happen with static deltas for some reason; we might be
including duplicate metadata objects. Regardless, this is a bug that
should be fixed.