The generic GKeyFile error isn't quite informative enough here.
I hit this with the new compose process where we don't automatically
inject a configured remote into the generated disk images; we expect
people to add them.
https://bugzilla.gnome.org/show_bug.cgi?id=731346
There's several use cases for calling into ostree itself to do
mirroring, instead of using bare rsync. For example, it's a bit more
efficient as it doesn't require syncing the objects/ directory.
https://bugzilla.gnome.org/show_bug.cgi?id=728351
For the local repository on the system, it's not the usual case to
have the complete compose history. Rather than erroring out, provide
a bit more friendly message.
https://bugzilla.gnome.org/show_bug.cgi?id=731538
In the case of running ostree as non-root on a regular filesystem (not
tmpfs which doesn't support immutable), we should just silently do
nothing if we encounter EPERM. Cache the result to avoid spam in
strace.
https://bugzilla.gnome.org/show_bug.cgi?id=728006
We weren't installing the headers, but at the moment all symbols
starting with ostree_ were being exported. Fix that by prefixing
non-static symbols with '_'.
https://bugzilla.gnome.org/show_bug.cgi?id=731369
Finally, fsync to ensure all entries are on disk, unless disabled.
We support disabling this for cases like server-side buildroot
construction where we don't need to be robust against power loss
The previous S_IMMUTABLE commit broke ostree-remount; / is now not
actually writable. All we really wanted to know though was whether it
was *mounted* writable, so check that via statvfs() which is cleaner
anyways (i.e. not via access() which kernel people hate).
https://bugzilla.gnome.org/show_bug.cgi?id=728006
On some storage configurations, fsync() can be extremely expensive.
Developers and users with slow hard drives may want the ability to opt
for speed over safety.
Furthermore, many production servers have UPS and stable kernels, and
the risk of not fsync'ing in that scenario is fairly low. These users
should also be able to opt out.
This prevents people from creating new directories there and expecting
them to be persisted. The OSTree model has all local state to be in
/etc and /var.
This introduces a compile-time dependency on libe2fsprogs.
We're only doing this for the root directory at the moment.
https://bugzilla.gnome.org/show_bug.cgi?id=728006
If fetching GPG-signed commits over plain HTTP, a MitM attacker can
fill up the drive of targets by simply returning an enormous stream
for the commit object.
Related to this, an attacker can also cause OSTree to perform large
memory allocations by returning enormous GVariants in the metadata.
This helps close that attack by limiting all metadata objects to 10
MiB, so the initial fetch will be truncated.
But now the attack is only slightly more difficult as the attacker
will have to return a correctly formed commit object, then return a
large stream of < 10 MiB dirmeta/dirtree objects.
https://bugzilla.gnome.org/show_bug.cgi?id=725921
The current "transaction" symlink was introduced to fix issues with
interrupted pulls; normally we assume that if we have a metadata
object, we also have all objects to which it refers.
There used to be a "summary" which had all the available refs, but I
deleted it because it wasn't really used, and was still racy despite
the transaction bits.
We still want the pull process to use the transaction link, so don't
delete the APIs, just relax the restriction on object writing, and
introduce a new ostree_repo_set_ref_immediate().
They shouldn't be loaded for random test/personal repositories. Doing
so triggers another bug in that we return them from
ostree_repo_get_config() when then causes clients to write them out
permanently to disk with ostree_repo_write_config(). This caused test
suite failures.
This makes it easy to use for the case where rpm-ostree-toolbox is
injecting systemd services into the deployment root, and we don't
actually need to traverse the whole FS.
For many OS install scenarios, one runs through an installer which may
come with embedded data, and then the OS is configured post-install to
receive updates.
In this model, it'd be nice to avoid the post-install having to rewrite
the /ostree/repo/config file.
Additionally, it feels weird for admins to interact with "/ostree" -
let's make the system feel more like Unix and have our important
configuration in /etc.
https://bugzilla.gnome.org/show_bug.cgi?id=729343
The option --port-file=- is most useful when the stdout of the daemon
is programatically redirected and not going to a terminal. The
flush-after-a-line behavior of stdout is specific to terminals, so
we need an explicit flush.
https://bugzilla.gnome.org/show_bug.cgi?id=729609
For the static deltas work, we're using the already-extant internal
API to perform a HTTP fetch for optional data - static deltas are
optional.
Except that we didn't correctly unset the error if we were doing an
optional fetch and the data wasn't found.
There's two halves to this; first, when we create an hierarchy, we
need to call fsync(). Second, we need to fsync again anytime after
we've modified a directory.
To be really sure that any directory entries have hit disk we need to
call fsync() on the directory fd. This API allows us to conveniently
create a directory hierarchy, fsyncing all of it along the way.
Let's be a bit more conservative here and actually fdatasync() the
configurations we're generating.
I'm seeing an issue at the moment where syslinux isn't finding the
config sometimes, and while I don't think this is the issue, let's try
it.
There was an attempted optimization to only write if changed, but this
is broken - we always write the bootloader config into a new
directory.
In theory we should only be writing if it changed, but let's not do a
broken optimization.
The previous commit here changed things so that we do mkdir(x, 0700),
then fchmod later only if we created the directory.
However the logic was incorrect; we still need to chmod even in
MODE_USER if we created the directory.
Otherwise this broke atomicity; we could fetch/store the ref, then
crash, and then not upgrade the next time we tried upgrading.
The correct model is: the tree has changed if the new ref is different
from the merge deployment.
Trying to implement "rpm-ostree rollback", in the case where we have 2
deployments with the same bootconfig that we're reordering, we need to
write bootconfig, not just swap the bootlinks.
It turns out people sometimes want to be able to change the kernel
arguments. Add a convenient API to do so for the current deployment.
This will be used by Anaconda.
There are going to be a few utilities that are only useful for
installers and disk image creation tools. Let's not expose them all
at the toplevel; instead, hide them under "instutil".
Code like rpm-ostree generates disk images directly. In order to
ensure SELinux labeling is correct, it currently has a helper program
that runs over the deployment root, then over the whole disk and to
only set a label if none exist.
In order to make it easier to write installers such as Anaconda
without having them depend on rpm-ostree (or whatever other
build-server side program), pull in the helper code here.
We shouldn't g_print() from a library, particularly when the
expectation is that the client has an async progress set up.
This should fix the pull output extending the status line.
If a MITM attacker (or just network corruption) causes a temporary
downloaded object in tmp/ to be corrupted, we'll end up
continually trying to commit it, and fail.
Fix this unlinking the temp file immediately after opening it. This
will ensure that if we exit due to an error (or crash), the kernel
will clean up the space for us.
https://bugzilla.gnome.org/show_bug.cgi?id=725924
The idea with this is that things like yum should be able to look for
it and determine whether or not they should assume that they can
change things on the system.
https://bugzilla.gnome.org/show_bug.cgi?id=725380
This is a bit more efficient in that we're not walking full paths, and
it helps avoid security/reliability issues if an attacker (or just a
misbehaving process) has the ability to mutate paths in the middle.
Mixing async and threads has proved to be too much for my little mind.
It has race conditions that I've tried repeatedly to fix, but failed.
The threading here was scanning metadata objects - and there are
two parts to that:
1) Physically loading them from disk
2) Parsing them
Now #1 has been partially addressed by avoiding a storm of lstat() if
we're starting from a known working state. If pull gets interrupted,
then we do need to rescan all objects. Also, we can address this with
local metadata packfiles.
The other potentially slow bit is that we recurse across the metadata,
blocking the main thread. We could ameliorate that in the future by
scheduling metadata parsing as idle "chunks".
Anyways, let's move the needle back to reliability, and readd speed
more carefully.
https://bugzilla.gnome.org/show_bug.cgi?id=706456
We don't want to allow MITM attackers to intercept upgrade requests
and provide clients with older OS versions vulnerable to security
flaws.
Only "ostree admin upgrade" gets this behavior for now - whether we
want to do it for "ostree admin switch" is another question.
It's better if this is independent from the OstreeSysroot; for
example, a policy is active in a given deployment root at once, not
for a sysroot globally.
We can also collect SELinux-related API in one place.
Unfortunately at the moment there can be only one instance of this
class per process.
We're seeing some hangs while ostree is fetching updates.
I imagine the fact that SoupSessionAsync has no timeout by default
could be the cause of this.
Set timeout values to 60 seconds, which is the default for the new
SoupSession API which we may switch to later.
https://bugzilla.gnome.org/show_bug.cgi?id=724310
tmpfiles.d configurations generally require write access to some places
that are read-only until ostree-remount runs.
Make sure ostree-remount has run first.
Thanks to Cosimo Cecchi for finding and diagnosing this problem.
https://bugzilla.gnome.org/show_bug.cgi?id=724183
The instructions one finds on the internets are apparently wrong, we
really need to keep the default here, since gpgme uses it to actually
find the helper binary it runs.
This fixes the GPG tests for me on EL7 at least.
This has a very basic level of functionality (deltas can be generated,
and applied offline). There is only some stubbed out pull code to
fetch them via HTTP.
But, better to commit this now and improve it from a known starting
point, rather than have it languish in a branch.
First, /var needs to be labeled at least once. We should probably
rearrange things so that /var is only created (and labeled) on the
first deployment, but this patch adds a /var/.ostree-selabeled file
instead.
Second, when doing the /etc merge, we compare the xattrs of the old
/usr/etc versus the current /etc. The problem with that is that the
policy has different labels for /usr/etc on disk than the real /etc.
The correct fix for this is a bit invasive - we have to take the
physical content of the old /usr/etc, but compare the labels as if
they were really in /etc.
Instead for now, just ignore changes to xattrs. If the file
content/mode changes, then we take the new file (including any changed
xattrs).
Bottom line: just doing chcon -t blah_t /etc/foo.conf may be lost on
upgrade (for now).
This will be used by guestmount - it's WAY faster. We only take disks
as a unit, so it's safe. If the process fails halfway through, we
just start over from scratch the next time anyways.
The trees as shipped come with /usr/etc, which should just be labeled
as usr_t. When we do a deployment, we need to relabel the copies of
the files we're making in /etc.
SELinux support is compile and runtime optional.
The intent of this code I'm fairly certain was to use *.gpg from the
trusted.gpg.d, directory. But right now, we're only using
"pubring.gpg" from that directory, which is odd.
Let's fix this to use all keys ending in .gpg, which will also
include pubring.gpg.
Only send _IDLE messages if and only if we state transition the main
thread (from idle -> !idle or !idle -> idle). This ensures that we
don't send IDLE, then get it back, and process that when we're !idle.
This is a redesign (again) of the pull code. It is simpler and
survives 20 minutes of testing in a loop, whereas the old code would
only go from 30 seconds to 2 minutes.
The problem with the old code was that there was a race where we might
determine idle state even when there are content requests in flight
between the metadata thread and the main one.
This code majorly reworks things - there's now only one IDLE message,
sent in a circle from the main thread, through the metadata scanner,
and back to the main one.
Crucially it's only sent when the *main* thread is idle. Previously
we were looking at whether the metadata scanner is idle, but that
doesn't make a lot of sense. First let's make sure the main thread is
idle, then verify that the metadata one is.
This closes the loop because we'll have ensured we get any pending
requests.
https://bugzilla.gnome.org/show_bug.cgi?id=706456
The "ordered hash" code was really just for kernel arguments. And it
turns out it needs to be a multihash (for e.g. multiple console=
arguments).
So turn the OstreeOrderedHash into OstreeKernelArgs, and move the bits
to split key=value and such into there.
Now we're not making this public API yet - the public OstreeSysroot
just takes char **kargs. To facilitate code reuse between ostree/ and
libostree/, make it a noinst libtool library. It'll be duplicated in
the binary and library, but that's OK for now. We can investigate
making OstreeKernelArgs public later.
https://bugzilla.gnome.org/show_bug.cgi?id=721136
The official way to add bootloader arguments to the current deployment
is to redeploy with --karg. However, doing so tripped up an
optimization made inside the deployment code to just swap the
bootlinks if we're keeping the same "bootcsum".
Change this optimization to look at the pair of (bootcsum, options).
We can't use #ifdef in the headers, since then g-ir-scanner won't pick
up the functions (unless we included config.h). Let's instead always
have the symbols, but just set an error if we were built without
support for it, just like how pull works.
This large patch moves the core xattr logic down into libgsystem,
which allows the gs_shutil_cp_a() API to copy them. In turn, this
allows us to just use that API instead of rolling our own recursive
copy here.
As noted in the new comment though, one case that we are explicitly
regressing is where the new /etc removes a parent directory that's
needed by a modified file. This seems unlikely for most vendors now,
but let's do that as a separate bug.
https://bugzilla.gnome.org/show_bug.cgi?id=711058
Previously the progress meter would bump in large chunks after we
completed a download. Instead, poll in progress files via fstat() for
their size, and add those to the running total.
The libostree already treats passing NULL for osname as "booted
osname, if any". We should do the same inside the tools. The upgrade
builtin had this logic duplicated there; we should be able to safely
remove it.
https://bugzilla.gnome.org/show_bug.cgi?id=710970
When booted into an ostree you can deploy without passing
an --os option. That was crashing though, because
ot_admin_complete_deploy_one is called with NULL
osname but it was not handling it properly.
https://bugzilla.gnome.org/show_bug.cgi?id=710970
Several APIs in libostree were moved there from the commandline code,
and have hardcoded g_print() for progress and notifications. This
isn't useful for people who want to write PackageKit backends, custom
GUIs and the like.
From what I can tell, there isn't really a winning precedent in GLib
for progress notifications.
PackageKit has the model where the source has GObject properties that
change as async ops execute, which isn't bad...but I'd like something
a bit more general where say you can have multiple outstanding async
ops and sensibly track their state.
So, OstreeAsyncProgress is basically a threadsafe property bag with a
change notification signal.
Use this new API to move the GSConsole usage (i.e. g_print()) out from
libostree/ and into ostree/.
Add a --generate-sizes option to commit to add size information to the
commit metadata. This will be used by higher level code which wants
to determine the total size necessary for downloading.
The gpgme examples use this, but from what I can tell we don't really
need to because we don't need detailed results; we only care whether
we signed it at all.
We need to use the full shutil_rm_rf() in order to actually delete
complete directories.
Test suite code based on a patch from Sjoerd Simons <sjored@luon.net>
https://bugzilla.gnome.org/show_bug.cgi?id=710097
Adapted from Google protobufs. For several cases, we want to support
e.g. file sizes up to guint64, but paying the cost of 8 bytes for each
number is too high.
This will be used for static deltas and sizes metadata.
If a deployment is somehow in the list twice, the hash table will free
the *new* value with g_hash_table_insert which gets all broken. Just
use g_hash_table_replace().
This commit changes the sysroot API so that one can create arbitrary
new deployment checkouts, then commit them as one step. This is to
enable things like an automatic bisection tool which say create 50
deployments at once, then when done clean them up.
This also moves some printfs from the library into src/ostree.
I plan to rename all of these APIs to use the term 'loose', so that it
makes more sense after pack files are introduced. External users
should not use them; instead use _load_variant() or _read_commit().
This uses gpgv for verification against DATADIR/ostree/pubring.gpg by
default. The keyring can be overridden by specifying OSTREE_GPG_HOME.
Add a unit test for commit signing with gpg key and verifying on pull;
to implement this we ship a test GPG key generated with no password
for Ostree Tester <test@test.com>.
Change all of the existing tests to disable GPG verification.
Add an optional dependency on gpgme to add GPG signatures into the
detached metadata, with the key "ostree.gpgsigs", as an "aay", an
array of signatures (treated as binary data).
The commit command gains a --gpg-sign=<key-id> argument. Also add an
argument --gpg-homedir to set the GPG homedir where we look for
keyrings.
I was getting hangs in the test suite, and looking at the previous
commit, we were calling the async completion functions out of the
finalizer for the URI, which is weird. I didn't analyze what's going
wrong, but what we really should be doing is processing our internal
queue after we've downloaded a file, and the request is about to be
finalized.
I suspect doing queue management from the finalizer created a circular
reference type situation.
This patch deduplicates the queue processing bits too.
https://bugzilla.gnome.org/show_bug.cgi?id=708126
On a large ostree repository pulling over http slows to a crawl. Pulling
from localhost results in:
5944 metadata, 63734 content objects fetched; 850509 KiB transferred in
1106 seconds
In other words about 800KiB/s. Some profiling shows that essentially
all of the CPU goes into libsoup doing its request bookkeeping instead
of into the actual downloading.
Adding a simple queue to limit to number of active request sent into
libsoup makes for a dramatic improvement:
5944 metadata, 63734 content objects fetched; 850509 KiB transferred
in 89 seconds
So around 9450 KiB/s.
https://bugzilla.gnome.org/show_bug.cgi?id=708126
If we had two deployments with different boot checksums, and were
trying to remove the one that was the same and add a new one (the
normal case), we'd end up assuming due to comparison with 0 that
we only needed to do the fast subbootversion swap.
Fix this by actually putting 1 where we really mean 1.
And update the tests to verify the fix; I have double-verified by
undoing the fix, and noting that the test fails.
https://bugzilla.gnome.org/show_bug.cgi?id=708351
The actual deployment checksum shouldn't be in there, because we may
just swap bootlinks, rendering the name of the old bootloader entry
file invalid. Thankfully nothing actually parsed the names of these
files, so let's just use the index.
This is just something I noticed on inspection; we should catch any
changes to /boot in the sync(), even though theoretically gio should
have done fdatasync().
Now that we have a real GObject for the sysroot, we have a convenient
place to keep track of 4 pieces of state:
* The current deployment list
* The current bootversion
* The current subbootversion
* The current booted deployment (if any)
Avoid requiring callers to pass all of this around and load it
piecemeal; instead the new thing is ostree_sysroot_load().
I ran into Jeremy Katz today, and he gave me permission to relicense
the small bits of switch-root.c to LGPLv2+. This combined with
permission from Peter Jones allows OSTree to become fully LGPLv2+.
Not a big deal, it's just a lot clearer to only have one license, and
it makes it easier to turn application code into library code.
read_commit resolves the ref to a commit, and a lot of consumers want
the resolved commit for their own purposes; this prevents them from
calling resolve_rev themselves.
https://bugzilla.gnome.org/show_bug.cgi?id=707727
We want an OstreeRepoFile to be the way to reference a "filesystem
tree" that's stored in the repo, which is a combination of a DIR_TREE
and a DIR_META. The idea is that once you write an mtree to the repo
using ostree_repo_write_mtree, it becomes serialized and you get an
OstreeRepoFile in return.
Change any APIs that care about DIR_TREE / DIR_META checksums to care
about OstreeRepoFiles instead, which right now is mostly is
ostree_repo_write_commit.
https://bugzilla.gnome.org/show_bug.cgi?id=707727
We want an OstreeRepoFile to be the way to represent a filesystem tree
inside an ostree repository. In order to do this, we need to drop the
commit from an OstreeRepoFile, and make that go to callers.
Switch all current users of ostree_repo_file_new_root to
ostree_repo_read_commit, and make the actual constructor private.
https://bugzilla.gnome.org/show_bug.cgi?id=707727
Previously I thought we'd have to ditch the current commit
format to avoid a{sv} due to
See https://bugzilla.gnome.org/show_bug.cgi?id=673012
But I realized that we don't really have to care about
unpacking/repacking commit objects, so let's just re-expose the
existing metadata a{sv} in commits in the API.
Also, add support for "detached" metadata that can be updated at any
time post-commit. This is specifically designed for GPG signatures.
https://bugzilla.gnome.org/show_bug.cgi?id=707379
...get_thread_default returns NULL when the thread default is also the global
default, so this only shows up when running in a thread (eg g_task_run_in_thread)
This is a relic from long ago when we were trying to stage objects
before finally committing them all in one go in the pull code.
We're no longer doing that, so stop trying to make the directory.
This also fixes trying to use ostree as non-root to read the
root-owned repo, since we'd fail to create the pending dir.
For the cases where we can't hardlink, use at-relative walking of the
path where possible. We still don't have lsetxattrat, so we also need
to deal with pathnames, but that is now only for symlinks.
Again, the advantages of this are a lot less malloc() of pathnames in
ostree, and much less time spent traversing paths inside the kernel.
https://bugzilla.gnome.org/show_bug.cgi?id=707733
Nothing external uses it. We keep ostree_get_xattrs_for_file() public
because it's convenient for external consumers to get xattrs in
exactly the format we desire.
https://bugzilla.gnome.org/show_bug.cgi?id=707733
Do as many operations as we can using the original file descriptor
while we have it open, rather than writing, closing, then reopening.
This necessitated very explicitly special casing symbolic links,
mainly due to the lack of lsetxattrat().
https://bugzilla.gnome.org/show_bug.cgi?id=707733
Clean up how we deal with the uncompressed object cache; we now use
openat()/linkat() and such just like we do for the main objects/.
Use linkat() between the objects and the destination, if possible.
https://bugzilla.gnome.org/show_bug.cgi?id=707733
Instead, use OstreeRepoFile as a handle for the parent commit.
We need to add an accessor for the metadata checksum, as that
hasn't been exposed before.
https://bugzilla.gnome.org/show_bug.cgi?id=707727
We can't quite do it for bare repositories yet because we need to have
a way to go from struct stat -> GFileInfo, and that's buried in gio's
private GLocalFile class.
Just use openat() for locating variants, rather than doing the lstat()
+ open(). This also drops several malloc+object allocations from the
lookup path.
Add new internal API to both fstatat() and write a pathname for the
given object. Use it in commit, and also wrapped in the old
GFile-based API.
This is more efficient.
Rather than having separate write_ref calls, make clients start a
transaction, add some refs, and then commit it. While this doesn't
make it 100% atomic, it makes it easier for us to use an atomic
model, and it means we don't do as much I/O updating the summary
file and such.
https://bugzilla.gnome.org/show_bug.cgi?id=707644