This is inspired by the [Coccinelle](http://coccinelle.lip6.fr/) usage
in systemd. I also took it a bit further and added infrastructure
to have spatches which should never apply. This acts as a blacklist.
The reason to do the latter is that coccinelle is *way* more powerful than the
regular expresssions we have in `make syntax-check`.
I started with blacklisting `g_error_free()` directly. The reason that's bad is
it leaves a dangling pointer.
Closes: #754
Approved by: jlebon
I was playing around in a FAH vagrant box, and hit:
```
Receiving delta parts: 3/4 453.2 kB/s 1.8 MB/145.8 MB
error: opcode set-read-source: No such file object b6e54ba3471b9c116ce6b9bfbf9e55fec60d35cfdb9ae5ae1ee219af02a591b7
```
This is because this host version doesn't yet have
https://github.com/ostreedev/ostree/pull/710
which incidentally fixed this for the case where the OS vendor is using
summary files.
Some organizations may not be using summary files - at least we still try to
support that case. So let's copy the logic very recently added in that commit to
handle the legacy case too.
No new tests since this is a nice-to-have - we really do
expect people to be using summary files now.
Closes: #739
Approved by: jlebon
When using Flatpak with GNOME Software, it is important to
show the progress of the download and install as close as
possible to the real progress.
However, OSTree forces the frequency to call the async
progress callback to 1 second, which causes an unpleasant
effect on the UI, specially when the download size is so
small that everything happens in less than 1 second.
Fix that by adding making OSTree read a custom 'update-frequency'
option and set the timeout source timeout to that. If
no custom frequency is passed, we assume the default 1
second timeout, maintaining the current behavior.
Closes: #725
Approved by: jlebon
The previous logic for static deltas was to use as a FROM
revision the current branch tip. However, we want
to support deltas between branches in an automatic
fashion.
If a summary file is available, we already have an
enumerated list of deltas - so the logic introduced
here is to search it, and find the newest commit
we have locally that matches the TO revision target.
This builds on some thoughts from
https://github.com/ostreedev/ostree/pull/151#issuecomment-232390232
Closes: https://github.com/ostreedev/ostree/pull/151Closes: #710
Approved by: giuseppe
In https://github.com/ostreedev/ostree/pull/408, we disabled the use of
static deltas when mirroring. Later,
https://github.com/ostreedev/ostree/pull/506 loosened this up again so
that we could use static deltas when mirroring into bare{-user} repos.
However, the issue which originally spurrred #408 is even more generic
than that: we want to avoid static deltas for any archive repo, not just
when doing a mirror pull. This patch tightens this up, and also
relocates the decision code to make it easier to read.
Closes: #715
Approved by: cgwalters
Particularly when HTTP requests fail, I really want a lot more information.
We could theoretically stuff it into the `GError` message field, but
that gets ugly *fast*.
Using the systemd journal allows us to log things in a structured fashion.
Right now e.g. rpm-ostree won't be aware of this additional information,
but I think we could teach it to be down the line.
In the short term, users can learn to find it from `systemctl status rpm-ostreed`
or `journalctl -b -r -u rpm-ostreed`, etc.
One thing I'd like to do next is log successful fetches of e.g. commit objects
as well with more information about the originating server (things like the
final URL if we were redirected, did we use TLS pinning, what was the negotiated
TLS version+cipher, etc).
Closes: #708
Approved by: jlebon
It's just simpler, and I'm not sure people are going to care
much about the difference by default.
We already folded in the fallback sizes into the download totals, so folding in
the count makes things consistent; previously you could see e.g.
`3/3 parts, 100MB/150MB` and be confused.
Closes: #678
Approved by: giuseppe
I don't know why I added support for this; it makes no sense really. If we have
large metadata objects something has gone badly wrong.
The delta compiler has always only processed fallbacks for regular
content files.
Dropping support in the fetcher for this will simplify later handling of
fallback progress accounting.
Closes: #678
Approved by: giuseppe
There were a few bugs here.
- We need to keep track of the size of the delta parts we've already processed,
in order to make progress reliable at all in the face of interruptions. Add
a new `fetched-delta-part-size` async progress variable for this.
- The total before disregarded what we'd already downloaded, which was confusing.
Now, a progress percentage is `fetched/total`.
- Correctly handle "unknown bytes/sec" in the progress display.
However, to be fully correct we need to show the fallback objects too. That
would require tracking in the pull code when we fetch an object as a fallback
versus "normally". This would be simpler really if we could assume in a run we
were *only* processing a delta, but currently we don't do that.
Related: https://github.com/ostreedev/ostree/issues/475Closes: #678
Approved by: giuseppe
When fetching over a fast enough connection, we can be receiving files
faster than we write them. This can then lead to EMFILE when we have
enough files open. This was made very easy to notice with the upcoming
libcurl backend, which makes use of pipelining.
Closes: #675
Approved by: cgwalters
For rpm-ostree, we already link to libcurl indirectly via librepo, and
only having one HTTP library in process makes sense.
Further, libcurl is (I think) more popular in the embedded space. It
also supports HTTP/2.0 today, which is a *very* nice to have for OSTree.
This seems to be working fairly well for me in my local testing, but it's
obviously brand new nontrivial code, so it's going to need some soak time.
The ugliest part of this is having to vendor in the soup-url code. With
Oxidation we could follow the path of Firefox and use the
[Servo URL parser](https://github.com/servo/rust-url). Having to redo
cookie parsing also sucked, and that would also be a good oxidation target.
But that's for the future.
Closes: #641
Approved by: jlebon
The libcurl backend does all the work in the main thread/loop, which
seems to starve the idle scanning worker more. With the libcurl
backend, we're a lot more likely to have at least one outstanding
metadata request.
But it can more easily transiently happen with libcurl that all of our current
fetches are content. To be accurate here, just show Estimating if we're scanning
too.
Closes: #654
Approved by: jlebon
Working on the libcurl backend, I didn't want to reimplement another queue. I
think the queue logic is really better done at the high level, since the fetcher
knows how we want to prioritize metadata over content, etc.
Adding another queue here is duplication, but things will look nicer when we can
actually delete the libsoup one in the next commit.
Closes: #654
Approved by: jlebon
I was working on https://bugzilla.redhat.com/show_bug.cgi?id=1393545
and it was annoying that I couldn't know what the new (unsigned)
commit has was until verification succeeded. I could pull it
manually without GPG, but then it'd be sitting in the repo.
Now:
```
Updating from: fedora-atomic:fedora-atomic/25/x86_64/docker-host
Receiving metadata objects: 0/(estimating) -/s 0 bytes
error: Commit 2fb89decd2cb5c3bd73983f0a7b35c7437f23e3aaa91698fab952bb224e46af5: GPG verification enabled, but no signatures found (use gpg-verify=false in remote config to disable)
```
Closes: #663
Approved by: giuseppe
This is prep for the libcurl porting. `GTlsCertificate/GTlsDatabase` are
abstract classes implemented in glib-networking for gnutls. curl's APIs take
file paths as strings, so it's easier to work on both if we move the GLib TLS
bits into the libsoup code.
Closes: #651
Approved by: giuseppe
I was making some other changes in this code, and noticed that we were adding
checksums without object types into the same hash table for metadata. We should
*never* do this with both metadata content objects, since in theory a content
object could have the same hash as metadata.
I don't actually think it's possible in practice for pure metadata to collide,
since they have different structures, but let's do this anyways since it's
conceptually right.
Closes: #651
Approved by: giuseppe
For the pending libcurl port, the backend is a bit more
sensitive to the main context setup. The delta superblock
fetch here is a synchronous request in the section that's
supposed to be async.
Now, libsoup definitely supports mixing sync and async requests, but it wasn't
hard to help the libcurl port here by making this one async. Now fetchers are
either sync or async.
Closes: #636
Approved by: jlebon
The previous commit introduced a single low level API - however,
we can do things in a more optimal way for the curl backend if
we drop the "streaming API" variant. Currently, we only use
it to synchronously splice into a memory buffer...which is pretty
silly when we could just do that in the backend.
The only tweak here is that we have an "add NUL character" flag that is
(possibly) needed when fetching into a membuf.
The code here ends up being better I think, since we avoid the double return
value for the `_finish()` invocation, and now most of the fetcher code (in the
soup case) writes to a `GOutputStream` consistently.
This will again make things easier for a curl backend.
Closes: #636
Approved by: jlebon
Conceptually these now lay on top of the core API, and don't reference libsoup.
This is preparation for libcurl porting, but it's also just generally better.
Closes: #636
Approved by: jlebon
This is in preparation for the libcurl port. We're basically making public what
we had internally. The next step here is to create `ostree-fetcher-util.[ch]`
that only operates in terms of this lower level API.
Also drop the `_mirrored` from the function name since it's
the default now.
Closes: #636
Approved by: jlebon
This is a migration from the origin version. It's
nicer to have it in the remote, since that's what one
needs to change. Then tools don't need to mess with
the origin file.o
In fact in this scenario one can keep the "media source" like
`file:///install/repo` or whatever, since conceptually that's where it
came from. We're just providing a better error.
Closes: https://github.com/ostreedev/ostree/issues/626Closes: #627
Approved by: jlebon
These are out parameters, so add the (out) annotation and switch
(nullable) to (optional) since the latter is used for the purpose of
optional out parameters.
Closes: #629
Approved by: cgwalters
This is what we do for non-local (i.e. HTTP) pulls; we wnat to
correctly handle being interrupted during partial pulls.
Closes: https://github.com/ostreedev/ostree/issues/579Closes: #613
Approved by: jlebon
This is a follow up to conversation on list - in practice, if we're
backing away from summary signing, then it makes sense to remove the
special casing for checksums in deltas around summary signatures.
This is also related to the recent change to enable GPG checking for
commits in deltas - now we have a more coherent story between the
previous pull path and deltas.
I didn't do any performance checking, and while it's slightly annoying
that we're now doing sha256 on the delta content twice (once for the
part and once per object)...sha256 is pretty fast, I think most users
are I/O bound anyways, and it'd drop even farther if we started using
openssl.
Closes: #612
Approved by: jlebon
The fact that we weren't doing this is at best an oversight, and
for some deployment models a security vulnerability. Having both
`gpg-verify` and `gpg-verify-summary` shows that we were intending
them to be orthogonal/independent.
Lately I've been advocating moving towards pinned TLS instead of
gpg-signed summaries, and if we follow that path, performing GPG
verification of commit objects even if using deltas is more important,
as it provides an at-rest verifiable authenticity and integrity
mechanism.
Content providers which are signing their summary files and/or using
TLS (particularly pinned TLS) for transport should treat this as a
nice-to-have. However, for providers which are serving content over
plain HTTP and relying on GPG, this is a critical update.
Closes: https://github.com/ostreedev/ostree/issues/517Closes: #589
Approved by: jlebon
What in the code is called "scanning" is ensuring (potentially
recursively) have an object, and if not, fetching it. And then if
it's metadata, parsing it and finding new objects to fetch.
This logic has grown fairly complex. What I'm trying to fix
right now is that if we're doing a pull-local to a remote repository
via `sshfs` (FUSE) we still end up scanning, which is inefficient.
We can take advantage of the "commitpartial" logic here - if a commit
isn't partial, it's complete, hence we don't need to scan it.
At the same time, I'm changing the logic here to *always* do scans for
dirtree objects. This will fix cases where multiple commits share
dirtree objects. We have "commitpartial" metadata, but no such concept
of partial/complete for dirtrees.
But, we'll only ever scan dirtrees if we scan commits, which is
what the section above fixes.
Closes: https://github.com/ostreedev/ostree/issues/543Closes: #564
Approved by: alexlarsson
Some deployments may want to gate access to content based on things
like OAuth. In this model, the client system would normally compute a
token and pass it to the server via an API.
We could theoretically support this in the remote config too, but
that'd be a bit weird for OAuth as the information is dynamic.
Therefore this cleans up the code a little bit to more clearly handle
the case that the fetcher is initialized from both remote config
data plus pull options.
Closes: #574
Approved by: giuseppe
Optionally read cookie jars for a remote to be used when downloading
data. This can be used for private repositories which require specific
cookies to be present, e.g. repositories hosted on Amazon cloudfront
using signed cookies.
Closes: #531
Approved by: cgwalters
We should just download the commit objects directly, as it's
obviously a lot more efficient than deltas.
I had to generate a summary file in more places in the tests,
since once created, it needs to be updated.
Closes: https://github.com/ostreedev/ostree/issues/528Closes: #566
Approved by: jlebon
If this is true, don't initiate, abort of commit a transaction, instead
it is assumed that the caller initiated the transaction, and that it
will eventually be commited.
This allows you to do multiple pulls or a combination of pulls and
commits in a single transaction.
Closes: #525
Approved by: cgwalters
In https://github.com/ostreedev/ostree/pull/408 we fixed a
bug where we would crash when trying to execute deltas into
an archive repo (which isn't presently supported).
But that was overly aggressive - we obviously *can* execute deltas
when mirroring into a bare repo. This should fix a regression with
the way flatpak uses mirroring to pull from a user repo into the
system.
Closes: #506
Approved by: alexlarsson
While converting the mirrorlist code from using GSList to GPtrArray, I
completely missed the fact that there is now a much cleaner way to do
this.
Closes: #484
Approved by: cgwalters
This commit adds mirrorlist support to the fetcher. Users can now
prepend url or/and contenturl by mirrorlist= to interpret the link as a
mirrorlist.
If an object is not found, the fetcher will automatically try the next
mirror in the order given in the list (assuming the order returned by
the server is significant).
Closes: #469
Approved by: cgwalters
This made sense back when we used a main loop even when we needed to
fetch objects synchronously. Nowadays, we no longer actually update
progress before the FETCHING_OBJECTS phase, which is only for async
requests.
This allows us to get rid of fetch_uri_contents_membuf_sync() and to
generalize fetch_uri_contents_utf8_sync() so that it only requires a
fetcher. This will be needed later.
Closes: #469
Approved by: cgwalters
Allow users to pass a --contenturl during `remote add` and store it in
the remote config.
Fish out the contenturl setting from the remote config and use it when
downloading static deltas and objects (except for commit signatures).
The idea here is that items in the trust chain (summary & sigs) can be
fetched from a more secure e.g. TLS-pinned location, while objects
themselves are fetched from another location. Once mirrorlist support is
added, this use-case will become even more advantageous.
Closes: #469
Approved by: cgwalters
OSTree function ostree_repo_pull_with_options starts a
series of operations that makes heavy use of the PullData's
cancellable.
This isn't effective, however, since nowhere in the code
the OtPullData.cancellable field is set. This is visible,
for example, when trying to cancel a Flatpak pull and nothing
happens, because the cancellable is not properly passed
to the pull data.
Fix that by setting the cancellable field of the pull data. It
owns a reference for safety reasons, and unreferences it at the
end of the operation.
ostreedev/ostree#482
Closes: #483
Approved by: cgwalters
Fixes this warning:
src/libostree/ostree-repo-pull.c:2162: Warning: OSTree: ostree_repo_pull_with_options: unknown parameter 'remote_name_or_baseurl' in documentation comment, should be 'remote_name'
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Closes: #472
Approved by: jlebon
In CentOS, these happened to appear in a repo that is served
via rsync, and having them not be world-readable caused mirroring
tools to fail.
They aren't secret, so don't make them so.
Closes: #468
Approved by: giuseppe