Commit Graph

77 Commits

Author SHA1 Message Date
Colin Walters 7519457382 fetcher: Define an abstraction over SoupURI
This is preparatory work for a potential libcurl backend.

Closes: #616
Approved by: jlebon
2016-12-07 23:00:58 +00:00
Colin Walters c1c70bceb7 [TSAN] Rework assertions to always access refcount atomically
`-fsanitize=address` complained that the `refcount > 0` assertions
were reading without atomics.  We can fix this by reworking them
to read the previous value.

Closes: #582
Approved by: jlebon
2016-11-17 19:41:57 +00:00
Colin Walters 37c07d2f1c pull: Add support for `http-headers` option
Some deployments may want to gate access to content based on things
like OAuth.  In this model, the client system would normally compute a
token and pass it to the server via an API.

We could theoretically support this in the remote config too, but
that'd be a bit weird for OAuth as the information is dynamic.
Therefore this cleans up the code a little bit to more clearly handle
the case that the fetcher is initialized from both remote config
data plus pull options.

Closes: #574
Approved by: giuseppe
2016-11-16 10:04:22 +00:00
Sjoerd Simons be9a3a7a19 OsreeFetcher: Treat 403 as not found
Private Cloudfront instances return 403 for objects which don't exist
rather then a 404.

Change the fetcher to assume 403 is ok for download that are "optional"
rather then erroring out at that step (e.g. trying to download a static
delta if the remote repo doesn't have those)

Closes: #531
Approved by: cgwalters
2016-11-05 17:34:09 +00:00
Sjoerd Simons 96356aa192 pull: Add per-remote cookie jar
Optionally read cookie jars for a remote to be used when downloading
data. This can be used for private repositories which require specific
cookies to be present, e.g. repositories hosted on Amazon cloudfront
using signed cookies.

Closes: #531
Approved by: cgwalters
2016-11-05 17:34:09 +00:00
Colin Walters b77edf24a3 tree-wide: Remove unused variables detected by CLang
CLang finds these, whereas GCC treats having
`__attribute__((cleanup))` as a use.

This obsoletes https://github.com/ostreedev/ostree/pull/411

Closes: #548
Approved by: jlebon
2016-10-27 17:02:01 +00:00
Jonathan Lebon a660e5650d OstreeFetcher: provide proxy credentials if needed
There seems to be an issue in libsoup which causes basic auth
credentials to not be passed to the proxy during requests. We thus have
to handle PROXY_UNAUTHORIZED responses and provide the auth ourselves.

Related: https://bugzilla.redhat.com/show_bug.cgi?id=1370558
Related: https://bugzilla.gnome.org/show_bug.cgi?id=772932

Closes: #529
Approved by: cgwalters
2016-10-14 16:06:08 +00:00
Colin Walters 82b756783b fetcher: Fix another finalization deadlock
If the current repo is already up to date (we have no content to
fetch), it's possible for the fetcher to not request any URIs.  So
create and then finalize it quickly.

Finalization involves calling `g_main_loop_quit()` +
`g_thread_wait()`.  However, if `g_main_loop_quit()` is run *before*
`g_main_loop_run()`, we'll deadlock because GMainLoop assumes in
`_run()` to start things.

This is a common trap - ideally, GMainLoop would record if `_quit()`
was called before `_run()` or something, but doing that now would
likely break people who are expecting quit() -> run() to restart.

In general, we've moved in various GLib-consuming apps to an
explicit "main context iteration with termination condition" model;
see `pull_termination_condition()` in the pull code.

This fixes this race condition.

I verified that an assertion in `_finalize` that more than
zero URIs were requested was hit in multiple test cases, and this patch
has survived a while of make check loops.

Closes: https://github.com/ostreedev/ostree/issues/496

Closes: #499
Approved by: jlebon
2016-09-08 20:09:07 +00:00
Jonathan Lebon 157d878ce1 pull: add mirrorlist support
This commit adds mirrorlist support to the fetcher. Users can now
prepend url or/and contenturl by mirrorlist= to interpret the link as a
mirrorlist.

If an object is not found, the fetcher will automatically try the next
mirror in the order given in the list (assuming the order returned by
the server is significant).

Closes: #469
Approved by: cgwalters
2016-08-31 16:52:12 +00:00
Colin Walters ba21023d6c fetcher: Explicitly join thread if it's not self
This fixes a valgrind leak; see [this StackOverflow thread](http://stackoverflow.com/questions/17642433/why-pthread-causes-a-memory-leak).

Closes: #410
Approved by: giuseppe
2016-07-28 10:10:17 +00:00
Colin Walters 972ed3e54e fetcher: Remove unused GTask structure member
Spotted by mbarnes.

Closes: #383
Approved by: mbarnes
2016-07-08 18:38:11 +00:00
Colin Walters 5d21650ea5 fetcher: Clear all data for session in session thread
Conceptually the session thread owns the session, so let's clear out
everything predictably there, rather than sometimes having it happen
on the main thread.

Also, this moves up clearing the pending/outstanding queues *before*
we unreference the session, since conceptually they need to reference
it as well.

Based on a patch from: Matthew Barnes <mbarnes@redhat.com>

Closes: #383
Approved by: mbarnes
2016-07-08 18:38:11 +00:00
Colin Walters b4c15209e8 fetcher: Hold a ref to main context for lifetime of thread
I don't think this fixes the bug I was seeing, but it makes me more
comfortable to know we have a strong ref to the main context across
the thread lifetime, and we only unset the default right before
we go away.

If something in `thread_closure_unref()` used
`g_main_context_get_thread_default()` for example it'd be wrong
before.

Closes: #383
Approved by: mbarnes
2016-07-08 18:38:11 +00:00
Colin Walters 535033a4f0 pull: Ensure we always process queue only from main thread
I was easily reproducing a hang on pulls with thousands of requests on
current git master.  The initial symptom seemed to be that there are
multiple code paths where we don't invoke
`session_thread_process_pending_queue()`.  We really need to do
that any time we remove something from the outstanding queue,
to ensure it gets filled again.

A further issue is that we were tying the lifecycle of the pending
object to the `GTask`, but the task could be unref'd from the main
thread (via a `GSource` on the main thread), and that introduced
threadsafety issues, because the hash table and other data suddenly
could be concurrently modified.

Both of these need to be fixed together.  First, we introduce
`Arc<Pending>`, and ensure that both the main and worker threads hold
references.

Second, we ensure that we re-process the queue *immediately* whenever
a task is done, inside the worker thread, rather than doing it
incidentally via an unref.  This architecture is quite similar to what
the outside pull code is doing.

Closes: #350
Approved by: jlebon
2016-06-17 18:19:23 +00:00
Alexander Larsson d368624798 Build on older versions of glib
Various places need to include libglnx.h for the autoptr backport
fallbacks to be there before ostree-autocleanups.h is included.

This fixes the build on centos7·

Closes: #309
Approved by: giuseppe
2016-05-25 14:01:39 +00:00
Gatis Paeglis 441c03ba9e Fix build when have_libsoup_client_certs=no
This fixes a build failure with older libsoup versions
that do not have the client certificates feature.

Closes: #294
Approved by: cgwalters
2016-05-13 14:33:08 +00:00
Colin Walters 6724519080 libglnx porting: Migrate to glnx_stream_fstat()
I ended up deciding to move this one into libglnx, seems like
something other libglnx-using software might want to do, even though
xdg-app doesn't right now.

Closes: #282
Approved by: jlebon
2016-05-06 14:29:59 +00:00
Colin Walters a56ba6081a repo: Clean up staging directory for previous boot IDs
We had a policy of cleaning up all files in `$repo/tmp` older
than one day, but we should really clean up previous bootid staging
directories too, as they can potentially take up a lot of disk space.

https://bugzilla.gnome.org/show_bug.cgi?id=760531

Closes: #170
Approved by: jlebon
2016-05-02 18:44:44 +00:00
Alexander Larsson 6a57d0a2f0 fetcher: Initialize output_stream_set_lock mutex
ostree pull-local crashed for me in thread_closure_unref () doing:
    g_mutex_clear (&thread_closure->output_stream_set_lock);

Seems like we never initialize this mutex.

Closes: #254
Approved by: cgwalters
2016-04-13 14:11:46 +00:00
Colin Walters d456fe5adb libglnx porting: Use glnx_set_error_from_errno
⚠️ There is a notable spiked pit trap here around
`posix_fallocate()` and `errno`.  This has bit other projects,
see e.g.
7bb87460e6

Otherwise the port was straightforward.
2016-03-23 10:26:01 -04:00
Matthew Barnes 5adafd7674 fetcher: Fix hung GTlsInteraction
The GTlsInteraction instance must be created in the session thread
so it uses the correct GMainContext.
2016-02-09 00:58:17 +00:00
Matthew Barnes 1f1bfbf711 fetcher: Lazily create tmp directory
The tmp directory is lazily created for each fetcher instance, since
it may require superuser permissions and some instances only need
_ostree_fetcher_request_uri_to_membuf() which keeps everything in
memory buffers.
2015-12-19 09:21:22 -05:00
Matthew Barnes f0b143ca8a pull: Push a temporary main context for sync requests
Given the previous commit, which isolates SoupSession in a separate
thread, it should be safe to start pushing a temporary main context
for synchronous requests again.

This partially reverts 84fe2ff, which partially reverted 9f3d586.

Related to https://bugzilla.gnome.org/show_bug.cgi?id=753336
2015-12-14 11:11:34 -05:00
Matthew Barnes 54066420cf fetcher: Move the SoupSession to a separate thread
Move the SoupSession to a separate thread with its own isolated main
context and main loop.  All interaction with the SoupSession occurs
by way of idle sources attached to the session's main context, which
execute on the session's thread.

This should solve the problem of running an asynchronous fetch request
synchronously by pushing a new thread-default main context and iterating
a main loop until the request completes.  Prior to this, the new thread-
default main context would interfere with the SoupSession's own async
processing.
2015-12-14 11:11:29 -05:00
Matthew Barnes af30fc764a fetcher: Add "config-flags" construct-only property
A lot of effort here just to avoid touching SoupSession directly in
ostree_fetcher_new().  The reason will become apparent in subsequent
commits.

Note this introduces generated enum/flags GTypes using glib-mkenums.
I could have just made the property type as plain integer, but doing
properties right will henceforth be easier now that the automake-fu
is established.
2015-12-14 09:41:29 -05:00
Alexander Larsson 96eed95720 repo: Allocate a tmpdir for each OstreeFetcher to isolate concurrent downloads
This way two pulls will not use the same tmpdir and accidentally
overwrite each other. However, consecutive OstreeFetchers will reuse
the tmpdirs, so that we can properly resume downloading large objects.

https://bugzilla.gnome.org/show_bug.cgi?id=757611
2015-12-14 08:39:11 +01:00
Matthew Barnes 581b7d6183 fetcher: Remove "total_requests" counter
Incremented, but not used for anything.
2015-12-01 12:34:34 -05:00
Matthew Barnes 97efe12ac6 fetcher: Remove "sending_messages" hash table
Vestige of ostree_fetcher_query_state_text(), removed last year.
2015-12-01 12:34:28 -05:00
Matthew Barnes e21188a245 fetcher: Track outstanding requests with a table
Track outstanding HTTP requests in a table for easier debugging.

Also fixes a bug discussed in https://bugzilla.gnome.org/755224
where the outstanding request counter was not decremented in the
event of an error, which could result in the fetcher hitting its
max request limit and locking up.

The bug is fixed by removing the request struct from the table in
pending_uri_free(), which is always called regardless of error,
so the outstanding request count is always accurate.
2015-09-24 10:01:01 -04:00
Matthew Barnes 771075d319 fetcher: Rework reference counting
Have OstreeFetcherPendingURI be the GTask's task_data and pass the GTask
around in queues and callback closures.  The reference counting before
was a little confusing and this helps clarify it, at least to me.

OstreeFetcherPendingURI no longer needs its own reference count.
2015-09-23 19:52:42 -04:00
Matthew Barnes 330a99c40b fetcher: Convert from GSimpleAsyncResult to GTask
Obsessive compulsive cleanup.
2015-09-23 19:52:10 -04:00
Matthew Barnes df4865e395 fetcher: Remove message_to_request table
Does not appear to be needed, no lookups on the table.
2015-09-23 13:50:50 -04:00
Colin Walters 84fe2ffb2b pull: Go back to using one main context
xdg-app was hanging for me with v2015.8, but worked with v2015.7.
I narrowed things down to the GMainLoop/context commit, in which
we started pushing a temporary main context for synchronous
requests internally.

That's never really going to work with libsoup - there needs
to be a single main context which works on the socket.  Furthermore,
clients couldn't get progress messages that way.

For *other* internal uses where we added APIs that talk to the remote
repo, we cleanly push a temporary main context.

(Note that I kind of snuck in a change here around the GError handling
 in pulls that isn't strictly related but came up in testing)
2015-09-01 14:39:24 -04:00
Colin Walters 0110183675 fetcher: Use 0666 (-umask) for temporary files
There's no reason to keep them hidden.  I have a hard policy that
OSTree should *not* be used to carry secrets.  Things like host ssh
private keys should be set up out of band by an OS-external
configuration mechanism such as kickstart, cloud-init, etc.

We also assume that hiding binaries is not very useful as most
attackers would be able to find them on the Internet or (for
subscribed content) acting as a customer.

This fixes a bug with mirroring because we changed to take the
unmodified upstream objects rather than uncompress <-> recompress.

https://bugzilla.gnome.org/show_bug.cgi?id=748959
2015-08-27 11:36:48 -04:00
Colin Walters 9f3d586993 pull: Stop using GMainLoop
First of all, what we were doing with having GMainLoop in the internal
APIs is wrong.  Synchronous APIs should always create their own main
context and not iterate the caller's.  Doing the latter creates
potential for evil reentrancy issues.  Sync API should block, async
API is for not blocking.

Now that's out of the way, fix the pull code to do the clean

```
while (termination_condition (state))
  g_main_context_iteration (mainctx, TRUE);
```

model for looping.  This is a lot easier to understand and ultimately
more reliable than having other code call `g_main_loop_quit()`, as the
loop condition is in exactly one place.

We can also remove the idle source which only fired once.

Note we have to add a hack here to discard the synchronous session and
create a new one which we only use async.

https://bugzilla.gnome.org/show_bug.cgi?id=753336
2015-08-13 22:02:00 -04:00
Colin Walters 31d16c9cce pull: Plug a memory leak 2015-06-29 21:57:44 -04:00
Colin Walters 889b86e96d pull: Avoid leaking signal handlers across fetch requests
libsoup will cache sessions, so it might be the case that we get a
reused session when pulling from the same repo multiple times in one
process.

In this case we were leaking signal connections, which caused
callbacks into freed memory with bad consequences.

Fix it by tying the signal connection to the object lifetime.
2015-06-29 21:56:03 -04:00
Matthew Barnes 4ef0280941 Remove unnecessary #include "libgsystem.h" 2015-05-06 22:07:11 -04:00
Matthew Barnes e6556dd223 Use g_autoptr(GBytes) instead of gs_unref_bytes 2015-05-06 22:07:10 -04:00
Matthew Barnes 6a5f7b1288 Use glnx_unref_object instead of gs_unref_object
For non-GIO object types, at least until autocleanup support for GObject
based types becomes more widespread.
2015-05-06 22:07:04 -04:00
Matthew Barnes 4ee1acd981 Use g_autoptr() for GIO object types
GLib 2.44 supplies all the necessary autocleanup macros for GIO types,
and libglnx backports the relevant macros for ostree.
2015-05-06 21:51:19 -04:00
Matthew Barnes 7a62d64968 Use g_autofree instead of gs_free 2015-05-06 21:50:17 -04:00
Colin Walters 115e05746b pull: Handle remote web server not honoring range requests
It's valid for the remote server to say 200 OK and give us the entire
file instead of a 206 Partial Content, and in that case we should blow
away the previous cached data, rather than blindly appending to it and
thus creating multiple copies of the data inside the file.

This problem primarily occurs when we do have the complete file, and
we're interrupted, then try again, where the new process didn't record
the download was already complete.  We do a range request for bytes
past the end, and some web servers (e.g. Akamai) will return 200 OK
with the whole content again, rather than a 416 Requested Range Not
Satisfiable.

Thus we could also fix this by saner caching strategy - since we know
the file is complete, rename it again to $checksum.done or something
before it's processed.  (Or really, rework how we do caching more
intelligently in general).

This fixes the issue that interrupted pulls failed with such
webservers, although repeated attempts would eventually succeed
because we'd unlink files that failed to pull.

Related: https://bugzilla.redhat.com/show_bug.cgi?id=1207292
2015-04-06 14:33:16 -04:00
Colin Walters 9020fe2547 Change OstreeFetcher to be dirfd-relative
This is a noticeable cleanup, and fixes another big user of GFile* in
performance/security sensitive codepaths.

I'm specifically making this change because the static deltas code was
leaking temporary files, and cleaning that up nicely would be best if
we were fd relative.
2015-01-14 22:12:08 -05:00
Matthew Barnes 5c26e392ec fetcher: Add a priority value to async requests 2015-01-11 18:48:21 -05:00
Giuseppe Scrivano f699153f67 ostree-fetcher: move more logic into ostree_fetcher_request_uri_internal
Make _ostree_fetcher_request_uri_with_partial_async and
ostree_fetcher_stream_uri_async simple wrapper around the same
function, all the requests are created in the same place now.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2014-11-12 21:20:28 -05:00
Giuseppe Scrivano a5491f98cb ostree-fetcher: make _ostree_fetcher_stream_uri_sync private
Rename _ostree_fetcher_contents_membuf_sync to
ostree_fetcher_request_uri_to_membuf and drop unused argument
user_data.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2014-11-12 21:20:28 -05:00
Giuseppe Scrivano c2bc99bc16 ostree-fetcher: Remove _ostree_fetcher_request_uri_to_stream function
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2014-11-12 21:20:28 -05:00
Giuseppe Scrivano d48aca5645 ostree-fetcher: add max_size argument to change _ostree_metalink_request_sync
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2014-11-12 21:20:28 -05:00
Giuseppe Scrivano a4a4921d3f ostree-fetcher: remove two unused functions
_ostree_fetcher_query_state_text() and_ostree_fetcher_get_n_requests()
have no callers, so remove them.

If they will be needed, they can be easily copied back from the git
history.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2014-11-12 21:20:28 -05:00