By default, when doing a commit, scan all of our loose objects and
build up a (device,inode) -> checksum hash. Then when we're doing a
commit, if we see a file with that (device,inode) pair, we can avoid
checksumming it.
This will allow us to use hard links again for user-mode checkouts,
rather than the hackish link cache. It was pretty silly anyways to
have file objects be stored with just a small metadata header
prepended, but uncompressed.
Either they should be hardlinkable, or compressed (in pack files).
Rather than passing xattr/file_info for all objects, change the API to
assume we're passing the defined object stream for each type. Namely,
for OSTREE_OBJECT_TYPE_FILE, we're now giving the "archive file" data.
This significantly cleans up the code for committing to archive mode
repositories, at the cost of having to (at present) create an
intermediate temporary file when committing to raw repositories.
This will be useful for ostbuild; a user can create their own archive
mode repository which transparently inherits objects from the
root-owned one in /ostree.
This is a convenient way to have a lookaside directory of hard links,
which can greatly speed up checkouts. In the future we probably want
to push this down into the repository.
Having the archived vs not distinction in the object system wasn't
useful in light of pack files. In fact, we should probably move
towards generating a pack file per commit by default.
Don't expose GChecksum in APIs. Add a new stream class which allows
us to pass an input stream somewhere, but gather a checksum as it's
read.
Move some bits of the internals towards binary csums.
Previously we had the "staged" state to ensure we didn't add a commit
object without the associated dirtree, etc. However it's
easier/better to just ensure in the pull command that we have all
referenced objects.
Also change pull to download metadata first. This will allow adding
a progress bar later.
Expose the lower-level functionality in libostree, change checkout
builtin to be a higher level driver. This will allow us to more
easily improve the "checkout" builtin..
We want to support both "bare" lookups where "foo" can be local, or in
any remote, as well as prefixed ones for a specific remote.
This fixes ostree-pull noticing that nothing has changed.
Before we were creating randomly-named temporary files in repo/tmp
when downloading via pull, but that means if the download process is
interrupted, we have to redownload everything again.
Let's still keep the concept of a "transaction" where files are
stored in the repository as atomically as possible (i.e. we
do a bunch of rename() calls), but now we also have an explicit
"tmp/pending/objects" directory that contains named objects.
This allows us to then skip redownloading things that are pending.
The builder wants the ability to mark a given file as e.g. setuid. To
implement this, the repo now has a callback-based API when importing a
directory to modify or remove items.
The commit tool accepts a "statoverride" file as input which looks like:
+mode /path/to/file
If multiple files have the same hash, we need to ensure we're not
overwriting other tempfiles in the same transaction. Instead
just delete them, since we know they're in the repo.
The tar files we're making of artifacts don't include parent
directories. Now we could change the builder to make them, but we can
also just autocreate them on import. Mode 0755 with no xattrs seems
OK here.
It's pretty trivial to map a previously existing commit tree into a
mutable tree too. While we're here change the command line arguments
for commit so that we can now properly overlay any combination of
directory, commit, or tarfile.
Rather than offering high level "commit directory", instead perform
operations on a mtree. Commits are treated more like regular objects.
Change the commit builtin to drive this all at a lower level.
The tar import code forced the resuscitation of a hackish "FileTree"
data type for representing an in-memory tree. Split this out
into an OstreeMutableTree class for future use by any other in-memory
tree construction.
ostbuild will generate two artifacts: foo-runtime.tar.gz and
foo-devel.tar.gz in the general case. When committing to the devel
tree, it'd be lame (i.e. slower and not atomic) to have to commit
twice.
This will allow us to have hardlink checkouts of archives. A key use
case here is an archive repo of an OS (with root-owned files etc.)
where we want to do builds in a user tree.
A positive side effect of doing things this way is that now the SHA256
checksums for a given file should be identical regardless of whether
it's stored in an archive or bare repository.
It's too confusing that we call the mode "archive" but the actual
files ".packfile". Also, git already has a "packfile" that serves a
totally different purpose.
Note this change makes it so we no longer call link() from an import
filesystem tree to the repository. This is a Good Thing really; it
makes local FS commits slower, but also less prone to corruption.
We really want the ability to take a .tar.gz and directly import
it into a repository, without creating a temporary filesystem tree.
First, doing it this way is significantly faster. Also, this allows
us to handle importing tar files with e.g. uid 0 files into packed
repositories as non-root, which is very useful for tests and builds.
We never actually dropped into the bits to write metadata as packfiles,
because such a thing doesn't exist.
Also add a GInputStream-based API for writing packfiles.
The default is always ignore_exists. Also port the internals here
to use more GIO code, and stop using *at syscall variants since they're
only useful if used 100%.
This commit originally was to port ostree_stat_and_checksum_file() to
GFile*, but I noticed that the checksum code was reading data in host
endianness. Fix that while we're here.
This invalidates all existing repositories.
This necessitated a large set of changes.
We now support an "archive" mode for repositories. In this mode,
files are stored "packed" rather than hard linked. This allows one to
e.g. store an OSTree repository with root-owned files as non-root. It
is also used as the basis for serving repositories via HTTP.
While doing this I realized that GVariant is endianness-dependent; I
decided to just store all data in big endian.