ostree/src/libostree
Colin Walters 27c3e7884e core: Make write_object() a bit more efficient
Do as many operations as we can using the original file descriptor
while we have it open, rather than writing, closing, then reopening.

This necessitated very explicitly special casing symbolic links,
mainly due to the lack of lsetxattrat().

https://bugzilla.gnome.org/show_bug.cgi?id=707733
2013-09-08 14:40:52 -04:00
..
README.md core: Associate branches with remotes, move trigger runs into checkout 2012-04-03 23:46:34 -04:00
ostree-1.pc.in Install a shared library 2013-07-26 19:25:07 -04:00
ostree-chain-input-stream.c libostree: Fix many gtk-doc warnings 2013-08-17 08:41:31 -04:00
ostree-chain-input-stream.h core: Fix all introspection warnings 2013-07-27 10:13:30 -04:00
ostree-checksum-input-stream.c libostree: Fix many gtk-doc warnings 2013-08-17 08:41:31 -04:00
ostree-checksum-input-stream.h Switch to #pragma once for headers 2013-07-09 18:53:22 -04:00
ostree-core-private.h core: Use linkat() for hardlink checkouts too 2013-09-08 14:40:09 -04:00
ostree-core.c core: Use linkat() for hardlink checkouts too 2013-09-08 14:40:09 -04:00
ostree-core.h core: Add malloc-free API for objects, use *at functions for storing 2013-09-07 04:18:41 -04:00
ostree-diff.c repo-file: s/content_checksum/contents_checksum/ 2013-09-08 11:50:51 -04:00
ostree-diff.h core: Fix all introspection warnings 2013-07-27 10:13:30 -04:00
ostree-fetcher.c libostree: Change synchronous fetching API to return a stream 2013-09-02 14:48:21 -04:00
ostree-fetcher.h libostree: Change synchronous fetching API to return a stream 2013-09-02 14:48:21 -04:00
ostree-libarchive-input-stream.c Install a shared library 2013-07-26 19:25:07 -04:00
ostree-libarchive-input-stream.h Switch to #pragma once for headers 2013-07-09 18:53:22 -04:00
ostree-mutable-tree.c libostree: Fix many gtk-doc warnings 2013-08-17 08:41:31 -04:00
ostree-mutable-tree.h Switch to #pragma once for headers 2013-07-09 18:53:22 -04:00
ostree-repo-checkout.c core: Use linkat() for hardlink checkouts too 2013-09-08 14:40:09 -04:00
ostree-repo-commit.c core: Make write_object() a bit more efficient 2013-09-08 14:40:52 -04:00
ostree-repo-file-enumerator.c Install a shared library 2013-07-26 19:25:07 -04:00
ostree-repo-file-enumerator.h Switch to #pragma once for headers 2013-07-09 18:53:22 -04:00
ostree-repo-file.c builtin-commit: Don't parse the parent's GVariant by hand 2013-09-08 11:50:51 -04:00
ostree-repo-file.h builtin-commit: Don't parse the parent's GVariant by hand 2013-09-08 11:50:51 -04:00
ostree-repo-libarchive.c repo: Rename "stage" to "write" in the API 2013-09-06 20:31:12 -04:00
ostree-repo-private.h core: Use linkat() for hardlink checkouts too 2013-09-08 14:40:09 -04:00
ostree-repo-prune.c Fix warnings about unused variables 2013-08-30 14:23:45 -04:00
ostree-repo-pull.c Move ref writing to be transaction-based 2013-09-06 20:31:12 -04:00
ostree-repo-refs.c Move ref writing to be transaction-based 2013-09-06 20:31:12 -04:00
ostree-repo-traverse.c Fix warnings about unused variables 2013-08-30 14:23:45 -04:00
ostree-repo.c core: Use linkat() for hardlink checkouts too 2013-09-08 14:40:09 -04:00
ostree-repo.h repo: Drop the branch parameter from ostree_repo_commit 2013-09-08 11:50:51 -04:00
ostree-types.h core: Drop duplicated type declarations 2013-08-17 08:23:28 -04:00
ostree.h Install a shared library 2013-07-26 19:25:07 -04:00

README.md

Repository design

At the heart of OSTree is the repository. It's very similar to git, with the idea of content-addressed storage. However, OSTree is designed to store operating system binaries, not source code. There are several consequences to this. The key difference as compared to git is that the OSTree definition of "content" includes key Unix metadata such as owner uid/gid, as well as all extended attributes.

Essentially OSTree is designed so that if two files have the same OSTree checksum, it's safe to replace them with a hard link. This fundamental design means that an OSTree repository imposes negligible overhead. In contrast, a git repository stores copies of zlib-compressed data.

Key differences versus git

  • As mentioned above, extended attributes and owner uid/gid are versioned
  • Optimized for Unix hardlinks between repository and checkout
  • SHA256 instead of SHA1
  • Support for empty directories

Binary files

While this is still in planning, I plan to heavily optimize OSTree for versioning ELF operating systems. In industry jargon, this would be "content-aware storage".

Trimming history

OSTree will also be optimized to trim intermediate history; in theory one can regenerate binaries from corresponding (git) source code, so we don't need to keep all possible builds over time.

MILESTONE 1

  • Basic pack files (like git)

MILESTONE 2

  • Store checksums as ay
  • Drop version/metadata from tree/dirmeta objects
  • Add index size to superindex, pack size to index
    • So pull can calculate how much we need to download
  • Split pack files into metadata/data
  • pull: Extract all we can from each packfile one at a time, then delete it
  • Restructure repository so that links can be generated as a cache; i.e. objects/raw, pack files are now the canonical
  • For files, checksum combination of metadata variant + raw data
    • i.e. there is only OSTREE_OBJECT_TYPE_FILE (again)

MILESTONE 3

  • Drop archive/raw distinction - archive repositories always generate packfiles per commit
  • Include git packv4 ideas:
    • metadata packfiles have string dictionary (tree filenames and checksums)
    • data packfiles match up similar objects
  • Rolling checksums for partitioning large files? Kernel debuginfo
  • Improved pack clustering
    • file fingerprinting?
  • ELF-x86 aware deltas

git: http://git-scm.com/ Venti: http://plan9.bell-labs.com/magic/man2html/6/venti Elephant FS: http://www.hpl.hp.com/personal/Alistair_Veitch/papers/elephant-hotos/index.html

Compression

xdelta: http://xdelta.org/ Bsdiff: http://www.daemonology.net/bsdiff/ xz: http://tukaani.org/xz/