All Projects → AkihiroSuda → filegrain

AkihiroSuda / filegrain

Licence: Apache-2.0 license
transport-agnostic, fine-grained content-addressable container image layout

Programming Languages

go
31211 projects - #10 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to filegrain

fullmetalupdate
FullMetalUpdate Python client application.
Stars: ✭ 19 (-17.39%)
Mutual labels:  oci, oci-image
oci-build-task
a Concourse task for building OCI images
Stars: ✭ 57 (+147.83%)
Mutual labels:  oci, oci-image
ocibuilder
A tool to build OCI compliant images
Stars: ✭ 63 (+173.91%)
Mutual labels:  oci, oci-image
Terrier
Terrier is a Image and Container analysis tool that can be used to scan Images and Containers to identify and verify the presence of specific files according to their hashes.
Stars: ✭ 203 (+782.61%)
Mutual labels:  container, oci
inclavare-containers
A novel container runtime, aka confidential container, for cloud-native confidential computing and enclave runtime ecosystem.
Stars: ✭ 510 (+2117.39%)
Mutual labels:  container, oci
vilicus
Vilicus is an open source tool that orchestrates security scans of container images(docker/oci) and centralizes all results into a database for further analysis and metrics.
Stars: ✭ 82 (+256.52%)
Mutual labels:  oci, oci-image
ctnr
rootless runc-based container engine - deprecated in favour of podman
Stars: ✭ 30 (+30.43%)
Mutual labels:  oci, oci-image
imgcrypt
OCI Image Encryption Package
Stars: ✭ 214 (+830.43%)
Mutual labels:  oci, oci-image
undock
Extract contents of a container image in a local folder
Stars: ✭ 119 (+417.39%)
Mutual labels:  container, oci
Clair
Vulnerability Static Analysis for Containers
Stars: ✭ 8,356 (+36230.43%)
Mutual labels:  oci, oci-image
Buildkit
concurrent, cache-efficient, and Dockerfile-agnostic builder toolkit
Stars: ✭ 4,537 (+19626.09%)
Mutual labels:  oci, oci-image
Runtime
OCI (Open Containers Initiative) compatible runtime using Virtual Machines
Stars: ✭ 588 (+2456.52%)
Mutual labels:  container, oci
Cc Oci Runtime
OCI (Open Containers Initiative) compatible runtime for Intel® Architecture
Stars: ✭ 418 (+1717.39%)
Mutual labels:  container, oci
Runtime
Kata Containers version 1.x runtime (for version 2.x see https://github.com/kata-containers/kata-containers).
Stars: ✭ 2,103 (+9043.48%)
Mutual labels:  container, oci
K8s Diagrams
A collection of kubernetes-related diagrams
Stars: ✭ 227 (+886.96%)
Mutual labels:  container
Aws Containers Task Definitions
Task Definitions for running common applications Amazon ECS
Stars: ✭ 210 (+813.04%)
Mutual labels:  container
Archon
Cluster operation the Kubernetes way
Stars: ✭ 197 (+756.52%)
Mutual labels:  container
Ko
Build and deploy Go applications on Kubernetes
Stars: ✭ 3,755 (+16226.09%)
Mutual labels:  container
Harbor
An open source trusted cloud native registry project that stores, signs, and scans content.
Stars: ✭ 16,320 (+70856.52%)
Mutual labels:  container
Kruise
Automate application management on Kubernetes (project under CNCF)
Stars: ✭ 2,819 (+12156.52%)
Mutual labels:  container

⚠️ FILEgrain is abandoned in favor of stargz/CRFS. See containerd#3731 and https://github.com/ktock/remote-snapshotter


FILEgrain: transport-agnostic, fine-grained content-addressable container image layout

Build Status GoDoc

FILEgrain is a (long-term) proposal to extend OCI Image Format to support CAS in the granularity of file, in a transport-agnostic way.

Your feedback is welcome.

Talks

Pros and Cons

Pros:

  • Higher concurrency in pulling image, in a transport-agnostic way
  • Files can be lazy-pulled. i.e. Files can appear at the filesystem before it is actually pulled.
  • Finer deduplication granularity

Cons:

  • The blobs directory in the image can contain a large number of files. So, readdir() for the directory is likely to become slow. This could be mitigated by using external blob stores though.

Format

FILEgrain defines the image manifest which is almost identical to the OCI image manifest, but different in the following points:

  • FILEgrain image manifest supports continuity manifest (application/vnd.continuity.manifest.v0+pb and ...+json) as an Image Layer Filesystem Changeset. Regular files in an image are stored as OCI blob and accessed via the digest value recorded in the continuity manifest. FILEgrain still supports tar layers (application/vnd.oci.image.layer.v1.tar and its families), and it is even possible to put a continuity layer on top of tar layers, and vice versa. Tar layers might be useful for enforcing a lot of small files to be downloaded in batch (as a single tar file).
  • FILEgrain image manifest SHOULD have an annotation filegrain.version=20170501, in both the manifest JSON itself and the image index JSON. This annotation WILL change in future versions.

It is possible and recommended to put both a FILEgrain manifest file and an OCI manifest file in a single image.

Example

image index: (The second entry is a FILEgrain manifest)

{
    "schemaVersion": 2,
    "manifests": [
	{
	    "mediaType": "application/vnd.oci.image.manifest.v1+json",
	    ...
	},
	{
	    "mediaType": "application/vnd.oci.image.manifest.v1+json",
	    ...,
	    "annotations": {
		"filegrain.version": "20170501"
	    }
	}
    ]
}

image manifest: (a continuity layer on top of a tar layer)

{
    "schemaVersion": 2,
    "layers": [
	{
	    "mediaType": "application/vnd.continuity.manifest.v0+json",
	    ...
	},
	{
	    "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
	    ..,
	}
    ],
    "annotations": {
	"filegrain.version": "20170501"
    }
}

Distribution

FILEgrain is designed agnostic to transportation and hence can be distribeted in any way.

My personal recommendation is to just put the image directory to IPFS. However, I intentionally designed FILEgrain not to use IPFS multiaddr/multihash.

Future support for IPFS blob store

So as to avoid putting a lot file into a single OCI blob directory, it might be good to consider using IPFS as an additional blob store.

IPFS store support is not yet undertaken, but it would be like this:

{
    "schemaVersion": 2,
    "layers": [
	{
		"mediaType": "application/vnd.continuity.manifest.v0+json",
		...,
		"annotations": {
			"filegrain.ipfs": "QmFooBar"
		}
	}
    ],
    "annotations": {
	"filegrain.version": "2017XXXX"
    }
}

In this case, the layer SHOULD be fetch via IPFS multihash, rather than the digest values specified in the continuity manifest. Also, the continuity manifest MAY omit digest values, since IPFS provides them redundantly.

Note that this is different from just putting the blobs directory onto IPFS, which would still create a lot of files on a single directory, when pulled from non-FILEgrain implementation.

POC

Builder:

  • Build a FILEgrain image from an existing OCI image (--source-type oci-image)
  • Build a FILEgrain image from an existing Docker image (--source-type docker-image)
  • Build a FILEgrain image from a raw rootfs directory (--source-type rootfs)

Lazy Puller:

Mounter:

  • Read-only mount using FUSE (Linux)

Writable mount is not planned at the moment, as FILEgrain is optimized for "cattles" rather than "pets". Users should use bind-mount or some union filesystems for /tmp, /run, and /home.

POC Usage

Install FILEgrain binary:

$ go get github.com/AkihiroSuda/filegrain

Convert a Docker image (e.g. java:8) to a FILEgrain image /tmp/filegrain-image:

# filegrain build -o /tmp/filegrain-image --source-type docker-image java:8

Prepare an OCI bundle /tmp/bundle.sh from ./oci-runtime-bundle.template:

# cp -r ./oci-runtime-bundle.template /tmp/bundle
# cd /tmp/bundle
# ./prepare.sh

Mount the local FILEgrain image /tmp/filegrain-image on /tmp/bundle/rootfs:

# filegrain mount /tmp/filegrain-image /tmp/bundle/rootfs

In future, filegrain mount should support mounting remote images over Docker Registry HTTP API as well.

Open another terminal, and start runC with the bundle /tmp/bundle:

# cd /tmp/bundle
# runc run foo

Instead of runc, you will be able to use docker run as well when Docker supports running an arbitrary OCI runtime bundle.

The container starts without pulling all the blobs. Pulled blobs can be found on /tmp/filegrain-blobcacheXXXXX:

# du -hs /tmp/filegrain-blobcache*

This directory grows as you read(2) files within the container rootfs.

POC Benchmark

Please refer to #17.

e.g. Pulling 352MB of blobs is enough for using NLTK with 8.3GB kaggle/python image.

Similar work

Lazy distribution

FAQ

Q. Why not just use IPFS directory? It is CAS in the granularity of file.

A. Because IPFS does not support metadata of files. Also, this way is not transport-agnostic.

Q. Usecases for lazy-pulling?

A. Here are some examples I can come up with:

  • Huge web application composed of a lot of static HTML and graphic files
  • Huge scientific data (a content-addressable image with full code and full data would be great for reproducible research)
  • Huge OS image (e.g. Windows Server, Linux with VNC desktop)
  • Huge runtime (e.g. Java, dotNET)
  • Huge image that is composed of multiple software stack for integration testing

Please also refer to the list of similar work about lazy distribution.

Q. Isn't it a bad idea to put a lot of file into a single blobs directory?

A. This could be mitigated by avoid putting file into the OCI blob store, and use an external blob store instead e.g. IPFS. (go-ipfs supports sharding), although not transport-agnostic. See also an idea about future support for IPFS blob store.

Also, there is an idea to implement sharding to the OCI native blob store: opencontainers/image-spec#449.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].