blob: fd8856334fe7f5d8f46a98ba4f7b281739db658a [file] [log] [blame] [view] [edit]
# containerd Mounts and Mount Management
## Mount Type
`Mount` is an important struct in containerd used to represent a filesystem
without needing any active state. This allows deferring the mounting of
filesystems to when they are needed. This also allows temporary mounts, normally
done by containerd to inspect a container image's filesystem or make changes
before handing off a filesystem to the lower level runtime. This also allows the
lower level runtime to make optimizations such as using virtio-blk over
virtio-fs for some mounts or performing the mounts inside another mount
namespace.
The `Mount` type is used by the `Snapshotter` interface to return the filesystem
for a snapshot. This allows the snapshotter to focus on the storage lifecycle of
the snapshot without complicated logic to handle the runtime lifecycle of the
mounted filesystem. This is part of containerd's decoupled architecture where
snapshotters and runtimes don't need to share state, only the set of mounts
needs to be communicated.
## Mount Management
In containerd 2.2, the mount manager was introduced to extend the functionality
of mounts, allowing for more powerful snapshotters and complex use cases that
are hard to represent with native filesystem mounts. It also adds an extra
layer of resource tracking to mounts, to add more protection against leaking
mounts in the host mount namespace.
### Extending the mount types
Typically snapshotters are limited to mount types which are mountable by
the host kernel, or in some advanced use cases, a vm guest kernel. The mount
manager is able to extend the mount types by using a plugable interface to
handle custom mount types.
The interface for a custom mount handler is very simple.
```go
type Handler interface {
Mount(context.Context, Mount, string, []ActiveMount) (ActiveMount, error)
Unmount(context.Context, string) error
}
```
#### Built-in Mount Handlers
##### Loopback Handler
The loopback handler (`loop`) allows mounting files as loopback devices. This is useful
for mounting disk images or filesystem images without requiring a pre-configured loopback device.
It is preferable to use the `loop` option on mounts when supported by other mount types to allow
other mount types to optimize when a loopback is needed, when this handler is used with
another mount type it may force a loopback to be used even when not necessary.
```go
// Example mount using loopback
mount.Mount{
Type: "loop",
Source: "/path/to/disk.img",
Options: []string{},
}
```
The handler automatically:
- Sets up the loopback device using the first available loop device
- Makes device available at mount point
- Handles cleanup on unmount
### Mount Transformers
Mount transformers are interfaces that can modify mounts based on previous mount state.
Transformers are useful for preparing mounts before they are activated, such as creating
directories or formatting filesystems.
```go
type Transformer interface {
Transform(context.Context, Mount, []ActiveMount) (Mount, error)
}
```
Transformers are specified in the mount type using a prefix pattern: `<transformer>/<mount-type>`.
Multiple transformers can be chained: `<transformer1>/<transformer2>/<mount-type>`.
#### Built-in Transformers
##### Format Transformer (`format/`)
In order to chain mounts together, the results from a previous mount may be
needed for subsequent mounts. Some of these mount parameters may not be
known until mount time, making it impossible to represent with static
mount values. Formatted mounts allow providing templated values for mount
parameters to be filled in at mount activation time using the previous
mounts' results.
Formatted mounts have a type that starts with `format/` followed by the
intended mount type after filling in format values.
Uses templating based on [go templates](https://pkg.go.dev/text/template) to
fill in values.
Values are referenced by the index of the previous active mounts. The following
is a list of supported values that can be provided in formatted mounts.
| Value | Args | Example | Description |
|-----|------|---------|-------------|
| `source` | <index> | `{{ source 0 }}` | Source from active mount at <index> |
| `target` | <index> | `{{ target 0 }}` | Target from the active mount at <index> |
| `mount` | <index> | `{{ mount 0 }}` | Mount point from the active mount at <index> |
| `overlay` | <start> <end> | `{{ overlay 0 2 }}` | Fill in overlayfs lowerdir arguments for active mount points at <start> through <end> |
Formatted mounts are handled differently than other custom mounts. If the
resulting mount after formatting is a supported system mount, it does not
need to be mounted by the mount handlers like custom mounts do.
**Example:**
```go
// First mount provides the lower layer
mount.Mount{
Type: "bind",
Source: "/var/lib/containerd/snapshots/1",
Options: []string{"ro"},
},
// Second mount uses formatting to reference the first mount
mount.Mount{
Type: "format/overlay",
Source: "overlay",
Options: []string{
"lowerdir={{ mount 0 }}",
"upperdir=/upper",
"workdir=/work",
},
}
```
##### Mkfs Transformer (`mkfs/`)
The mkfs transformer creates and formats filesystem images. It supports creating
ext2, ext3, ext4, and xfs filesystems in files that can then be mounted as loopback devices.
Mount options with the `X-containerd.mkfs.` prefix are consumed by the transformer:
| Option | Description | Example |
|--------|-------------|---------|
| `X-containerd.mkfs.size` | Size of the filesystem image (supports units like MiB, GiB) | `X-containerd.mkfs.size=100MiB` |
| `X-containerd.mkfs.fs` | Filesystem type (ext2, ext3, ext4, xfs) | `X-containerd.mkfs.fs=ext4` |
| `X-containerd.mkfs.uuid` | UUID for the filesystem | `X-containerd.mkfs.uuid=550e8400-e29b-41d4-a716-446655440000` |
**Example:**
```go
mount.Mount{
Type: "mkfs/loop",
Source: "/path/to/disk.img",
Options: []string{
"X-containerd.mkfs.size=1GiB",
"X-containerd.mkfs.fs=ext4",
},
}
```
This will:
1. Create a 1GiB file at `/path/to/disk.img`
2. Format it as ext4
3. Set up a loopback device for the file
4. Return the loop device for subsequent mounting
##### Mkdir Transformer (`mkdir/`)
The mkdir transformer creates directories before mounting. This is useful for
ensuring overlay upperdir and workdir directories exist, or for creating mount points.
Mount options with the `X-containerd.mkdir.` prefix are consumed by the transformer:
| Option Format | Description |
|--------------|-------------|
| `X-containerd.mkdir.path=<dir>` | Create directory with default permissions (0700) |
| `X-containerd.mkdir.path=<dir>:<mode>` | Create directory with specified octal mode |
| `X-containerd.mkdir.path=<dir>:<mode>:<uid>:<gid>` | Create directory with mode and ownership |
**Example:**
```go
mount.Mount{
Type: "format/mkdir/overlay",
Source: "overlay",
Options: []string{
"X-containerd.mkdir.path={{ mount 0 }}/upper:0755",
"X-containerd.mkdir.path={{ mount 0 }}/work:0755",
"lowerdir={{ mount 1 }}",
"upperdir={{ mount 0 }}/upper",
"workdir={{ mount 0 }}/work",
},
}
```
#### Chaining Transformers
Transformers can be chained together to perform multiple operations in sequence:
```go
mount.Mount{
Type: "mkfs/loop",
Source: "/data/fs.img",
Options: []string{
"X-containerd.mkfs.size=500MiB",
"X-containerd.mkfs.fs=xfs",
},
},
mount.Mount{
Type: "xfs",
Source: "{{ source 0 }}", // Loop device from previous mount
Options: []string{},
},
mount.Mount{
Type: "format/mkdir/overlay",
Source: "overlay",
Options: []string{
"X-containerd.mkdir.path={{ mount 1 }}/upper:0755",
"X-containerd.mkdir.path={{ mount 1 }}/work:0755",
"lowerdir=/lower",
"upperdir={{ mount 1 }}/upper",
"workdir={{ mount 1 }}/work",
},
}
```
This example:
1. Creates and formats a 500MiB XFS image
2. Sets up a loop device and mounts the XFS filesystem
3. Creates directories on the XFS filesystem and sets up an overlay
### Garbage Collection and Backreferences
The mount manager integrates with containerd's garbage collection system to ensure
mounts are properly tracked and cleaned up. Mounts can reference other resources
using special labels:
| Label | Description |
|-------|-------------|
| `containerd.io/gc.bref.container.*` | Back reference to a container |
| `containerd.io/gc.bref.content.*` | Back reference to content in the content store |
| `containerd.io/gc.bref.image.*` | Back reference to an image |
| `containerd.io/gc.bref.snapshot.*` | Back reference to a snapshot |
The `.*` suffix allows for named backreferences separated by `.` or `/`.
**Example:**
```go
info, err := mountManager.Activate(ctx, "my-mount", mounts,
mount.WithLabels(map[string]string{
"containerd.io/gc.bref.container": "container-id-123",
"containerd.io/gc.bref.snapshot.overlayfs": "active-snapshot-key",
}),
)
```
These labels ensure that the mount won't be garbage collected while the
referenced resources still exist, and the mount will be automatically cleaned
up when the references are removed.
### Relationship with runtimes
The runtime should use the mount manager to initiate activation of the mounts
before setting up the rootfs for a container. The runtime name should be passed
along to the activation call so that the mount manager may be configured for
runtime specific behavior.
The `ActivateOptions` allow runtimes to indicate which mount types they can handle:
```go
// Runtime can handle formatting, so don't let mount manager do it
info, err := mountManager.Activate(ctx, name, mounts,
mount.WithAllowMountType("format/*"),
)
// Runtime can handle loop devices
info, err := mountManager.Activate(ctx, name, mounts,
mount.WithAllowMountType("loop"),
)
```
#### Support with containerd shims
By default, the containerd runtime will call the mount manager to activate mounts,
which will perform any transformations and custom mounts. However, a runtime shim may
choose to handle some mount types or transformations itself in order to optimize
performance based on the runtime environment. For example, a VM based runtime may
choose to handle loopback mounts itself by passing the disk image file directly to
the VM instead of setting up a loop device on the host. The runtime shim may export
the annotation `containerd.io/runtime-allow-mounts` in its runtime info to indicate
which mount types the shim can handle. The values are comma separated and passed to
via the `mount.WithAllowMountType` option when activating mounts.
### Mount Manager Interface
The complete mount manager interface:
```go
type Manager interface {
Activate(context.Context, string, []Mount, ...ActivateOpt) (ActivationInfo, error)
Deactivate(context.Context, string) error
Info(context.Context, string) (ActivationInfo, error)
Update(context.Context, ActivationInfo, ...string) (ActivationInfo, error)
List(context.Context, ...string) ([]ActivationInfo, error)
}
```
**Methods:**
- `Activate`: Activate a set of mounts with a unique name
- `Deactivate`: Unmount and cleanup an activation
- `Info`: Get information about an active mount
- `Update`: Update an active mount (not yet implemented)
- `List`: List all active mounts, optionally filtered
**ActivationInfo** contains:
- `Name`: Unique identifier for the activation
- `Active`: Mounts that were handled by the mount manager
- `System`: Remaining mounts that must be performed by the system/runtime
- `Labels`: Labels associated with the activation
### Storage and Persistence
The mount manager stores activation state in a BoltDB database and maintains
mount targets in a dedicated directory. This provides:
- Crash recovery: Mounts can be tracked and cleaned up after daemon restart
- Garbage collection: Integration with containerd's GC system
- Lease support: Mounts can be associated with leases for lifecycle management
### Example Usage
```go
// Initialize mount manager
mm, err := manager.NewManager(
db,
targetDir,
manager.WithMountHandler("loop", mount.LoopbackHandler()),
)
// Create mounts for a writable overlay with custom block device
mounts := []mount.Mount{
{
Type: "mkfs/loop",
Source: "/data/writable.img",
Options: []string{
"X-containerd.mkfs.size=1GiB",
"X-containerd.mkfs.fs=ext4",
},
},
{
Type: "ext4",
Source: "{{ source 0 }}",
},
{
Type: "mkdir/format/overlay",
Source: "overlay",
Options: []string{
"X-containerd.mkdir.path={{ mount 1 }}/upper:0755",
"X-containerd.mkdir.path={{ mount 1 }}/work:0755",
"lowerdir=/snapshots/base",
"upperdir={{ mount 1 }}/upper",
"workdir={{ mount 1 }}/work",
},
},
}
// Activate with lease and backreference
info, err := mm.Activate(ctx, "container-123-rootfs", mounts,
mount.WithLabels(map[string]string{
"containerd.io/gc.bref.container": "container-123",
}),
)
// info.Active contains mounts handled by mount manager
// info.System contains remaining mounts to perform in container namespace
// Later, cleanup
err = mm.Deactivate(ctx, "container-123-rootfs")
```