Bringing Docker Containers To IoT & Edge Devices, Part 2: Optimal and Reliable Storage Usage | Foundries.io

Photo of Mykhaylo Sul

Posted on Jan 24, 2023 by Mykhaylo Sul

7 min read

Part one walks you through our path towards container applications on IoT & Edge devices. It also outlines the docker engine vulnerability to power cuts during image downloading and unpacking (e.g. https://github.com/moby/moby/issues/42964), and how our platform overcomes this by means of "Restorable" Compose Apps.

The key element of the Foundries.io™ solution is a Over-The-Air (OTA) update mechanism that allows for updating an ostree-based rootfs as well as the Compose Apps on devices. One of the OTA update challenges is to make sure that a device disk will not become full. The solution can be guaranteed via the following two assurances:

  1. Files that have become obsolete after an update are removed.
  2. New update data won't fill the entire disk while downloading and storing to a disk.

Not meeting these two can lead to undefined behavior of a device and/or Apps, or—worst case scenario—a bricked device. The following is how we, at Foundries.io, have addressed this challenge to guarantee optimal and reliable storage usage.

Problem Statement

Ostree has a built-in mechanism to guarantee that the disk is not entirely filled during an OTA update. It removes unused files following a successful update, and checks whether a defined level of free space hasn't been crossed. By default, this is three percent of the total storage volume size. It is configurable through the min-free-space-percent and min-free-space-size variables. Check out man ostree.repo-config for more details.

Neither Docker Engine nor skopeo has such a mechanism aimed to prevent overfilling storage by OTA updates over a device's lifetime. Originally, Docker Engine was designed for Cloud usage...you know what they say about the Cloud—"it's elastic and so is storage!" Why the heck worry about storage overfilling and optimal usage then?

Because device storage is not elastic, and edge storage is too dependent on "edge" type. We have no choice but to care about storage.

To be fair to Docker Engine, it does use storage optimally, the best it can. The whole idea behind sharing of image layers by many containers is about optimal network bandwidth and storage usage. Also, it has API and command to delete unused stuff (e.g. docker image prune). But it never checks how much free storage is left prior to a layer downloading or unpacking, hence storage can be overfilled during image pulling. Also, neither image nor container pruning works perfectly, and this leaves garbage that increases over time.

As you may know from reading the previous part, our solution uses skopeo for image pulling. Unfortunately, it also does not verify available storage before fetching image layers.

To remedy to this issue, we advise our customers to decouple their system rootfs storage from Apps storage. Specifically, to keep the docker engine data directory (/var/lib/docker) on a separate volume. While this is a good idea in general, and helps to reduce negative effect from overfilling the storage volume dedicated for Apps, it still may lead to undefined App behaviour, and even making the device unusable from a user perspective.

Therefore, we rolled up our sleeves and implemented a solution to address the issue.

Making The OTA Update Storage Friendly

The solution consists of the following steps:

  1. Gathering metadata (hash and size) of each image layer for all App images for every supported architecture.
  2. Adding the metadata to the Compose App container image.
  3. Fetching the metadata about App image layers and comparing them with the layers currently stored on a device (skopeo store).
  4. Calculating the overall size of the missing layers that will be downloaded, and checking if there is enough storage to save them.
  5. Download the missing layers, then install and start the updated Apps/containers.
  6. Regardless of the install & start result—success or failure—remove the unused container and images from both the skopeo and docker layer stores.

Gathering And Attaching Metadata

A Compose App is packaged and distributed as a container image. The App packaging and the push to the Registry is the last step of the container CI build in the "publish-compose-apps" run. We extended the packaging utility with the functionality to gather metadata about the layers of all App images for each architecture. The metadata is then are attached to the Compose App image manifest and pushed to the Registry along with the rest of image data. This is an example of an App image manifest, shown by using fioctl targets show compose-app <build number> <app name> --manifest:

Manifest:
	annotations:
	  compose-app: v1
	config:
	  digest: sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
	  mediaType: application/vnd.oci.image.config.v1+json
	layers:
	- digest: sha256:ba16118aa9d7550d15f2dcf586e7ae51fbe167ed52aacdc7c4d3a782566428b1
	  mediaType: application/octet-stream
	  size: 586
	manifests:
	- digest: sha256:7fa755bd502c1ae33fed5e0f23f7d45a7d485924ef276c69f651320daec6e8e8
	  mediaType: application/vnd.oci.image.index.v1+json
	  platform:
	    architecture: amd64
	    os: ""
	  size: 1629
	- digest: sha256:4a24d4a7609c2adaa6105035b2811eed634d2d9c01358526d7544dcbe7e57cb5
	  mediaType: application/vnd.oci.image.index.v1+json
	  platform:
	    architecture: arm64
	    os: ""
	  size: 1789
	mediaType: application/vnd.oci.image.manifest.v1+json
	schemaVersion: 2

As you can see it includes references to the "layer index" images for each supported architecture (amd64 and arm64) in this case. Each "layer index" image lists size and the digest of each image layer.

skopeo inspect docker://hub.foundries.io/<factory>/<app>@sha256:7fa755bd502c1ae33fed5e0f23f7d45a7d485924ef276c69f651320daec6e8e8 --raw | jq

{
  "schemaVersion": 2,
  "mediaType": "application/vnd.oci.image.index.v1+json",
  "platform": {
    "architecture": "amd64",
    "os": ""
  },
  "layers": [
    {
      "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
      "size": 120,
      "digest": "sha256:388169b3a3d311d29d28eeba4fa687b0c18207b040f74b49622e5eca4b765fa6"
    },
    ...
    {
      "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
      "size": 1732918,
      "digest": "sha256:f2767c9963aac10b5b498e5b98bc719fd8a9ee9e79d7d6dee720d0fc441ea1d9"
    }
  ],
  "annotations": {
    "compose-app-layers": "v1"
  }
}

Checking if the Update Fits

The update agent (aktualizr-lite) fetches the App manifest and the layer index from the Registry, then it iterates over locally stored layers to find out which of the App layers are already present in the store and which are missing and need be downloaded. A sum of the missing layer sizes is the overall size required to accommodate the given update on a local store. In addition, we need to take into account the storage size that the missing layers require in an unpacked form in the docker store. We calculate it approximately by assuming the gzip compression ratio, instead of trying to determine an exact number what requires unjustifiable level of effort. Then, depending whether the docker and skopeo stores are located on the same storage volume, the update agent does one or two checks of available storage.

The aktualizr-lite's parameter storage_watermark can be used to customize the watermark for storage usage by Compose Apps. By default, it is set to 80 percent out of an overall volume size.

Pruning Unused Images

Once again, regardless an update status, success or failure, the agent removes the unused layers in the both stores, skopeo and Docker. The image pruning can be turned off for all images or for specified images, see the reference manual for more details.

It is worth noting that in the case of any docker store corruptions that may happen during layer unpacking, the store state can be restored. To do so:

  1. stop the aktualizr-lite systemd daemon if it is running;
  2. remove corrupted images/containers or even /var/lib/docker/image/overlay2 and /var/lib/docker/overlay2 directories;
  3. start the aktualizr-lite systemd daemon or run it from command line.

Please, be aware that removing these directories removes containers' top read-write layers too. This means the removal of any user data written directly to the container filesystem during its lifetime. In any case, we highly advise to use a read-only file system for containers and write data only to mounted volumes.

Next Steps

While Restorable Apps have been proven over time, there is always room for improvement. In addition to keeping our container technologies up-to-date in the LmP, we've been following a few interesting developments in that domain. For example, nerdctl, composefs,and and Nydus. As they mature, we may incorporate some of their elements into our solution.

Related posts