OSTree Static Deltas

Rolling out big OTA updates efficiently is something we are often asked about. This article will talk about a feature in OSTree called static deltas and how they make platform updates more efficient.

Background

An OSTree server is essentially a content addressable storage system. A good way to explain OSTree is by comparing and contrasting with two other familiar CAS systems: Git and the Docker Registry. When updating your local copy with the latest copy, you need to figure out how to get from point A to point B (or what I'll call "source" to "destination"). Each system does this a differently based on their unique needs.

Git

The modern "smart" protocol is interactive. You tell the server your source and your desired destination. The server helps you figure out the best way to get there. This makes sense for Git. You have lots of clients at different sources and many storage objects in a tree-based hierarchy. It does have a big drawback: hosting a Git server is pretty resource intensive.

Docker Registry

Docker has a totally different usage pattern. A container image is a few layers. Each layer is an object. The client simply looks at the layers in a container image and downloads the ones it doesn't have a copy of. This works pretty well, especially on server-class hardware in data centers. It does have a drawback: Sometimes a single layer can be huge. This leads to clients having to occasionally download huge files. If a huge download fails, it has to retry downloading the entire file.

OSTree

OSTree is inspired by Git. However, the designers had an explicit goal of avoiding smart servers. This is great for their stated reasons, but it can have a drawback. OSTree has many objects in a tree hierarchy. These objects are often small. This leads to clients making hundreds (sometimes thousands) of small HTTP requests during an update. Even with some advancements like HTTP connection reuse this can be slow for both the client and server side.

Static deltas are OSTree's solution to this problem. Since the clients aren't on as many random source objects as Git, an operator can better reason about what updates are going to look like. Static deltas allow an operator to generate optimized downloads. For example, I've produced an OTA that would change 750 files that total about 1.5G in data. Without static deltas, this is nearly impossible. OStree generates the static delta as 38 files that are each about 30M. This is a pretty good balance for a resource constrained device with spotty internet connectivity.

Foundries.io Approach to Static Deltas

For development workflows, static deltas might not be necessary. However, for production and more formal test devices, adding static-delta generation to your workflow is a good idea. There are two ways you might look at this: with and without waves.

Without Waves

Consider this scenario: You have the classic, "devel" and "master" tagged Targets to handle CI. Certain "master" Targets are occasionally tagged with "promoted" and used by your "production" or formal verification devices. You want to rollout Target #42 which is going to result in a big OTA.

# Find out what static deltas are needed for your devices looking at the "promoted" tag:
$ fioctl targets static-deltas --dryrun --by-tag promoted 42
Dry run: Would generated static deltas for target versions:
   22 -> 42
   32 -> 42

What this is saying is that all "promoted" devices are running either Target #22 or Target #32. The backend can produce two static deltas to make an optimal OTA to #42. The delta can then be produced by running the command without the --dryrun option. This command takes a while to complete and runs in our CI system. Fioctl will tail the output from CI in case you want to wait and watch.

Once static deltas are in place, you mark Target #42 as "promoted" with:

$ fioctl targets tag -T master,promoted --by-version 42

Now as promoted devices check in, they'll see and apply the OTA for Target #42.

With Waves

Waves are pretty much the same thing. I'd actually use the exact same process to see some of my formal test devices apply the static delta correctly. Then it's just a matter of rolling out an OTA using the waves commands.

Conclusion

OSTree static deltas are great way to ensure large OTAs are performed efficiently and like most things in our Factory, when you think of device management in terms of Targets things integrate in pretty sensible way.

Reference

Keep up to date with Foundries.io