We've been using a tool called "fiotest" for months now, but haven't really explained the what, why, and how. This is the first part of a small series of articles to explain fiotest.
Before going further into the "what", I think people that know us are curious about the "why". Specifically:
- Why would Andy Doan and Tyler Baker, two people that helped create LAVA, create yet another testing tool.
- Why would Milo Casagrande and Tyler Baker, two people that helped create KernelCI create yet another testing tool.
LAVA is amazing, but one thing I personally lost focus of while I led the team was its strength: testing unreliable development boards booting unreliable kernels. LAVA can gracefully test things that may not boot and give pretty reliable results. It's kind of amazing.
KernelCI has a different focus, boot testing kernels. However, it did teach us a great lesson we've used for fiotest: You need a well defined API for reporting test results. If people can make peace with the API, then they are free to report results with any tooling they choose.
We created fiotest with a few important architectural things in mind:
Centered around Targets. Operators/Release-managers/etc need to be able to look at a Target and understand if it's safe to roll out to production. We make the results of testing be tethered to a Target so that it's easy to run
fioctl targets tests <target number>
Cover the hard stuff. Clean APIs are often clean because they don't deal with the hard stuff. For instance - tests need to be able to trigger reboots.
Container friendly. Like everything else, we think most application level things on a device can be done from a container. Fiotest is no different.
Host friendly. While containers can do almost anything, you still need to run commands directly on the host.
Simple conventions. For example, having a test upload artifacts is simple. Files placed into
$ARTIFACTS_DIRwill be uploaded through the API as test artifacts.
Extensibility. The default fiotest container is pretty handy, but by choosing Ubuntu as the base image its easy for you to build your own container with packages required for your tests. The implementation is done in Python so that its easy to hack on.
Our update agent, aktualizr-lite, has a callback mechanism. The fiotest application configures aktualizr-lite to make fiotest receive callbacks. The application waits for the
install-post message from aktualizr-lite to let it know a new Target is ready to be tested. At this point, the fiotest application will execute tests as defined in a test-spec. These tests are executed and results are reported via the device-gateway API. A simple test-spec could be:
sequence: # Run the LTP test suite once: - tests: - name: ltp command: - /usr/share/fio-tests/ltp.sh # Now reboot the system. After reboot, fiotest will start up and # continue with the next test in the sequence - reboot: command: - /bin/true # Run a test repeatedly: - tests: - name: sleep command: - /bin/sleep - 10 repeat: total: 4 # run 4 times. If not specified, repeat forever delay_seconds: 3 # sleep 3 seconds between loops, default 3600 (1 hour)
Once you have things all hooked up, results can be viewed with:
See all Targets that have been tested:
$ fioctl targets tests Tested Targets: 171 170 169 168
View the testing done on a specific Target:
$ fioctl targets tests 171 NAME STATUS ID CREATED AT DEVICE ---- ------ -- ---------- ------ system PASSED 4097531b-76fa-47ba-a2f6-136943d9c0cb 2020-10-30 15:06:40 +0000 UTC doanac-imx6-postmerge ltp PASSED b564d135-6904-44c8-a749-80b2f20a5c6d 2020-10-30 15:06:40 +0000 UTC doanac-imx6-postmerge reboot PASSED 47a8680e-a402-443a-bd40-6f66c1317b14 2020-10-30 15:08:48 +0000 UTC doanac-imx6-postmerge system PASSED 6c49af4a-2891-463f-8a52-d7efc1d139e2 2020-10-30 18:01:36 +0000 UTC doanac-rpi3-postmerge ltp PASSED 7fc0d4a9-a127-4629-957f-b2b00e7eb3c5 2020-10-30 18:01:36 +0000 UTC doanac-rpi3-postmerge reboot PASSED 78cffebb-a8dc-42a1-92a4-b285ef3f096f 2020-10-30 18:01:36 +0000 UTC doanac-rpi3-postmerge
View the details of a specific test:
$ fioctl targets tests 171 b564d135-6904-44c8-a749-80b2f20a5c6d Name: ltp Status: PASSED Created: 2020-10-30 15:06:40 +0000 UTC Completed: Device: doanac-imx6-postmerge Artifacts: console.log TEST RESULT STATUS ----------- ------ madvise01 PASSED madvise02 PASSED madvise05 PASSED madvise06 SKIPPED madvise07 SKIPPED madvise08 PASSED madvise09 SKIPPED madvise10 PASSED
View a test artifact:
$ fioctl targets tests 171 b564d135-6904-44c8-a749-80b2f20a5c6d console.log Checking for required user/group ids 'nobody' user id and group found. 'bin' user id and group found. 'daemon' user id and group found. Users group found. Sys group found. Required users/groups exist. no big block device was specified on commandline. Tests which require a big block device are disabled. You can specify it with option -z INFO: Test start time: Fri Oct 30 15:07:55 UTC 2020 COMMAND: /opt/ltp/bin/ltp-pan -q -e -S -a 18 -n 18 -p -f /tmp/ltp-YCxLqXaKRr/alltests -l /tmp/LTP_1604070445_.log -C /opt/ltp/output/LTP_RUN_ON-LTP_1604070445_.log.failed -T /opt/ltp/output/LTP_RUN_ON-LTP_1604070445_.log.tconf INFO: Restricted to madvise LOG File: /tmp/LTP_1604070445_.log FAILED COMMAND File: /opt/ltp/output/LTP_RUN_ON-LTP_1604070445_.log.failed TCONF COMMAND File: /opt/ltp/output/LTP_RUN_ON-LTP_1604070445_.log.tconf Running tests....... tst_test.c:1244: INFO: Timeout per run is 0h 05m 00s madvise01.c:112: PASS: madvise test for MADV_NORMAL PASSED madvise01.c:112: PASS: madvise test for MADV_RANDOM PASSED madvise01.c:112: PASS: madvise test for MADV_SEQUENTIAL PASSED madvise01.c:112: PASS: madvise test for MADV_WILLNEED PASSED madvise01.c:112: PASS: madvise test for MADV_DONTNEED PASSED madvise01.c:112: PASS: madvise test for MADV_REMOVE PASSED madvise01.c:112: PASS: madvise test for MADV_DONTFORK PASSED madvise01.c:112: PASS: madvise test for MADV_DOFORK PASSED madvise01.c:104: CONF: MADV_HWPOISON is not supported madvise01.c:104: CONF: MADV_MERGEABLE is not supported madvise01.c:104: CONF: MADV_UNMERGEABLE is not supported madvise01.c:104: CONF: MADV_HUGEPAGE is not supported madvise01.c:104: CONF: MADV_NOHUGEPAGE is not supported madvise01.c:112: PASS: madvise test for MADV_DONTDUMP PASSED madvise01.c:112: PASS: madvise test for MADV_DODUMP PASSED madvise01.c:112: PASS: madvise test for MADV_FREE PASSED madvise01.c:112: PASS: madvise test for MADV_WIPEONFORK PASSED madvise01.c:112: PASS: madvise test for MADV_KEEPONFORK PASSED Summary: passed 13 failed 0 skipped 5 warnings 0 ...
Why no test visualization?
Every organization seems to have their own approach/preferences for this. For the time being, we've focused on an easy API that makes it easy to extract results.
Can this hook into code review (i.e GitHub pull request)?
Not quite yet. I'm starting to think about this topic as it would be really useful for Foundries.io. The great thing about being Target focused is that we know the exact change that produced the Target. I'm hopeful we can work backward from that and create check runs that would show up in a code review.
Can this run in my private test lab?
That's the exact reason we made this. You don't need us. You just need the API.
I already run my own tests.
You may have no need for this feature. However, you do have the option to feed your results into our API if you'd like to integrate them with your factory.
Hopefully this has provided some context about fiotest. In the next article, we'll go over integrating fiotest into your factory and include some examples of how to do testing.