Automation is paramount to our success and to meet our ambitious goals we built the horizontally scalable open source CI system called JobServ.
Somewhere in the last 10 years CI went from “nice-to-have” to “business-critical”. CI servers can’t suffer from down time and CI workers must be fault tolerant so that users don’t spend their day clicking “re-build”.
JobServ can be thought of at a high level as something similar to Jenkins. However, several design decisions have been made to make sure the service can be highly available and horizontally scalable. The public facing APIs are based on stateless HTTP calls which means:
Multiple instances of the API can run and have rolling updates deployed behind a load-balancer without existing CI builds noticing.
Workers have a simple way to connect and can retry status updates if something unexpected happens.
We also built in a “simulator” so that users can re-create failed builds
from their desk or develop new changes without having to repeatedly push
"oops - Fix typo" changes to the production CI repository.