How I helped my company ship features 10 times faster, and made dev and ops win

Using one product, deployed correctly, you can improve your company’s work methods, speed up software delivery, reduce errors, avoid maintenance and create a self healing CI system that is scalable, agile and customizable.

What

After a few years of working with Jenkins for enterprises, Travis CI for open source and side projects; I began looking for a self-hosted solution. One that would provide the speed and agility of container-native servers, but, would be open sourced with a live community and an easy way to contribute features and create plugins.

Drone combines it all and a bit more, although the first official major version hasn’t been released yet (0.8 is the latest to date). The Golang-based server has been alive and flourishing for over 2 years, new features and fixes are being released, with a live community responding to any query, question or call for help.

Leveraging the light weight Go concurrency “routines”, drone applies parallel build pipelines, stages and jobs to enhance processes and improve build times. The system provides an easy to setup server and agent(s), out of the box integration with VCSs like GitHub and GitLab, as well as a very quick deployment process overall.

Drone UI, giving a simple yet powerful overview on build history, steps separation with logs timing and a clear understanding of the CI status.Drone UI, giving a simple yet powerful overview on build history, steps separation with logs timing and a clear understanding of the CI status.

Why

Configuration as code Much like other similar CI servers, Drone operates based on a list of stages provided in a YAML template. Stages, plugins, commands and additional services are all described in the .drone.yml which enforces an infrastructure-as-code method of work. Using these templates efficiently, ensure that your application configuration data, environment settings, and deployment architecture are all documented and never lost; say good-bye to custom pipelines with arbitrary code that isn’t getting backed-up or documented. Changes are structured, templated and agreed upon as part of the code contribution and review process.

“Pipelines are executed inside containers and isolated from the host machine. Images are downloaded automatically; no manual installs or upgrades are necessary.”

*Plugins *Being part of an open source community has its advantages: Drone provides a plugins store, developed and contributed to, by and for the community. You can find most of the common integrations with AWS, GCP, Blumix, GitHub, Gitlab, NPM, Slack, Chef, Gitter, Telegram, Terraform, Docker etc etc etc… even Jenkins! In case you can’t find what you’re looking for or would like to customize yourself, this next section is for you →

*Develop your own *Drone, as mentioned, is a Docker-based system; every stage is a container with input parameters and optional output. As such, any container can be run to perform any task. Applying recurring processes as a plugin and even publishing it to the community as an open source tool has never been easier. The documentation provides examples in Go and Bash, but Python or any language is just as easy. Try to always create a plugin instead of using plain commands. You’ll be practicing code, preventing future errors and waste, but more (or most) importantly — you’ll be contributing to this rapidly-growing community.

A small part of the plugins marketplace.A small part of the plugins marketplace.

When (or when not)

Drone brings integrations with most familiar Git systems, however, check that your tech stack fits the listed backends, plugins, dev processes and methods.

E.g, if your application deployment requires a very customized solution, tailor-made in a specific way, using different sets of scripts, you may want to rethink the migration. 9/10 times this means that the over complexed process can be separated to a logical group of plugins and steps, and probably simplified and narrowed down. The 10th time where the process cannot, or more accurately should not change, it can be turned into a plugin. But, do you have the capacity to make the transition and apply the process into a new system? Are you willing to make the change?

How

Drone is using gRPC for server communication which requires HTTP/2 or TCP level communication. I’ve created my own architecture for such a deployment that works great, communication is seamless and the system is scalable and deployable whenever I want. Data of build, users and configuration is exported to an external data base of your choice, making the server practically stateless. This contributes to the high availability of the setup, making sure that any crash is recoverable.

Although describing a specific architecture, note that the same can be done on top of almost any platform as long as it answers to the basic communication requirements. Mine is only one option of many.

This is how it goes The Drone deployment code is a Git repo containing the deployment structure templates, and a .drone.yml file; **Yes, drone deploys itself. **It’s built so that it can be updated using CI just like any other application would. In case of a failure, it can fallback to previous server or agent versions.

The server is deployed as an Elastic Beanstalk application. I decided to go with the full application deployment since it provides out-of-the box Elastic Load Balancer, running an EC2 in an AutoScalingGroup. With such deployment, my server is accessible by TCP via the load balancer and recoverable thanks to the auto scaling group. Either with a new version deployment or a required system recovery, the EB application knows how to heal itself.

The agents are the easiest part, being the “active” ones on the server-agent relationship, the agents are the ones getting in touch with the server letting him know they’re alive. This allows their deployment to be pretty much doable anywhere, in my case — an ECS service, but the same can be done with any other orchestrator or platform.

The ECS cluster is built with an Application Load Balancer which means it doesn’t fully support HTTP/2 nor TCP mode. For that, either use the new Network Load Balancer or the good old ELB.

Having a container orchestrator hosting my agents, lets me enjoy the benefits of quick scale, recovery and deployments. The agents are more “disposable”, hence scalable based on any metric or action you require. One such option is a Lambda function, sampling the number of pending jobs using the API and scaling the agents accordingly.

A simple sketch of the deployment:

Drone deployment using Elastic Beanstalk with ELB, and an ECS cluster with ALB.Drone deployment using Elastic Beanstalk with ELB, and an ECS cluster with ALB.

Example pipeline deploying Drone

An example gist of .drone.yml, showing the use of Drone plugins and Yaml pipeline to deploy a new version of itself into the system. The plugins used below are the public Elastic Beanstalk plugin and a custom plugin I created myself to deploy ECS services of my own. Having the ability to create my own custom ECS plugin, allowed me to add different functionality such as mounting docker.sock to the image, adding different log handlers, injecting secrets, etc.

Personal thoughts

Migrations are rarely done in a day. It’s usually a process; and as such, it requires organizational and team changes of methods, concepts and solutions that, up until today, involved some customized undocumented jobs and processes, that can now be templated and structured into code and generic flows.

Keeping it simple and generic is the key to zero maintenance, no more reverse engineering, or trying to hack how the hell things are actually working. It also helps to avoid the common scenario of:

“Hmm.. I wonder what that piece of bash does, it looks completely unnecessary! Who in the world wrote this terrible thing he calls code?? I’m removing it. Someone has to clean things up. “

  • “Shit.. production is down!”

Change is always hard and often rejected, but keeping the goal visible helps with making it through. Allowing such a change in your systems would bring faster work pace, developer happiness and a high throughput system, but most of all, it would bring peace and calmness. You will notice an unbelievable amount of decrease in firefighting mode, and a whole lot more planned, fluent work.

Moreover, it will back everyone up, when your CI system is written down as code, deployable instantly, self-healing and allows disaster recovery - before you know it, it becomes transparent. No more organizational human focal points. To put it in Steve Jobs’ words: “it just works”. Can you describe you current system in a similar manner?

A CI server controls your delivery and development tempo, when it works, people can do their jobs, when it works fast, people enjoy their jobs, but when it fails, frustration is your last problem; engineers time is wasted and you are practically losing money. Remove the obstacles, don’t let the CI become a bottleneck, gain developers trust and happiness, by making them confident in their deployment pipeline to production. One that tests, builds and ships their code fluently and flawlessly.

Don’t wait for the shit to hit the fan, be proactive; kill waste; focus on innovation.

Thank you for reading! I hope you‘ve found this post as helpful as I’ve found Drone. Since I first heard of it, I’ve done several exceptional migrations and the feedback is just awesome. It brings transparency and confidence to developers and after a while to the management tier as well. No more manual logging in, searching a job, clicking it, waiting for build, manually updating the process and so on. As I see it (and so I’ve learned most organizations do), developers should be doing one thing, and one thing only: pushing code.

Drone is one of the many solutions we, at ProdOps, provide as a solution directly on your cloud platform’s account. Feel free to reach out for further details if you’re interested or have any question.

These days I’m working on another project that would expose CI metrics as visual charts based on Drone API. So keep in touch for updates, and clap if you liked 😎