Today's business environment is fast-paced and overwhelmingly digital, making it more critical than ever for organizations to frequently and successfully deploy innovative software. However, as organizations navigate the intricacies of modern software development, they often face code-breaking bugs and errors that only arise post-deployment. The costliness of these deployment errors makes it paramount for developers to retain access to rollback capabilities.
Rollbacks are a critical safety net in software deployment because they allow developers to revert code to its previous state if an issue arises. Rollbacks provide an immediate solution to user-facing functionality errors and minimize the impact of costly disruptions. Consider the following:
■ According to Information Technology Intelligent Consulting (ITIC), an hour of downtime costs enterprises anywhere from $1 million to $5 million.
■ Meanwhile, small and medium businesses (SMBs) that experience an hour's downtime rack up around $10,000 of losses per hour — more than enough to shutter certain SMBs.
Given these figures, it's essential for software organizations of all sizes to have an effective rollback strategy in place.
Yet frequent rollbacks also delay the implementation of new features and waste developer time, causing frustration. Thus, while DevOps teams should prioritize easy rollback capabilities, they should ultimately work toward a future with more sustainable deployment practices and tools.
One such practice DevOps teams can prioritize today is continuous deployment (CD). CD significantly reduces the need for rollbacks by testing and validating code pre-deployment while also enabling teams to seamlessly revert to older working code versions when necessary. Let's unpack how CD can mitigate the necessity of rollbacks and, when necessary, ensure rollback processes are seamless.
How Continuous Deployment Reduces Your Organization's Reliance on Rollbacks
Although rollbacks are a necessary component of any deployment strategy, they are suboptimal for apparent reasons: they waste developer time and resources and delay the frequency of essential code updates. According to Armory research, 52% of engineering and operations leaders measure success by deployment frequency. Rollbacks jeopardize this leading indicator of success.
Developers who rely on CD improve productivity and the overall efficiency of their organization's software development lifecycle — eradicating the need for rollbacks in most situations. Leading CD tools offer policy enforcement, permissions and entitlements management, and architectural best practice enforcement. They also leverage progressive rollout strategies such as:
■ Blue/green deployments — This technique allows developers to run two versions simultaneously in the production environment. One operates as the live version and the other as a test version, with the capability to change which version traffic reaches as necessary. Developers test new features on the test version, and that code only receives production traffic after validation. Because both the new and old version are running, if a problem is encountered, traffic can be instantly redirected back to the old version, rolling back the change. Blue/green deployment simplifies testing and reduces risk by empowering developers to continuously integrate new features and deploy code changes with zero downtime.
■ Canary deployments — To stress-test code changes while decreasing their blast radius, developers can send the new version a small subset of traffic. As with blue/green deployment, canary deployments run old and new versions in parallel. With a canary strategy, instead of redirecting traffic all-at-once, traffic is slowly shifted to the new version in small increments for beta testing. This provides developers with crucial information about code viability in a live environment. Once developers deem code changes robust, they can deploy them to the next batch of users, eventually transitioning all users to the new application. Leading canary deployment tools also ensure that rollbacks — when necessary — are instantaneous. For example, some canary strategies only upgrade a subset of copies of the application (e.g. servers, vms, containers or pods) to the new version. If three pods are running the old version and one is running the new version, then the new version is receiving 25% of traffic. But controlling the traffic split via the number of running copies limits rollback speed because the application must recall previous versions to execute a rollback. Leading canary strategies have the capability to switch to a service mesh approach, as required by the client. This approach typically provides more fine-grained control of the traffic split, while also allowing for instantaneous rollbacks.
Progressive deployment strategies such as these are complex, but automation allows even the smallest teams to successfully employ such practices. Many modern CD tools come with off-the-shelf implementations of these strategies, enabling your DevOps team to benefit from them while focusing crucial time on implementing functionalities unique to your application.
The Power of Automated Rollbacks
As discussed, every developer's goal should be rollback reduction. However, there are instances in which a rollback is the only immediate solution to a significant bug or code break.
For example, say a top client relies on an organization's software to prepare their board presentation materials. While preparing for a next-day presentation, this client encounters a system-breaking bug. They submit an urgent support ticket, but the organization's DevOps team cannot address root-cause issues until the next day, at the earliest.
In cases like this, developers must use a reliable rollback tool — ideally, one with automated capabilities and one-click rollbacks, which can automatically restore a previous version of code without developer involvement when an update reaches a specific threshold of predetermined criteria (for instance, once user traffic has crashed a critical piece of your organization's software). These tools improve software consistency and reliability, reverting users to a previous application version complete with all dependencies and mechanisms required for service delivery.
And quick restoration is key. In the above example, the client's retainer likely represents a significant proportion of the software organization's revenue. When trust is lost in this relationship, it can translate to millions of dollars in losses. A painless, expedient rollback may be the linchpin in maintaining this robust relationship.
Moreover, leading CD tools have other avenues to prevent costly, ill-fated deployments, such as the automated rollback of containerized applications. For example, developers can roll back Kubernetes deployments based on their complexity and the number of deployed artifacts. Additionally, developers can use CD tools to guard against unwanted rollbacks through conditional checks. This feature ensures a rollback is strictly necessary by first verifying that an unrelated deployment pipeline error (e.g., network interference) hasn't occurred.
Reaching the Deployment "Sweet Spot"
Ideally, an organization would never rely on rollbacks. But realistically, no organization — nor any one developer — is immune to code-breaking bugs. The "sweet spot" of deployment, then, is a toolkit that prioritizes sustainable development tools like CD while also adopting leading, automation-based rollback capabilities. By focusing on preventative measures and leveraging the proper deployment tools, organizations can enhance their efficiency, reduce downtime and ultimately deliver their end users a much more reliable, robust product.