StackPulse Debuts Automated Kubernetes Troubleshooting and Remediation Tools
April 28, 2021

StackPulse announced a Kubernetes-centric “operations center” initiative as a part of its Reliability platform.

With these additions, StackPulse gives organizations running Kubernetes a powerful set of capabilities to augment their existing incident response practices, helping Site Reliability Engineers (SRE) understand and investigate issues faster, and deploy well-tested outage mitigation strategies, helping prevent customer-facing downtime.

Since Kubernetes is the de-facto standard for running containerized applications, StackPulse wanted to create a set of code-based tools engineers could use to operationalize incident response for production Kubernetes-based applications. When an error is detected in a Kubernetes environment, StackPulse automatically executes diagnostic steps to gather information from the clusters, and assists engineers in performing the root-cause analysis. This automation helps them quickly identify how to mitigate and resolve an issue.

Additionally, StackPulse has released more than a dozen playbooks built by SRE experts that remediate common Kubernetes problems. Using the StackPulse platform to automate these playbooks significantly reduces the time to resolution, helping teams restore services faster and meet SLOs.

“If you're serious about cloud-native, you're using Kubernetes, but it requires learning new concepts, and turning applications alongside infrastructure for best performance,” said Leonid Belkind, CTO and Co-Founder of StackPulse. “While developer teams push to adopt K8s due to the benefits in velocity it brings, it can be hard for Ops teams or on-call developers to know how to respond to alerts, or fix issues in production. This leads to costly incidents and outages. What we’re releasing today is a set of automated tools for diagnostics, mitigation, and remediation that help any Kubernetes environment operate with the best practices of planet-scale Kubernetes shops.”

All the Kubernetes tools and automated diagnostics are available to teams in the same platform as StackPulse's incident response functionality so teams can communicate during outages, centralize event data, and take action to remediate. From detecting issues by correlating signals from multiple sources to enriching alerts sent to on-call teams with root cause and remediation information, StackPulse drastically decreases the customer impact of production issues, helping stop outages in their tracks.

Share this

Industry News

May 06, 2021

Splunk announced the new Splunk Observability Cloud, the full-stack, analytics-powered and enterprise-grade Observability solution.

May 06, 2021

Gluware unveiled its DevOps for NetOps framework featuring Gluware Lab, its integrated development environment (IDE).

May 06, 2021

Ambassador Labs announced the new Ambassador Developer Control Plane (DCP), whichgives developers the ability to manage the entire modern software development lifecycle for Kubernetes environments using tools and processes that are familiar to them.

May 06, 2021

Code Dx and Secure Code Warrior have teamed up to launch Project Better Code, an initiative to tackle a major challenge facing innovative organizations today – pushing the pace of software development without compromising software security.

May 06, 2021

Pegasystems announced the latest evolution of its Pega Infinity software suite to help speed and simplify digital transformation (DT) initiatives, Pega Infinity version 8.6.

May 06, 2021

Accurics announced that its open source project Terrascan, which enables teams to detect compliance and security violations across Infrastructure as Code (IaC), now integrates with the Argo Project.

May 05, 2021

Amazon Web Services announced the general availability of Amazon DevOps Guru, a fully managed operations service that uses machine learning to make it easier for developers to improve application availability by automatically detecting operational issues and recommending specific actions for remediation.

May 05, 2021

SmartBear has added API testing support for the popular, open source event streaming platform, Apache Kafka.

May 05, 2021

Red Hat unveiled its Developer Sandbox for Red Hat OpenShift, an OpenShift-based development environment designed to enable organizations to accelerate the path from code to production for Kubernetes-based applications.

May 05, 2021

DevOps Institute announced the lineup for SKILup Days in the second quarter of 2021.

May 05, 2021

Idera announced the acquisition of Xblend Software.

May 04, 2021

ThoughtSpot announced the launch of ThoughtSpot Everywhere.

May 04, 2021

Perforce Software announced the availability of virtual devices (Android emulators and iOS simulators) as part of the comprehensive device lab within Perfecto’s Intelligent Test Automation platform.

May 04, 2021

LogiGear announced the newest release of its flagship TestArchitect™ Enterprise product, TestArchitect Enterprise 9.0.

May 04, 2021

Rafay Systems announced new enhancements to its flagship Kubernetes Management Cloud (KMC).