Lightstep Incident Response: Helping teams reduce downtime

  • IT Management
  • About ServiceNow
  • 2022
  • RJ Jainendra
March 14, 2022

Incident response: reducing downtime for SREs

Downtime—especially in customer-facing services—can cost businesses thousands of dollars an hour and incalculable customer trust. No company can afford to pay this price. To reduce downtime, software engineering teams must act quickly and decisively. But that’s easier said than done.

With Lightstep® Incident Response, generally available from ServiceNow today, we're unlocking speed, agility, and productivity for your engineers and your software-powered business.

We hear from customers that site reliability engineers (SREs) and developers can receive up to 30,000 alerts a day, leading to massive alert fatigue and uncertainty about what needs attention. Cloud-native distributed architectures with tens or hundreds of dependent microservices increase complexity, making it difficult to determine the root cause of real issues, let alone resolve them.

While triaging alerts and addressing incidents, SREs must work with multiple tools for observability, collaboration, on-call, and incident management. They must manually build the context of what is happening and involve the right responders to collaborate on analysis and resolution. Every second they spend toggling through multiple applications is a second further away from aggressive mean time to repair (MTTR) goals.

Converting real-time insights into action

More than a third of SRE organizations measure success based on improved revenues, profits, and customer satisfaction, according to a February 2022 SRE survey by IDC.1 Resilient software and services are a big part of that story.

Without streamlined incident management processes and the required context, it can take hours for your developers and SREs to investigate and resolve unexpected issues. Pay-per-seat pricing for on-call and incident management tools makes it cost-prohibitive for all developers to participate in delivering reliable services.

We’re on a mission to change that with Lightstep Incident Response. Our goal is to provide a cloud-native reliability platform that enables engineering teams to move fast and innovate boldly. Lightstep Incident Response is a crucial next step on the journey that started with Lightstep Observability. Combining real-time observability with incident resolution, Lightstep now provides teams the capabilities to deliver both innovation and reliability.

Lightstep Incident Response homepage

Powered by the Now Platform®, Lightstep Incident Response provides developers and SREs context and automation to pinpoint root cause and to streamline incident response workflow across observability, on-call and incident management, and remediation. This helps significantly reduce alert fatigue and decreases the pain of sifting through and resolving context-free alerts—resulting in less downtime, happier customers, and increased productivity.

Extending observability across the organization

With an innovative, usage-based pricing model aligned to the number of active services, customers can fully embrace the “you build it, you run it” culture of service ownership. The whole team can participate in on-call activities, collaborate on critical incidents, and learn from blameless postmortems to build more resilient systems without worrying about exorbitant pricing.

The Now Platform has set the gold standard for IT Operations Management and IT Service Management and is used by many IT organizations for enterprise-wide incident management. Lightstep Incident Response integrates with the Now Platform so that distributed DevOps product teams can seamlessly stay connected with platform teams to deliver reliable and resilient services throughout the organization.

Find out more about Lightstep Incident Response.

1 IDC, US IT Quick Poll Site Reliability Engineering Survey, #US48859422 (February 2022).

© 2022 ServiceNow, Inc. All rights reserved. ServiceNow, the ServiceNow logo, Now, and other ServiceNow marks are trademarks and/or registered trademarks of ServiceNow, Inc. in the United States and/or other countries. Other company names, product names, and logos may be trademarks of the respective companies with which they are associated.


  • Scaled Agile Framework (SAFe): business man looking at phone while standing on bridge overlooking a city
    IT Management
    How the Scaled Agile Framework (SAFe) truly supports business
    The Scaled Agile Framework (SAFe) delivery model can help IT leaders manage the transition from a stability-focused to a continuously evolving infrastructure.
  • The role of the manager: a manager and employee in conversation on a couch
    Employee Experience
    4 ways Manager Hub simplifies the role of the manager
    As the connective tissue between an organization and its employees, the role of the manager is more complex, and more important, than ever before. Learn more.
  • How delivery giant Yamato uses data science: uniformed Yamato delivery worker
    Customer Stories
    Delivery giant Yamato uses data science to drive growth
    Maintaining operational excellence while dealing with a surge in orders is a key pillar of Yamato’s digitization strategy—one that relies on data science.

Trends & Research

  • Total experience companies outperform: prism refraction with an arrow pointing to the right
    Employee Experience
    Survey says: Total experience-focused companies outperform
  • Customer service: smiling businessman on phone walking outdoors
    Customer Experience
    Survey: 3 tips to deliver world-class customer service
  • Enterprise SRE (site reliability engineering): where service reliability and business agility meet
    Application Development
    Service quality and the rising need for enterprise SRE