Incident Management

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.


Observations from the digital trenches

When AT&T Incident Response Consultants first engage a client during a ransomware incident, the situation is often very chaotic. The client's ability to conduct business has stopped; critical services are not online, and its reputation is being damaged. Usually, this is the first time a client has suffered an outage of such magnitude. Employees may wrongly fear that a previous action is a direct cause of the incident and the resulting consequences.


This is How Blameless Integrates with JIRA

Atlassian JIRA, one of the most popular ticketing systems, allows teams to catalogue incidents, follow-up actions, bugs, stories, and more. As a common tool in any DevOps/SRE operation’s toolchain, JIRA is a key integration at Blameless. Blameless’ integration with JIRA allows teams to automatically generate a ticket within both Blameless and JIRA. This integration also allows teams to track follow-up actions via Blameless’ postmortem tool.


Digital Retail Tips: Reduce Downtime on Black Friday (and Cyber Monday)

Black Friday is one of the biggest days of the year for online consumers and retailers alike. This year, the coronavirus (COVID-19) pandemic is reshaping Black Friday shopping — and digital consumers and retailers must plan accordingly. The coronavirus pandemic will likely cause Black Friday shopping to decline this year. As such, many digital retailers are launching early Black Friday sales, so they can capture consumers’ interest ahead of the competition.


How PagerDuty and Slack Empower the "Work Where You Are" Mindset

Our reliance on digital services continues to be heightened by the ongoing COVID-19 pandemic. For work, school, and play, digital remains the primary channel. This puts huge pressure on ITOps and DevOps teams, making it critical that they can collaborate easily to resolve incidents rapidly. Many modern ITOps and DevOps teams rely on one of PagerDuty’s key integration partners, Slack, to meet this need.


Five worthy reads: Preparing an incident response plan for the pandemic and beyond

Five worthy reads is a regular column on five noteworthy items we’ve discovered while researching trending and timeless topics. With the rising concern over cyberattacks in the distributed workforce, this week we explore the concept of cybersecurity incident response during a pandemic.


Lightstep Adds Complete System Context to PagerDuty Alerts

There is a lot of noise surrounding the term “Observability”. While vendors and pundits debate three pillars, Lightstep has partnered with PagerDuty, to ensure software teams can move from context within an incident to quickly understand and determine root cause. Together we’re augmenting incident response solutions for pre-production scenarios.


Delivering Always-On Digital Experiences in Retail

How is it already near the end of October? We know our retailer customers have been heads-down thinking about code freezes and hypercare during the high season as we approach the holidays. Disruption and pivoting quickly to meet changing customer expectations is nothing new to the retail industry.


"The clearest and most singular footprint for AIOps": BigPanda named leader once again in EMA's Radar Report on AIOps

Leading analysts continue to acknowledge BigPanda’s leading role in the AIOps ecosystem. Earlier this year the GigaOm AIOps Radar report placed BigPanda in the leader section for the company’s strong market impact as a platform that delivers event correlation at scale.


On Call Schedule

An enterprise can use an on-call schedule that defines who is available to respond to incidents 24/7. Yet, how your enterprise builds and manages its on-call schedule can impact departments and stakeholders across your organization. When it comes to on-call scheduling, your enterprise must plan as much as possible. Fortunately, with the right processes and tools, you can effectively implement and manage an on-call schedule.


How to connect ServiceNow and Elasticsearch for bidirectional communication

The Elastic Stack (ELK) has been used for observability and security for many years now, so much so that we now offer the two as out-of-the-box solutions. However, identifying issues and finding the root cause is only part of the process. Often, organizations want to integrate the Elastic Stack into their everyday workflows so they can resolve those issues quickly. This typically involves integrating with some form of ticketing/incident tracking framework.