Blameless

San Mateo, CA, USA
2017
May 3, 2021   |  By Blameless Community
Blameless is excited to announce a new source for monitoring data for your SLIs and SLOs. Prometheus is an open source monitoring and alerting solution which is highly customizable.
May 3, 2021   |  By Blameless Community
Blameless is excited to announce a new source for monitoring data for your SLIs and SLOs. New Relic is an observability platform that helps engineers instrument, analyze, troubleshoot, and optimize their entire software stack.
May 3, 2021   |  By Blameless Community
Blameless is excited to announce a new source for monitoring data for your SLIs and SLOs. Pingdom is a leading monitoring platform that allows users to monitor synthetically and with real user data both applications and infrastructure.
May 3, 2021   |  By Blameless Community
Blameless is excited to announce a new source for monitoring data for your SLIs and SLOs. Datadog is a monitoring and security platform for cloud applications. It brings together end-to-end traces, metrics, and logs to make applications, infrastructure, and third-party services observable.
Apr 26, 2021   |  By Emily Arnott
Wondering what SRE is all about? We will explain what it is, how it works, why it was developed, and how it can help your organization. So what is SRE (Site Reliability Engineering)? SRE is a methodology that fuses software and operations teams, with the goal of producing reliable, resilient, and scalable systems. Site Reliability Engineering (SRE) was developed by Google engineer Ben Treynor Sloss in 2003. Google’s goal was to increase the reliability of its sites and services.
Apr 20, 2021   |  By Blameless Community
Spring is here! We have rain! We have flowers! We have allergies! We also have some of the most exciting Tweets, content, and events happening in the SRE and resilience engineering community this month.
Apr 19, 2021   |  By Blameless Community
‍Resilience in Action is a podcast about all things resilience, from SRE to software engineering, to how it affects our personal lives, and more. Resilience in Action is hosted by Kurt Andersen. Kurt is a practitioner and an active thought leader in the SRE community. He speaks at major DevOps & SRE conferences and publishes his work through O'Reilly in quintessential SRE books such as Seeking SRE, What is SRE?, and 97 Things Every SRE Should Know.
Apr 13, 2021   |  By Emily Arnott
Data helps best-in-class teams make the right decisions. Analyzing your system’s metrics shows you where to invest time and resources. A common type of metric is Mean Time to X, or MTTx. These metrics detail the average time it takes for something to happen. The “x” can represent events or stages in a system’s incident response process. Yet, MTTx metrics rarely tell the whole story of a system’s reliability.
Apr 12, 2021   |  By Harry Hull
You aren't sure how long you've been here, but the view outside the window sure is soothing. Before you can fully take in your surroundings, a siren rips you back into the conscious world. Slowly, you begin to piece together that you exist, and you are on call. The ringing, much louder now, pierces through your skull as you begin to open your bleary eyes. You turn over your pillow, grab your phone, and click through the PagerDuty notification.
Apr 6, 2021   |  By Blameless Community
Blameless recently had the privilege of hosting SRE leaders Kurt Andersen, SRE Architect at Blameless, Vanessa Yiu, Executive Director, Enterprise Architecture at Goldman Sachs, and Tony Hansmann, Former Global CTO at Pivotal Software, Inc.

Blameless offers the only complete reliability engineering platform that brings together AI-driven incident resolution, blameless postmortems, SLOs/Error Budgets, and reliability insights reports and dashboards, enabling businesses to optimize reliability and innovation.

Enabling modern software businesses to adopt SRE best practices:

  • Incident Resolution: Use AI to engage the right people and teams in the right way to stop problems fast, ensure customer satisfaction and prevent incidents from happening again.
  • Blameless Postmortems: Learn without pointing fingers, ensuring continuous improvements. We automatically bring relevant information, proper context and industry best practices to your postmortem process.
  • SLOs/Error Budgets: Create SLOs and see your remaining error budgets with the SLO dashboard. Teams gain insight into what parts of the business are consuming the error budget, allowing them to make informed decisions between releasing new features and reliability.
  • Reliability Insights: Blameless will allow your business to consume event data across your entire DevOps stack, query the data, and create custom dashboards, meaning teams can quickly find signals amongst their DevOps data noise.

The Complete Site Reliability Engineering (SRE) Platform.