Monday's incident report

Monday's incident report

February 27, 2017

On Friday morning at 7:40 am, we experienced a global outage across all services. Clearbit was down for approximately 29 minutes. For that, we’re extremely sorry. We understand that Clearbit plays a large role in many sales, marketing, and product processes and we deeply apologize for any inconvenience caused.

Issue Summary

From 7:41 AM to 8:10 AM PT, requests to Clearbit APIs resulted in unauthorized error response messages. The root cause of this outage was the incorrect setup of one of our developer machines that gave direct access to the production database while running automated tests. This briefly removed access to Clearbit’s accounts as automated tests rely on an empty database.

Timeline (all times Pacific Time)

  • 7:40 AM: Test run finished
  • 7:41 AM: Pagers alerted team
  • 7:45 AM: Complete outage after our caching system was invalidated
  • 7:48 AM: Accounts database backup started from a snapshot
  • 7:55 AM: Root cause discovered
  • 8:08 AM: Account backup restored
  • 8:10 PM: 100% of traffic back online.

Root Cause

At 7:40 AM PT, one of our developers started a test run with a setup that inadvertently had the development environment configured to use our production cluster. While preparing the test run, our code completely drops all rows from our authentication database.

Resolution and recovery

At 7:41 AM PT, the monitoring systems alerted our engineers who investigated and quickly escalated the issue.

By 7:48 AM, the team started a backup to recover accounts but the root cause was still not clear.

At 7:55 PM, the root cause was found by the developer that caused the issue after noticing how the timeline of events matched the test run.

These problems were addressed and we successfully recovered from a backup at 8:10 PM.

Corrective and Preventative Measures

The following are actions we are taking to address the underlying causes of the issue and to help prevent them from happening again:

  1. Improve checks to the testing environment to prevent usage of production settings.
  2. Change database recovery process to be more time efficient.
  3. Increase backups to an hourly cadence.

We appreciate your patience and again apologize for any inconvenience. We thank you for your business and continued support.

Introducing the Data Activation Platform

Companyby Andrew O'Neal on February 15, 2022

The Clearbit Data Activation Platform brings together our industry-leading B2B data, flexible integrations, and new capabilities to help you create demand, capture intent, and optimize pipeline.

Drive growth with Clearbit for Startups

Companyby Rachel Lord on January 12, 2022

We’re excited to announce Clearbit for Startups, a new solution for high-velocity growth at an accessible price.

Join our newsletter

Engaging stories and exclusive data, designed for our best customers. One useful issue each month.