About Gremlin

Gremlin is building resilient systems through chaos engineering – a new engineering philosophy that safely injects failure into systems to proactively identify and fix unknown faults. Gremlin aims to make the internet more reliable and prevent costly outages, by empowering engineers to safely experiment on complex systems in order to test assumptions and build more resilient software.

How it Works

With a web or API interface, engineering teams can inject failure into their system to find faults and fix them before they affect the business.

Key benefits:

  • Test assumptions by showing how the system will behave in the face of failure
  • Validate that your system behaves the way it is designed to under duress
  • Minimize blast radius - allowing teams to safely experiment in production by containing the 'blast radius'
  • Save time and resources - without Gremlin, it would take a team of 5-10 engineers two
  • years to build robust and fully secure chaos engineering tools or use open source software to piece together a solution that requires engineers to provide their own security, safety, ops, and support


The adoption of cloud computing and the trend of microservices has created infrastructure that continues to mature and reveal new ways to develop, deploy, and operate applications that were never before possible. This has created a complexity gap – systems are now too complex for any engineer, or team of engineers, to predict how and when it could fail.

Because of the increased complexity of systems leading to more opportunities for unforeseen failures, engineering teams are constantly getting paged at odd hours to triage outages that are avoidable, leaving engineers on the brink of burnout. On top of it, the business is losing big money and creating mass customer frustration with every outage.

The numbers tell us the depth of this issue:

  • 40% of IT managers report burnout as their biggest concern for teams
  • There is a 2-year turnover rate for half of all software engineers in the Bay Area - compared to other professions where workers stay an average of 4.2 years
  • On average, 60 minutes of downtime costs businesses more than $300,000

With Gremlin, any company operating on the internet can fundamentally make their systems more reliable, reduce outages, and prevent potential millions in losses and customer dissatisfaction. 


  • Kolton Andrus, CEO
  • Matthew Fornaciari, CTO


  • Index Ventures
  • Amplify Partners


  • Expedia, Twilio, Confluent, and Remind
Total Funding
11-50 employees

