Panopta Incident Hub

The Panopta Incident Hub encompassed an expansive system of new product features geared towards providing complete visibility of incidents across complex infrastructure setups to help resolve customer pain points. From seemingly smaller features like automation runbooks, performance charts/filters, and incident summaries to much larger undertakings like the Incident Overview experience, browser-synthetic checks, alert timelines, and in-dashboard communication, the incident hub provides tools to help teams diagnose and resolve incidents, automate environments/alerts, and easily collaborate across teams.

Objective

Define and leverage a system of incident response tools to help restructure the Panopta product as not only an infrastructure monitoring platform but also as an incident response tool.

Incident Details Screen & Incident Overview Screen

User Interviews & Research

To begin creating the entirety of our incident management experience, the product team held multiple rounds of user interviews to help fully understand the pain points Support/Ops Teams deal with regularly.

Incident Details Experience

The Incident Details Screen is the center point of the incident Hub, housing the majority of features used to diagnose incidents, resolve incidents, collaborate as a team, and automate tasks.

Incident Details Screen

Side Bar Messaging, Active Incidents Pane
& Performance Charts

SideBar Messaging – Active Incident Pane – Performance Charts

SideBar Messaging – Active Incident Pane – Performance Charts

Incident Overview Experience

One customer need/pain point that came up in many user interviews was the ability to view present & past infrastructure problems to help inform triage/post-mortem operations. To address this need, we created the Incident Overview experience which allows practitioners to view the entire history of their infrastructure from a single location. Through the use of complex filtering, the Incident Overview experience gives practitioners the ability to identify recurring patterns and trends across their infrastructure, drill into historical data to help inform remediation decisions on active incidents, & the ability to perform quick-action incident management tasks.

Incident Overview Screen

Browser Synthetic Checks

While building the overall Incident Hub, we learned a lot about incident management and the fine granularity at which practitioners needed to be able to monitor their infrastructure. One highly important area Support Ops Engineers needed detailed visualization into were customer-facing interactions. To address this need, we built browser synthetic checks into our incident management tool-kit. Through source code testing, step-by-step breakdowns of tasks, waterfall visualization, and detailed screenshots practitioners, can now test code for new deployments before releasing features to customers, as well as quickly identify where problems are occurring in already deployed interactions.

Source Code Testing – Test Output – Test Details Modal 

Alert Timelines

To continue to help provide IT Teams with a greater level of granularity within their infrastructure management & incident response, we also refined and built a new experience for our Alert Timelines. To help address user needs, we worked to create a highly configurable experience allowing users to customize how soon to alert teammates after an outage occurs, any/all channels to notify teammates on, and even the ability to focus alerts on specific metric thresholds. (i.e., "CPU above 90% for 5 minutes")

Alert Timeline Table – Alert Timeline Builder