Skip to main content
Break-Glass Accounts: Design, Test, Audit — A Working Playbook
Back to Blog
Guide April 8, 2026 10 min read

Break-Glass Accounts: Design, Test, Audit — A Working Playbook

D

DataDike Security Research

PAM Research & Field Engineering

Every Zero Standing Privileges program ships with a small population of accounts that retain standing privilege — typically two or three per domain — designed to be used only when the normal request workflow has failed. The PAM industry calls these break-glass accounts. They are simultaneously the most important and least exercised control in the entire identity-security stack. The combination is what makes them dangerous: critical when needed, invisible the rest of the time, easy to neglect, even easier to discover by an attacker who knows where to look.

What break-glass actually protects against

A break-glass account is the answer to a specific failure mode: the PAM itself is unavailable (network split, infrastructure outage, ransomware), the identity provider is unavailable (Entra ID degradation, AD outage), or the approval workflow is unreachable (a pager misfire that takes out the on-call). In those moments the operator must regain administrative access without the platform that normally brokers it.

Note what break-glass does not protect against: routine after-hours work, vendor maintenance windows, planned changes. Those have their own approval paths. If your operators use break-glass for ordinary work, you no longer have a Zero Standing Privileges program — you have standing privileges with a more dramatic name.

The design specification

  1. Two to three accounts per identity boundary (per domain, per cloud tenant, per critical platform). One is a single point of failure. More than three creates more permanent attack surface than it solves.
  2. Each account holds the minimum standing privilege required for the recovery action. A break-glass account that can do "anything" is a worst-case account that an attacker can also do anything with.
  3. Credentials live in a sealed envelope (literal or digital) outside the PAM. The envelope is opened only by a designated incident commander. Opening it triggers a SIEM alert that cannot be suppressed.
  4. Passwords are 24+ character random. Hardware tokens (FIDO2 / YubiKey) are mandatory for the second factor — not TOTP, not SMS. The token lives with the envelope.
  5. Logon to the account is hard-wired to an audit ingestion path that does not depend on the PAM (because the PAM may be what failed). Direct SIEM forwarding, separate syslog channel, or platform-native audit (Azure activity logs, AWS CloudTrail) all qualify.

Common failure pattern

Most break-glass accounts we audit have their credentials in the same vault as the rest of the privileged inventory. When the vault goes down, the break-glass goes with it. The credential of last resort must be reachable when the platform of first resort is not.

The testing cadence nobody runs

A break-glass mechanism that has never been tested is not a control — it is an assumption. The minimum viable testing program is quarterly: pick a Saturday morning, simulate the PAM being down, walk the incident commander through opening the envelope, logging in, performing a documented recovery action, and writing up the post-mortem. The post-mortem is the deliverable; the technical exercise is the means to produce it.

Two things go wrong in the first test, every time. The credentials no longer work (rotated by the PAM and the envelope was not updated), or the hardware token is dead (battery on the FIDO2 device, lost YubiKey). Both findings are why you test — discover them in a planned exercise on a Saturday, not at 02:00 during a real incident.

The audit evidence regulators expect

External auditors and inspectors of any maturity ask three questions about break-glass: (1) how many accounts exist and what can each one do, (2) what alerted when each was last used and what was the resolution, (3) when was the last documented test and what did it find. If you cannot answer all three from a dashboard within five minutes, the program is not auditable.

A mature break-glass record looks like a thin paper trail with no surprises: inventory of the accounts and their scope, an alert log showing each use was reviewed within the SLA, and a quarterly test log with findings closed. The volume is small by design — break-glass should be used three to five times per year in a 5,000-employee environment. More than that and you have an over-reliance problem; fewer than that and you should question whether the mechanism actually works.

How DataDike handles break-glass

DataDike treats break-glass as a first-class object in the platform: separate inventory section, separate alerting rules that escalate to security leadership rather than the normal IR queue, and a separate audit stream that ingests events even when the main vault is partially degraded. The credentials themselves are wrapped in an out-of-band envelope (typically a sealed HSM-backed export held by the security team), so an outage of the vault control plane does not eliminate the recovery path.

Quarterly testing is built into the operator console as a scheduled workflow that produces the audit artifact directly. If your last "we tested break-glass" record is older than 90 days, the dashboard surfaces it as a finding. That posture — making the test schedule itself a tracked control — is what separates a real break-glass program from one that exists only on the architecture diagram.