SPrime AI
Book a call

Coffee, runbooks, and an on-call rotation nobody dreads.

Ask most engineers how they feel about being on call and you get a flinch. We wanted the opposite: a week where the person holding the pager sleeps fine, becaus

Ask most engineers how they feel about being on call and you get a flinch. We wanted the opposite: a week where the person holding the pager sleeps fine, because the system was designed to let them. This post explores how our team created an on-call rotation that nobody dreads, featuring organized handoffs, laminated runbooks, and methods to minimize overnight disruptions.

Engineers reviewing laminated runbooks in a bright office setting.

The Desk on Monday Morning ☕️

A rotation starts with coffee and a handoff, not with a fire. The outgoing engineer walks the incoming one through what happened last week — what paged, what was noise, what to keep an eye on.

Why the Runbooks are Laminated 📚

It sounds like a joke. It is not. A laminated runbook is one that gets pulled off the shelf at 2am, spilled on, and put back. The lamination is a signal: this document is used, not archived.

  • Every alert links to a runbook. If a page does not have a written response, it is not allowed to page anyone. That rule alone cut our overnight noise significantly.
  • The runbook is short. One screen. If it needs more than one screen, the alert is too broad and gets split.
  • It says when to escalate. The hardest thing at 2am is deciding whether to wake someone. The runbook decides that for you.

Competitors like PagerDuty and Opsgenie also emphasize the importance of clear runbooks in their on-call management solutions.

A good on-call week is one where the runbook answered the question before you had to think.

The Handoff Doc 📄

The artifact that carries the rotation is a single living document. Each week appends to it; nothing is deleted. The incoming engineer reads the last entry and knows the state of the world.

Why Nobody Dreads the Pager 📟

The dread does not come from being on call. It comes from being on call for a system you do not understand, with no plan, alone. We removed all three.

  • You are never alone. A secondary is always named. They do not get paged first, but they exist, and everyone knows who they are.
  • The pager is quiet by design. Fewer than two pages a week, most weeks. We treat every false page as a bug to fix, not noise to tolerate.
  • The week ends. A rotation is one week, then it is someone else's turn. Nobody carries the pager into a second week, and nobody carries the stress past Friday.

Boring is the goal. The best compliment our on-call engineers give is that they forgot they were holding the pager. Tools like VictorOps can also be used to ensure on-call rotations are smooth and stress-free.

Play video

Further Reading

🚀 Ready to Build with AI?

Contact Silicon Prime — we help companies design and ship production-grade AI products.

 FAQ

Frequently asked questions

Lamination is a signal that the document is used, not archived. A laminated runbook is one that gets pulled off the shelf at 2am, spilled on, and put back. It reflects a culture where runbooks are living operational tools rather than wiki pages nobody opens during an incident.

Every alert must link to a runbook—if a page doesn't have a written response, it isn't allowed to page anyone. The post says that rule alone cut overnight noise significantly. Runbooks are also kept to one screen; if a response needs more than one screen, the alert is too broad and gets split. Each runbook also states explicitly when to escalate.

A rotation is one week, then it's someone else's turn. Nobody carries the pager into a second week, and nobody carries the stress past Friday. The fixed, bounded week is part of why nobody dreads the pager—the dread comes from open-ended responsibility for a system you don't understand, alone, which the team deliberately removed.

A secondary is always named so the primary is never alone. The secondary doesn't get paged first, but they exist and everyone knows who they are. Removing the feeling of being alone with an unfamiliar system is one of the three sources of dread the team eliminated, alongside lack of understanding and lack of a plan.

The pager is quiet by design—fewer than two pages a week most weeks. Every false page is treated as a bug to fix, not noise to tolerate. Combined with the rule that no alert can page without a linked runbook, this keeps overnight interruptions low and lets the on-call engineer actually sleep.

It's a single living document that carries the rotation. Each week appends to it and nothing is deleted, so the incoming engineer reads the last entry and immediately knows the state of the world. The rotation literally starts with coffee and a handoff—the outgoing engineer walking the incoming one through what paged, what was noise, and what to watch.

The dread doesn't come from being on call; it comes from being on call for a system you don't understand, with no plan, alone. The team removed all three: runbooks give you the plan and the understanding, a named secondary means you're never alone, and the quiet-by-design pager plus a one-week limit cap the stress. Boring is the goal—the best compliment is that engineers forgot they were holding the pager.

Because a good on-call week is one where the runbook answered the question before the engineer had to think, and where the system was designed to let the person holding the pager sleep fine. Boring means the upstream decisions—scoping, release discipline, alert hygiene—were sound. The aim is reliability you can sustain, not heroics, so engineers don't flinch at the prospect of holding the pager.

Thirty minutes · No pitch deck

Ready to turn AI experiments into measurable ROI?

Bring one outcome you'd like AI to move. We'll help you scope a pilot you can actually measure — and tell you honestly if it's not worth doing yet.

Comments