People ask us to draw Aegis on a whiteboard often enough that we figured we should just publish the drawing. Here it is, cleaned up, with the signals between each stage labeled.
Stage 1 — Planning.
The unit of work entering this stage is a ticket. The unit of work leaving this stage is a smaller ticket, paired with a list of dependencies the team didn't notice the first time around. AI assists; humans decide what gets scoped in.
Stage 2 — Pre-release quality.
The diff arrives. AI extends the test coverage to cover what changed, runs against the regression library for adjacent surfaces, and emits a one-page risk note. The note exists so the on-call engineer has something to read, not a CI page to interpret.
Stage 3 — Deploy.
This is the only stage where a human is the primary actor in the critical path. The on-call reads the note, confirms the rollback target, and approves the release. We deliberately did not automate this decision. The risk of removing the human is greater than the cost of keeping them.
Stage 4 — Monitoring.
Anomaly detection runs against production traffic in five-minute windows. If something drifts, the team sees it before customers do. The output of this stage feeds back into Stage 1 — the dashed orange line in the diagram.
The loop is the product. The individual stages are just where the loop pauses to think.
Where this differs from a generic CI/CD setup.
- The unit of work is constrained at planning time, not deploy time.
- The risk note is generated, not produced by a human at 1am.
- The signal from monitoring re-enters planning explicitly, not as an after-the-fact incident review.
— Silicon Prime team. May 2026.
Comments