The Conference Prod Push

It happens the same way every time. Someone at the table has a laptop open. There is a deploy that has been sitting in staging. Everything looks fine. The drinks are good. The person with the laptop says "should we just push this?" and the table goes quiet.

Not because anyone objects. Because everyone at the table is an operator and everyone at the table knows exactly what this is. You are about to play a game with production infrastructure from a hotel bar at 11pm. The VPN is flaky. Your phone battery is at 40%. The person who knows the database schema is asleep.

The push goes out. Sometimes it is fine. Sometimes it is very much not fine and you spend the next three hours hunched over a laptop in a conference hallway while your colleagues wave from the after-party. The ratio of "fine" to "not fine" is high enough that the ritual continues.

The on-call rotation does not care that you are in a different time zone. It does not care that you have a keynote in the morning. The rotation is the rotation.
— Learned at altitude, at speed, the hard way

The Incident Report Nobody Files

After the conference push goes wrong, there is a resolution. Someone fixes it. The service comes back. The affected users — it was late, so not many — move on with their lives. And because the blast radius was small and everyone is tired and the conference starts in six hours, the incident report does not get written.

This is how a certain kind of institutional knowledge disappears. Not in a catastrophic failure with a thorough postmortem, but in a hundred small fires that got put out quietly and were never documented. The next person who tries this deploy will not know there was a problem. They will find out the same way you did.

The conference circuit carries its own oral history of these moments — the names changed, the systems changed, but the structure of the story constant. Someone pushed at the wrong time. Someone fixed it. Nobody wrote it down. The thing keeps happening.