PKI Deployment Mistakes
The person who set this up is gone

6"Dave set this up before he left"
What happens
Engineer deploys certificate, doesn't document the process, and leaves the company. All knowledge of how it was set up walks out the door with them.
Why it seems reasonable
"Dave knows what he's doing. He'll be here if we need help."
The reality
Everyone leaves their jobs eventually. They quit, get fired, or die. Dave is not an exception.
Real-world consequence
Hours of archaeology to figure out what Dave did in 5 minutes three years ago. You're reverse-engineering your own infrastructure while the outage clock is ticking.
The fix
- Deploy as if you're leaving tomorrow
- Runbook for every deployment - no exceptions
- No single-person dependencies on any certificate
Warning signs
- "Only [person] knows how that works"
- Fear when key employee updates LinkedIn
7"Just install the cert, it's easy"
What happens
Certificate gets installed but the intermediate chain is missing. Browsers work fine but APIs and mobile apps break.
Why it seems reasonable
"Site shows the padlock, it works!"
The reality
Your browser cached the intermediate certificate from a previous visit. Other clients don't have it and can't complete the chain.
Real-world consequence
QA tests in Chrome: works perfectly. Mobile app users start complaining. APIs fail with SSL errors. Support tickets pile up while you insist "it works for me."
The fix
- Test with
openssl s_client -showcerts - Use SSL Labs to verify full chain
- Test from a clean environment (incognito, different machine)
Warning signs
- "Works in Chrome, must be their problem"
- Different behavior on different devices
8"We renewed it, why is it still broken?"
What happens
Certificate expires. Team renews with the CA, declares victory. Site is still down.
Why it seems reasonable
"We renewed it! The CA says it's valid!"
The reality
Renewal ≠ Deployment. The new certificate is sitting in someone's Downloads folder, not on the server.
Real-world consequence
Certificate expires at 2am. On-call engineer downloads new cert from CA portal. Marks ticket resolved. Goes back to sleep. Site is still down. Customers are still seeing errors.
The fix
- Runbooks include deployment, not just renewal
- Verification step that checks the live service
- "Done" means verified in production, not downloaded
Warning signs
- "We renewed it" without "and deployed it"
- No deployment verification step in the process
9"Nobody else needs access to this key"
What happens
Private key exists on one machine only. No backup. That machine dies or the person who has access leaves.
Why it seems reasonable
"Fewer people with access = more secure"
The reality
There's "secure" and there's "we lost the only copy." You can recover from a compromise. You can't recover from data loss.
Real-world consequence
Server fails. You need to restore the certificate. The private key only existed on that dead disk. Now you're reissuing certificates while the site is down.
The fix
- Key escrow with encrypted backup
- Document key locations in your certificate inventory
- Test key recovery before you need it
Warning signs
- Key locations undocumented
- "It's on the server" as the only backup plan
Key Takeaways
- →Document every deployment as if you're leaving tomorrow
- →Test certificate installations from a clean environment, not your browser
- →Renewal is not deployment - verify the live service
- →Backup private keys securely - "secure" doesn't mean "lost forever"