the wmfusercontent.org certificate expired recently, and it should have been on our tracking calendar.
I've created this task to track auditing the existing certificates and ensuring all are on the calendar.
the wmfusercontent.org certificate expired recently, and it should have been on our tracking calendar.
I've created this task to track auditing the existing certificates and ensuring all are on the calendar.
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | BBlack | T112521 Track/notify cert expiries better | |||
Resolved | RobH | T112542 audit all SSL certificates expiry on ops tracking gcal |
this is changing scope to a checklist for all ssl certificate purchases and how to review and audit
Since we've had a second cert expire on us unexpectedly now in a span of a few days, I went ahead and audited the expiries on all of the cert files stored in puppet's files/ssl/ directory. Nothing else is coming up imminently. There's a batch that will expire in November, but it's the prod SNI certs we're not currently using and don't currently plan to renew.
Wasn't there an Icinga check that tested that certificates were good for another x days?
We only have that icinga check on the primary unified cert, which covers the production endpoints for:
... and all of their mobile subdomains and whatnot. It's a pretty verbose check, validates functional SSL for all of the SAN domains, checks the cert expiry, etc.
But we don't have any kind of checking in place for the various other misc certs we own that are deployed for smaller or one-off services, or deployed to third parties (or in some cases, rare today but important later - not deployed at all but still critical). Just looking at puppet's files/ssl/ today, that list is something like:
archiva.wikimedia.org.crt blog.wikimedia.org.crt dumps.wikimedia.org.crt ecc-star.wmfusercontent.org.crt eventdonations.wikimedia.org.crt ganglia.wikimedia.org.crt gerrit.wikimedia.org.crt icinga.wikimedia.org.crt labvirt-star.eqiad.wmnet.crt ldap-codfw.wikimedia.org.crt ldap-eqiad.wikimedia.org.crt ldap-mirror.wikimedia.org.crt librenms.wikimedia.org.crt lists.wikimedia.org.crt policy.wikimedia.org.crt rt.wikimedia.org.crt star.planet.wikimedia.org.crt star.wmflabs.org.crt star.wmfusercontent.org.crt stream.wikimedia.org.crt tendril.wikimedia.org.crt ticket.wikimedia.org.crt toolserver.org.crt virt-star.eqiad.wmnet.crt wikitech.wikimedia.org.crt
Of those, I can see in our icinga config direct expiry checks only for:
lists.wikimedia.org ticket.wikimedia.org ldap-codfw.wikimedia.org ldap-eqiad.wikimedia.org
related but slightly tangent to this, we have also other private material that's bound to expire (e.g. puppet CA, gpg keyrings for apt repos, certs for cassandra server/client auth). I was thinking we could extend the checks directly by reading material from private.git locally and alert accordingly, thoughts?
For the first steps, I've created https://docs.google.com/a/wikimedia.org/spreadsheets/d/1yT5rvoEEUHhNeJAQRVamr8ECqN3TLsMaO8N_At4Ki3I/edit?usp=sharing
This lists off all the one-off certificates listed earlier in this task. I'll be using it to check against the ops tracking calendar for these renewals and adding the ssl renewals as needed to those entries.
google sheet has been updated with the most recent mail purchases and all info has been added to the google calendar for expiry tracking. each entry has a 4 week notification email set to go to both myself and the ssl renewal alias (intentionally not listed in task)
So while this is 99% complete, we should add in checks in icinga for the various hosts, so I'll be changing the overall subject of the task.