NET::ERR_CERT_COMMON_NAME_INVALID error when accessing the JS SDK on Enterprise line
Incident Report for TokBox
Postmortem

Timeline

Around 2018-07-12 15:20 PDT, we continued the process of migrating to a more secure and stable API gateway as part of a larger system improvement project. At 2018-07-12 15:34 PDT, we received the first report from a customer that the Enterprise JS SDK was not accessible. At 2018-07-12 15:52 PDT we published in the first incident. At 2018-07-12 16:15 PDT, we pushed a fix but it did not persist. On 2018-07-13 8:28 PDT we created a new incident based on new reports, though the incident/reproducibility of the problem was significantly less. On 2018-07-13 10:23 PDT we resolved the incident.

Root-Cause Analysis

The outage that happened was a result of migrating to a more secure and stable API gateway. We use multiple external DNS providers and one of them had the incorrect record set. Compounding the problem, the current gateway servers were pointing to an internal DNS that had cached the incorrect record set.

Remediation

This happened as we are conducting an important project to improve the Enterprise line. We expect to complete that initiative in a week or so. Once it is completed, we'll resume efforts to improve monitoring that could have detected this problem, and automate some some notifications to our status page.

Posted 2 days ago. Jul 13, 2018 - 12:03 PDT

Resolved
The outage that happened was a result of migrating to a more secure and stable API gateway. We use multiple external DNS providers one of them had the incorrect record set. Compounding the problem, the current gateway servers were pointing to a internal DNS that had cached the incorrect record set. This morning at 9:15am we found the 2nd incorrect external record and updated it.
Posted 2 days ago. Jul 13, 2018 - 10:23 PDT
Investigating
A few customers have started to experience this problem and we are investigating. It does not happen consistently but we are investigating.
Posted 2 days ago. Jul 13, 2018 - 08:28 PDT
This incident affected: Enterprise (Enterprise Video).