Authentication and authorisation outage
Incident Report for Docplanner Phone
Postmortem

Update 24/09/2024

Key takeaways from the post-mortem meeting are:

  • improving application monitoring so the automated alerts would be sent earlier
  • improving the rollback process so it can be easily started manually and fully automated then
  • reviewing the Fallback Workstation process to cover partial outage of the application
  • improving incident management, so we can quickly declare the incident and show the notifications to the customers
  • change the auth library

We will start implementing changes ASAP.


Initial report 18/09/2024

Timeframe [CEST]:

  • EU between 14:39 and 14:53
  • BR between 14:39 and 15:12
  • MX between 14:39 and 15:10

Summary Users were not able to log in. Those logged in were unable to do anything in the app because the connection between the interface and servers was blocked. We made a mistake during the optimization of the deployment process and removed part of the workflow needed to properly authorize users.

Details
The problem that we created yesterday was impacting the authentication and authorization part of the application. The users were not able to log in or access any internal endpoints of the system. All the secured communication between the interface and servers was down because we were unable to authorize users to access the API (even if they were already logged in).

Fallback

The part that communicates with the telecommunication network was working well and that's why the fallback workstation path was not activated. We will take a closer look at how can we activate the fallback mechanism in such cases in the future.

We will analyze the details during the post-mortem review next week and we will keep you updated on the plan how can we improve to prevent this from happening in the future.

Posted Sep 19, 2024 - 09:49 CEST

Resolved
Users are not able to log in. Those logged in are unable to do anything in the app because the connection between the interface and servers is blocked.
Posted Sep 18, 2024 - 14:30 CEST