A letter from our CEO: Downtime report - Thursday, March 22nd

Dear Desk-Net customers,

As you will probably have noticed Desk-Net was down today for a very long time. We have not experienced such a long downtime during European office hours in our more than eleven years of operation.

For most of you this happened during some of the most busy hours of your daily use of Desk-Net.

Before I explain what happened please accept my sincere apologies for this downtime.

We know you rely on Desk-Net for your day-to-day operations and issues like these can have a significant impact.

What happened?

Desk-Net went unexpectedly down for a significant duration due to a fatal issue at our central application server at our hosting provider Amazon Web Services. For those of you interested in technical issues: This is what happened.

The issue also caused our marketing site to be down. Our email server was not available either.

The Desk-Net team at that moment was fully staffed and immediately began working on analyzing the issue and starting the failover procedure. Our customer success team communicated as usual via Twitter with our users and answered all incoming phone calls.

Due to the severity of the issue this was from the beginning not a quick process. However, overall the process took longer than it should have.

What will we do about it?

We are already in the process of significantly reducing the risk of something like this happening:

  • We are just a few weeks away from separating our email server from the main server instance. This would allow us in such a case to communicate with users via email.

   

  • We will once again review and test our documented failover procedure which had been produced for such a case. The objective here will be to significantly reduce the time to recover Desk-Net in instances like these.

     

  • The above-mentioned separation of the email server from the main application instance is part of an overall project of splitting Desk-Net up into micro-services.
    Once this is largely completed in about eight months we will be able to distribute the main Desk-Net application across several server instances.
    This will enable Desk-Net to be up even when one of the application instances failed as it did today. Every month we are dedicating significant resources to this work as we strive to make Desk-Net more resilient and easier to maintain and improve.

 

For tomorrow, Friday, March 23rd we had planned to deploy a new release of Desk-Net. As a cautionary measure we are postponing this until Tuesday, March 27th. We will keep you informed via Twitter.

Once again, please accept my apologies for this downtime.

We are working hard on not letting this happen again.

Best regards,
Matthias Kretschmer
CEO & Founder Desk-Net GmbH