We had a VMware crash (at the hypervisor level) on our main Redis server.
The log says:
An application (/bin/vmx) running on ESXi host has crashed (1 time(s) so far). A core file might have been created at /vmfs/volumes/579f2eb5-e0d5763e-df1c-f48e38-c36596/pa2-rediscore01/vmx-zdump.000.
The VM was powered off automatically by VMware after the crash.
Our monitoring system detected the issue immediately. In an event like this, we can either deploy a new configuration to use a backup redis instance (we have 4), or we can wait for the main instance to come back online.
Since the origin of the outage was detected very quickly, we were able to restart the VM and chose not to switch to a backup instance. The delay would have been more or less the same.
After the VM was started again, the outage was over.
We are sorry about this issue. We will analyze the logs and try to prevent any further potential issues.
Oct 18, 15:37 CEST
The internal component has been restored. Everything is back to normal. We are monitoring closely.
Oct 18, 15:24 CEST
An internal component has crashed. We are working on a fix. We will give more details soon.
Oct 18, 15:20 CEST