Errors running queries and starting/resuming clusters
Incident Report for Starburst Galaxy
Resolved
This incident has been resolved.
Posted Oct 26, 2023 - 16:16 UTC
Update
All Galaxy systems continue to be operational. We are still monitoring this incident until we get an RCA from the cloud provider.
Posted Oct 26, 2023 - 16:06 UTC
Update
We are continuing to monitor for further issues and continuing to engage with the cloud provider to get an RCA and driving to resolution.
Posted Oct 25, 2023 - 22:48 UTC
Monitoring
Our cloud provider has mitigated the incident - we are continuing to monitor the situation, but our services should be back to operational now
Posted Oct 25, 2023 - 19:09 UTC
Update
Our cloud provider has identified the issue and is working on a fix - we are continuing to monitor things on our end and will post updates as systems get operational again
Posted Oct 25, 2023 - 16:57 UTC
Identified
Due to an incident in our upstream cloud provider, we are experiencing a partial outage. Any new or resumed Galaxy clusters may be inaccessible from clients. We are engaged with the cloud provider and are working on resolution with them. Existing running clusters should continue to function.
Posted Oct 25, 2023 - 15:30 UTC
Monitoring
The issue has now been mitigated - we have implemented a workaround by restarting cloudflared in the affected Trino planes. We are continuing to monitor the situation.
Posted Oct 25, 2023 - 02:54 UTC
Identified
The root cause has been identified and the engineers are working on mitigating the issue. This was caused by one of our providers not updating its configuration from the upstream. We are engaging with the provider to resolve the issue at the source
Posted Oct 25, 2023 - 01:14 UTC
Update
We are continuing to investigate this issue.
Posted Oct 24, 2023 - 19:45 UTC
Investigating
There have been several reports of errors running queries, which are failing intermittently and sometimes taking a long time. We are aware of the issue and currently investigating. When a root cause is identified, we'll post more updates.
Posted Oct 24, 2023 - 18:16 UTC
This incident affected: Starburst Galaxy UI, Clusters (AWS) (AWS ap-northeast1, AWS ap-southeast1, AWS ap-southeast2, AWS ca-central1, AWS eu-central1, AWS eu-west1, AWS eu-west2, AWS eu-west3, AWS us-east1, AWS us-east2, AWS us-west1, AWS us-west2), Clusters (Azure) (Azure centralindia, Azure eastus, Azure eastus2, azure francecentral, Azure southcentralus, Azure westeurope), and Clusters (GCP) (GCP asia-south1, GCP asia-southeast1, GCP europe-west2, GCP europe-west6, GCP southamerica-east1, GCP us-central1, GCP us-east1).