Astalty Logo

13th April 2026 Outage

Generated by Nathan Wuiske via incident.io on April 28, 2026 2:16 PM. All timestamps are local to Australia/Brisbane. The original document can be found here.

Summary

Astalty is currently down from a frontend deployment - users may experience blank pages

Incident Timeline

Time

Event

2026-04-13

13:22:49

Incident reported by Nathan Wuiske

Nathan Wuiske reported the incident

Severity: Critical

Status: Investigating

13:24:33

Status changed from Investigating → Fixing

Nathan Wuiske shared an update

Status: Investigating → Fixing

13:24:33

Identified at

Custom timestamp "Identified at" occurred

13:32:02

Incident resolved and closed

Nathan Wuiske shared an update

Status: Fixing → Closed

13:32:02

Fixed at

Custom timestamp "Fixed at" occurred

Root Cause Analysis

Root cause

A circular dependency between two frontend data validation schemas caused the application to crash immediately on load. Two schema modules each referenced the other, creating a loop. When the browser attempted to initialise the application, one schema was still undefined when the other tried to use it, throwing a runtime error before the app could render. This resulted in a blank white page for every user.

Contributing factors

  • No browser-level smoke test in CI/CD: The existing CI pipeline (type checking and unit tests) does not boot the application in a real browser. Circular dependencies that pass type checking can still crash at runtime. There was no automated test to verify the app actually loads before deploying.

  • Growing complexity of the schema layer: As the platform has grown, the frontend data validation layer has accumulated many cross-references between domains. This increases the likelihood of circular dependencies being introduced without the developer being aware.

Technical analysis

A frontend deployment introduced a circular dependency between the plan services and participant record type schemas. When the browser loaded the application, JavaScript's module system attempted to initialise both schemas simultaneously. Because each depended on the other, one was still uninitialised when the other tried to reference it. This caused a runtime TypeError that prevented the entire Vue application from mounting, rendering a blank white page instead of the UI. Critically, the TypeScript compiler and unit test suite both passed -- the circular dependency only manifests at runtime in a real browser environment, which was not part of the CI pipeline.

Impact Assessment

Customer impact

All users experienced a complete service outage. Navigating to Astalty yielded a blank white page with no interactive UI. Users could not access any functionality including scheduling, participant management, finance, or any other feature. The outage affected 100% of users across all roles (admins, coordinators, support workers) as the failure occurred before any part of the interface could render.

System impact

The frontend application was entirely non-functional. The backend API remained operational and unaffected -- no data was lost or corrupted. The failure was isolated to the frontend: the JavaScript bundle loaded but crashed during initialisation before the application could start.

Business analysis

Complete loss of access to the platform for all users during the outage window. Users were unable to view or manage schedules, participant records, invoicing, or any operational workflows. Support workers in the field could not access shift information or participant details.

Resolution Steps

Resolution summary

Fixed the circular schema dependency by deferring the evaluation of the cross-referenced schema (using Zod's lazy evaluation pattern). Added Playwright end-to-end smoke tests to CI/CD to prevent this class of failure from reaching production again.

Detailed steps

  • Step 1: Identified the root cause by inspecting the browser console error, which pointed to an undefined schema reference during app initialisation. Traced this back to the circular dependency between the two schema modules.

  • Step 2: Applied a one-line fix to defer the evaluation of the cross-referenced schema, breaking the circular dependency. The schema is now resolved at the time it is used rather than at the time the module loads.

  • Step 3: Deployed the fix to production, restoring full application functionality.

  • Step 4: Added Playwright end-to-end smoke tests that boot the application in a real browser and verify both the login page and an authenticated dashboard page load without errors. These tests include a console error guard that fails the test on any runtime error.

  • Step 5: Added a CI workflow that runs the Playwright smoke tests on every pull request and every push to deployment branches, gating deployments on a successful app boot.

Verification

  • Confirmed the fix by loading the application in a browser and verifying the app rendered correctly with no console errors.

  • Validated the new smoke tests catch the regression by temporarily reintroducing the circular dependency, confirming the tests fail, then restoring the fix.

  • The Playwright CI workflow now runs on all frontend changes going forward, preventing this class of failure from reaching production.

Lessons Learned

What went well

  • Root cause was identified and fixed quickly once the team investigated the browser console errors.

  • The fix was minimal and targeted -- a single-line change to defer the schema evaluation.

  • The team immediately invested in prevention by building end-to-end smoke tests and CI integration on the same day, rather than deferring it.

Areas for improvement

  • The CI pipeline lacked any browser-level test that would have caught the application failing to load before deployment. Type checking and unit tests alone are insufficient to verify the app actually boots.

  • There is no automated detection of circular dependencies in the frontend codebase. Linting rules or dependency analysis tooling could flag these at PR time before they reach production.

  • The frontend schema layer has grown in complexity without a documented convention for handling cross-domain references safely, making it easy to inadvertently introduce circular dependencies.

Key takeaways

  • Type checking and unit tests are not sufficient to catch all runtime failures -- browser-level smoke tests are essential for frontend deployments.

  • Circular dependencies can be invisible to static analysis but catastrophic at runtime. Automated detection should be added to the CI pipeline.

  • A convention for safely handling cross-domain schema references should be documented and enforced to reduce the likelihood of recurrence.