Skip to content

Conversation

@cfsmp3
Copy link
Contributor

@cfsmp3 cfsmp3 commented Dec 23, 2025

Summary

Adds health check endpoints to enable deployment verification and monitoring:

  • /health - Returns 200 if all critical checks pass (database, config), 503 otherwise
  • /health/live - Liveness probe, always returns 200 if Flask is responding
  • /health/ready - Readiness probe, same as /health

Why This Is Needed

Currently, the deployment pipeline has no way to verify that a deployment was successful. If code is deployed that crashes on startup or can't connect to the database, the platform becomes "bricked" with no automatic recovery.

These endpoints enable:

  1. Post-deployment verification: After deploying, check /health to confirm the app started successfully
  2. Automatic rollback: If health check fails, rollback to previous version
  3. Load balancer integration: Standard health endpoints for nginx/HAProxy
  4. Container orchestration: Kubernetes liveness/readiness probes

Test Plan

  • /health returns 200 when database and config are OK
  • /health returns 503 when database fails
  • /health returns 503 when config is missing
  • /health/live always returns 200
  • /health/ready behaves same as /health
  • Verify CI passes

Example Response

{
  "status": "healthy",
  "timestamp": "2025-12-23T15:30:00Z",
  "checks": {
    "database": {"status": "ok"},
    "config": {"status": "ok"}
  }
}

Related

This is part of a larger effort to make deployments more reliable. Future PRs will:

  1. Add deployment scripts with pre/post checks
  2. Update GitHub Actions workflow to use health checks and auto-rollback

🤖 Generated with Claude Code

@codecov
Copy link

codecov bot commented Dec 23, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 87.02%. Comparing base (53b3d7f) to head (ab0cf74).

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #948      +/-   ##
==========================================
+ Coverage   86.88%   87.02%   +0.14%     
==========================================
  Files          35       36       +1     
  Lines        3759     3800      +41     
  Branches      767      774       +7     
==========================================
+ Hits         3266     3307      +41     
  Misses        355      355              
  Partials      138      138              
Flag Coverage Δ
unittests 87.02% <100.00%> (+0.14%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

db.remove()
return {'status': 'ok'}
except Exception as e:
return {'status': 'error', 'message': str(e)}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want to just dump the exception message here, it might contain sensitive information. Better log the exception and return a generic error message.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Fixed - now logging the exception server-side with g.log.exception() and returning a generic "Database connection failed" message to the client.

@canihavesomecoffee
Copy link
Member

Also, you might want to rebase the branch :)

cfsmp3 and others added 4 commits December 23, 2025 23:21
Add /health, /health/live, and /health/ready endpoints to enable
deployment verification and monitoring:

- /health: Returns 200 if database and config checks pass, 503 otherwise
- /health/live: Always returns 200 if Flask is responding (liveness probe)
- /health/ready: Same as /health (readiness probe)

These endpoints are essential for implementing safe deployments with
automatic rollback - the deployment pipeline can verify the app is
healthy after deploying new code.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Log database exceptions server-side with g.log.exception()
- Return generic "Database connection failed" message to client
- Add comment explaining db.remove() connection pool cleanup
- Update test to expect generic error message

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@cfsmp3
Copy link
Contributor Author

cfsmp3 commented Dec 23, 2025

Rebased on master and addressed review comments in the latest commit. CI is now running.

g.log is only available during HTTP requests (set in before_request),
but current_app.logger works in any app context including tests.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@sonarqubecloud
Copy link

@canihavesomecoffee canihavesomecoffee merged commit 535719e into master Dec 24, 2025
6 checks passed
@cfsmp3 cfsmp3 deleted the feat/health-endpoint branch December 24, 2025 11:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants