Add configurable database health check with automatic restart on failure by rositsa-popova · Pull Request #38 · cloudfoundry/locket

rositsa-popova · 2026-03-04T08:31:13Z

Read the Contributing document.

Summary

Implements a configurable database health check for Locket that monitors database connectivity and automatically restarts the process when failures are detected. Follows the same pattern as BBS (cloudfoundry/bbs#134).

Resolves: cloudfoundry/diego-release#1105

Problem: Locket can enter a degraded state when the database becomes unresponsive, with no automatic recovery mechanism.

Solution:

Periodic health check (UPSERT + SELECT on dedicated table)
Configurable interval, timeout, and failure threshold
Automatic process exit on sustained failures, allowing BOSH to restart
Disabled by default for backward compatibility

Test Results

All tests passed on dev landscape with PostgreSQL backend:

Test 1 - Backward Compatibility (Health Check Disabled): ✅

Verified no health check activity when enable_db_health_check: false (default)
Locket functions normally without any behavior changes

Test 2 - Health Check Enabled and Working: ✅

Health check runner started successfully
54+ consecutive successful health checks observed
Checks occurring every 10 seconds as configured
Logs at INFO level: locket.db-health-check-runner.health-check-succeeded

Test 3 - Database Failure Detection: ✅

Blocked database connectivity using iptables (PostgreSQL port 5432)
Health check detected failure in 12 seconds
Three consecutive timeouts recorded (5s each)
Log message: "database-failure-detected-restarting-locket"
Locket process exited and was restarted by BOSH monit
Health checks resumed successfully after recovery

Test 4 - Timeout Protection: ✅

Added 10 second network delay to database traffic
Health check timeout (5s) triggered correctly - didn't wait full 10s
Three consecutive timeouts detected
Locket restarted as expected
System recovered after removing delay
Confirms health checks don't hang on slow database

Test 5 - Configuration Parameters: ✅

Verified all configuration values applied correctly:
- enable_db_health_check: true
- health_check_interval: 10s
- health_check_timeout: 5s
- health_check_failure_threshold: 3
Measured actual interval: exactly 10 seconds between checks

Database Support

✅ MySQL 8.0 (tested in docker)
✅ PostgreSQL (tested in docker and on dev landscape)

Backward Compatibility

Breaking Change? No

This feature is disabled by default and requires explicit operator opt-in via the enable_db_health_check BOSH property. When disabled (default), Locket behaves exactly as before with no changes to functionality or performance.

When enabled:

Locket creates an additional table locket_health_check (simple 2-column table)
Minimal performance overhead: one UPSERT + SELECT every 10 seconds
Only restarts on sustained database failures (3+ consecutive failures)
No breaking changes to existing APIs, interfaces, or behaviors

linux-foundation-easycla · 2026-03-04T08:31:22Z

❌ - login: @rositsa-popova / name: Rositsa Popova . The commit (312aa8d) is not authorized under a signed CLA. Please click here to be authorized. For further assistance with EasyCLA, please submit a support request ticket.
❌ - login: @rositsa-popova / name: rositsa-popova . The commit (c86743f) is not authorized under a signed CLA. Please click here to be authorized. For further assistance with EasyCLA, please submit a support request ticket.

rositsa-popova added 2 commits February 25, 2026 14:25

Add configurable DB health check to Locket.

312aa8d

Change health check success log from debug to info level.

c86743f

rositsa-popova requested a review from a team as a code owner March 4, 2026 08:31

rositsa-popova mentioned this pull request Mar 4, 2026

Add BOSH properties for Locket DB health check. cloudfoundry/diego-release#1111

Open

1 task

cf-foundation-community-automation bot added this to Application Runtime Platform Working Group Mar 4, 2026

cf-foundation-community-automation bot moved this to Inbox in Application Runtime Platform Working Group Mar 4, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add configurable database health check with automatic restart on failure#38

Add configurable database health check with automatic restart on failure#38
rositsa-popova wants to merge 2 commits intocloudfoundry:mainfrom
rositsa-popova:add-locket-db-health-check

rositsa-popova commented Mar 4, 2026

Uh oh!

linux-foundation-easycla bot commented Mar 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rositsa-popova commented Mar 4, 2026

Summary

Test Results

Database Support

Backward Compatibility

Uh oh!

linux-foundation-easycla bot commented Mar 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant