This project provides an automated deployment infrastructure for CachetHQ status page system using Podman containers, with integrated Prometheus AlertManager webhook middleware for automatic incident management.
All application sources and dependencies are prepared automatically by the deployment script; you do not need to manually clone or manage any application repositories except this one.
This infrastructure deploys a complete status page system consisting of:
- Cachet Application: Open-source status page system (Laravel-based)
- PostgreSQL Database: Data persistence layer
- Traefik Reverse Proxy: HTTP/HTTPS routing and SSL termination
- AlertManager Webhook Middleware: Python service that receives Prometheus alerts and automatically manages Cachet incidents and component statuses
- Two-tier Component Architecture: Invisible components (per-target monitoring) + visible components (aggregated service status)
The middleware handles alert lifecycle (firing/resolved) and intelligently updates component statuses and incidents based on alert state and target criticality.
To deploy the infrastructure, ensure the following prerequisites are met:
- System Requirements:
- Bash: Ensure Bash is installed as the default shell.
- Podman: Install Podman for container management.
- Podman-Compose: Install Podman-Compose using the following commands to avoid issues with outdated versions:
sudo curl -L https://raw.githubusercontent.com/containers/podman-compose/main/podman_compose.py -o /usr/local/bin/podman-compose sudo chmod +x /usr/local/bin/podman-compose
Note: The version available via
apt(1.0.6) is outdated and contains bugs related to volume mounting. - Python 3: Ensure Python 3 and pip3 are installed.
- curl: Required for testing webhook endpoints.
- htpasswd: Installable via
apache2-utils, used for HTTP authentication. - openssl: Required for generating random keys (e.g.,
APP_KEY). - systemctl: Required for managing the Podman rootless socket (systemd-based systems).
- sed, grep, awk: Standard utilities for file and string manipulation.
By following these steps and ensuring the required tools are installed, you can successfully deploy the infrastructure and complete the setup process.
Copy the example environment file and edit with your values follow the inline instructions:
cp .env.example .env
nano .envLeave empty (auto-generated):
APP_KEYCACHET_API_TOKEN
Authentication:
The webhook endpoint is protected by Basic Auth. You can configure the credentials by setting the WEBHOOK_BASIC_AUTH environment variable in .env.
The format is user:hash. You can generate the hash using htpasswd -nb user password or an online generator (BCrypt, MD5, SHA1).
If not set, the default credentials are admin:admin.
Note for rootless Podman and privileged ports (80/443):
By default, non-root users cannot bind to ports below 1024 (such as 80 and 443). If you want to expose Traefik or other services directly on these ports in rootless mode, you must configure the following kernel parameter on your host system:
sudo sysctl net.ipv4.ip_unprivileged_port_start=80To make this change persistent after reboot, add this line to
/etc/sysctl.conf:net.ipv4.ip_unprivileged_port_start=80and apply with:
sudo sysctl -pDo not set this in any project file (.env, docker-compose.yml, etc): it must be configured at the OS level.
Copy the example configuration and edit with your infrastructure details:
cp middleware/prometheus.yml.example middleware/prometheus.yml
nano middleware/prometheus.ymlCopy the example configuration and edit to define your component groups:
cp middleware/config.json.example middleware/config.json
nano middleware/config.jsonConfigure the groups_configuration section:
{
"new_incident_name": "System %s is experiencing issues",
"new_incident_message": "We are investigating an issue affecting this service.",
"resolved_incident_message": "The issue has been resolved.",
"cachet_per_page_param": 50,
"groups_configuration": [
{
"status_page_group": "Web Services",
"status_page_components": [
"Web Server",
"Load Balancer"
]
},
{
"status_page_group": "Database Services",
"status_page_components": [
"Database",
"Backend"
]
}
]
}Configuration parameters:
status_page_group: Name of the group that will be created on the status pagestatus_page_components: Array of visible component names that belong to this group
Each visible component referenced in your Prometheus labels must be mapped to a group in this configuration. The setup-components.py script will use this mapping to automatically create groups and organize components during initialization.
Required Prometheus labels for status page integration:
Your Prometheus targets configuration must include these custom labels:
prometheus_targets:
node_exporter:
- targets:
- '192.168.1.10:9100'
- '192.168.1.11:9100'
labels:
status_page_alert: true # NEW: Enable status page monitoring
status_page_component: 'Web Server' # NEW: Visible component name(s)
status_page_critical_target: false # NEW: Mark as critical target (optional)
- targets:
- '192.168.1.20:9100'
labels:
status_page_alert: true
status_page_component: 'Database, Backend' # Multiple components supported
status_page_critical_target: true # Critical target forces major outageNew Label Definitions:
status_page_alert(required): Set totrueto enable monitoring for this targetstatus_page_component(required): Name(s) of visible component(s) affected by this target (comma-separated for multiple)status_page_critical_target(optional): Set totrueto mark target as critical. When a critical target fails, the entire visible component is set to major outage regardless of other targets' status. Default:false
Important: Make sure all component names used in status_page_component labels are also defined in the groups_configuration section of middleware/config.json.
For local development (without HTTPS), ensure your .env is properly configured:
ENVIRONMENT=local
CACHET_DOMAIN=localhost
WEBHOOK_DOMAIN=localhost
APP_ENV=local
APP_DEBUG=true
APP_URL=http://localhost:8080
ASSET_URL=http://localhost:8080
CERT_RESOLVER=Run the deployment script:
./deploy.shThe script will automatically:
- Validate configuration
- Prepare all application sources and dependencies (no manual cloning required)
- Generate the Laravel
APP_KEYautomatically - Configure webhook authentication
- Build container images
- Start Traefik, PostgreSQL, Cachet and Middleware services
- Run database migrations
- Create the admin user and generate the API token
- Automatically set up components using Prometheus/config.json if requested
- Verify the webhook endpoint
At the end of the process, the Cachet status page and middleware will be fully operational.
The middleware implements a two-tier component architecture:
Invisible Components (one per monitored target):
- Name format:
<instance> | <component_names> - Example:
192.168.1.10:9100 | Web Server, Database - Status: Operational (1) or Major Outage (4)
- Purpose: Track individual target health
Visible Components (aggregated by service):
- Name: Service name (e.g.,
Web Server,Database) - Status: Calculated by aggregating invisible component statuses
- Incident: Created/closed when status changes to/from Major Outage
Visible component status is determined by:
- Major Outage (4): All invisible components down OR any critical target down
- Partial Outage (3): Mixed statuses (some up, some down) without critical targets down
- Operational (1): All invisible components operational
Critical targets (marked with status_page_critical_target: true) have priority: if any critical target fails, the entire visible component immediately goes to Major Outage.
To monitor received webhook requests and component status changes from the middleware container, you can use the following commands:
To view all requests received on the /webhook endpoint:
podman logs <container-name> 2>/dev/null | grep "\[WEBHOOK_REQUEST\]"
Example output:
[WEBHOOK_REQUEST] source_ip=10.0.0.5 headers=[Host: example.com; User-Agent: curl/7.68.0; Content-Type: application/json] body={...}
To view all component status changes (both visible and invisible):
podman logs <container-name> 2>/dev/null | grep "\[COMPONENT_STATUS_CHANGE\]"
Example output:
[COMPONENT_STATUS_CHANGE] component="Database" old_status=1 (Operational) new_status=4 (Outage)
Replace <container-name> with the actual container name (e.g., cachet-middleware).
middleware/README.md- Detailed middleware architecture and API- Cachet Documentation
- Prometheus Documentation
- Traefik Documentation