Security

This document describes the security posture, configuration, and guidelines for the server monitoring system.

Secret Management

Django Secret Key

The DJANGO_SECRET_KEY environment variable is required. The application raises a RuntimeError at startup if it is not set (config/settings.py:35).

# Generate a production-grade key
python -c "from django.core.management.utils import get_random_secret_key; print(get_random_secret_key())"

Environment Variables

Secrets are loaded from environment variables via python-dotenv (config/env.py). Key rules:

Never commit .env files — only .env.sample is tracked
Existing shell environment variables always take precedence (override=False)
.env.dev is loaded only when DJANGO_ENV=dev

Security-sensitive variables:

Variable	Purpose	Required
`DJANGO_SECRET_KEY`	Django cryptographic signing	Yes (enforced)
`DJANGO_DB_PASSWORD`	Database credentials	When using MySQL/PostgreSQL
`CELERY_BROKER_URL`	Redis connection (may contain password)	When using Celery

DB-Stored Secrets

API keys and credentials for notification channels and intelligence providers are stored in database JSON fields:

NotificationChannel.config — webhook URLs, SMTP passwords, API keys
IntelligenceProvider.config — AI provider API keys

In production deployments:

Use secret references rather than raw values where possible
Restrict database access to the Django application user
Consider encrypting sensitive fields at the application layer for high-security environments

Django Security Configuration

Middleware Stack

The following security middleware is enabled (config/settings.py:64-72):

Middleware	Protection
`SecurityMiddleware`	HTTPS redirects, HSTS, content type sniffing
`CsrfViewMiddleware`	Cross-site request forgery protection
`AuthenticationMiddleware`	Session-based authentication
`XFrameOptionsMiddleware`	Clickjacking protection

Password Validation

All four Django password validators are enabled:

UserAttributeSimilarityValidator
MinimumLengthValidator
CommonPasswordValidator
NumericPasswordValidator

HTTPS Hardening (Production)

The following settings should be enabled in production environments via environment variables or a production settings module:

SECURE_SSL_REDIRECT = True
SESSION_COOKIE_SECURE = True
CSRF_COOKIE_SECURE = True
SECURE_HSTS_SECONDS = 31536000  # 1 year
SECURE_HSTS_INCLUDE_SUBDOMAINS = True
SECURE_HSTS_PRELOAD = True
SECURE_CONTENT_TYPE_NOSNIFF = True

Debug Mode

DJANGO_DEBUG defaults to 1 (enabled) for local development. Always set DJANGO_DEBUG=0 in production.

Webhook Security

CSRF Exemption

The webhook endpoint (POST /alerts/webhook/) is CSRF-exempted because external alerting systems (Grafana, Alertmanager, PagerDuty, etc.) cannot include Django CSRF tokens. This is standard practice for webhook receivers.

Payload Validation

Each alert driver validates incoming payloads structurally via validate() and parse() methods. Drivers check for required keys and expected payload shapes.

API Key Authentication

API endpoints require authentication via API key for non-GET requests. Keys are managed via the Django admin.

Setup

Enable: set API_KEY_AUTH_ENABLED=1 in your environment
Create a key via admin (/admin/config_app/apikey/) or shell:

from config.models import APIKey
key = APIKey.objects.create(name="my-client")
print(key.key)  # 40-char hex token

Usage

Include the key in every request via one of:

Authorization: Bearer <key>
X-API-Key: <key>

Endpoint Restrictions

Keys can optionally restrict access to specific path prefixes via the allowed_endpoints JSON field (e.g., ["/alerts/"]). Empty list = access all API endpoints.

Exempt Paths

The following paths do not require an API key:

/alerts/webhook/ — GET health check for the webhook endpoint
/intelligence/health/ — GET service health status
/admin/* — Django session auth (not API key auth)
/static/*

All other GET and POST requests on API paths (/alerts/, /orchestration/, /notify/, /intelligence/) require a valid key. In particular, data-returning endpoints such as /orchestration/pipelines/, /intelligence/providers/, and /intelligence/recommendations/ are not exempt.

Webhook Signature Verification

Drivers support opt-in HMAC signature verification. When a secret is configured for a driver, incoming webhooks must include a valid signature.

Configuration

Set an environment variable per driver:

Variable	Driver
`WEBHOOK_SECRET_GRAFANA`	Grafana
`WEBHOOK_SECRET_PAGERDUTY`	PagerDuty
`WEBHOOK_SECRET_NEWRELIC`	New Relic
`WEBHOOK_SECRET_GENERIC`	Generic webhook

Drivers without native signature support (Alertmanager, Datadog, OpsGenie, Zabbix) do not perform verification.

How It Works

The driver declares its signature header (e.g., X-Grafana-Signature)
On incoming POST, if the env var is set, the middleware computes HMAC-SHA256(secret, request.body) and compares with the header value
Missing or invalid signature → 403 Forbidden
No env var configured → verification skipped (opt-in)

Rate Limiting

Application-level rate limiting using Django cache with fixed-window counters (one bucket per UTC minute per identity/prefix).

Configuration

Enable: RATE_LIMIT_ENABLED=1

Default limits (configurable via RATE_LIMITS in settings):

Path prefix	Limit
`/alerts/`	120 req/min
`/orchestration/`	30 req/min
`/notify/`	30 req/min
`/intelligence/`	20 req/min

Identity

Limits are tracked per API key name (if authenticated) or per client IP (via X-Forwarded-For or REMOTE_ADDR).

Cache Backend

Rate limiting requires a shared cache backend (Redis or Memcached) in multi-process deployments. A Django system check (config.W001) warns if rate limiting is enabled with LocMemCache.

Current Limitations

The following are known areas for improvement:

DB-stored secrets not encrypted — API keys and provider credentials in JSON fields are not encrypted at rest. Consider field-level encryption for high-security environments.

Data Handling

Redacted References

The pipeline stores references to data rather than raw payloads to avoid leaking secrets:

normalized_payload_ref — Reference to normalized inbound payload (no raw secrets)
checker_output_ref — Reference to checker output
intelligence_output_ref — Reference to AI analysis (prompt/response refs, redacted)
notify_output_ref — Reference to notification delivery results

Logging

Never log raw webhook payloads that may contain credentials
Never log API keys, tokens, or webhook URLs
Use structured logging with trace_id/run_id for correlation without exposing sensitive data

Celery Security

JSON-only serialization — CELERY_ACCEPT_CONTENT, CELERY_TASK_SERIALIZER, and CELERY_RESULT_SERIALIZER are all set to json, preventing pickle deserialization attacks
Broker URL may contain credentials — treat CELERY_BROKER_URL as a secret

CI Security Checks

Automated Checks (`.github/workflows/security.yml`)

The security workflow runs automatically on:

Every push to main
Pull requests that change Python files, pyproject.toml, uv.lock, Docker config, or the workflow itself

Code Security job:

Check	Tool	What it does
Dependency audit	`pip-audit`	Scans installed packages for known CVEs
Security lint	`bandit`	Static analysis for common Python security issues
Secret detection	`detect-secrets`	Scans for accidentally committed credentials

Docker Security job:

Check	Tool	What it does
Image vulnerability scan	`trivy`	Scans the Docker image for OS and library CVEs (blocks on CRITICAL)
HIGH vulnerability report	`trivy`	Reports HIGH-severity vulnerabilities (non-blocking)

Addressing Security Alerts

When a vulnerability is reported (by pip-audit, GitHub Dependabot, or manual audit):

Identify the package and fix version — check the CVE details for the patched version
Bump the dependency:
- Direct dependency: update version in pyproject.toml, then uv lock
- Transitive dependency: uv lock --upgrade-package <package>
Verify the fix: uv sync --extra dev && uv run pip-audit --strict --desc
Create a PR — the security workflow triggers automatically for dependency changes
Merge promptly — security fixes should not wait in review queues

For Docker image vulnerabilities (trivy), rebuild with an updated base image or pin a patched version of the affected OS package.

CI Pipeline (`.github/workflows/ci.yml`)

Lint: Black formatting + Ruff linting (catches common issues)
Type check: mypy with django-stubs (catches type-related bugs)
Tests: pytest across Python 3.10, 3.11, 3.12
Django checks: manage.py check + migration consistency

Admin Interface

The Django admin (/admin/) is the primary operations surface:

Protected by Django’s built-in staff/superuser authentication
Custom MonitoringAdminSite inherits all default auth protections
Admin actions (acknowledge, resolve, retry) require authenticated staff access

Reporting Vulnerabilities

If you discover a security vulnerability, please report it responsibly by opening a private issue or contacting the maintainer directly. Do not disclose vulnerabilities in public issues.

Security

Secret Management

Django Secret Key

Environment Variables

DB-Stored Secrets

Django Security Configuration

Middleware Stack

Password Validation

HTTPS Hardening (Production)

Debug Mode

Webhook Security

CSRF Exemption

Payload Validation

API Key Authentication

Setup

Usage

Endpoint Restrictions

Exempt Paths

Webhook Signature Verification

Configuration

How It Works

Rate Limiting

Configuration

Identity

Cache Backend

Current Limitations

Data Handling

Redacted References

Logging

Celery Security

CI Security Checks

Automated Checks (.github/workflows/security.yml)

Addressing Security Alerts

CI Pipeline (.github/workflows/ci.yml)

Admin Interface

Reporting Vulnerabilities

Automated Checks (`.github/workflows/security.yml`)

CI Pipeline (`.github/workflows/ci.yml`)