Environment Variable Cleanup — Implementation Plan
For Claude: REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
Goal: Remove application-level env vars that duplicate DB/definition-based config, clean up .env.sample to infrastructure-only, and remove legacy env var fallbacks from code.
Architecture: Pipeline definitions, NotificationChannel, and IntelligenceProvider models are the source of truth for application behavior. Environment variables only configure infrastructure (Django core, Celery, metrics). Code that reads removed env vars gets deleted or updated; tests that assert on removed env vars get rewritten.
Tech Stack: Django 5.2, Python, pytest
Task 1: Remove CHECKERS_SKIP settings from settings.py
Files:
- Modify:
config/settings.py:198-214
Step 1: Remove the checker skip settings block
Delete lines 198-214 in config/settings.py — the entire Checkers Configuration section:
# DELETE THIS ENTIRE BLOCK:
# ---------------------------------------------------------------------------
# Checkers Configuration
# ---------------------------------------------------------------------------
# Disable all checkers globally.
# ...
CHECKERS_SKIP_ALL = os.environ.get("CHECKERS_SKIP_ALL", "0") in {"1", "true", "True", "yes", "on"}
# List of checker names to skip (disabled checkers).
# ...
_skip_checkers = os.environ.get("CHECKERS_SKIP", "")
CHECKERS_SKIP: list[str] = [c.strip() for c in _skip_checkers.split(",") if c.strip()]
# If CHECKERS_SKIP_ALL is enabled, treat all checkers as skipped.
# We do this by overriding the skip list downstream (in apps.checkers.checkers.is_checker_enabled).
Step 2: Run tests to see what breaks
Run: uv run pytest apps/checkers/_tests/test_registry.py -v Expected: FAIL — tests use @override_settings(CHECKERS_SKIP_ALL=...) which now does nothing meaningful
Step 3: Commit
git add config/settings.py
git commit -m "refactor: remove CHECKERS_SKIP_ALL and CHECKERS_SKIP from settings.py"
Task 2: Remove is_checker_enabled / get_enabled_checkers and their tests
Files:
- Modify:
apps/checkers/checkers/__init__.py:25-72 - Modify:
apps/checkers/_tests/test_registry.py
Step 1: Remove is_checker_enabled and get_enabled_checkers from checkers/__init__.py
In apps/checkers/checkers/__init__.py, delete:
"get_enabled_checkers"and"is_checker_enabled"from__all__(lines 24-25)- The
is_checker_enabled()function (lines 41-61) - The
get_enabled_checkers()function (lines 64-72)
The file should only export the registry and checker classes:
__all__ = [
"BaseChecker",
"CheckResult",
"CheckStatus",
"CPUChecker",
"MemoryChecker",
"DiskChecker",
"DiskCommonChecker",
"DiskLinuxChecker",
"DiskMacOSChecker",
"NetworkChecker",
"ProcessChecker",
"CHECKER_REGISTRY",
]
Add "CHECKER_REGISTRY" to __all__ since external code uses it.
Step 2: Remove checker skip tests from test_registry.py
In apps/checkers/_tests/test_registry.py, delete:
- The import of
is_checker_enabled(line 14) test_skip_all_disables_every_checker(lines 39-42)test_skip_list_disables_only_selected_checkers(lines 46-51)
Step 3: Run tests
Run: uv run pytest apps/checkers/_tests/test_registry.py -v Expected: PASS (remaining registry tests should still work)
Step 4: Commit
git add apps/checkers/checkers/__init__.py apps/checkers/_tests/test_registry.py
git commit -m "refactor: remove is_checker_enabled and get_enabled_checkers
Pipeline definitions control which checkers run via context node checker_names."
Task 3: Remove is_checker_enabled usage from alerts app
Files:
- Modify:
apps/alerts/check_integration.py:47,445-446,477-482 - Modify:
apps/alerts/_tests/test_check_integration.py:151 - Modify:
apps/alerts/management/commands/check_and_alert.py:33,88,109,121-126
Step 1: Update check_integration.py
In apps/alerts/check_integration.py:
- Remove the import of
is_checker_enabled(line 47) - In
run_check_and_alert(): remove theif not is_checker_enabled(checker_name)guard (lines 445-446). The caller (pipeline definition) already specifies which checkers to run. - In
run_checks_and_alert(): remove theif not is_checker_enabled(checker_name)skip (lines 477-482). Just iterate all provided checker names.
Step 2: Update check_and_alert.py management command
In apps/alerts/management/commands/check_and_alert.py:
- Remove the import of
is_checker_enabled(line 33) - Remove
--include-skippedargument (line 88) — no skip list to override - Remove the skip filtering logic (lines 109-126). Use all checkers from
CHECKER_REGISTRYdirectly.
Step 3: Update test_check_integration.py
In apps/alerts/_tests/test_check_integration.py:
- Remove the mock patch of
is_checker_enabled(line 151)
Step 4: Run tests
Run: uv run pytest apps/alerts/_tests/ -v Expected: PASS
Step 5: Commit
git add apps/alerts/check_integration.py apps/alerts/_tests/test_check_integration.py apps/alerts/management/commands/check_and_alert.py
git commit -m "refactor: remove is_checker_enabled from alerts app
Checker selection is now controlled by pipeline definitions, not global skip lists."
Task 4: Remove NOTIFY_SKIP functions from notify drivers
Files:
- Modify:
apps/notify/drivers/__init__.py:15-16,28-59
Step 1: Remove is_notify_enabled and get_enabled_notify_drivers
In apps/notify/drivers/__init__.py:
- Remove
"is_notify_enabled"and"get_enabled_notify_drivers"from__all__(lines 15-16) - Delete
is_notify_enabled()function (lines 28-48) - Delete
get_enabled_notify_drivers()function (lines 51-59)
The file should be:
"""
Notification drivers for sending notifications to various platforms.
"""
from apps.notify.drivers.base import BaseNotifyDriver, NotificationMessage
from apps.notify.drivers.email import EmailNotifyDriver
from apps.notify.drivers.generic import GenericNotifyDriver
from apps.notify.drivers.pagerduty import PagerDutyNotifyDriver
from apps.notify.drivers.slack import SlackNotifyDriver
__all__ = [
"NotificationMessage",
"BaseNotifyDriver",
"DRIVER_REGISTRY",
"SlackNotifyDriver",
"PagerDutyNotifyDriver",
"EmailNotifyDriver",
"GenericNotifyDriver",
]
# Registry of available notification drivers
DRIVER_REGISTRY: dict[str, type[BaseNotifyDriver]] = {
"slack": SlackNotifyDriver,
"pagerduty": PagerDutyNotifyDriver,
"email": EmailNotifyDriver,
"generic": GenericNotifyDriver,
}
Step 2: Search for any callers of the removed functions
Run: uv run pytest apps/notify/_tests/ -v Expected: PASS (these functions were never called)
Step 3: Commit
git add apps/notify/drivers/__init__.py
git commit -m "refactor: remove unused is_notify_enabled and get_enabled_notify_drivers
NotificationChannel.is_active controls channel enable/disable in DB."
Task 5: Remove OpenAI env var fallbacks from provider
Files:
- Modify:
apps/intelligence/providers/openai.py:8,36,39-40 - Modify:
apps/intelligence/_tests/providers/test_openai.py:23-25
Step 1: Update the test to expect DB-only config
In apps/intelligence/_tests/providers/test_openai.py, update test_initialization_defaults():
- Remove the
patch.dict("os.environ", ...)wrapper - Test that when no args are passed, provider has empty/None values (no env fallback)
- Add a test that verifies provider works when config is passed via kwargs (the DB path)
def test_initialization_defaults(self):
"""Provider uses empty defaults when no config provided."""
provider = OpenAIRecommendationProvider()
assert provider.api_key is None
assert provider.model == "gpt-4o-mini" # default_model class attr
assert provider.max_tokens == 1024 # BaseAIProvider default
def test_initialization_with_kwargs(self):
"""Provider accepts config from DB (IntelligenceProvider model)."""
provider = OpenAIRecommendationProvider(
api_key="sk-test", model="gpt-4o", max_tokens=2048
)
assert provider.api_key == "sk-test"
assert provider.model == "gpt-4o"
assert provider.max_tokens == 2048
Step 2: Run test to verify it fails
Run: uv run pytest apps/intelligence/_tests/providers/test_openai.py -v Expected: FAIL — provider still reads env vars
Step 3: Update openai.py to remove env var fallbacks
In apps/intelligence/providers/openai.py:
- Remove
import os(line 8) - Update
__init__to remove allos.environ.get()calls:
def __init__(
self,
api_key: str | None = None,
model: str | None = None,
max_tokens: int | None = None,
**kwargs: Any,
) -> None:
super().__init__(
api_key=api_key or "",
model=model or "",
max_tokens=max_tokens or 0,
)
if not api_key:
self.api_key = None # type: ignore[assignment]
self._client = None
Step 4: Run test to verify it passes
Run: uv run pytest apps/intelligence/_tests/providers/test_openai.py -v Expected: PASS
Step 5: Commit
git add apps/intelligence/providers/openai.py apps/intelligence/_tests/providers/test_openai.py
git commit -m "refactor: remove env var fallbacks from OpenAI provider
Credentials and config come from IntelligenceProvider DB model, not env vars."
Task 6: Remove env var writing from setup_instance
Files:
- Modify:
apps/orchestration/management/commands/setup_instance.py:708-730 - Modify:
apps/orchestration/_tests/test_setup_instance.py:1019-1051,1323-1327
Step 1: Update setup_instance handle() to stop writing removed env vars
In apps/orchestration/management/commands/setup_instance.py, update the env_updates block (lines 708-730):
Remove these lines:
# DELETE: CHECKERS_SKIP writing (lines 713-720)
if checkers:
from apps.checkers.checkers import CHECKER_REGISTRY
all_checkers = set(CHECKER_REGISTRY.keys())
enabled = set(checkers["enabled"])
skipped = all_checkers - enabled
if skipped:
env_updates["CHECKERS_SKIP"] = ",".join(sorted(skipped))
# DELETE: Intelligence env var writing (lines 722-727)
if intelligence:
env_updates["INTELLIGENCE_PROVIDER"] = intelligence["provider"]
if intelligence.get("api_key"):
env_updates["OPENAI_API_KEY"] = intelligence["api_key"]
if intelligence.get("model"):
env_updates["OPENAI_MODEL"] = intelligence["model"]
Keep env_updates = {} and the _write_env call (it may still write ALERTS_ENABLED_DRIVERS on line 711). If env_updates ends up empty, consider skipping the _write_env call entirely:
env_updates = {}
if alerts:
env_updates["ALERTS_ENABLED_DRIVERS"] = ",".join(alerts)
if env_updates:
self._write_env(env_path, env_updates)
self.stdout.write(self.style.SUCCESS(f"✓ Updated .env with {len(env_updates)} setting(s)"))
Step 2: Update setup_instance tests
In apps/orchestration/_tests/test_setup_instance.py:
test_generates_checkers_skip_env(line 1019): Rewrite to assertCHECKERS_SKIPis NOT in env_updates (setup no longer writes it)test_generates_openai_env_vars(line 1045): Rewrite to assertINTELLIGENCE_PROVIDER,OPENAI_API_KEY,OPENAI_MODELare NOT in env_updatestest_all_checkers_enabled_no_skip_env(line 1323): This test already asserts CHECKERS_SKIP is absent — keep it, but simplify the name/docstring
Step 3: Run tests
Run: uv run pytest apps/orchestration/_tests/test_setup_instance.py -v Expected: PASS
Step 4: Commit
git add apps/orchestration/management/commands/setup_instance.py apps/orchestration/_tests/test_setup_instance.py
git commit -m "refactor: stop writing CHECKERS_SKIP and OPENAI_* to .env from setup_instance
These are now DB-only: IntelligenceProvider model for AI config, pipeline definitions for checker selection."
Task 7: Clean up .env.sample
Files:
- Modify:
.env.sample
Step 1: Rewrite .env.sample with infrastructure-only vars, organized
# .env.sample
# Copy this file to .env for local development.
# Only infrastructure config belongs here. Application behavior
# (checkers, intelligence, notify) is configured via Django Admin
# and pipeline definitions.
# Django
DJANGO_SECRET_KEY=
DJANGO_DEBUG=1
DJANGO_ALLOWED_HOSTS=localhost,127.0.0.1
DJANGO_ENV=dev
# Celery / Redis
CELERY_BROKER_URL=redis://localhost:6379/0
# CELERY_RESULT_BACKEND=redis://localhost:6379/0
CELERY_TASK_ALWAYS_EAGER=0
# Orchestration
ORCHESTRATION_MAX_RETRIES_PER_STAGE=3
ORCHESTRATION_BACKOFF_FACTOR=2.0
ORCHESTRATION_INTELLIGENCE_FALLBACK_ENABLED=1
ORCHESTRATION_METRICS_BACKEND=logging
# StatsD (only used when ORCHESTRATION_METRICS_BACKEND=statsd)
STATSD_HOST=localhost
STATSD_PORT=8125
STATSD_PREFIX=pipeline
# Django System Checks
# Silence specific checks during unrelated commands (migrate, runserver, etc.)
# The preflight command still shows all checks regardless.
# SILENCED_SYSTEM_CHECKS=checkers.W009,checkers.W010
Step 2: Run the env var system check to make sure it still works
Run: uv run pytest apps/checkers/_tests/test_checks.py -v Expected: Some tests may need updating — test_required_env_vars_warns_on_missing uses OPENAI_API_KEY as a sample var. Update to use a var that still exists in .env.sample (e.g., DJANGO_DEBUG).
Step 3: Update env var check tests
In apps/checkers/_tests/test_checks.py, update:
test_required_env_vars_warns_on_missing(line 198): Change sample content from"DJANGO_DEBUG=1\nOPENAI_API_KEY=\n"to"DJANGO_DEBUG=1\nSTATSD_HOST=localhost\n"and assert onSTATSD_HOSTinstead ofOPENAI_API_KEYtest_required_env_vars_ok_when_all_set(line 213): Same — changeOPENAI_API_KEYtoSTATSD_HOST
Step 4: Run tests
Run: uv run pytest apps/checkers/_tests/test_checks.py -v Expected: PASS
Step 5: Commit
git add .env.sample apps/checkers/_tests/test_checks.py
git commit -m "refactor: clean up .env.sample to infrastructure-only
Removed CHECKERS_SKIP*, NOTIFY_SKIP*, OPENAI_*, INTELLIGENCE_PROVIDER,
and DJANGO_DB_* keys. Application config lives in DB and pipeline definitions."
Task 8: Update documentation
Files:
- Modify:
docs/Architecture.md:270-282 - Modify:
apps/notify/README.md(remove NOTIFY_SKIP docs) - Modify:
apps/checkers/README.md(remove CHECKERS_SKIP docs if present)
Step 1: Update Architecture.md env var table
Replace the env var table (lines 272-281) with:
| Variable | Purpose | Default |
|----------|---------|---------|
| `DJANGO_SECRET_KEY` | Django secret key | Required in production |
| `DJANGO_DEBUG` | Debug mode | `0` |
| `DJANGO_ALLOWED_HOSTS` | Comma-separated allowed hosts | `*` |
| `CELERY_BROKER_URL` | Redis broker URL | `redis://localhost:6379/0` |
| `CELERY_TASK_ALWAYS_EAGER` | Run tasks synchronously (dev) | `False` |
| `ORCHESTRATION_MAX_RETRIES_PER_STAGE` | Retries before pipeline failure | `3` |
| `ORCHESTRATION_BACKOFF_FACTOR` | Exponential backoff multiplier | `2.0` |
| `ORCHESTRATION_INTELLIGENCE_FALLBACK_ENABLED` | Continue pipeline when AI fails | `1` |
| `ORCHESTRATION_METRICS_BACKEND` | Metrics backend (`logging` or `statsd`) | `logging` |
| `STATSD_HOST` | StatsD server host | `localhost` |
| `STATSD_PORT` | StatsD server port | `8125` |
| `STATSD_PREFIX` | StatsD metric prefix | `pipeline` |
Add a note after the table:
Application-level configuration (which checkers to run, intelligence provider, notification channels) is managed through Django Admin and pipeline definitions — not environment variables. See the [Setup Guide](Setup-Guide) for details.
Step 2: Update notify README.md
Remove the NOTIFY_SKIP_ALL and NOTIFY_SKIP documentation section (around lines 140-172 per agent report).
Step 3: Update checkers README.md
Remove any CHECKERS_SKIP documentation if present.
Step 4: Run full test suite
Run: uv run pytest -v Expected: ALL PASS
Step 5: Commit
git add docs/Architecture.md apps/notify/README.md apps/checkers/README.md
git commit -m "docs: update env var documentation to reflect cleanup
Removed references to CHECKERS_SKIP*, NOTIFY_SKIP*, OPENAI_* from docs.
Added orchestration and StatsD vars to Architecture.md table."