CLI Aliases & CLI-First Documentation Implementation Plan
For Claude: REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
Goal: Add optional shell aliases for all 8 management commands and overhaul every app README to document every CLI flag and combination.
Architecture: Create bin/setup_aliases.sh that generates a sourceable bin/aliases.sh file, add a Django system check for dev environments, update cli.sh with alias hints, and rewrite CLI reference sections in all app READMEs to cover all 67 flags.
Tech Stack: Bash, Django system checks, Markdown
Task 1: Create bin/setup_aliases.sh and update .gitignore
Files:
- Create:
bin/setup_aliases.sh - Modify:
.gitignore
Step 1: Add bin/aliases.sh to .gitignore
Append to the end of .gitignore:
# Generated alias file (user-specific, project-path-locked)
bin/aliases.sh
Step 2: Create bin/setup_aliases.sh
#!/usr/bin/env bash
#
# Setup shell aliases for server-maintanence management commands.
#
# Usage:
# ./bin/setup_aliases.sh # Interactive setup (default prefix: sm)
# ./bin/setup_aliases.sh --prefix maint # Custom prefix
# ./bin/setup_aliases.sh --remove # Remove aliases and source line
# ./bin/setup_aliases.sh --list # Show current aliases
#
set -e
# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
info() { echo -e "${BLUE}[INFO]${NC} $1"; }
success() { echo -e "${GREEN}[SUCCESS]${NC} $1"; }
warn() { echo -e "${YELLOW}[WARN]${NC} $1"; }
error() { echo -e "${RED}[ERROR]${NC} $1"; }
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_DIR="$(dirname "$SCRIPT_DIR")"
ALIASES_FILE="$SCRIPT_DIR/aliases.sh"
SOURCE_LINE="source \"$ALIASES_FILE\" # server-maintanence aliases"
# Detect shell profile
detect_shell_profile() {
if [ -n "$ZSH_VERSION" ] || [ "$SHELL" = "$(which zsh 2>/dev/null)" ]; then
echo "$HOME/.zshrc"
else
echo "$HOME/.bashrc"
fi
}
SHELL_PROFILE="$(detect_shell_profile)"
# Parse arguments
PREFIX="sm"
ACTION="install"
while [[ $# -gt 0 ]]; do
case $1 in
--prefix)
PREFIX="$2"
shift 2
;;
--prefix=*)
PREFIX="${1#*=}"
shift
;;
--remove)
ACTION="remove"
shift
;;
--list)
ACTION="list"
shift
;;
--help|-h)
echo "Usage: $0 [OPTIONS]"
echo ""
echo "Options:"
echo " --prefix PREFIX Alias prefix (default: sm)"
echo " --remove Remove aliases and source line"
echo " --list Show current aliases"
echo " --help Show this help"
exit 0
;;
*)
error "Unknown option: $1"
exit 1
;;
esac
done
generate_aliases() {
local prefix="$1"
cat <<ALIASES
#!/usr/bin/env bash
#
# Auto-generated by bin/setup_aliases.sh
# Prefix: ${prefix}
# Project: ${PROJECT_DIR}
#
# Re-generate: ./bin/setup_aliases.sh --prefix ${prefix}
# Remove: ./bin/setup_aliases.sh --remove
#
alias ${prefix}-check-health='cd "${PROJECT_DIR}" && uv run python manage.py check_health'
alias ${prefix}-run-check='cd "${PROJECT_DIR}" && uv run python manage.py run_check'
alias ${prefix}-check-and-alert='cd "${PROJECT_DIR}" && uv run python manage.py check_and_alert'
alias ${prefix}-get-recommendations='cd "${PROJECT_DIR}" && uv run python manage.py get_recommendations'
alias ${prefix}-run-pipeline='cd "${PROJECT_DIR}" && uv run python manage.py run_pipeline'
alias ${prefix}-monitor-pipeline='cd "${PROJECT_DIR}" && uv run python manage.py monitor_pipeline'
alias ${prefix}-test-notify='cd "${PROJECT_DIR}" && uv run python manage.py test_notify'
alias ${prefix}-list-notify-drivers='cd "${PROJECT_DIR}" && uv run python manage.py list_notify_drivers'
alias ${prefix}-cli='${SCRIPT_DIR}/cli.sh'
ALIASES
}
do_install() {
echo ""
echo "============================================"
echo " server-maintanence Alias Setup"
echo "============================================"
echo ""
# If no --prefix was passed, prompt interactively
if [ "$PREFIX" = "sm" ] && [ -t 0 ]; then
read -p "Alias prefix (default: sm): " user_prefix
PREFIX="${user_prefix:-sm}"
fi
info "Using prefix: ${PREFIX}"
info "Generating aliases..."
generate_aliases "$PREFIX" > "$ALIASES_FILE"
chmod +x "$ALIASES_FILE"
success "Created $ALIASES_FILE"
echo ""
info "Aliases that will be available:"
echo ""
echo " ${PREFIX}-check-health Run all health checks"
echo " ${PREFIX}-run-check Run a single checker"
echo " ${PREFIX}-check-and-alert Run checks and create alerts"
echo " ${PREFIX}-get-recommendations Get AI recommendations"
echo " ${PREFIX}-run-pipeline Execute a pipeline"
echo " ${PREFIX}-monitor-pipeline Monitor pipeline runs"
echo " ${PREFIX}-test-notify Test notification delivery"
echo " ${PREFIX}-list-notify-drivers List notification drivers"
echo " ${PREFIX}-cli Interactive CLI"
echo ""
# Add source line to shell profile
if [ -f "$SHELL_PROFILE" ]; then
if grep -qF "server-maintanence aliases" "$SHELL_PROFILE" 2>/dev/null; then
# Replace existing source line
local tmp_file
tmp_file=$(mktemp)
grep -vF "server-maintanence aliases" "$SHELL_PROFILE" > "$tmp_file"
echo "$SOURCE_LINE" >> "$tmp_file"
mv "$tmp_file" "$SHELL_PROFILE"
success "Updated source line in $SHELL_PROFILE"
else
echo "" >> "$SHELL_PROFILE"
echo "$SOURCE_LINE" >> "$SHELL_PROFILE"
success "Added source line to $SHELL_PROFILE"
fi
else
warn "$SHELL_PROFILE not found. Add this line manually:"
echo " $SOURCE_LINE"
fi
echo ""
info "To activate now, run:"
echo " source $SHELL_PROFILE"
echo ""
success "Alias setup complete!"
}
do_remove() {
echo ""
info "Removing aliases..."
# Remove aliases file
if [ -f "$ALIASES_FILE" ]; then
rm "$ALIASES_FILE"
success "Removed $ALIASES_FILE"
else
warn "$ALIASES_FILE not found (already removed?)"
fi
# Remove source line from shell profile
if [ -f "$SHELL_PROFILE" ] && grep -qF "server-maintanence aliases" "$SHELL_PROFILE" 2>/dev/null; then
local tmp_file
tmp_file=$(mktemp)
grep -vF "server-maintanence aliases" "$SHELL_PROFILE" > "$tmp_file"
mv "$tmp_file" "$SHELL_PROFILE"
success "Removed source line from $SHELL_PROFILE"
fi
echo ""
info "Restart your shell or run: source $SHELL_PROFILE"
success "Aliases removed!"
}
do_list() {
if [ -f "$ALIASES_FILE" ]; then
info "Current aliases ($ALIASES_FILE):"
echo ""
grep "^alias " "$ALIASES_FILE" | sed 's/alias / /'
else
warn "No aliases configured. Run: ./bin/setup_aliases.sh"
fi
}
case $ACTION in
install) do_install ;;
remove) do_remove ;;
list) do_list ;;
esac
Step 3: Make executable
chmod +x bin/setup_aliases.sh
Step 4: Verify
# Test help
./bin/setup_aliases.sh --help
# Test generation (non-interactive with explicit prefix)
./bin/setup_aliases.sh --prefix test-sm
cat bin/aliases.sh
./bin/setup_aliases.sh --remove
Step 5: Commit
git add bin/setup_aliases.sh .gitignore
git commit -m "feat: add shell alias setup script with customizable prefix"
Task 2: Add Django system check for aliases
Files:
- Modify:
apps/checkers/checks.py— add@register("aliases")check - Create:
apps/checkers/_tests/test_checks_aliases.py— tests
Step 1: Write failing tests in apps/checkers/_tests/test_checks_aliases.py
"""Tests for the aliases Django system check."""
import os
from unittest.mock import patch
from django.test import TestCase, override_settings
from apps.checkers.checks import check_aliases_configured
class CheckAliasesConfiguredTests(TestCase):
"""Tests for the check_aliases_configured system check."""
@override_settings(DEBUG=False)
def test_skips_in_production(self):
"""Check should return no warnings when DEBUG=False."""
result = check_aliases_configured(None)
assert result == []
@override_settings(DEBUG=True)
def test_skips_during_tests(self):
"""Check should return no warnings when running under pytest."""
result = check_aliases_configured(None)
assert result == []
@override_settings(DEBUG=True)
@patch("apps.checkers.checks._is_testing", return_value=False)
def test_warns_when_aliases_file_missing(self, _mock_testing):
"""Check should warn when bin/aliases.sh does not exist."""
with patch("apps.checkers.checks._aliases_file_exists", return_value=False):
result = check_aliases_configured(None)
assert len(result) == 1
assert result[0].id == "checkers.W009"
assert "aliases" in result[0].msg.lower()
@override_settings(DEBUG=True)
@patch("apps.checkers.checks._is_testing", return_value=False)
def test_no_warning_when_aliases_file_exists(self, _mock_testing):
"""Check should return no warnings when bin/aliases.sh exists."""
with patch("apps.checkers.checks._aliases_file_exists", return_value=True):
result = check_aliases_configured(None)
assert result == []
Step 2: Run tests to verify they fail
uv run pytest apps/checkers/_tests/test_checks_aliases.py -v
Expected: FAIL (ImportError — check_aliases_configured and _aliases_file_exists don’t exist)
Step 3: Implement in apps/checkers/checks.py
Add after the existing check_crontab_configuration function (before @register(Tags.database, deploy=True)):
def _aliases_file_exists():
"""Return True if bin/aliases.sh exists in the project root."""
import django.conf
base_dir = getattr(django.conf.settings, "BASE_DIR", None)
if base_dir is None:
return True # Can't check, don't warn
aliases_path = os.path.join(str(base_dir), "bin", "aliases.sh")
return os.path.isfile(aliases_path)
@register("aliases")
def check_aliases_configured(app_configs, **kwargs):
"""
Check that shell aliases are configured for management commands.
Only runs in development (DEBUG=True and not in tests).
"""
from django.conf import settings
if not settings.DEBUG or _is_testing():
return []
errors = []
if not _aliases_file_exists():
errors.append(
Warning(
"Shell aliases not configured for management commands",
hint=(
"Run 'bin/setup_aliases.sh' to set up quick aliases like "
"sm-check-health, sm-run-check, etc. "
"This is optional but improves developer experience."
),
id="checkers.W009",
)
)
return errors
Also add import os at the top of the file (if not already present).
Step 4: Run tests to verify they pass
uv run pytest apps/checkers/_tests/test_checks_aliases.py -v
Step 5: Update Django System Checks table in apps/checkers/README.md
Add a row to the existing table at line ~122:
| `aliases` | `checkers.W009` | Shell aliases not configured (dev only) |
Step 6: Commit
git add apps/checkers/checks.py apps/checkers/_tests/test_checks_aliases.py apps/checkers/README.md
git commit -m "feat: add Django system check for shell aliases (dev only)"
Task 3: Update bin/cli.sh — startup hint + install menu option
Files:
- Modify:
bin/cli.sh
Step 1: Add alias hint to show_banner()
Replace the show_banner function (lines 39-46) with:
show_banner() {
clear
echo -e "${CYAN}"
echo "╔══════════════════════════════════════════════════════════════╗"
echo "║ Server Maintenance CLI ║"
echo "╚══════════════════════════════════════════════════════════════╝"
echo -e "${NC}"
# Show alias hint if aliases are not configured
if [ ! -f "$SCRIPT_DIR/aliases.sh" ]; then
echo -e "${YELLOW}Tip:${NC} Run ${CYAN}bin/setup_aliases.sh${NC} for quick command aliases (sm-check-health, sm-run-check, etc.)"
echo ""
fi
}
Step 2: Add “Setup shell aliases” to install menu
Replace the install_project function (lines 137-175). The options array becomes:
local options=(
"Full installation (uv sync + pre-commit)"
"Install dependencies only (uv sync)"
"Install pre-commit hooks"
"Setup shell aliases"
"Check installation status"
"Back to main menu"
)
And the select case becomes:
select opt in "${options[@]}"; do
case $REPLY in
1)
echo -e "${YELLOW}Running full installation...${NC}"
run_command "uv sync" "Installing dependencies"
run_command "uv run pre-commit install" "Installing pre-commit hooks"
;;
2)
run_command "uv sync" "Installing dependencies"
;;
3)
run_command "uv run pre-commit install" "Installing pre-commit hooks"
;;
4)
run_command "$SCRIPT_DIR/setup_aliases.sh" "Setting up shell aliases"
;;
5)
check_installation
;;
6)
return
;;
*)
echo -e "${RED}Invalid option${NC}"
;;
esac
break
done
Step 3: Add alias check to check_installation()
Add after the pre-commit check (around line 199):
# Check aliases
if [ -f "$SCRIPT_DIR/aliases.sh" ]; then
echo -e "${GREEN}✓${NC} Shell aliases configured"
else
echo -e "${YELLOW}!${NC} Shell aliases not configured (run bin/setup_aliases.sh)"
fi
Step 4: Verify
./bin/cli.sh help
Step 5: Commit
git add bin/cli.sh
git commit -m "feat: add alias setup option and hint to interactive CLI"
Task 4: Rewrite apps/checkers/README.md CLI reference
Files:
- Modify:
apps/checkers/README.md— replace lines 144-264 (the “Running checks” section)
Step 1: Replace the “Running checks” section (lines 144-264) with comprehensive CLI reference
The new section should cover check_health (10 flags) and run_check (11 flags) with every combination:
## CLI Reference
There are two management commands for running checks. All flags can be passed after the alias too (e.g., `sm-check-health --json`).
### `check_health`
Run all checkers (or a selection) and show a summary.
```bash
# Run ALL registered checkers
uv run python manage.py check_health
# Run specific checkers only
uv run python manage.py check_health cpu memory
uv run python manage.py check_health cpu memory disk network process
# List available checkers and exit
uv run python manage.py check_health --list
```
#### JSON output
```bash
# JSON output (for scripts, cron, piping to jq)
uv run python manage.py check_health --json
# Specific checkers + JSON
uv run python manage.py check_health cpu disk --json
```
#### Exit codes for CI/automation
By default: exit `2` if any CRITICAL, `1` if any UNKNOWN, `0` otherwise.
```bash
# Exit 1 if ANY check is WARNING or CRITICAL (strictest)
uv run python manage.py check_health --fail-on-warning
# Exit 1 only if ANY check is CRITICAL
uv run python manage.py check_health --fail-on-critical
# CI pipeline example: fail build on critical
uv run python manage.py check_health --fail-on-critical --json
```
#### Threshold overrides
Override default warning/critical thresholds for all checkers in this run:
```bash
# Lower thresholds (more sensitive)
uv run python manage.py check_health --warning-threshold 60 --critical-threshold 80
# Higher thresholds (less sensitive)
uv run python manage.py check_health --warning-threshold 85 --critical-threshold 98
# Override thresholds for specific checkers only
uv run python manage.py check_health cpu memory --warning-threshold 75 --critical-threshold 95
```
#### Checker-specific options
These flags are passed to the relevant checker when it runs:
```bash
# Disk: check specific mount points
uv run python manage.py check_health disk --disk-paths / /var /tmp /home
# Network: ping specific hosts
uv run python manage.py check_health network --ping-hosts 8.8.8.8 1.1.1.1 github.com
# Process: verify specific processes are running
uv run python manage.py check_health process --processes nginx postgres redis celery
```
#### Combined examples
```bash
# Full CI check: all checkers, JSON, fail on warning
uv run python manage.py check_health --json --fail-on-warning
# Disk + network with custom targets + thresholds
uv run python manage.py check_health disk network \
--disk-paths / /var/log \
--ping-hosts 8.8.8.8 google.com \
--warning-threshold 75 --critical-threshold 90
# Cron job: all checks, JSON, append to log
uv run python manage.py check_health --json >> /var/log/health-checks.log 2>&1
# Quick smoke test: CPU + memory, fail on critical
uv run python manage.py check_health cpu memory --fail-on-critical
```
#### Flag reference
| Flag | Type | Default | Description |
|------|------|---------|-------------|
| `checkers` (positional) | str... | all | Specific checkers to run (space-separated) |
| `--list` | flag | — | List available checkers and exit |
| `--json` | flag | — | Output results as JSON |
| `--fail-on-warning` | flag | — | Exit 1 if any WARNING or CRITICAL |
| `--fail-on-critical` | flag | — | Exit 1 only if any CRITICAL |
| `--warning-threshold` | float | per-checker | Override warning threshold for all checks |
| `--critical-threshold` | float | per-checker | Override critical threshold for all checks |
| `--disk-paths` | str... | `/` | Paths to check (disk checker) |
| `--ping-hosts` | str... | `8.8.8.8 1.1.1.1` | Hosts to ping (network checker) |
| `--processes` | str... | — | Process names to check (process checker) |
---
### `run_check`
Run a **single** checker with checker-specific options.
```bash
# Basic usage
uv run python manage.py run_check cpu
uv run python manage.py run_check memory
uv run python manage.py run_check disk
uv run python manage.py run_check network
uv run python manage.py run_check process
```
#### JSON output
```bash
uv run python manage.py run_check cpu --json
uv run python manage.py run_check disk --json
```
#### Threshold overrides
```bash
# Override thresholds for this single check
uv run python manage.py run_check cpu --warning-threshold 80 --critical-threshold 95
uv run python manage.py run_check memory --warning-threshold 75 --critical-threshold 90
uv run python manage.py run_check disk --warning-threshold 85 --critical-threshold 98
```
#### CPU checker options
```bash
# Default: 5 samples, 1 second apart
uv run python manage.py run_check cpu
# More samples for better accuracy
uv run python manage.py run_check cpu --samples 10
# Faster sampling (0.5s intervals)
uv run python manage.py run_check cpu --sample-interval 0.5
# Quick snapshot (1 sample, no wait)
uv run python manage.py run_check cpu --samples 1 --sample-interval 0
# Per-CPU mode (reports busiest core)
uv run python manage.py run_check cpu --per-cpu
# All CPU options combined
uv run python manage.py run_check cpu --samples 10 --sample-interval 0.5 --per-cpu
# CPU with threshold override + JSON
uv run python manage.py run_check cpu --samples 10 --per-cpu --warning-threshold 80 --critical-threshold 95 --json
```
#### Memory checker options
```bash
# Default: RAM only
uv run python manage.py run_check memory
# Include swap memory in the check
uv run python manage.py run_check memory --include-swap
# Memory with custom thresholds
uv run python manage.py run_check memory --include-swap --warning-threshold 75 --critical-threshold 90 --json
```
#### Disk checker options
```bash
# Default: check /
uv run python manage.py run_check disk
# Check specific paths
uv run python manage.py run_check disk --paths /
uv run python manage.py run_check disk --paths / /var /tmp /home
uv run python manage.py run_check disk --paths /var/log /var/lib
# Disk with thresholds + JSON
uv run python manage.py run_check disk --paths / /var/log --warning-threshold 80 --critical-threshold 95 --json
```
#### Network checker options
```bash
# Default hosts: 8.8.8.8, 1.1.1.1
uv run python manage.py run_check network
# Custom hosts
uv run python manage.py run_check network --hosts 8.8.8.8 1.1.1.1 github.com
uv run python manage.py run_check network --hosts google.com cloudflare.com aws.amazon.com
# Network with JSON
uv run python manage.py run_check network --hosts 8.8.8.8 google.com --json
```
#### Process checker options
```bash
# Check specific processes
uv run python manage.py run_check process --names nginx
uv run python manage.py run_check process --names nginx postgres redis
uv run python manage.py run_check process --names nginx postgres redis celery gunicorn
# Process with JSON
uv run python manage.py run_check process --names nginx postgres --json
```
#### Flag reference
| Flag | Type | Default | Description |
|------|------|---------|-------------|
| `checker` (positional) | str | required | Checker name (cpu, memory, disk, network, process, etc.) |
| `--json` | flag | — | Output as JSON |
| `--warning-threshold` | float | per-checker | Override warning threshold |
| `--critical-threshold` | float | per-checker | Override critical threshold |
| `--samples` | int | 5 | Number of CPU samples (cpu only) |
| `--sample-interval` | float | 1.0 | Seconds between CPU samples (cpu only) |
| `--per-cpu` | flag | — | Per-CPU mode, reports busiest core (cpu only) |
| `--include-swap` | flag | — | Include swap memory (memory only) |
| `--paths` | str... | `/` | Disk paths to check (disk only) |
| `--hosts` | str... | `8.8.8.8 1.1.1.1` | Hosts to ping (network only) |
| `--names` | str... | — | Process names to check (process only) |
Step 2: Verify no broken formatting
Visually inspect the rendered markdown or run a markdown linter.
Step 3: Commit
git add apps/checkers/README.md
git commit -m "docs: comprehensive CLI reference for check_health and run_check"
Task 5: Rewrite apps/alerts/README.md CLI reference
Files:
- Modify:
apps/alerts/README.md— replace lines 338-368 (the management command section under “Creating alerts from health checks”)
Step 1: Replace the management command subsection with comprehensive CLI reference
Replace from ### Using the management command through the code block ending at line 368 with:
### `check_and_alert` management command
Run health checkers and automatically create alerts/incidents for failures.
```bash
# Run all checks and create alerts
uv run python manage.py check_and_alert
# Run specific checkers only
uv run python manage.py check_and_alert --checkers cpu memory
uv run python manage.py check_and_alert --checkers cpu memory disk network process
```
#### Dry run
Preview what would happen without creating any alerts or incidents:
```bash
uv run python manage.py check_and_alert --dry-run
# Dry run with specific checkers
uv run python manage.py check_and_alert --checkers cpu disk --dry-run
# Dry run with JSON output
uv run python manage.py check_and_alert --dry-run --json
```
#### Incident control
```bash
# Skip automatic incident creation (create alerts only)
uv run python manage.py check_and_alert --no-incidents
# Specific checkers without incidents
uv run python manage.py check_and_alert --checkers cpu memory --no-incidents
```
#### Custom labels
Add key=value labels to all generated alerts (repeatable):
```bash
# Single label
uv run python manage.py check_and_alert --label env=production
# Multiple labels
uv run python manage.py check_and_alert --label env=production --label team=sre --label datacenter=us-east-1
# Labels + specific checkers
uv run python manage.py check_and_alert --checkers cpu memory --label env=staging --label service=api
```
#### Hostname override
Override the auto-detected hostname in alert labels:
```bash
uv run python manage.py check_and_alert --hostname web-server-01
# Hostname + labels
uv run python manage.py check_and_alert --hostname db-primary --label env=production --label role=database
```
#### JSON output
```bash
uv run python manage.py check_and_alert --json
# JSON + specific checkers
uv run python manage.py check_and_alert --checkers cpu disk --json
```
#### Threshold overrides
Override warning/critical thresholds for all checkers:
```bash
# Lower thresholds (more sensitive alerting)
uv run python manage.py check_and_alert --warning-threshold 60 --critical-threshold 80
# Higher thresholds (less noise)
uv run python manage.py check_and_alert --warning-threshold 85 --critical-threshold 98
```
#### Skip list override
```bash
# Include checkers that are normally skipped via CHECKERS_SKIP
uv run python manage.py check_and_alert --include-skipped
```
#### Combined examples
```bash
# Production cron job: all checks, JSON, custom labels
uv run python manage.py check_and_alert --json --label env=production --hostname prod-web-01
# CI smoke test: dry run, specific checkers, JSON
uv run python manage.py check_and_alert --checkers cpu memory --dry-run --json
# Sensitive alerting with full context
uv run python manage.py check_and_alert \
--warning-threshold 50 --critical-threshold 75 \
--label env=production --label team=oncall \
--hostname api-server-03 --json
# Include all checkers even skipped ones, no incidents
uv run python manage.py check_and_alert --include-skipped --no-incidents --json
# Cron: every 5 minutes with labels and JSON log
*/5 * * * * cd /path/to/project && uv run python manage.py check_and_alert --json --label env=production >> /var/log/health-alerts.log 2>&1
```
#### Flag reference
| Flag | Type | Default | Description |
|------|------|---------|-------------|
| `--checkers` | str... | all enabled | Specific checkers to run |
| `--json` | flag | — | Output as JSON |
| `--dry-run` | flag | — | Preview without creating alerts |
| `--no-incidents` | flag | — | Create alerts but skip incident creation |
| `--hostname` | str | auto-detected | Override hostname in alert labels |
| `--label` | KEY=VALUE | — | Add label to all alerts (repeatable) |
| `--warning-threshold` | float | per-checker | Override warning threshold |
| `--critical-threshold` | float | per-checker | Override critical threshold |
| `--include-skipped` | flag | — | Include checkers from CHECKERS_SKIP |
Step 2: Commit
git add apps/alerts/README.md
git commit -m "docs: comprehensive CLI reference for check_and_alert"
Task 6: Rewrite apps/intelligence/README.md CLI reference
Files:
- Modify:
apps/intelligence/README.md— replace lines 20-51 (the “Management Command” section)
Step 1: Replace the management command section with comprehensive CLI reference
Replace lines 20-51 with:
### Management Command: `get_recommendations`
```bash
# Default: get general recommendations
uv run python manage.py get_recommendations
```
#### Analysis modes
```bash
# Memory analysis: top processes by memory usage
uv run python manage.py get_recommendations --memory
# Disk analysis: large files, old logs, cleanup candidates
uv run python manage.py get_recommendations --disk
# Disk analysis for a specific path
uv run python manage.py get_recommendations --disk --path /var/log
uv run python manage.py get_recommendations --disk --path /home
# All analysis (memory + disk combined)
uv run python manage.py get_recommendations --all
```
#### Incident-based analysis
```bash
# Analyze a specific incident (auto-detects type from title/description)
uv run python manage.py get_recommendations --incident-id 1
uv run python manage.py get_recommendations --incident-id 42
# Incident analysis with specific provider
uv run python manage.py get_recommendations --incident-id 1 --provider local
```
#### Provider selection
```bash
# List available providers
uv run python manage.py get_recommendations --list-providers
# Use a specific provider
uv run python manage.py get_recommendations --provider local
```
#### Tuning parameters
```bash
# Show top 5 processes (default: 10)
uv run python manage.py get_recommendations --top-n 5
# Show top 20 processes
uv run python manage.py get_recommendations --memory --top-n 20
# Lower threshold for "large" files (default: 100 MB)
uv run python manage.py get_recommendations --disk --threshold-mb 50
# Higher threshold
uv run python manage.py get_recommendations --disk --threshold-mb 500
# Detect files older than 7 days (default: 30)
uv run python manage.py get_recommendations --disk --old-days 7
# Detect files older than 90 days
uv run python manage.py get_recommendations --disk --old-days 90
```
#### JSON output
```bash
uv run python manage.py get_recommendations --json
uv run python manage.py get_recommendations --memory --json
uv run python manage.py get_recommendations --all --json
```
#### Combined examples
```bash
# Full analysis with tuned parameters + JSON
uv run python manage.py get_recommendations --all \
--top-n 15 --threshold-mb 50 --old-days 14 --json
# Disk analysis for /var/log with low threshold
uv run python manage.py get_recommendations --disk \
--path /var/log --threshold-mb 10 --old-days 7
# Memory analysis with top 5 + JSON
uv run python manage.py get_recommendations --memory --top-n 5 --json
# Incident analysis with custom provider + JSON
uv run python manage.py get_recommendations --incident-id 1 --provider local --json
```
#### Flag reference
| Flag | Type | Default | Description |
|------|------|---------|-------------|
| `--memory` | flag | — | Get memory-specific recommendations |
| `--disk` | flag | — | Get disk-specific recommendations |
| `--all` | flag | — | Get all recommendations (memory + disk) |
| `--path` | str | `/` | Path to analyze for disk recommendations |
| `--incident-id` | int | — | Analyze a specific incident by ID |
| `--provider` | str | `local` | Intelligence provider to use |
| `--list-providers` | flag | — | List available providers and exit |
| `--top-n` | int | `10` | Number of top processes to report |
| `--threshold-mb` | float | `100.0` | Minimum file size in MB for "large" |
| `--old-days` | int | `30` | Age in days for old file detection |
| `--json` | flag | — | Output as JSON |
Step 2: Commit
git add apps/intelligence/README.md
git commit -m "docs: comprehensive CLI reference for get_recommendations"
Task 7: Rewrite apps/notify/README.md CLI reference
Files:
- Modify:
apps/notify/README.md— replace lines 31-35 (management commands section) with comprehensive version
Step 1: Replace the management commands section
The current lines 31-35 are brief. Replace the ### Management commands section with:
### Management commands
#### `list_notify_drivers`
```bash
# List available notification drivers
uv run python manage.py list_notify_drivers
# Show detailed configuration requirements (required/optional fields per driver)
uv run python manage.py list_notify_drivers --verbose
```
| Flag | Type | Default | Description |
|------|------|---------|-------------|
| `--verbose` | flag | — | Show required/optional config fields per driver |
#### `test_notify`
Send a test notification to verify driver or channel configuration.
```bash
# Test using first active DB channel (no driver argument needed)
uv run python manage.py test_notify
# Test a specific driver
uv run python manage.py test_notify slack
uv run python manage.py test_notify email
uv run python manage.py test_notify pagerduty
uv run python manage.py test_notify generic
# Test a named DB channel
uv run python manage.py test_notify ops-slack
```
##### Custom message
```bash
# Custom title and message
uv run python manage.py test_notify slack --title "Deploy Alert" --message "Deployment started"
# Custom severity
uv run python manage.py test_notify slack --severity critical
uv run python manage.py test_notify slack --severity warning
uv run python manage.py test_notify slack --severity info
uv run python manage.py test_notify slack --severity success
# Custom channel destination
uv run python manage.py test_notify slack --channel "#ops-alerts"
```
##### Slack driver
```bash
# Slack with webhook URL
uv run python manage.py test_notify slack --webhook-url https://hooks.slack.com/services/T.../B.../XXX
# Slack with custom message + channel
uv run python manage.py test_notify slack \
--webhook-url https://hooks.slack.com/services/T.../B.../XXX \
--channel "#alerts" \
--title "Test Alert" \
--message "Testing Slack integration" \
--severity warning
```
##### Email driver
```bash
# Email with SMTP config
uv run python manage.py test_notify email \
--smtp-host smtp.gmail.com \
--from-address alerts@example.com
# Email with TLS and custom port
uv run python manage.py test_notify email \
--smtp-host smtp.gmail.com \
--smtp-port 587 \
--from-address alerts@example.com \
--use-tls
# Email with full options
uv run python manage.py test_notify email \
--smtp-host smtp.gmail.com \
--smtp-port 587 \
--from-address alerts@example.com \
--use-tls \
--title "Disk Alert" \
--message "Disk usage critical on server-01" \
--severity critical
```
##### PagerDuty driver
```bash
# PagerDuty with integration key
uv run python manage.py test_notify pagerduty --integration-key your-key-here
# PagerDuty with custom severity
uv run python manage.py test_notify pagerduty \
--integration-key your-key-here \
--title "API Down" \
--message "API server not responding" \
--severity critical
```
##### Generic HTTP driver
```bash
# Generic with endpoint
uv run python manage.py test_notify generic --endpoint https://api.example.com/notify
# Generic with API key
uv run python manage.py test_notify generic \
--endpoint https://api.example.com/notify \
--api-key your-api-key
# Generic with full options
uv run python manage.py test_notify generic \
--endpoint https://api.example.com/notify \
--api-key your-api-key \
--title "Custom Alert" \
--message "Something happened" \
--severity warning
```
##### JSON config (advanced)
Pass full driver config as a JSON string (for complex configurations):
```bash
uv run python manage.py test_notify slack --json-config '{"webhook_url": "https://hooks.slack.com/...", "channel": "#alerts", "username": "Bot", "icon_emoji": ":robot:"}'
uv run python manage.py test_notify email --json-config '{"smtp_host": "smtp.gmail.com", "smtp_port": 587, "from_address": "alerts@example.com", "to_addresses": ["ops@example.com"], "use_tls": true}'
```
##### Flag reference
| Flag | Type | Default | Description |
|------|------|---------|-------------|
| `driver` (positional) | str | first active channel | Driver name or DB channel name |
| `--title` | str | `Test Alert` | Notification title |
| `--message` | str | default test message | Notification body |
| `--severity` | choice | `info` | `critical`, `warning`, `info`, or `success` |
| `--channel` | str | `default` | Destination channel/recipient |
| `--json-config` | str | — | Full driver config as JSON string |
| `--smtp-host` | str | — | SMTP host (email driver) |
| `--smtp-port` | int | `587` | SMTP port (email driver) |
| `--from-address` | str | — | Sender address (email driver) |
| `--use-tls` | flag | — | Enable TLS for SMTP (email driver) |
| `--webhook-url` | str | — | Webhook URL (slack driver) |
| `--integration-key` | str | — | Integration key (pagerduty driver) |
| `--endpoint` | str | — | API endpoint (generic driver) |
| `--api-key` | str | — | API key (generic driver) |
Step 2: Commit
git add apps/notify/README.md
git commit -m "docs: comprehensive CLI reference for list_notify_drivers and test_notify"
Task 8: Rewrite apps/orchestration/README.md CLI reference
Files:
- Modify:
apps/orchestration/README.md— replace lines 120-179 (Usage Examples section)
Step 1: Replace the management command sections
Replace from ## Usage Examples through ### Monitor Pipeline Command section (lines 120-179) with:
## CLI Reference
### `run_pipeline`
Execute the full pipeline (ingest → check → analyze → notify) or parts of it.
```bash
# Run with sample alert payload (quickest test)
uv run python manage.py run_pipeline --sample
# Dry run: show what would happen without executing
uv run python manage.py run_pipeline --sample --dry-run
```
#### Payload sources
```bash
# Sample payload (built-in test data)
uv run python manage.py run_pipeline --sample
# From a JSON file
uv run python manage.py run_pipeline --file alert.json
uv run python manage.py run_pipeline --file /path/to/payload.json
# Inline JSON string
uv run python manage.py run_pipeline --payload '{"name": "Test Alert", "status": "firing", "severity": "warning"}'
```
#### Source format
```bash
# Specify the alert source format
uv run python manage.py run_pipeline --sample --source alertmanager
uv run python manage.py run_pipeline --sample --source grafana
uv run python manage.py run_pipeline --sample --source pagerduty
uv run python manage.py run_pipeline --sample --source generic
uv run python manage.py run_pipeline --file alert.json --source datadog
```
#### Environment and correlation
```bash
# Set environment name
uv run python manage.py run_pipeline --sample --environment production
uv run python manage.py run_pipeline --sample --environment staging
# Set custom trace ID for correlation
uv run python manage.py run_pipeline --sample --trace-id my-trace-123
# Both
uv run python manage.py run_pipeline --sample --environment production --trace-id deploy-v2.1.0
```
#### Partial execution
```bash
# Run only the checkers stage (skip alert ingestion)
uv run python manage.py run_pipeline --sample --checks-only
# Checks only + dry run
uv run python manage.py run_pipeline --sample --checks-only --dry-run
```
#### Notification driver
```bash
# Specify which notification driver to use
uv run python manage.py run_pipeline --sample --notify-driver slack
uv run python manage.py run_pipeline --sample --notify-driver email
uv run python manage.py run_pipeline --sample --notify-driver pagerduty
uv run python manage.py run_pipeline --sample --notify-driver generic
```
#### Definition-based pipelines
```bash
# Run a pipeline definition stored in the database
uv run python manage.py run_pipeline --definition my-pipeline-name
# Run from a JSON config file
uv run python manage.py run_pipeline --config path/to/pipeline.json
# Definition + environment
uv run python manage.py run_pipeline --definition production-pipeline --environment production
```
#### JSON output
```bash
uv run python manage.py run_pipeline --sample --json
uv run python manage.py run_pipeline --file alert.json --json
```
#### Combined examples
```bash
# Full production pipeline: file payload, production env, trace ID, slack notify, JSON
uv run python manage.py run_pipeline \
--file alert.json \
--source grafana \
--environment production \
--trace-id incident-2024-001 \
--notify-driver slack \
--json
# Quick smoke test: sample, dry run, JSON
uv run python manage.py run_pipeline --sample --dry-run --json
# Checks-only with custom source and trace
uv run python manage.py run_pipeline --sample --checks-only --source alertmanager --trace-id diag-run-1
# Definition pipeline with all options
uv run python manage.py run_pipeline \
--definition my-pipeline \
--environment staging \
--trace-id test-run-42 \
--notify-driver email \
--json
```
#### Flag reference
| Flag | Type | Default | Description |
|------|------|---------|-------------|
| `--sample` | flag | — | Use built-in sample alert payload |
| `--payload` | str | — | Inline JSON payload string |
| `--file` | str | — | Path to JSON payload file |
| `--source` | str | `cli` | Alert source format |
| `--environment` | str | `development` | Environment name |
| `--trace-id` | str | auto-generated | Custom trace ID for correlation |
| `--checks-only` | flag | — | Run only checkers stage |
| `--dry-run` | flag | — | Preview without executing |
| `--notify-driver` | str | `generic` | Notification driver to use |
| `--json` | flag | — | Output as JSON |
| `--definition` | str | — | Pipeline definition name (from DB) |
| `--config` | str | — | Path to pipeline definition JSON file |
---
### `monitor_pipeline`
View and monitor pipeline run history.
```bash
# List recent pipeline runs (default: last 10)
uv run python manage.py monitor_pipeline
# Show more runs
uv run python manage.py monitor_pipeline --limit 25
uv run python manage.py monitor_pipeline --limit 50
uv run python manage.py monitor_pipeline --limit 100
```
#### Filter by status
```bash
# Show only failed runs
uv run python manage.py monitor_pipeline --status failed
# Show only completed runs
uv run python manage.py monitor_pipeline --status notified
# Other statuses
uv run python manage.py monitor_pipeline --status pending
uv run python manage.py monitor_pipeline --status ingested
uv run python manage.py monitor_pipeline --status checked
uv run python manage.py monitor_pipeline --status analyzed
uv run python manage.py monitor_pipeline --status retrying
uv run python manage.py monitor_pipeline --status skipped
```
#### Inspect a specific run
```bash
# Get full details for a pipeline run by run_id
uv run python manage.py monitor_pipeline --run-id abc123
uv run python manage.py monitor_pipeline --run-id 550e8400-e29b-41d4-a716-446655440000
```
#### Combined examples
```bash
# Last 50 failed runs
uv run python manage.py monitor_pipeline --status failed --limit 50
# Last 20 completed runs
uv run python manage.py monitor_pipeline --status notified --limit 20
```
#### Flag reference
| Flag | Type | Default | Description |
|------|------|---------|-------------|
| `--limit` | int | `10` | Number of pipeline runs to show |
| `--status` | str | all | Filter by status (pending, ingested, checked, analyzed, notified, failed, retrying, skipped) |
| `--run-id` | str | — | Show details for a specific pipeline run |
Step 2: Commit
git add apps/orchestration/README.md
git commit -m "docs: comprehensive CLI reference for run_pipeline and monitor_pipeline"
Task 9: Rewrite bin/README.md with alias reference and command quick-reference
Files:
- Modify:
bin/README.md
Step 1: Rewrite bin/README.md
Replace the entire file with:
# Shell Scripts & CLI
This directory contains shell scripts for installation, automation, and interactive usage.
[toc]
## Quick Command Reference
All management commands and their shell aliases (set up via `setup_aliases.sh`):
| Alias (default `sm-` prefix) | Management Command | App | Description |
|------|------|-----|-------------|
| `sm-check-health` | `check_health` | checkers | Run health checks (CPU, memory, disk, network, process) |
| `sm-run-check` | `run_check` | checkers | Run a single checker with checker-specific options |
| `sm-check-and-alert` | `check_and_alert` | alerts | Run checks and create alerts/incidents |
| `sm-get-recommendations` | `get_recommendations` | intelligence | Get AI-powered system recommendations |
| `sm-run-pipeline` | `run_pipeline` | orchestration | Execute the full pipeline |
| `sm-monitor-pipeline` | `monitor_pipeline` | orchestration | Monitor pipeline run history |
| `sm-test-notify` | `test_notify` | notify | Test notification delivery |
| `sm-list-notify-drivers` | `list_notify_drivers` | notify | List available notification drivers |
| `sm-cli` | — | — | Interactive CLI menu |
Aliases pass all flags through. Example: `sm-check-health --json` = `uv run python manage.py check_health --json`.
For full flag reference per command, see the app READMEs:
- [`apps/checkers/README.md`](../apps/checkers/README.md) — `check_health` (10 flags), `run_check` (11 flags)
- [`apps/alerts/README.md`](../apps/alerts/README.md) — `check_and_alert` (9 flags)
- [`apps/intelligence/README.md`](../apps/intelligence/README.md) — `get_recommendations` (11 flags)
- [`apps/notify/README.md`](../apps/notify/README.md) — `list_notify_drivers` (1 flag), `test_notify` (14 flags)
- [`apps/orchestration/README.md`](../apps/orchestration/README.md) — `run_pipeline` (12 flags), `monitor_pipeline` (3 flags)
---
## Scripts
### `setup_aliases.sh` — Shell Alias Setup
Set up shell aliases so you can run `sm-check-health` instead of `uv run python manage.py check_health`.
```bash
# Interactive setup (prompts for prefix, default: sm)
./bin/setup_aliases.sh
# Custom prefix
./bin/setup_aliases.sh --prefix maint
# Creates: maint-check-health, maint-run-check, etc.
# Show current aliases
./bin/setup_aliases.sh --list
# Remove aliases and source line from shell profile
./bin/setup_aliases.sh --remove
```
**What it does:**
- Generates `bin/aliases.sh` (gitignored) with aliases locked to the project path
- Adds a `source` line to `~/.zshrc` or `~/.bashrc`
- `--remove` undoes both
**After setup, activate immediately:**
```bash
source ~/.zshrc # or source ~/.bashrc
```
---
### `cli.sh` — Interactive CLI
An interactive menu-driven interface for all management commands. Recommended for new users and manual operations.
```bash
# Start interactive mode
./bin/cli.sh
# Direct shortcuts
./bin/cli.sh install # Jump to installation menu
./bin/cli.sh health # Jump to health monitoring
./bin/cli.sh alerts # Jump to alerts menu
./bin/cli.sh intel # Jump to intelligence menu
./bin/cli.sh pipeline # Jump to pipeline menu
./bin/cli.sh notify # Jump to notifications menu
./bin/cli.sh help # Show all options
```
**Features:**
- Color-coded output
- Shows available flags and options for each command
- Confirms before running commands
- Installation status check
- Shell alias setup option
---
### `install.sh` — Project Installer
Full installation script for setting up the project.
```bash
./bin/install.sh
```
**What it does:**
- Verifies Python 3.10+ (tries python3.13 → python3.10 → python3)
- Installs `uv` if missing
- Creates `.env` from `.env.sample`
- Prompts for dev/production configuration
- Installs dependencies with `uv sync`
- Runs Django migrations
- Optionally runs health checks
- Optionally sets up cron
See [`docs/Installation.md`](../docs/Installation.md) for full details.
---
### `setup_cron.sh` — Cron Setup
Sets up scheduled health checks via cron.
```bash
./bin/setup_cron.sh
```
**What it does:**
- Detects project directory
- Lets you choose a schedule (5 min / 15 min / hourly / custom)
- Writes crontab entry for `check_and_alert --json`
- Logs to `cron.log` in project root
**Useful commands after setup:**
```bash
crontab -l # View cron entries
tail -f ./cron.log # Follow cron output
```
---
## Permissions
If you get "permission denied", make scripts executable:
```bash
chmod +x ./bin/*.sh
```
Step 2: Commit
git add bin/README.md
git commit -m "docs: rewrite bin/README.md with alias reference and command quick-reference"
Task 10: Update root README.md with aliases in quickstart
Files:
- Modify:
README.md
Step 1: Add aliases to quickstart section
After step 3 in the Quickstart section (after the ./bin/cli.sh block, around line 133), add:
4) (Optional) Set up shell aliases for quick command access:
```bash
./bin/setup_aliases.sh
After setup, use aliases like sm-check-health, sm-run-check, etc. See bin/README.md for the full alias list.
**Step 2: Commit**
```bash
git add README.md
git commit -m "docs: add alias setup to quickstart"
Task 11: Full verification
Step 1: Run tests
uv run pytest -v
Step 2: Django checks
uv run python manage.py check
Step 3: Linting
uv run black --check .
uv run ruff check .
Step 4: Verify alias script works
./bin/setup_aliases.sh --prefix test-sm
./bin/setup_aliases.sh --list
./bin/setup_aliases.sh --remove
Step 5: Grep for stale references
grep -r "orchestration-pipelines" . --include="*.md" | grep -v docs/plans
Files Summary
| File | Change |
|---|---|
bin/setup_aliases.sh | Create — interactive alias setup script |
.gitignore | Add bin/aliases.sh |
apps/checkers/checks.py | Add @register("aliases") system check |
apps/checkers/_tests/test_checks_aliases.py | Create — 4 tests for aliases check |
bin/cli.sh | Add alias hint + install menu option |
apps/checkers/README.md | Rewrite CLI reference (check_health + run_check) |
apps/alerts/README.md | Rewrite CLI reference (check_and_alert) |
apps/intelligence/README.md | Rewrite CLI reference (get_recommendations) |
apps/notify/README.md | Rewrite CLI reference (list_notify_drivers + test_notify) |
apps/orchestration/README.md | Rewrite CLI reference (run_pipeline + monitor_pipeline) |
bin/README.md | Rewrite with alias table + command quick-reference |
README.md | Add alias setup to quickstart |