CLI Aliases & CLI-First Documentation Implementation Plan

For Claude: REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.

Goal: Add optional shell aliases for all 8 management commands and overhaul every app README to document every CLI flag and combination.

Architecture: Create bin/setup_aliases.sh that generates a sourceable bin/aliases.sh file, add a Django system check for dev environments, update cli.sh with alias hints, and rewrite CLI reference sections in all app READMEs to cover all 67 flags.

Tech Stack: Bash, Django system checks, Markdown


Task 1: Create bin/setup_aliases.sh and update .gitignore

Files:

  • Create: bin/setup_aliases.sh
  • Modify: .gitignore

Step 1: Add bin/aliases.sh to .gitignore

Append to the end of .gitignore:

# Generated alias file (user-specific, project-path-locked)
bin/aliases.sh

Step 2: Create bin/setup_aliases.sh

#!/usr/bin/env bash
#
# Setup shell aliases for server-maintanence management commands.
#
# Usage:
#   ./bin/setup_aliases.sh                  # Interactive setup (default prefix: sm)
#   ./bin/setup_aliases.sh --prefix maint   # Custom prefix
#   ./bin/setup_aliases.sh --remove         # Remove aliases and source line
#   ./bin/setup_aliases.sh --list           # Show current aliases
#

set -e

# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'

info() { echo -e "${BLUE}[INFO]${NC} $1"; }
success() { echo -e "${GREEN}[SUCCESS]${NC} $1"; }
warn() { echo -e "${YELLOW}[WARN]${NC} $1"; }
error() { echo -e "${RED}[ERROR]${NC} $1"; }

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_DIR="$(dirname "$SCRIPT_DIR")"
ALIASES_FILE="$SCRIPT_DIR/aliases.sh"
SOURCE_LINE="source \"$ALIASES_FILE\"  # server-maintanence aliases"

# Detect shell profile
detect_shell_profile() {
    if [ -n "$ZSH_VERSION" ] || [ "$SHELL" = "$(which zsh 2>/dev/null)" ]; then
        echo "$HOME/.zshrc"
    else
        echo "$HOME/.bashrc"
    fi
}

SHELL_PROFILE="$(detect_shell_profile)"

# Parse arguments
PREFIX="sm"
ACTION="install"

while [[ $# -gt 0 ]]; do
    case $1 in
        --prefix)
            PREFIX="$2"
            shift 2
            ;;
        --prefix=*)
            PREFIX="${1#*=}"
            shift
            ;;
        --remove)
            ACTION="remove"
            shift
            ;;
        --list)
            ACTION="list"
            shift
            ;;
        --help|-h)
            echo "Usage: $0 [OPTIONS]"
            echo ""
            echo "Options:"
            echo "  --prefix PREFIX   Alias prefix (default: sm)"
            echo "  --remove          Remove aliases and source line"
            echo "  --list            Show current aliases"
            echo "  --help            Show this help"
            exit 0
            ;;
        *)
            error "Unknown option: $1"
            exit 1
            ;;
    esac
done

generate_aliases() {
    local prefix="$1"
    cat <<ALIASES
#!/usr/bin/env bash
#
# Auto-generated by bin/setup_aliases.sh
# Prefix: ${prefix}
# Project: ${PROJECT_DIR}
#
# Re-generate: ./bin/setup_aliases.sh --prefix ${prefix}
# Remove:      ./bin/setup_aliases.sh --remove
#

alias ${prefix}-check-health='cd "${PROJECT_DIR}" && uv run python manage.py check_health'
alias ${prefix}-run-check='cd "${PROJECT_DIR}" && uv run python manage.py run_check'
alias ${prefix}-check-and-alert='cd "${PROJECT_DIR}" && uv run python manage.py check_and_alert'
alias ${prefix}-get-recommendations='cd "${PROJECT_DIR}" && uv run python manage.py get_recommendations'
alias ${prefix}-run-pipeline='cd "${PROJECT_DIR}" && uv run python manage.py run_pipeline'
alias ${prefix}-monitor-pipeline='cd "${PROJECT_DIR}" && uv run python manage.py monitor_pipeline'
alias ${prefix}-test-notify='cd "${PROJECT_DIR}" && uv run python manage.py test_notify'
alias ${prefix}-list-notify-drivers='cd "${PROJECT_DIR}" && uv run python manage.py list_notify_drivers'
alias ${prefix}-cli='${SCRIPT_DIR}/cli.sh'
ALIASES
}

do_install() {
    echo ""
    echo "============================================"
    echo "   server-maintanence Alias Setup"
    echo "============================================"
    echo ""

    # If no --prefix was passed, prompt interactively
    if [ "$PREFIX" = "sm" ] && [ -t 0 ]; then
        read -p "Alias prefix (default: sm): " user_prefix
        PREFIX="${user_prefix:-sm}"
    fi

    info "Using prefix: ${PREFIX}"
    info "Generating aliases..."

    generate_aliases "$PREFIX" > "$ALIASES_FILE"
    chmod +x "$ALIASES_FILE"
    success "Created $ALIASES_FILE"

    echo ""
    info "Aliases that will be available:"
    echo ""
    echo "  ${PREFIX}-check-health          Run all health checks"
    echo "  ${PREFIX}-run-check             Run a single checker"
    echo "  ${PREFIX}-check-and-alert       Run checks and create alerts"
    echo "  ${PREFIX}-get-recommendations   Get AI recommendations"
    echo "  ${PREFIX}-run-pipeline          Execute a pipeline"
    echo "  ${PREFIX}-monitor-pipeline      Monitor pipeline runs"
    echo "  ${PREFIX}-test-notify           Test notification delivery"
    echo "  ${PREFIX}-list-notify-drivers   List notification drivers"
    echo "  ${PREFIX}-cli                   Interactive CLI"
    echo ""

    # Add source line to shell profile
    if [ -f "$SHELL_PROFILE" ]; then
        if grep -qF "server-maintanence aliases" "$SHELL_PROFILE" 2>/dev/null; then
            # Replace existing source line
            local tmp_file
            tmp_file=$(mktemp)
            grep -vF "server-maintanence aliases" "$SHELL_PROFILE" > "$tmp_file"
            echo "$SOURCE_LINE" >> "$tmp_file"
            mv "$tmp_file" "$SHELL_PROFILE"
            success "Updated source line in $SHELL_PROFILE"
        else
            echo "" >> "$SHELL_PROFILE"
            echo "$SOURCE_LINE" >> "$SHELL_PROFILE"
            success "Added source line to $SHELL_PROFILE"
        fi
    else
        warn "$SHELL_PROFILE not found. Add this line manually:"
        echo "  $SOURCE_LINE"
    fi

    echo ""
    info "To activate now, run:"
    echo "  source $SHELL_PROFILE"
    echo ""
    success "Alias setup complete!"
}

do_remove() {
    echo ""
    info "Removing aliases..."

    # Remove aliases file
    if [ -f "$ALIASES_FILE" ]; then
        rm "$ALIASES_FILE"
        success "Removed $ALIASES_FILE"
    else
        warn "$ALIASES_FILE not found (already removed?)"
    fi

    # Remove source line from shell profile
    if [ -f "$SHELL_PROFILE" ] && grep -qF "server-maintanence aliases" "$SHELL_PROFILE" 2>/dev/null; then
        local tmp_file
        tmp_file=$(mktemp)
        grep -vF "server-maintanence aliases" "$SHELL_PROFILE" > "$tmp_file"
        mv "$tmp_file" "$SHELL_PROFILE"
        success "Removed source line from $SHELL_PROFILE"
    fi

    echo ""
    info "Restart your shell or run: source $SHELL_PROFILE"
    success "Aliases removed!"
}

do_list() {
    if [ -f "$ALIASES_FILE" ]; then
        info "Current aliases ($ALIASES_FILE):"
        echo ""
        grep "^alias " "$ALIASES_FILE" | sed 's/alias /  /'
    else
        warn "No aliases configured. Run: ./bin/setup_aliases.sh"
    fi
}

case $ACTION in
    install) do_install ;;
    remove)  do_remove ;;
    list)    do_list ;;
esac

Step 3: Make executable

chmod +x bin/setup_aliases.sh

Step 4: Verify

# Test help
./bin/setup_aliases.sh --help

# Test generation (non-interactive with explicit prefix)
./bin/setup_aliases.sh --prefix test-sm
cat bin/aliases.sh
./bin/setup_aliases.sh --remove

Step 5: Commit

git add bin/setup_aliases.sh .gitignore
git commit -m "feat: add shell alias setup script with customizable prefix"

Task 2: Add Django system check for aliases

Files:

  • Modify: apps/checkers/checks.py — add @register("aliases") check
  • Create: apps/checkers/_tests/test_checks_aliases.py — tests

Step 1: Write failing tests in apps/checkers/_tests/test_checks_aliases.py

"""Tests for the aliases Django system check."""

import os
from unittest.mock import patch

from django.test import TestCase, override_settings

from apps.checkers.checks import check_aliases_configured


class CheckAliasesConfiguredTests(TestCase):
    """Tests for the check_aliases_configured system check."""

    @override_settings(DEBUG=False)
    def test_skips_in_production(self):
        """Check should return no warnings when DEBUG=False."""
        result = check_aliases_configured(None)
        assert result == []

    @override_settings(DEBUG=True)
    def test_skips_during_tests(self):
        """Check should return no warnings when running under pytest."""
        result = check_aliases_configured(None)
        assert result == []

    @override_settings(DEBUG=True)
    @patch("apps.checkers.checks._is_testing", return_value=False)
    def test_warns_when_aliases_file_missing(self, _mock_testing):
        """Check should warn when bin/aliases.sh does not exist."""
        with patch("apps.checkers.checks._aliases_file_exists", return_value=False):
            result = check_aliases_configured(None)
        assert len(result) == 1
        assert result[0].id == "checkers.W009"
        assert "aliases" in result[0].msg.lower()

    @override_settings(DEBUG=True)
    @patch("apps.checkers.checks._is_testing", return_value=False)
    def test_no_warning_when_aliases_file_exists(self, _mock_testing):
        """Check should return no warnings when bin/aliases.sh exists."""
        with patch("apps.checkers.checks._aliases_file_exists", return_value=True):
            result = check_aliases_configured(None)
        assert result == []

Step 2: Run tests to verify they fail

uv run pytest apps/checkers/_tests/test_checks_aliases.py -v

Expected: FAIL (ImportError — check_aliases_configured and _aliases_file_exists don’t exist)

Step 3: Implement in apps/checkers/checks.py

Add after the existing check_crontab_configuration function (before @register(Tags.database, deploy=True)):

def _aliases_file_exists():
    """Return True if bin/aliases.sh exists in the project root."""
    import django.conf

    base_dir = getattr(django.conf.settings, "BASE_DIR", None)
    if base_dir is None:
        return True  # Can't check, don't warn
    aliases_path = os.path.join(str(base_dir), "bin", "aliases.sh")
    return os.path.isfile(aliases_path)


@register("aliases")
def check_aliases_configured(app_configs, **kwargs):
    """
    Check that shell aliases are configured for management commands.

    Only runs in development (DEBUG=True and not in tests).
    """
    from django.conf import settings

    if not settings.DEBUG or _is_testing():
        return []

    errors = []

    if not _aliases_file_exists():
        errors.append(
            Warning(
                "Shell aliases not configured for management commands",
                hint=(
                    "Run 'bin/setup_aliases.sh' to set up quick aliases like "
                    "sm-check-health, sm-run-check, etc. "
                    "This is optional but improves developer experience."
                ),
                id="checkers.W009",
            )
        )

    return errors

Also add import os at the top of the file (if not already present).

Step 4: Run tests to verify they pass

uv run pytest apps/checkers/_tests/test_checks_aliases.py -v

Step 5: Update Django System Checks table in apps/checkers/README.md

Add a row to the existing table at line ~122:

| `aliases` | `checkers.W009` | Shell aliases not configured (dev only) |

Step 6: Commit

git add apps/checkers/checks.py apps/checkers/_tests/test_checks_aliases.py apps/checkers/README.md
git commit -m "feat: add Django system check for shell aliases (dev only)"

Task 3: Update bin/cli.sh — startup hint + install menu option

Files:

  • Modify: bin/cli.sh

Step 1: Add alias hint to show_banner()

Replace the show_banner function (lines 39-46) with:

show_banner() {
    clear
    echo -e "${CYAN}"
    echo "╔══════════════════════════════════════════════════════════════╗"
    echo "║                    Server Maintenance CLI                    ║"
    echo "╚══════════════════════════════════════════════════════════════╝"
    echo -e "${NC}"
    # Show alias hint if aliases are not configured
    if [ ! -f "$SCRIPT_DIR/aliases.sh" ]; then
        echo -e "${YELLOW}Tip:${NC} Run ${CYAN}bin/setup_aliases.sh${NC} for quick command aliases (sm-check-health, sm-run-check, etc.)"
        echo ""
    fi
}

Step 2: Add “Setup shell aliases” to install menu

Replace the install_project function (lines 137-175). The options array becomes:

    local options=(
        "Full installation (uv sync + pre-commit)"
        "Install dependencies only (uv sync)"
        "Install pre-commit hooks"
        "Setup shell aliases"
        "Check installation status"
        "Back to main menu"
    )

And the select case becomes:

    select opt in "${options[@]}"; do
        case $REPLY in
            1)
                echo -e "${YELLOW}Running full installation...${NC}"
                run_command "uv sync" "Installing dependencies"
                run_command "uv run pre-commit install" "Installing pre-commit hooks"
                ;;
            2)
                run_command "uv sync" "Installing dependencies"
                ;;
            3)
                run_command "uv run pre-commit install" "Installing pre-commit hooks"
                ;;
            4)
                run_command "$SCRIPT_DIR/setup_aliases.sh" "Setting up shell aliases"
                ;;
            5)
                check_installation
                ;;
            6)
                return
                ;;
            *)
                echo -e "${RED}Invalid option${NC}"
                ;;
        esac
        break
    done

Step 3: Add alias check to check_installation()

Add after the pre-commit check (around line 199):

    # Check aliases
    if [ -f "$SCRIPT_DIR/aliases.sh" ]; then
        echo -e "${GREEN}${NC} Shell aliases configured"
    else
        echo -e "${YELLOW}!${NC} Shell aliases not configured (run bin/setup_aliases.sh)"
    fi

Step 4: Verify

./bin/cli.sh help

Step 5: Commit

git add bin/cli.sh
git commit -m "feat: add alias setup option and hint to interactive CLI"

Task 4: Rewrite apps/checkers/README.md CLI reference

Files:

  • Modify: apps/checkers/README.md — replace lines 144-264 (the “Running checks” section)

Step 1: Replace the “Running checks” section (lines 144-264) with comprehensive CLI reference

The new section should cover check_health (10 flags) and run_check (11 flags) with every combination:

## CLI Reference

There are two management commands for running checks. All flags can be passed after the alias too (e.g., `sm-check-health --json`).

### `check_health`

Run all checkers (or a selection) and show a summary.

```bash
# Run ALL registered checkers
uv run python manage.py check_health

# Run specific checkers only
uv run python manage.py check_health cpu memory
uv run python manage.py check_health cpu memory disk network process

# List available checkers and exit
uv run python manage.py check_health --list
```

#### JSON output

```bash
# JSON output (for scripts, cron, piping to jq)
uv run python manage.py check_health --json

# Specific checkers + JSON
uv run python manage.py check_health cpu disk --json
```

#### Exit codes for CI/automation

By default: exit `2` if any CRITICAL, `1` if any UNKNOWN, `0` otherwise.

```bash
# Exit 1 if ANY check is WARNING or CRITICAL (strictest)
uv run python manage.py check_health --fail-on-warning

# Exit 1 only if ANY check is CRITICAL
uv run python manage.py check_health --fail-on-critical

# CI pipeline example: fail build on critical
uv run python manage.py check_health --fail-on-critical --json
```

#### Threshold overrides

Override default warning/critical thresholds for all checkers in this run:

```bash
# Lower thresholds (more sensitive)
uv run python manage.py check_health --warning-threshold 60 --critical-threshold 80

# Higher thresholds (less sensitive)
uv run python manage.py check_health --warning-threshold 85 --critical-threshold 98

# Override thresholds for specific checkers only
uv run python manage.py check_health cpu memory --warning-threshold 75 --critical-threshold 95
```

#### Checker-specific options

These flags are passed to the relevant checker when it runs:

```bash
# Disk: check specific mount points
uv run python manage.py check_health disk --disk-paths / /var /tmp /home

# Network: ping specific hosts
uv run python manage.py check_health network --ping-hosts 8.8.8.8 1.1.1.1 github.com

# Process: verify specific processes are running
uv run python manage.py check_health process --processes nginx postgres redis celery
```

#### Combined examples

```bash
# Full CI check: all checkers, JSON, fail on warning
uv run python manage.py check_health --json --fail-on-warning

# Disk + network with custom targets + thresholds
uv run python manage.py check_health disk network \
  --disk-paths / /var/log \
  --ping-hosts 8.8.8.8 google.com \
  --warning-threshold 75 --critical-threshold 90

# Cron job: all checks, JSON, append to log
uv run python manage.py check_health --json >> /var/log/health-checks.log 2>&1

# Quick smoke test: CPU + memory, fail on critical
uv run python manage.py check_health cpu memory --fail-on-critical
```

#### Flag reference

| Flag | Type | Default | Description |
|------|------|---------|-------------|
| `checkers` (positional) | str... | all | Specific checkers to run (space-separated) |
| `--list` | flag | — | List available checkers and exit |
| `--json` | flag | — | Output results as JSON |
| `--fail-on-warning` | flag | — | Exit 1 if any WARNING or CRITICAL |
| `--fail-on-critical` | flag | — | Exit 1 only if any CRITICAL |
| `--warning-threshold` | float | per-checker | Override warning threshold for all checks |
| `--critical-threshold` | float | per-checker | Override critical threshold for all checks |
| `--disk-paths` | str... | `/` | Paths to check (disk checker) |
| `--ping-hosts` | str... | `8.8.8.8 1.1.1.1` | Hosts to ping (network checker) |
| `--processes` | str... | — | Process names to check (process checker) |

---

### `run_check`

Run a **single** checker with checker-specific options.

```bash
# Basic usage
uv run python manage.py run_check cpu
uv run python manage.py run_check memory
uv run python manage.py run_check disk
uv run python manage.py run_check network
uv run python manage.py run_check process
```

#### JSON output

```bash
uv run python manage.py run_check cpu --json
uv run python manage.py run_check disk --json
```

#### Threshold overrides

```bash
# Override thresholds for this single check
uv run python manage.py run_check cpu --warning-threshold 80 --critical-threshold 95
uv run python manage.py run_check memory --warning-threshold 75 --critical-threshold 90
uv run python manage.py run_check disk --warning-threshold 85 --critical-threshold 98
```

#### CPU checker options

```bash
# Default: 5 samples, 1 second apart
uv run python manage.py run_check cpu

# More samples for better accuracy
uv run python manage.py run_check cpu --samples 10

# Faster sampling (0.5s intervals)
uv run python manage.py run_check cpu --sample-interval 0.5

# Quick snapshot (1 sample, no wait)
uv run python manage.py run_check cpu --samples 1 --sample-interval 0

# Per-CPU mode (reports busiest core)
uv run python manage.py run_check cpu --per-cpu

# All CPU options combined
uv run python manage.py run_check cpu --samples 10 --sample-interval 0.5 --per-cpu

# CPU with threshold override + JSON
uv run python manage.py run_check cpu --samples 10 --per-cpu --warning-threshold 80 --critical-threshold 95 --json
```

#### Memory checker options

```bash
# Default: RAM only
uv run python manage.py run_check memory

# Include swap memory in the check
uv run python manage.py run_check memory --include-swap

# Memory with custom thresholds
uv run python manage.py run_check memory --include-swap --warning-threshold 75 --critical-threshold 90 --json
```

#### Disk checker options

```bash
# Default: check /
uv run python manage.py run_check disk

# Check specific paths
uv run python manage.py run_check disk --paths /
uv run python manage.py run_check disk --paths / /var /tmp /home
uv run python manage.py run_check disk --paths /var/log /var/lib

# Disk with thresholds + JSON
uv run python manage.py run_check disk --paths / /var/log --warning-threshold 80 --critical-threshold 95 --json
```

#### Network checker options

```bash
# Default hosts: 8.8.8.8, 1.1.1.1
uv run python manage.py run_check network

# Custom hosts
uv run python manage.py run_check network --hosts 8.8.8.8 1.1.1.1 github.com
uv run python manage.py run_check network --hosts google.com cloudflare.com aws.amazon.com

# Network with JSON
uv run python manage.py run_check network --hosts 8.8.8.8 google.com --json
```

#### Process checker options

```bash
# Check specific processes
uv run python manage.py run_check process --names nginx
uv run python manage.py run_check process --names nginx postgres redis
uv run python manage.py run_check process --names nginx postgres redis celery gunicorn

# Process with JSON
uv run python manage.py run_check process --names nginx postgres --json
```

#### Flag reference

| Flag | Type | Default | Description |
|------|------|---------|-------------|
| `checker` (positional) | str | required | Checker name (cpu, memory, disk, network, process, etc.) |
| `--json` | flag | — | Output as JSON |
| `--warning-threshold` | float | per-checker | Override warning threshold |
| `--critical-threshold` | float | per-checker | Override critical threshold |
| `--samples` | int | 5 | Number of CPU samples (cpu only) |
| `--sample-interval` | float | 1.0 | Seconds between CPU samples (cpu only) |
| `--per-cpu` | flag | — | Per-CPU mode, reports busiest core (cpu only) |
| `--include-swap` | flag | — | Include swap memory (memory only) |
| `--paths` | str... | `/` | Disk paths to check (disk only) |
| `--hosts` | str... | `8.8.8.8 1.1.1.1` | Hosts to ping (network only) |
| `--names` | str... | — | Process names to check (process only) |

Step 2: Verify no broken formatting

Visually inspect the rendered markdown or run a markdown linter.

Step 3: Commit

git add apps/checkers/README.md
git commit -m "docs: comprehensive CLI reference for check_health and run_check"

Task 5: Rewrite apps/alerts/README.md CLI reference

Files:

  • Modify: apps/alerts/README.md — replace lines 338-368 (the management command section under “Creating alerts from health checks”)

Step 1: Replace the management command subsection with comprehensive CLI reference

Replace from ### Using the management command through the code block ending at line 368 with:

### `check_and_alert` management command

Run health checkers and automatically create alerts/incidents for failures.

```bash
# Run all checks and create alerts
uv run python manage.py check_and_alert

# Run specific checkers only
uv run python manage.py check_and_alert --checkers cpu memory
uv run python manage.py check_and_alert --checkers cpu memory disk network process
```

#### Dry run

Preview what would happen without creating any alerts or incidents:

```bash
uv run python manage.py check_and_alert --dry-run

# Dry run with specific checkers
uv run python manage.py check_and_alert --checkers cpu disk --dry-run

# Dry run with JSON output
uv run python manage.py check_and_alert --dry-run --json
```

#### Incident control

```bash
# Skip automatic incident creation (create alerts only)
uv run python manage.py check_and_alert --no-incidents

# Specific checkers without incidents
uv run python manage.py check_and_alert --checkers cpu memory --no-incidents
```

#### Custom labels

Add key=value labels to all generated alerts (repeatable):

```bash
# Single label
uv run python manage.py check_and_alert --label env=production

# Multiple labels
uv run python manage.py check_and_alert --label env=production --label team=sre --label datacenter=us-east-1

# Labels + specific checkers
uv run python manage.py check_and_alert --checkers cpu memory --label env=staging --label service=api
```

#### Hostname override

Override the auto-detected hostname in alert labels:

```bash
uv run python manage.py check_and_alert --hostname web-server-01

# Hostname + labels
uv run python manage.py check_and_alert --hostname db-primary --label env=production --label role=database
```

#### JSON output

```bash
uv run python manage.py check_and_alert --json

# JSON + specific checkers
uv run python manage.py check_and_alert --checkers cpu disk --json
```

#### Threshold overrides

Override warning/critical thresholds for all checkers:

```bash
# Lower thresholds (more sensitive alerting)
uv run python manage.py check_and_alert --warning-threshold 60 --critical-threshold 80

# Higher thresholds (less noise)
uv run python manage.py check_and_alert --warning-threshold 85 --critical-threshold 98
```

#### Skip list override

```bash
# Include checkers that are normally skipped via CHECKERS_SKIP
uv run python manage.py check_and_alert --include-skipped
```

#### Combined examples

```bash
# Production cron job: all checks, JSON, custom labels
uv run python manage.py check_and_alert --json --label env=production --hostname prod-web-01

# CI smoke test: dry run, specific checkers, JSON
uv run python manage.py check_and_alert --checkers cpu memory --dry-run --json

# Sensitive alerting with full context
uv run python manage.py check_and_alert \
  --warning-threshold 50 --critical-threshold 75 \
  --label env=production --label team=oncall \
  --hostname api-server-03 --json

# Include all checkers even skipped ones, no incidents
uv run python manage.py check_and_alert --include-skipped --no-incidents --json

# Cron: every 5 minutes with labels and JSON log
*/5 * * * * cd /path/to/project && uv run python manage.py check_and_alert --json --label env=production >> /var/log/health-alerts.log 2>&1
```

#### Flag reference

| Flag | Type | Default | Description |
|------|------|---------|-------------|
| `--checkers` | str... | all enabled | Specific checkers to run |
| `--json` | flag | — | Output as JSON |
| `--dry-run` | flag | — | Preview without creating alerts |
| `--no-incidents` | flag | — | Create alerts but skip incident creation |
| `--hostname` | str | auto-detected | Override hostname in alert labels |
| `--label` | KEY=VALUE | — | Add label to all alerts (repeatable) |
| `--warning-threshold` | float | per-checker | Override warning threshold |
| `--critical-threshold` | float | per-checker | Override critical threshold |
| `--include-skipped` | flag | — | Include checkers from CHECKERS_SKIP |

Step 2: Commit

git add apps/alerts/README.md
git commit -m "docs: comprehensive CLI reference for check_and_alert"

Task 6: Rewrite apps/intelligence/README.md CLI reference

Files:

  • Modify: apps/intelligence/README.md — replace lines 20-51 (the “Management Command” section)

Step 1: Replace the management command section with comprehensive CLI reference

Replace lines 20-51 with:

### Management Command: `get_recommendations`

```bash
# Default: get general recommendations
uv run python manage.py get_recommendations
```

#### Analysis modes

```bash
# Memory analysis: top processes by memory usage
uv run python manage.py get_recommendations --memory

# Disk analysis: large files, old logs, cleanup candidates
uv run python manage.py get_recommendations --disk

# Disk analysis for a specific path
uv run python manage.py get_recommendations --disk --path /var/log
uv run python manage.py get_recommendations --disk --path /home

# All analysis (memory + disk combined)
uv run python manage.py get_recommendations --all
```

#### Incident-based analysis

```bash
# Analyze a specific incident (auto-detects type from title/description)
uv run python manage.py get_recommendations --incident-id 1
uv run python manage.py get_recommendations --incident-id 42

# Incident analysis with specific provider
uv run python manage.py get_recommendations --incident-id 1 --provider local
```

#### Provider selection

```bash
# List available providers
uv run python manage.py get_recommendations --list-providers

# Use a specific provider
uv run python manage.py get_recommendations --provider local
```

#### Tuning parameters

```bash
# Show top 5 processes (default: 10)
uv run python manage.py get_recommendations --top-n 5

# Show top 20 processes
uv run python manage.py get_recommendations --memory --top-n 20

# Lower threshold for "large" files (default: 100 MB)
uv run python manage.py get_recommendations --disk --threshold-mb 50

# Higher threshold
uv run python manage.py get_recommendations --disk --threshold-mb 500

# Detect files older than 7 days (default: 30)
uv run python manage.py get_recommendations --disk --old-days 7

# Detect files older than 90 days
uv run python manage.py get_recommendations --disk --old-days 90
```

#### JSON output

```bash
uv run python manage.py get_recommendations --json
uv run python manage.py get_recommendations --memory --json
uv run python manage.py get_recommendations --all --json
```

#### Combined examples

```bash
# Full analysis with tuned parameters + JSON
uv run python manage.py get_recommendations --all \
  --top-n 15 --threshold-mb 50 --old-days 14 --json

# Disk analysis for /var/log with low threshold
uv run python manage.py get_recommendations --disk \
  --path /var/log --threshold-mb 10 --old-days 7

# Memory analysis with top 5 + JSON
uv run python manage.py get_recommendations --memory --top-n 5 --json

# Incident analysis with custom provider + JSON
uv run python manage.py get_recommendations --incident-id 1 --provider local --json
```

#### Flag reference

| Flag | Type | Default | Description |
|------|------|---------|-------------|
| `--memory` | flag | — | Get memory-specific recommendations |
| `--disk` | flag | — | Get disk-specific recommendations |
| `--all` | flag | — | Get all recommendations (memory + disk) |
| `--path` | str | `/` | Path to analyze for disk recommendations |
| `--incident-id` | int | — | Analyze a specific incident by ID |
| `--provider` | str | `local` | Intelligence provider to use |
| `--list-providers` | flag | — | List available providers and exit |
| `--top-n` | int | `10` | Number of top processes to report |
| `--threshold-mb` | float | `100.0` | Minimum file size in MB for "large" |
| `--old-days` | int | `30` | Age in days for old file detection |
| `--json` | flag | — | Output as JSON |

Step 2: Commit

git add apps/intelligence/README.md
git commit -m "docs: comprehensive CLI reference for get_recommendations"

Task 7: Rewrite apps/notify/README.md CLI reference

Files:

  • Modify: apps/notify/README.md — replace lines 31-35 (management commands section) with comprehensive version

Step 1: Replace the management commands section

The current lines 31-35 are brief. Replace the ### Management commands section with:

### Management commands

#### `list_notify_drivers`

```bash
# List available notification drivers
uv run python manage.py list_notify_drivers

# Show detailed configuration requirements (required/optional fields per driver)
uv run python manage.py list_notify_drivers --verbose
```

| Flag | Type | Default | Description |
|------|------|---------|-------------|
| `--verbose` | flag | — | Show required/optional config fields per driver |

#### `test_notify`

Send a test notification to verify driver or channel configuration.

```bash
# Test using first active DB channel (no driver argument needed)
uv run python manage.py test_notify

# Test a specific driver
uv run python manage.py test_notify slack
uv run python manage.py test_notify email
uv run python manage.py test_notify pagerduty
uv run python manage.py test_notify generic

# Test a named DB channel
uv run python manage.py test_notify ops-slack
```

##### Custom message

```bash
# Custom title and message
uv run python manage.py test_notify slack --title "Deploy Alert" --message "Deployment started"

# Custom severity
uv run python manage.py test_notify slack --severity critical
uv run python manage.py test_notify slack --severity warning
uv run python manage.py test_notify slack --severity info
uv run python manage.py test_notify slack --severity success

# Custom channel destination
uv run python manage.py test_notify slack --channel "#ops-alerts"
```

##### Slack driver

```bash
# Slack with webhook URL
uv run python manage.py test_notify slack --webhook-url https://hooks.slack.com/services/T.../B.../XXX

# Slack with custom message + channel
uv run python manage.py test_notify slack \
  --webhook-url https://hooks.slack.com/services/T.../B.../XXX \
  --channel "#alerts" \
  --title "Test Alert" \
  --message "Testing Slack integration" \
  --severity warning
```

##### Email driver

```bash
# Email with SMTP config
uv run python manage.py test_notify email \
  --smtp-host smtp.gmail.com \
  --from-address alerts@example.com

# Email with TLS and custom port
uv run python manage.py test_notify email \
  --smtp-host smtp.gmail.com \
  --smtp-port 587 \
  --from-address alerts@example.com \
  --use-tls

# Email with full options
uv run python manage.py test_notify email \
  --smtp-host smtp.gmail.com \
  --smtp-port 587 \
  --from-address alerts@example.com \
  --use-tls \
  --title "Disk Alert" \
  --message "Disk usage critical on server-01" \
  --severity critical
```

##### PagerDuty driver

```bash
# PagerDuty with integration key
uv run python manage.py test_notify pagerduty --integration-key your-key-here

# PagerDuty with custom severity
uv run python manage.py test_notify pagerduty \
  --integration-key your-key-here \
  --title "API Down" \
  --message "API server not responding" \
  --severity critical
```

##### Generic HTTP driver

```bash
# Generic with endpoint
uv run python manage.py test_notify generic --endpoint https://api.example.com/notify

# Generic with API key
uv run python manage.py test_notify generic \
  --endpoint https://api.example.com/notify \
  --api-key your-api-key

# Generic with full options
uv run python manage.py test_notify generic \
  --endpoint https://api.example.com/notify \
  --api-key your-api-key \
  --title "Custom Alert" \
  --message "Something happened" \
  --severity warning
```

##### JSON config (advanced)

Pass full driver config as a JSON string (for complex configurations):

```bash
uv run python manage.py test_notify slack --json-config '{"webhook_url": "https://hooks.slack.com/...", "channel": "#alerts", "username": "Bot", "icon_emoji": ":robot:"}'

uv run python manage.py test_notify email --json-config '{"smtp_host": "smtp.gmail.com", "smtp_port": 587, "from_address": "alerts@example.com", "to_addresses": ["ops@example.com"], "use_tls": true}'
```

##### Flag reference

| Flag | Type | Default | Description |
|------|------|---------|-------------|
| `driver` (positional) | str | first active channel | Driver name or DB channel name |
| `--title` | str | `Test Alert` | Notification title |
| `--message` | str | default test message | Notification body |
| `--severity` | choice | `info` | `critical`, `warning`, `info`, or `success` |
| `--channel` | str | `default` | Destination channel/recipient |
| `--json-config` | str | — | Full driver config as JSON string |
| `--smtp-host` | str | — | SMTP host (email driver) |
| `--smtp-port` | int | `587` | SMTP port (email driver) |
| `--from-address` | str | — | Sender address (email driver) |
| `--use-tls` | flag | — | Enable TLS for SMTP (email driver) |
| `--webhook-url` | str | — | Webhook URL (slack driver) |
| `--integration-key` | str | — | Integration key (pagerduty driver) |
| `--endpoint` | str | — | API endpoint (generic driver) |
| `--api-key` | str | — | API key (generic driver) |

Step 2: Commit

git add apps/notify/README.md
git commit -m "docs: comprehensive CLI reference for list_notify_drivers and test_notify"

Task 8: Rewrite apps/orchestration/README.md CLI reference

Files:

  • Modify: apps/orchestration/README.md — replace lines 120-179 (Usage Examples section)

Step 1: Replace the management command sections

Replace from ## Usage Examples through ### Monitor Pipeline Command section (lines 120-179) with:

## CLI Reference

### `run_pipeline`

Execute the full pipeline (ingest → check → analyze → notify) or parts of it.

```bash
# Run with sample alert payload (quickest test)
uv run python manage.py run_pipeline --sample

# Dry run: show what would happen without executing
uv run python manage.py run_pipeline --sample --dry-run
```

#### Payload sources

```bash
# Sample payload (built-in test data)
uv run python manage.py run_pipeline --sample

# From a JSON file
uv run python manage.py run_pipeline --file alert.json
uv run python manage.py run_pipeline --file /path/to/payload.json

# Inline JSON string
uv run python manage.py run_pipeline --payload '{"name": "Test Alert", "status": "firing", "severity": "warning"}'
```

#### Source format

```bash
# Specify the alert source format
uv run python manage.py run_pipeline --sample --source alertmanager
uv run python manage.py run_pipeline --sample --source grafana
uv run python manage.py run_pipeline --sample --source pagerduty
uv run python manage.py run_pipeline --sample --source generic
uv run python manage.py run_pipeline --file alert.json --source datadog
```

#### Environment and correlation

```bash
# Set environment name
uv run python manage.py run_pipeline --sample --environment production
uv run python manage.py run_pipeline --sample --environment staging

# Set custom trace ID for correlation
uv run python manage.py run_pipeline --sample --trace-id my-trace-123

# Both
uv run python manage.py run_pipeline --sample --environment production --trace-id deploy-v2.1.0
```

#### Partial execution

```bash
# Run only the checkers stage (skip alert ingestion)
uv run python manage.py run_pipeline --sample --checks-only

# Checks only + dry run
uv run python manage.py run_pipeline --sample --checks-only --dry-run
```

#### Notification driver

```bash
# Specify which notification driver to use
uv run python manage.py run_pipeline --sample --notify-driver slack
uv run python manage.py run_pipeline --sample --notify-driver email
uv run python manage.py run_pipeline --sample --notify-driver pagerduty
uv run python manage.py run_pipeline --sample --notify-driver generic
```

#### Definition-based pipelines

```bash
# Run a pipeline definition stored in the database
uv run python manage.py run_pipeline --definition my-pipeline-name

# Run from a JSON config file
uv run python manage.py run_pipeline --config path/to/pipeline.json

# Definition + environment
uv run python manage.py run_pipeline --definition production-pipeline --environment production
```

#### JSON output

```bash
uv run python manage.py run_pipeline --sample --json
uv run python manage.py run_pipeline --file alert.json --json
```

#### Combined examples

```bash
# Full production pipeline: file payload, production env, trace ID, slack notify, JSON
uv run python manage.py run_pipeline \
  --file alert.json \
  --source grafana \
  --environment production \
  --trace-id incident-2024-001 \
  --notify-driver slack \
  --json

# Quick smoke test: sample, dry run, JSON
uv run python manage.py run_pipeline --sample --dry-run --json

# Checks-only with custom source and trace
uv run python manage.py run_pipeline --sample --checks-only --source alertmanager --trace-id diag-run-1

# Definition pipeline with all options
uv run python manage.py run_pipeline \
  --definition my-pipeline \
  --environment staging \
  --trace-id test-run-42 \
  --notify-driver email \
  --json
```

#### Flag reference

| Flag | Type | Default | Description |
|------|------|---------|-------------|
| `--sample` | flag | — | Use built-in sample alert payload |
| `--payload` | str | — | Inline JSON payload string |
| `--file` | str | — | Path to JSON payload file |
| `--source` | str | `cli` | Alert source format |
| `--environment` | str | `development` | Environment name |
| `--trace-id` | str | auto-generated | Custom trace ID for correlation |
| `--checks-only` | flag | — | Run only checkers stage |
| `--dry-run` | flag | — | Preview without executing |
| `--notify-driver` | str | `generic` | Notification driver to use |
| `--json` | flag | — | Output as JSON |
| `--definition` | str | — | Pipeline definition name (from DB) |
| `--config` | str | — | Path to pipeline definition JSON file |

---

### `monitor_pipeline`

View and monitor pipeline run history.

```bash
# List recent pipeline runs (default: last 10)
uv run python manage.py monitor_pipeline

# Show more runs
uv run python manage.py monitor_pipeline --limit 25
uv run python manage.py monitor_pipeline --limit 50
uv run python manage.py monitor_pipeline --limit 100
```

#### Filter by status

```bash
# Show only failed runs
uv run python manage.py monitor_pipeline --status failed

# Show only completed runs
uv run python manage.py monitor_pipeline --status notified

# Other statuses
uv run python manage.py monitor_pipeline --status pending
uv run python manage.py monitor_pipeline --status ingested
uv run python manage.py monitor_pipeline --status checked
uv run python manage.py monitor_pipeline --status analyzed
uv run python manage.py monitor_pipeline --status retrying
uv run python manage.py monitor_pipeline --status skipped
```

#### Inspect a specific run

```bash
# Get full details for a pipeline run by run_id
uv run python manage.py monitor_pipeline --run-id abc123
uv run python manage.py monitor_pipeline --run-id 550e8400-e29b-41d4-a716-446655440000
```

#### Combined examples

```bash
# Last 50 failed runs
uv run python manage.py monitor_pipeline --status failed --limit 50

# Last 20 completed runs
uv run python manage.py monitor_pipeline --status notified --limit 20
```

#### Flag reference

| Flag | Type | Default | Description |
|------|------|---------|-------------|
| `--limit` | int | `10` | Number of pipeline runs to show |
| `--status` | str | all | Filter by status (pending, ingested, checked, analyzed, notified, failed, retrying, skipped) |
| `--run-id` | str | — | Show details for a specific pipeline run |

Step 2: Commit

git add apps/orchestration/README.md
git commit -m "docs: comprehensive CLI reference for run_pipeline and monitor_pipeline"

Task 9: Rewrite bin/README.md with alias reference and command quick-reference

Files:

  • Modify: bin/README.md

Step 1: Rewrite bin/README.md

Replace the entire file with:

# Shell Scripts & CLI

This directory contains shell scripts for installation, automation, and interactive usage.

[toc]

## Quick Command Reference

All management commands and their shell aliases (set up via `setup_aliases.sh`):

| Alias (default `sm-` prefix) | Management Command | App | Description |
|------|------|-----|-------------|
| `sm-check-health` | `check_health` | checkers | Run health checks (CPU, memory, disk, network, process) |
| `sm-run-check` | `run_check` | checkers | Run a single checker with checker-specific options |
| `sm-check-and-alert` | `check_and_alert` | alerts | Run checks and create alerts/incidents |
| `sm-get-recommendations` | `get_recommendations` | intelligence | Get AI-powered system recommendations |
| `sm-run-pipeline` | `run_pipeline` | orchestration | Execute the full pipeline |
| `sm-monitor-pipeline` | `monitor_pipeline` | orchestration | Monitor pipeline run history |
| `sm-test-notify` | `test_notify` | notify | Test notification delivery |
| `sm-list-notify-drivers` | `list_notify_drivers` | notify | List available notification drivers |
| `sm-cli` | — | — | Interactive CLI menu |

Aliases pass all flags through. Example: `sm-check-health --json` = `uv run python manage.py check_health --json`.

For full flag reference per command, see the app READMEs:
- [`apps/checkers/README.md`](../apps/checkers/README.md)`check_health` (10 flags), `run_check` (11 flags)
- [`apps/alerts/README.md`](../apps/alerts/README.md)`check_and_alert` (9 flags)
- [`apps/intelligence/README.md`](../apps/intelligence/README.md)`get_recommendations` (11 flags)
- [`apps/notify/README.md`](../apps/notify/README.md)`list_notify_drivers` (1 flag), `test_notify` (14 flags)
- [`apps/orchestration/README.md`](../apps/orchestration/README.md)`run_pipeline` (12 flags), `monitor_pipeline` (3 flags)

---

## Scripts

### `setup_aliases.sh` — Shell Alias Setup

Set up shell aliases so you can run `sm-check-health` instead of `uv run python manage.py check_health`.

```bash
# Interactive setup (prompts for prefix, default: sm)
./bin/setup_aliases.sh

# Custom prefix
./bin/setup_aliases.sh --prefix maint
# Creates: maint-check-health, maint-run-check, etc.

# Show current aliases
./bin/setup_aliases.sh --list

# Remove aliases and source line from shell profile
./bin/setup_aliases.sh --remove
```

**What it does:**
- Generates `bin/aliases.sh` (gitignored) with aliases locked to the project path
- Adds a `source` line to `~/.zshrc` or `~/.bashrc`
- `--remove` undoes both

**After setup, activate immediately:**
```bash
source ~/.zshrc   # or source ~/.bashrc
```

---

### `cli.sh` — Interactive CLI

An interactive menu-driven interface for all management commands. Recommended for new users and manual operations.

```bash
# Start interactive mode
./bin/cli.sh

# Direct shortcuts
./bin/cli.sh install    # Jump to installation menu
./bin/cli.sh health     # Jump to health monitoring
./bin/cli.sh alerts     # Jump to alerts menu
./bin/cli.sh intel      # Jump to intelligence menu
./bin/cli.sh pipeline   # Jump to pipeline menu
./bin/cli.sh notify     # Jump to notifications menu
./bin/cli.sh help       # Show all options
```

**Features:**
- Color-coded output
- Shows available flags and options for each command
- Confirms before running commands
- Installation status check
- Shell alias setup option

---

### `install.sh` — Project Installer

Full installation script for setting up the project.

```bash
./bin/install.sh
```

**What it does:**
- Verifies Python 3.10+ (tries python3.13 → python3.10 → python3)
- Installs `uv` if missing
- Creates `.env` from `.env.sample`
- Prompts for dev/production configuration
- Installs dependencies with `uv sync`
- Runs Django migrations
- Optionally runs health checks
- Optionally sets up cron

See [`docs/Installation.md`](../docs/Installation.md) for full details.

---

### `setup_cron.sh` — Cron Setup

Sets up scheduled health checks via cron.

```bash
./bin/setup_cron.sh
```

**What it does:**
- Detects project directory
- Lets you choose a schedule (5 min / 15 min / hourly / custom)
- Writes crontab entry for `check_and_alert --json`
- Logs to `cron.log` in project root

**Useful commands after setup:**
```bash
crontab -l           # View cron entries
tail -f ./cron.log   # Follow cron output
```

---

## Permissions

If you get "permission denied", make scripts executable:

```bash
chmod +x ./bin/*.sh
```

Step 2: Commit

git add bin/README.md
git commit -m "docs: rewrite bin/README.md with alias reference and command quick-reference"

Task 10: Update root README.md with aliases in quickstart

Files:

  • Modify: README.md

Step 1: Add aliases to quickstart section

After step 3 in the Quickstart section (after the ./bin/cli.sh block, around line 133), add:

4) (Optional) Set up shell aliases for quick command access:

```bash
./bin/setup_aliases.sh

After setup, use aliases like sm-check-health, sm-run-check, etc. See bin/README.md for the full alias list.


**Step 2: Commit**

```bash
git add README.md
git commit -m "docs: add alias setup to quickstart"

Task 11: Full verification

Step 1: Run tests

uv run pytest -v

Step 2: Django checks

uv run python manage.py check

Step 3: Linting

uv run black --check .
uv run ruff check .

Step 4: Verify alias script works

./bin/setup_aliases.sh --prefix test-sm
./bin/setup_aliases.sh --list
./bin/setup_aliases.sh --remove

Step 5: Grep for stale references

grep -r "orchestration-pipelines" . --include="*.md" | grep -v docs/plans

Files Summary

File Change
bin/setup_aliases.sh Create — interactive alias setup script
.gitignore Add bin/aliases.sh
apps/checkers/checks.py Add @register("aliases") system check
apps/checkers/_tests/test_checks_aliases.py Create — 4 tests for aliases check
bin/cli.sh Add alias hint + install menu option
apps/checkers/README.md Rewrite CLI reference (check_health + run_check)
apps/alerts/README.md Rewrite CLI reference (check_and_alert)
apps/intelligence/README.md Rewrite CLI reference (get_recommendations)
apps/notify/README.md Rewrite CLI reference (list_notify_drivers + test_notify)
apps/orchestration/README.md Rewrite CLI reference (run_pipeline + monitor_pipeline)
bin/README.md Rewrite with alias table + command quick-reference
README.md Add alias setup to quickstart

This site uses Just the Docs, a documentation theme for Jekyll.