Disk Checkers: Global Sort Implementation Plan
For Claude: REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
Goal: Make the disk checkers’ space_hogs / old_files / large_files lists be sorted descending by size_mb globally across scan targets, not just within each target.
Architecture: Add one .sort(key=lambda x: x["size_mb"], reverse=True) call after each of the seven per-target collection loops in disk_common.py / disk_macos.py / disk_linux.py. No new modules, no helper extraction, no metrics-shape changes. Five new tests cover the multi-target sort invariant.
Tech Stack: Python 3, Django management commands, unittest.TestCase via django.test.TestCase, pytest runner, coverage.
Design doc: docs/plans/2026-05-07-disk-global-sort-design.md
Branch: fix/disk-global-sort (already created from main with the design doc committed).
Single commit planned. TDD red-green: write 5 failing tests, add 7 sort lines, all green.
Independent of PR #132 and PR #134 — this branch does not touch check_health.py, run_check.py, or _metrics_format.py. Different files.
Background — what is changing
Each disk checker collects per-target scan results into a single list:
space_hogs = []
seen = set()
for target in scan_targets:
for item in scan_directory(target, timeout=self.timeout):
if item["path"] not in seen:
seen.add(item["path"])
space_hogs.append(item)
# (issue #133: needs space_hogs.sort(...) here)
Each per-target call returns sorted-desc results, but the parent loop just appends — so a 500 MB item from a later target lands after small items from an earlier one. The fix: one .sort() after each loop.
The 7 sort sites:
| File | Section | Loop ends after current line | Targets |
|---|---|---|---|
disk_common.py | space_hogs | space_hogs.append(item) | 2 (/var/log, ~/.cache) |
disk_common.py | old_files | old_files.append(item) | 2 (/tmp, /var/tmp) |
disk_common.py | large_files | large_files.append(item) | 1 (single-target today; sort is no-op for uniformity) |
disk_macos.py | space_hogs | space_hogs.append(item) | 4 |
disk_macos.py | old_files | old_files.append(item) | 1 (single-target today) |
disk_linux.py | space_hogs | space_hogs.append(item) | 4 |
disk_linux.py | old_files | old_files.append(item) | 1 (single-target today) |
Task 1: Add multi-target sort tests to test_disk_common.py
Three tests covering all three lists. Place them at the bottom of DiskCommonCheckerTests.
Files:
- Modify:
apps/checkers/_tests/checkers/test_disk_common.py
Step 1: Add the three tests
Append to DiskCommonCheckerTests:
@patch("apps.checkers.checkers.disk_common.os.path.expanduser")
@patch("apps.checkers.checkers.disk_common.scan_directory")
@patch("apps.checkers.checkers.disk_common.find_old_files")
@patch("apps.checkers.checkers.disk_common.find_large_files")
def test_space_hogs_globally_sorted_across_scan_targets(
self, mock_large, mock_old, mock_scan, mock_expanduser
):
"""space_hogs is sorted desc by size_mb across /var/log and ~/.cache."""
mock_expanduser.side_effect = lambda p: p.replace("~", "/home/testuser")
def fake_scan(path, timeout=None):
if path == "/var/log":
return [
{"path": "/var/log/a", "size_mb": 5.0},
{"path": "/var/log/b", "size_mb": 3.0},
]
if path == "/home/testuser/.cache":
return [{"path": "/home/testuser/.cache/big", "size_mb": 500.0}]
return []
mock_scan.side_effect = fake_scan
mock_old.return_value = []
mock_large.return_value = []
checker = self._get_checker_class()()
result = checker.check()
sizes = [item["size_mb"] for item in result.metrics["space_hogs"]]
self.assertEqual(sizes, sorted(sizes, reverse=True))
self.assertEqual(sizes[0], 500.0)
@patch("apps.checkers.checkers.disk_common.os.path.expanduser")
@patch("apps.checkers.checkers.disk_common.scan_directory")
@patch("apps.checkers.checkers.disk_common.find_old_files")
@patch("apps.checkers.checkers.disk_common.find_large_files")
def test_old_files_globally_sorted_across_targets(
self, mock_large, mock_old, mock_scan, mock_expanduser
):
"""old_files is sorted desc by size_mb across /tmp and /var/tmp."""
mock_expanduser.side_effect = lambda p: p.replace("~", "/home/testuser")
mock_scan.return_value = []
def fake_old(path, max_age_days=7, timeout=None):
if path == "/tmp":
return [{"path": "/tmp/small", "size_mb": 2.0, "age_days": 10}]
if path == "/var/tmp":
return [{"path": "/var/tmp/big", "size_mb": 200.0, "age_days": 15}]
return []
mock_old.side_effect = fake_old
mock_large.return_value = []
checker = self._get_checker_class()()
result = checker.check()
sizes = [item["size_mb"] for item in result.metrics["old_files"]]
self.assertEqual(sizes, sorted(sizes, reverse=True))
self.assertEqual(sizes[0], 200.0)
@patch("apps.checkers.checkers.disk_common.os.path.expanduser")
@patch("apps.checkers.checkers.disk_common.scan_directory")
@patch("apps.checkers.checkers.disk_common.find_old_files")
@patch("apps.checkers.checkers.disk_common.find_large_files")
def test_large_files_sorted_descending(
self, mock_large, mock_old, mock_scan, mock_expanduser
):
"""large_files is sorted desc — single-target today, locks invariant for the future."""
mock_expanduser.side_effect = lambda p: p.replace("~", "/home/testuser")
mock_scan.return_value = []
mock_old.return_value = []
# Return UNSORTED data to verify the checker sorts it (real find_large_files
# would already sort, but this test pins the .sort() call regardless).
mock_large.return_value = [
{"path": "/home/testuser/small.iso", "size_mb": 150.0},
{"path": "/home/testuser/big.iso", "size_mb": 800.0},
{"path": "/home/testuser/medium.iso", "size_mb": 400.0},
]
checker = self._get_checker_class()()
result = checker.check()
sizes = [item["size_mb"] for item in result.metrics["large_files"]]
self.assertEqual(sizes, sorted(sizes, reverse=True))
Step 2: Run the new tests — expect failure
Run: uv run pytest apps/checkers/_tests/checkers/test_disk_common.py -v -k "globally_sorted or sorted_descending"
Expected: 3 FAIL.
test_space_hogs_globally_sorted_across_scan_targetsfails because the 500 MB item from~/.cachelands after the 5 MB and 3 MB items from/var/log.test_old_files_globally_sorted_across_targetsfails for the same reason (200 MB after 2 MB).test_large_files_sorted_descendingfails because the mock returns unsorted data and the current code doesn’t re-sort.
If any of the three unexpectedly passes, STOP and report — that means the assertion is too weak or the mock isn’t being hit.
Task 2: Add multi-target sort test to test_disk_macos.py
Files:
- Modify:
apps/checkers/_tests/checkers/test_disk_macos.py
Step 1: Add the test
Append to DiskMacOSCheckerTests:
@patch("apps.checkers.checkers.disk_macos.sys")
@patch("apps.checkers.checkers.disk_macos.os.path.expanduser")
@patch("apps.checkers.checkers.disk_macos.scan_directory")
@patch("apps.checkers.checkers.disk_macos.find_old_files")
def test_space_hogs_globally_sorted_across_scan_targets(
self, mock_old, mock_scan, mock_expanduser, mock_sys
):
"""space_hogs is sorted desc across the four macOS scan targets."""
mock_sys.platform = "darwin"
mock_expanduser.side_effect = lambda p: p.replace("~", "/Users/testuser")
def fake_scan(path, timeout=None):
# First (~/Library/Caches) returns a small item;
# second (/Library/Caches) returns the largest;
# third (~/Library/Logs) returns a medium item.
if path == "/Users/testuser/Library/Caches":
return [{"path": f"{path}/small", "size_mb": 5.0}]
if path == "/Library/Caches":
return [{"path": f"{path}/big", "size_mb": 800.0}]
if path == "/Users/testuser/Library/Logs":
return [{"path": f"{path}/medium", "size_mb": 100.0}]
return []
mock_scan.side_effect = fake_scan
mock_old.return_value = []
checker = self._get_checker_class()()
result = checker.check()
sizes = [item["size_mb"] for item in result.metrics["space_hogs"]]
self.assertEqual(sizes, sorted(sizes, reverse=True))
self.assertEqual(sizes[0], 800.0)
Step 2: Run — expect failure
Run: uv run pytest apps/checkers/_tests/checkers/test_disk_macos.py -v -k "globally_sorted"
Expected: FAIL — the 800 MB item lands in position 2 (after the 5 MB item from the first target).
Task 3: Add multi-target sort test to test_disk_linux.py
Files:
- Modify:
apps/checkers/_tests/checkers/test_disk_linux.py
Step 1: Add the test
Append to DiskLinuxCheckerTests:
@patch("apps.checkers.checkers.disk_linux.sys")
@patch("apps.checkers.checkers.disk_linux.scan_directory")
@patch("apps.checkers.checkers.disk_linux.find_old_files")
def test_space_hogs_globally_sorted_across_scan_targets(
self, mock_old, mock_scan, mock_sys
):
"""space_hogs is sorted desc across the four Linux scan targets."""
mock_sys.platform = "linux"
def fake_scan(path, timeout=None):
# apt cache small; journal largest; docker medium.
if path == "/var/cache/apt/archives":
return [{"path": f"{path}/small", "size_mb": 50.0}]
if path == "/var/log/journal":
return [{"path": f"{path}/big", "size_mb": 900.0}]
if path == "/var/lib/docker":
return [{"path": f"{path}/medium", "size_mb": 300.0}]
return []
mock_scan.side_effect = fake_scan
mock_old.return_value = []
checker = self._get_checker_class()()
result = checker.check()
sizes = [item["size_mb"] for item in result.metrics["space_hogs"]]
self.assertEqual(sizes, sorted(sizes, reverse=True))
self.assertEqual(sizes[0], 900.0)
Step 2: Run — expect failure
Run: uv run pytest apps/checkers/_tests/checkers/test_disk_linux.py -v -k "globally_sorted"
Expected: FAIL — the 900 MB item lands in position 2 (after the 50 MB apt-cache item).
Task 4: Add the seven sort lines
Add .sort(key=lambda x: x["size_mb"], reverse=True) after each of the seven per-target collection loops.
Files:
- Modify:
apps/checkers/checkers/disk_common.py - Modify:
apps/checkers/checkers/disk_macos.py - Modify:
apps/checkers/checkers/disk_linux.py
Step 1: disk_common.py
Find the for target in scan_targets: block (currently lines 36-41). Immediately after the inner loop ends — before the comment # Build set of already-scanned paths to exclude from large file walk — add:
space_hogs.sort(key=lambda x: x["size_mb"], reverse=True)
Find the for target in old_file_targets: block (currently lines 44-49). Immediately after the inner loop ends — before the next blank line and the comment about exclude_paths — add:
old_files.sort(key=lambda x: x["size_mb"], reverse=True)
Find the for target in large_file_targets: block (currently lines 57-64). Immediately after the inner loop ends — before the total = (...) line — add:
large_files.sort(key=lambda x: x["size_mb"], reverse=True)
Match the existing 12-space indent of the surrounding code (these blocks are inside try: inside def check():).
Step 2: disk_macos.py
Find the for target in scan_targets: block (currently lines 39-44). Immediately after the inner loop ends — before the old_files = [] line — add:
space_hogs.sort(key=lambda x: x["size_mb"], reverse=True)
Find the for target in old_file_targets: block (currently lines 47-52). Immediately after the inner loop ends — before the total = ... line — add:
old_files.sort(key=lambda x: x["size_mb"], reverse=True)
Step 3: disk_linux.py
Find the for target in scan_targets: block (currently lines 38-42). Immediately after the inner loop ends — before the old_files = [] line — add:
space_hogs.sort(key=lambda x: x["size_mb"], reverse=True)
Find the for target in old_file_targets: block (currently lines 45-49). Immediately after the inner loop ends — before the total = ... line — add:
old_files.sort(key=lambda x: x["size_mb"], reverse=True)
Task 5: Verify tests pass
Step 1: Run the five new tests
Run: uv run pytest apps/checkers/_tests/checkers/test_disk_common.py apps/checkers/_tests/checkers/test_disk_macos.py apps/checkers/_tests/checkers/test_disk_linux.py -v -k "globally_sorted or sorted_descending"
Expected: 5 PASS.
If any fails, the sort line is missing, in the wrong place, or has a typo. Use git diff to inspect.
Step 2: Run the full disk checker test suites
Run: uv run pytest apps/checkers/_tests/checkers/test_disk_common.py apps/checkers/_tests/checkers/test_disk_macos.py apps/checkers/_tests/checkers/test_disk_linux.py -v
Expected: all PASS. Existing tests should be unaffected — the sort is unconditional and order-preserving for already-sorted input.
If any unrelated test fails: it likely asserts on a specific list order that depended on the old per-target concatenation. Read the failure carefully; if the assertion is genuinely about per-target order rather than global sort, that’s a real conflict — STOP and report. If the assertion is just brittle (e.g., asserts position 0 is a specific item that happens to land there because of the old order but isn’t actually guaranteed), update it to match the new global-sorted order.
Step 3: Run the full checkers suite
Run: uv run pytest apps/checkers/ -v
Expected: all PASS.
Task 6: Coverage
CLAUDE.md requires 100% branch coverage on every PR.
Step 1: Run
uv run coverage run --branch -m pytest apps/checkers/_tests/checkers/test_disk_common.py apps/checkers/_tests/checkers/test_disk_macos.py apps/checkers/_tests/checkers/test_disk_linux.py
uv run coverage report -m --include='apps/checkers/checkers/disk_common.py,apps/checkers/checkers/disk_macos.py,apps/checkers/checkers/disk_linux.py'
Expected: 100% on all three files. The .sort() lines are unconditional, so any test that runs check() exercises them.
If less than 100%, identify the uncovered lines (likely something pre-existing — investigate, don’t add unrelated tests).
Task 7: Lint, format, type-check
uv run black --check apps/checkers/checkers/disk_common.py apps/checkers/checkers/disk_macos.py apps/checkers/checkers/disk_linux.py apps/checkers/_tests/checkers/test_disk_common.py apps/checkers/_tests/checkers/test_disk_macos.py apps/checkers/_tests/checkers/test_disk_linux.py
uv run ruff check apps/checkers/checkers/disk_common.py apps/checkers/checkers/disk_macos.py apps/checkers/checkers/disk_linux.py apps/checkers/_tests/checkers/test_disk_common.py apps/checkers/_tests/checkers/test_disk_macos.py apps/checkers/_tests/checkers/test_disk_linux.py
uv run mypy apps/checkers/checkers/disk_common.py apps/checkers/checkers/disk_macos.py apps/checkers/checkers/disk_linux.py
Expected: all clean.
Task 8: Commit
git add apps/checkers/checkers/disk_common.py apps/checkers/checkers/disk_macos.py apps/checkers/checkers/disk_linux.py apps/checkers/_tests/checkers/test_disk_common.py apps/checkers/_tests/checkers/test_disk_macos.py apps/checkers/_tests/checkers/test_disk_linux.py
git commit -m "$(cat <<'EOF'
fix(checkers): globally sort disk checker lists across scan targets
Each disk checker collected per-target scan results into a single list
without re-sorting globally. scan_directory / find_old_files /
find_large_files each return sorted-desc results, but appending across
multiple targets left the combined list sorted *within* targets and
not *across* them — so the displayed "top 10" could show small items
above a large one from a later target.
Add one .sort() after each of the seven per-target collection loops
in disk_common, disk_macos, and disk_linux. Five focused tests cover
the multi-target sort invariant.
Closes #133.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EOF
)"
Pre-commit hooks should pass. After commit, git status clean. git log --oneline main..HEAD should show 2 commits (design doc + this fix).
Task 9: Push and open PR
git push -u origin fix/disk-global-sort
gh pr create --title "fix(checkers): globally sort disk checker lists across scan targets" --body "$(cat <<'EOF'
## Summary
- Adds one \`.sort(key=lambda x: x["size_mb"], reverse=True)\` after each of the seven per-target collection loops in \`disk_common.py\`, \`disk_macos.py\`, and \`disk_linux.py\`
- Disk checker output now shows the actually-largest items first across scan targets, not just within each target
- 5 focused tests pin the multi-target sort invariant; 7 unconditional sort lines added; no metrics-shape change
Design doc: \`docs/plans/2026-05-07-disk-global-sort-design.md\`
Closes #133.
## Why
Before this PR, a 596 MB item from \`~/.cache\` could land in position 8 of \`disk_common\`'s "top 10" \`space_hogs\` list, below items in the 1–22 MB range that came from an earlier scan target (\`/var/log\`). Reconciliation math (added in #132) still works because totals are order-independent — but the truncated "top 10" view doesn't actually show the biggest items, defeating its purpose.
## Test plan
- [x] \`uv run pytest apps/checkers/\` — full suite green; 5 new tests cover space_hogs/old_files/large_files sort across multi-target loops in all three checkers
- [x] \`uv run coverage report\` — 100% on disk_common.py, disk_macos.py, disk_linux.py
- [x] black / ruff / mypy clean
## Out of scope
- Extracting a shared helper for the duplicated loop pattern (the sort/dedup/append shape appears 7 times). Tempting, but a separate refactor — issue #133's suggested fix is literally "add \`.sort()\`", and this PR keeps that contract.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
EOF
)"
Return the PR URL.
Notes for the implementer
- Just inline the sort. Do not extract a helper, even if the duplication is annoying.
- Match existing indentation. The collection loops sit inside
try:insidedef check():— that’s 12 spaces of indent on thespace_hogs.sort(...)line. large_filesindisk_commonis single-target today. The sort is added for uniformity (per Approach A in the design) and is a no-op on already-sorted input. The testtest_large_files_sorted_descendingpasses UNSORTED data to the mock, which forces the test to depend on the new sort line — without it, the test fails. Don’t worry that it “shouldn’t” be needed; it locks the invariant for the future.- Don’t change
disk_utils.py. The per-target scanners already sort correctly; the bug is only at the aggregation layer. - Don’t introduce a tie-breaker. Python’s sort is stable; equal
size_mbvalues keep insertion order. The user previously rejected explicit tie-breaking (“just sort however you sort for usual cases” — PR #132 review). - Branch is independent of #132 and #134. This PR can merge in any order relative to those.