- Add WOMM bronze certification SVG and text certificate - Add CLAUDE.md for AI coding guidance - Update README.md with certification badge - Add TEST-COVERAGE.md in docs/developer-guide
82 lines
3.5 KiB
Markdown
82 lines
3.5 KiB
Markdown
# Test Coverage Assessment
|
||
|
||
**Date:** 2026-04-02
|
||
**Scope:** All packages in `cmd/`, `internal/`, and `config/`
|
||
**Test files reviewed:** 32 across 12 packages
|
||
|
||
---
|
||
|
||
## Overall: ~60% unit / ~20% integration
|
||
|
||
The pattern is consistent: **pure helper functions are well-tested; the actual command execution paths are mostly not**.
|
||
|
||
---
|
||
|
||
## Per-Package Breakdown
|
||
|
||
| Package | Est. Coverage | Notes |
|
||
|---|---|---|
|
||
| `internal/logger/` | ~85% | Comprehensive public API coverage |
|
||
| `internal/prompts/` | ~85% | All load paths + fallback covered |
|
||
| `internal/errors/` | ~80% | Error types + unwrapping covered |
|
||
| `internal/todo/` | ~80% | Bootstrap idempotency solid |
|
||
| `internal/linter/` | ~75% | Language detection strong; output parsing weak |
|
||
| `config/` | ~75% | Config getters good; file-load errors not tested |
|
||
| `internal/workon/` | ~70% | State transitions good; `Run()` itself untested |
|
||
| `internal/git/` | ~70% | Real git integration, good core coverage |
|
||
| `internal/grok/` | ~60% | Streaming + SSE parsing tested; error paths absent |
|
||
| `internal/version/` | ~50% | Just variable presence check |
|
||
| `internal/recipe/` | ~50% | Loading tested; execution engine (`Run()`, `refactorFiles()`, `handleApplyStep()`) entirely untested |
|
||
| `cmd/` | ~25–35% | Message builders tested; nearly all `run*()` functions untested |
|
||
|
||
---
|
||
|
||
## What's Systematically Untested
|
||
|
||
**Command execution** — Most `run*()` functions in `cmd/` have zero direct tests:
|
||
- `runAgent()`, `runChat()`, `runQuery()`, `runRecipe()`, `runTestgen()`, `runWorkon()`, `runChangelogCommand()`, `runDocs()`
|
||
|
||
**API error scenarios** — The grok client has no tests for: HTTP errors, rate limits, auth failures, malformed SSE chunks, timeouts, or stream cancellation.
|
||
|
||
**Recipe execution** — `internal/recipe/` tests only YAML loading. The entire execution side (`Run`, `refactorFiles`, `handleApplyStep`, `executeReadOnlyShell`) is untouched.
|
||
|
||
**User interaction** — All interactive confirmation/prompt flows are untested. No mock stdin anywhere.
|
||
|
||
**File system errors** — Read-only paths, permission failures, disk full — none tested.
|
||
|
||
---
|
||
|
||
## Highest-Impact Gaps
|
||
|
||
1. `internal/recipe/` — execution engine is production code with zero test coverage
|
||
2. `internal/grok/` — error path coverage missing entirely for a network client
|
||
3. `cmd/agent.go`, `cmd/chat.go`, `cmd/query.go` — core UX features without tests
|
||
4. `config/` — file parsing errors and env var overrides not tested
|
||
|
||
---
|
||
|
||
## Strengths
|
||
|
||
- Good use of `t.TempDir()` for isolation
|
||
- Real git integration in `internal/git/` tests (not mocked)
|
||
- Mock injection via function variables (git runner, API client) makes testing feasible
|
||
- `t.Parallel()` used consistently where appropriate
|
||
- Error type chain verification (`errors.Is`) is present
|
||
|
||
## Weaknesses
|
||
|
||
- The "Live" test pattern (`TestScaffoldCmd_Live`) uses `testing.Short()` logic inconsistently — it's gated but the semantics are inverted from convention
|
||
- No benchmark tests anywhere
|
||
- No CI separation between unit and integration tests — "live" tests quietly depend on environment
|
||
- Mocking strategy is inconsistent across packages (some use testify mocks, some manual function variables)
|
||
|
||
---
|
||
|
||
## Priority Recommendations
|
||
|
||
**High:** Add tests for `internal/recipe/` execution, grok client error scenarios, and `runAgent()`/`runChat()`/`runQuery()`.
|
||
|
||
**Medium:** Test `config/` file-load failure paths, git error scenarios (uninitialized repo), and file permission errors.
|
||
|
||
**Low:** Standardize the Live test pattern, add benchmarks for SSE streaming, add concurrent operation tests.
|