- Add WOMM bronze certification SVG and text certificate - Add CLAUDE.md for AI coding guidance - Update README.md with certification badge - Add TEST-COVERAGE.md in docs/developer-guide
3.5 KiB
Test Coverage Assessment
Date: 2026-04-02
Scope: All packages in cmd/, internal/, and config/
Test files reviewed: 32 across 12 packages
Overall: ~60% unit / ~20% integration
The pattern is consistent: pure helper functions are well-tested; the actual command execution paths are mostly not.
Per-Package Breakdown
| Package | Est. Coverage | Notes |
|---|---|---|
internal/logger/ |
~85% | Comprehensive public API coverage |
internal/prompts/ |
~85% | All load paths + fallback covered |
internal/errors/ |
~80% | Error types + unwrapping covered |
internal/todo/ |
~80% | Bootstrap idempotency solid |
internal/linter/ |
~75% | Language detection strong; output parsing weak |
config/ |
~75% | Config getters good; file-load errors not tested |
internal/workon/ |
~70% | State transitions good; Run() itself untested |
internal/git/ |
~70% | Real git integration, good core coverage |
internal/grok/ |
~60% | Streaming + SSE parsing tested; error paths absent |
internal/version/ |
~50% | Just variable presence check |
internal/recipe/ |
~50% | Loading tested; execution engine (Run(), refactorFiles(), handleApplyStep()) entirely untested |
cmd/ |
~25–35% | Message builders tested; nearly all run*() functions untested |
What's Systematically Untested
Command execution — Most run*() functions in cmd/ have zero direct tests:
runAgent(),runChat(),runQuery(),runRecipe(),runTestgen(),runWorkon(),runChangelogCommand(),runDocs()
API error scenarios — The grok client has no tests for: HTTP errors, rate limits, auth failures, malformed SSE chunks, timeouts, or stream cancellation.
Recipe execution — internal/recipe/ tests only YAML loading. The entire execution side (Run, refactorFiles, handleApplyStep, executeReadOnlyShell) is untouched.
User interaction — All interactive confirmation/prompt flows are untested. No mock stdin anywhere.
File system errors — Read-only paths, permission failures, disk full — none tested.
Highest-Impact Gaps
internal/recipe/— execution engine is production code with zero test coverageinternal/grok/— error path coverage missing entirely for a network clientcmd/agent.go,cmd/chat.go,cmd/query.go— core UX features without testsconfig/— file parsing errors and env var overrides not tested
Strengths
- Good use of
t.TempDir()for isolation - Real git integration in
internal/git/tests (not mocked) - Mock injection via function variables (git runner, API client) makes testing feasible
t.Parallel()used consistently where appropriate- Error type chain verification (
errors.Is) is present
Weaknesses
- The "Live" test pattern (
TestScaffoldCmd_Live) usestesting.Short()logic inconsistently — it's gated but the semantics are inverted from convention - No benchmark tests anywhere
- No CI separation between unit and integration tests — "live" tests quietly depend on environment
- Mocking strategy is inconsistent across packages (some use testify mocks, some manual function variables)
Priority Recommendations
High: Add tests for internal/recipe/ execution, grok client error scenarios, and runAgent()/runChat()/runQuery().
Medium: Test config/ file-load failure paths, git error scenarios (uninitialized repo), and file permission errors.
Low: Standardize the Live test pattern, add benchmarks for SSE streaming, add concurrent operation tests.