grokkit/docs/developer-guide/TEST-COVERAGE.md
Greg Gauthier c49f6d84ef docs: add WOMM certification badge and developer guides
- Add WOMM bronze certification SVG and text certificate
- Add CLAUDE.md for AI coding guidance
- Update README.md with certification badge
- Add TEST-COVERAGE.md in docs/developer-guide
2026-04-02 17:25:45 +01:00

82 lines
3.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Test Coverage Assessment
**Date:** 2026-04-02
**Scope:** All packages in `cmd/`, `internal/`, and `config/`
**Test files reviewed:** 32 across 12 packages
---
## Overall: ~60% unit / ~20% integration
The pattern is consistent: **pure helper functions are well-tested; the actual command execution paths are mostly not**.
---
## Per-Package Breakdown
| Package | Est. Coverage | Notes |
|---|---|---|
| `internal/logger/` | ~85% | Comprehensive public API coverage |
| `internal/prompts/` | ~85% | All load paths + fallback covered |
| `internal/errors/` | ~80% | Error types + unwrapping covered |
| `internal/todo/` | ~80% | Bootstrap idempotency solid |
| `internal/linter/` | ~75% | Language detection strong; output parsing weak |
| `config/` | ~75% | Config getters good; file-load errors not tested |
| `internal/workon/` | ~70% | State transitions good; `Run()` itself untested |
| `internal/git/` | ~70% | Real git integration, good core coverage |
| `internal/grok/` | ~60% | Streaming + SSE parsing tested; error paths absent |
| `internal/version/` | ~50% | Just variable presence check |
| `internal/recipe/` | ~50% | Loading tested; execution engine (`Run()`, `refactorFiles()`, `handleApplyStep()`) entirely untested |
| `cmd/` | ~2535% | Message builders tested; nearly all `run*()` functions untested |
---
## What's Systematically Untested
**Command execution** — Most `run*()` functions in `cmd/` have zero direct tests:
- `runAgent()`, `runChat()`, `runQuery()`, `runRecipe()`, `runTestgen()`, `runWorkon()`, `runChangelogCommand()`, `runDocs()`
**API error scenarios** — The grok client has no tests for: HTTP errors, rate limits, auth failures, malformed SSE chunks, timeouts, or stream cancellation.
**Recipe execution**`internal/recipe/` tests only YAML loading. The entire execution side (`Run`, `refactorFiles`, `handleApplyStep`, `executeReadOnlyShell`) is untouched.
**User interaction** — All interactive confirmation/prompt flows are untested. No mock stdin anywhere.
**File system errors** — Read-only paths, permission failures, disk full — none tested.
---
## Highest-Impact Gaps
1. `internal/recipe/` — execution engine is production code with zero test coverage
2. `internal/grok/` — error path coverage missing entirely for a network client
3. `cmd/agent.go`, `cmd/chat.go`, `cmd/query.go` — core UX features without tests
4. `config/` — file parsing errors and env var overrides not tested
---
## Strengths
- Good use of `t.TempDir()` for isolation
- Real git integration in `internal/git/` tests (not mocked)
- Mock injection via function variables (git runner, API client) makes testing feasible
- `t.Parallel()` used consistently where appropriate
- Error type chain verification (`errors.Is`) is present
## Weaknesses
- The "Live" test pattern (`TestScaffoldCmd_Live`) uses `testing.Short()` logic inconsistently — it's gated but the semantics are inverted from convention
- No benchmark tests anywhere
- No CI separation between unit and integration tests — "live" tests quietly depend on environment
- Mocking strategy is inconsistent across packages (some use testify mocks, some manual function variables)
---
## Priority Recommendations
**High:** Add tests for `internal/recipe/` execution, grok client error scenarios, and `runAgent()`/`runChat()`/`runQuery()`.
**Medium:** Test `config/` file-load failure paths, git error scenarios (uninitialized repo), and file permission errors.
**Low:** Standardize the Live test pattern, add benchmarks for SSE streaming, add concurrent operation tests.