Greg Gauthier c49f6d84ef docs: add WOMM certification badge and developer guides

- Add WOMM bronze certification SVG and text certificate
- Add CLAUDE.md for AI coding guidance
- Update README.md with certification badge
- Add TEST-COVERAGE.md in docs/developer-guide

2026-04-02 17:25:45 +01:00

3.5 KiB

Raw Blame History

Test Coverage Assessment

Date: 2026-04-02 Scope: All packages in cmd/, internal/, and config/ Test files reviewed: 32 across 12 packages

Overall: ~60% unit / ~20% integration

The pattern is consistent: pure helper functions are well-tested; the actual command execution paths are mostly not.

Per-Package Breakdown

Package	Est. Coverage	Notes
`internal/logger/`	~85%	Comprehensive public API coverage
`internal/prompts/`	~85%	All load paths + fallback covered
`internal/errors/`	~80%	Error types + unwrapping covered
`internal/todo/`	~80%	Bootstrap idempotency solid
`internal/linter/`	~75%	Language detection strong; output parsing weak
`config/`	~75%	Config getters good; file-load errors not tested
`internal/workon/`	~70%	State transitions good; `Run()` itself untested
`internal/git/`	~70%	Real git integration, good core coverage
`internal/grok/`	~60%	Streaming + SSE parsing tested; error paths absent
`internal/version/`	~50%	Just variable presence check
`internal/recipe/`	~50%	Loading tested; execution engine (`Run()`, `refactorFiles()`, `handleApplyStep()`) entirely untested
`cmd/`	~25–35%	Message builders tested; nearly all `run*()` functions untested

What's Systematically Untested

Command execution — Most run*() functions in cmd/ have zero direct tests:

runAgent(), runChat(), runQuery(), runRecipe(), runTestgen(), runWorkon(), runChangelogCommand(), runDocs()

API error scenarios — The grok client has no tests for: HTTP errors, rate limits, auth failures, malformed SSE chunks, timeouts, or stream cancellation.

Recipe execution — internal/recipe/ tests only YAML loading. The entire execution side (Run, refactorFiles, handleApplyStep, executeReadOnlyShell) is untouched.

User interaction — All interactive confirmation/prompt flows are untested. No mock stdin anywhere.

File system errors — Read-only paths, permission failures, disk full — none tested.

Highest-Impact Gaps

internal/recipe/ — execution engine is production code with zero test coverage
internal/grok/ — error path coverage missing entirely for a network client
cmd/agent.go, cmd/chat.go, cmd/query.go — core UX features without tests
config/ — file parsing errors and env var overrides not tested

Strengths

Good use of t.TempDir() for isolation
Real git integration in internal/git/ tests (not mocked)
Mock injection via function variables (git runner, API client) makes testing feasible
t.Parallel() used consistently where appropriate
Error type chain verification (errors.Is) is present

Weaknesses

The "Live" test pattern (TestScaffoldCmd_Live) uses testing.Short() logic inconsistently — it's gated but the semantics are inverted from convention
No benchmark tests anywhere
No CI separation between unit and integration tests — "live" tests quietly depend on environment
Mocking strategy is inconsistent across packages (some use testify mocks, some manual function variables)

Priority Recommendations

High: Add tests for internal/recipe/ execution, grok client error scenarios, and runAgent()/runChat()/runQuery().

Medium: Test config/ file-load failure paths, git error scenarios (uninitialized repo), and file permission errors.

Low: Standardize the Live test pattern, add benchmarks for SSE streaming, add concurrent operation tests.

3.5 KiB Raw Blame History Unescape Escape