Evaluation
Reach a 100 Sonde Score
Reproducible path to achieve and verify a 100/100 Sonde score for a CLI.
Purpose
Define a practical and repeatable process to push a CLI toward a 100/100 Sonde score.
Inputs
- A target CLI with stable
--helpoutput. - Support (or equivalent) for machine-readable and non-interactive execution.
- Sonde CLI installed or available via
npx.
Outputs
- Stable manifest and run results.
- Score output at or near 100, with actionable deltas if below 100.
Reproducible workflow
1) Generate a fresh manifest
sonde generate <cli> --json2) Validate deterministic run behavior
sonde run <cli> --jsonReview these fields in result:
okshould betrueexitCodeshould be0interactiveDetectedshould befalseusedPreferencesshould include JSON and non-interactive flags when available
3) Validate manifest contract exposure
sonde manifest --jsonUse this output as the reference for what target CLIs should expose via their own manifest command.
4) Score with repeat runs
sonde score <cli> --jsonTarget:
{
"ok": true,
"apiVersion": "1.0.0",
"command": "score",
"result": {
"manifestVersion": "1.0.0",
"generatedAt": "2026-03-05T00:00:00.000Z",
"total": 100
}
}What usually prevents 100/100
- CLI output changes across runs.
- Prompt-like output appears in stdout/stderr.
- Non-zero exits in baseline or repeat run.
- Missing machine-readable output support.
- Missing AI-native capabilities (
manifest, non-interactive, dry-run, field selection).
Improving score systematically
- Prefer stable output formats (
--json,--format json, or equivalent). - Add explicit non-interactive mode (
--non-interactive,--yes, etc.). - Add a
manifestcommand and expose required capability flags in help output. - Add
--dry-runand--fields(or equivalent) for safer agent execution. - Keep
--helpand command metadata deterministic. - Avoid timestamps/random values in default output paths checked by Sonde.
Versioning and compatibility
- Ensure manifest
versionremains on supported major version1while upgrading Sonde. - Use
manifestVersionto enforce compatibility andgeneratedAtto track report freshness.
Edge cases
runandscorecurrently executemanifest.cli.binary --help; ensure that path is deterministic first.- Passing
<cli>torun/scoredoes not overridemanifest.cli.binary. - AI-native capability gaps are score penalties with remediation notes (not hard failures).