Interesting, do you plan to make it available for other languages ?
I'm not an expert of testing but I don't know if something similar is already in the market.
Your point of only retesting the diff is great, as it will reduce CD costs a lot on big projects :)
Thanks a lot! About other languages though, the conecpt transfers but this implementation does not. The two things that made it work are Python specific, coverage.py's C tracer for the function level map, and sys.addaudithook for catching non-Python file reads(such as JSON fixtures and templates). AFAIK you would need to rebuild both primitives from scratch for another runtime. I'm not planning to do so, but the same approach would be valid.
And on the market, the idea is not new. The file level version exists in several other ecosystems. Jest has --changedSince, Nx does affected-test detection for monorepos, Bazel/Pants track build-graph edges. The Python space has pytest-testmon for local dev. Where the gap was(for me) is CI-shareable plus honest benchmarks. The CI-sharing part matters more than it sounds. If the map lives on one developer's machine, it does not help your PR pipeline.
Whether it reduces CD costs in practice depends a lot on how decoupled your test suite is. The honest range from my benchmark is ~21%(Flask, a tightly-coupled codebase) to ~96%(Boltons, one test file per module). Both are real per-commit replays, not just synthetic scenarios.
Interesting, do you plan to make it available for other languages ? I'm not an expert of testing but I don't know if something similar is already in the market. Your point of only retesting the diff is great, as it will reduce CD costs a lot on big projects :)
Thanks a lot! About other languages though, the conecpt transfers but this implementation does not. The two things that made it work are Python specific, coverage.py's C tracer for the function level map, and sys.addaudithook for catching non-Python file reads(such as JSON fixtures and templates). AFAIK you would need to rebuild both primitives from scratch for another runtime. I'm not planning to do so, but the same approach would be valid.
And on the market, the idea is not new. The file level version exists in several other ecosystems. Jest has --changedSince, Nx does affected-test detection for monorepos, Bazel/Pants track build-graph edges. The Python space has pytest-testmon for local dev. Where the gap was(for me) is CI-shareable plus honest benchmarks. The CI-sharing part matters more than it sounds. If the map lives on one developer's machine, it does not help your PR pipeline.
Whether it reduces CD costs in practice depends a lot on how decoupled your test suite is. The honest range from my benchmark is ~21%(Flask, a tightly-coupled codebase) to ~96%(Boltons, one test file per module). Both are real per-commit replays, not just synthetic scenarios.
[flagged]
[dead]
[flagged]
[flagged]