12 comments

  • recsv-heredoc 16 minutes ago ago

    The market timing on this is perfect - it fills a major current gap I've seen emerging.

    I've heard a few stories of QA departments being near-burnout due to the increased rate developers are shipping at these days. Even we're looking for any available QA resources we can pull in here.

    No harm meant with the question - but what's the advantage over Claude Code + the GitHub integrations?

  • blintz 3 hours ago ago

    I really want automated QA to work better! It's a great thing to work on.

    Some feedback:

    - I definitely don't want three long new messages on every PR. Max 1, ideally none? Codex does a great job just using emoji.

    - The replay is cool. I don't make a website, so maybe I'm not the target market, but I'd like QA for our backend.

    - Honestly, I'd rather just run a massive QA run every day, and then have any failures bisected, rather than per-PR.

    - I am worried that there's not a lot of value beyond the intelligence of the foundation models here.

    • Bnjoroge 2 hours ago ago

      Agree on your last point and it's going to be a very bitter lesson. In any case, you probably wanna shift alot of the code verification as left as possible so doing review at PR time isnt the right strat imo. And claude/codex are well positioned to do the local review.

    • Visweshyc 3 hours ago ago

      Thanks for the feedback! - Agreed that the form factor can be condensed with a link to detailed information - With the codebase understanding, backend is where we are looking to expand and provide value - The intelligence of the models does lay out the foundation but combining the strength of these models unlocks a system of specialized agents that each reason about the codebase differently to catch the unknown unknowns

  • warmcat 3 hours ago ago

    Good work. But what makes this different than just another feature in Gemini Code assist or Github copilot?

    • Visweshyc 2 hours ago ago

      Thanks! To execute these tests reliably you would need custom browser fleets, ephemeral environments, data seeding and device farms

  • Bnjoroge 2 hours ago ago

    what kinds of tests does it generate and how's this different from the tens of code review startups out there?

    • Visweshyc 2 hours ago ago

      The system focuses on going beyond the happy path and generating edge case tests that try to break the application. For example, a Grafana PR added visual drag feedback to query cards. The system came up with an edge case like - does drag feedback still work when there's only one card in the list, with nothing to reorder against?

  • solfox 3 hours ago ago

    Not a direct competitor but another YC company I use and enjoy for PR reviews is cubic.dev. I like your focus on automated tests.

    • Visweshyc 3 hours ago ago

      Thanks! We believe executing the scenarios and showing what actually broke closes the loop

  • solfox 3 hours ago ago

    Looks interesting! Looks like perhaps no support for Flutter apps yet?

    • Visweshyc 2 hours ago ago

      Yes we currently support web apps but plan to extend the foundation to test mobile applications on device emulators