3 comments

  • leoncos a day ago ago

    When I use Codex/Claude to complete a computer vision task, such as extracting assets from an image, OpenCV is their default solution. However, I believe that using YOLO and other methods is outdated. The best solution now is to directly use Nano Banana or other AI image models. A paper has proven that image generation models can perform most CV tasks well. I believe the new OpenCV should become a wrapper for VLM or AI image models.

    • serf 18 hours ago ago

      do you realize how many edge or unconnected nodes do OpenCV work?

      some SBC w/ an industrial camera that is doing pick-place or go/no-go operations on a conveyor belt against a singular object type doesn't need a huge image-gen/llm model governing it.

      I mean have you even considered the kind of performance an opencv function can get w/ just mask-matching? I mean even with a fancy YOLO model these answers get thrown out in 1.5-50ms ; this is just a wholly different time scaling.

  • hbcondo714 13 hours ago ago

    > LLMs and VLMs, Running Inside OpenCV…Qwen 2.5, Gemma 3, PaliGemma, and the GPT-2 / GPT-4 family

    Why these specific models / versions?