2 comments

  • yarivk 6 hours ago ago

    Nice project.

    Voice to text with a push to talk workflow makes a lot of sense for developers who don't want constant background listening.

    Curious how the local mode performs compared to the cloud version in terms of latency and accuracy

    • AleksDoesCode 2 hours ago ago

      Hey yarivk,

      thanks I do appreciate it! The local mode depends heavily on the hardware and the model you choose.

      I run it exclusively in local mode nowadays. Old Macbook Pro with M1 chip with 32g ram + Model Large v3 Turbo.

      Transcribing one minute of audio takes around 2-3 seconds, compared to the 0.5 to 1 second if using the openAI API.

      For my usecase this a no brainer. Not having to pay + keeping all of my data private is well worth waiting 1.5 seconds longer. Also I simply think it is pretty cool to run models locally, so there that :D

      Feel free to try it out completely for free and lmk your thoughts!