SolidStart - Hacker News

pj_mukh 5 hours ago ago

Super cool! Didn't realize OpenAI is just using LiveKit.

Does the pricing breakdown to be the same as having a OpenAI Advanced Voice socket open the whole time? It's like $9/hr!

It would be theoretically cheaper to use this without keeping the advanced voice socket open the whole time and just use the GPT4o streaming service [1] for whenever inference is needed (pay per token) and use livekits other components to do the rest (TTS, VAD etc.).

What's the trade off here?

[1]: https://platform.openai.com/docs/api-reference/streaming

[-]

davidz 4 hours ago ago

Currently it does: all audio is sent to the model.

However, we are working on turn detection within the framework, so you won't have to send silence to the model when the user isn't talking. It's a fairly straight forward path to cutting down the cost by ~50%.

solarkraft 2 hours ago ago

That’s some crazy marketing for a „our library happened to support this relatively simple use case“ situation. Impressive!

By the way: The cerebras voice demo also uses LiveKit for this: https://cerebras.vercel.app/

[-]

russ 26 minutes ago ago

There’s a ton of complexity under the “relatively simple use case” when you get to a global, 200M+ user scale.

FanaHOVA 6 hours ago ago

Olivier, Michelle, and Romain gave you guys a shoutout like 3 times in our DevDay recap podcast if you need more testimonial quotes :) https://www.latent.space/p/devday-2024

[-]

russ 4 hours ago ago

I had no idea! <3 Thank you for sharing this, made my weekend.

shayps 3 hours ago ago

You guys are honestly the best

mycall 6 hours ago ago

I wonder when Azure OpenAI will get this.

[-]

davidz 4 hours ago ago

I'm working on a PR now :)

gastonmorixe 6 hours ago ago

Nice they have many partners on this. I see Azure as well.

There is a common consensus that the new Realtime API is not actually using the same Advanced Voice model / engine - or however it works - since at least the TTS part doesn’t seem to be as capable as the one shipped with the official OpenAI app.

Any idea on this?

Source: https://github.com/openai/openai-realtime-api-beta/issues/2

[-]

russ 4 hours ago ago

It's using the same model/engine. I don't have knowledge of the internals, but a different subsystem/set of dedicated resources though for API traffic versus first-party apps.

One thing to note is there is no separate TTS-phase here, it's happening internally within GPT-4o, in the Realtime API and Advanced Voice.

willsmith72 4 hours ago ago

That was cool, but got up to $1 usage real quick

[-]

russ 4 hours ago ago

We had our playground (https://playground.livekit.io) up for a few days using our key. Def racked up a $$$$ bill!

[-]

wordpad25 2 hours ago ago

How much is it per minute of talking?

[-]

russ 2 hours ago ago

50% human speaking at $0.06/minute of tokens

50% AI speaking at $0.24/minute of tokens

we (LiveKit Cloud) charge ~$0.0005/minute for each participant (in this case there would be 2)

So blended is $0.151/minute

shayps 2 hours ago ago

It shakes out to around $0.15 per minute for an average conversation. If history is a guide though, this will get a lot cheaper pretty quickly.

[-]

cdolan 26 minutes ago ago

This is cheaper than old cellular calls, inflation adjusted

Show HN: Open source framework OpenAI uses for Advanced Voice