I'm always amazed at these relatively tiny projects that "launch" with a "customers" list that reads like they've spent 10 years doing hard outbound enterprise sales: Google, Intel, Apple, Amazon, Deloitte, IBM, Ford, Meta, Uber, Tencent, etc.
We’re firmly in a world where “cheat on everything” is an acceptable business, startups that were hacked together in a week at YC claim they have SOC2 and vibecoded GPT wrappers claim they “trained a model”. Shameless lying took over tech, and if anyone catches you lying, you double down, make a scene and a bunch of podcasts will talk about you. Free advertising!
Of course, dishonesty is as old as time, but these last couple of years have been hard to watch…
have to admit that we did some logo plays. but our users are really all over the place and just wanted to show it off! i am not sure how it looked but that's why we didn't use terms like "teams" or "customers" to be honest while showing some validation.
It says "Our users are everywhere" and shows some logos for the companies these users are from.
If the users are from those companies, this is not lying.
If they added logos for companies their users are not from, it would be lying.
Adding a logo to your webpage has started to follow different patterns for the stage of the company.
Early stage companies show things like "people at X, Y, Z use our product!" (showing logos without permission), whilst later stage ones tend to show logos after asking for permission, and with more formal case studies.
They may not have asked for permission to show these logos, but that's not the same thing as lying.
There's a lot of heavy lifting in the idea that someone who tried it / used it of their own volition that happens to work for, say Google, is the same as indicating that your product is "used by Google".
The most honest version is the company is paying for the tool. The most stretched version I’ve seen is a former employee of a company uses the tool in a personal capacity. Most commonly for newly launched things it means someone with an @company email has tried the tool (even if they didn’t pay). You could, for example, set up a waitlist and then let anyone with a logo-worthy email in.
I think this is way too far. For me personally, the threshold to put the logo is someone within the company is paying, even though the whole company is not in a contract. For example, you might not have a full fledged contract with Google, but one manager of a tiny team might have used her/his company credit card to pay for your tool. If the sum is below a certain threshold, they don't need to authorize or go through vendor vetting and all that.
The threshold should be that a relevant representative of the company agreed to have the logo displayed on the website. Anything other than that is deceptive.
to show that we are acknowledged by many users from various orgs. we listed users who talked to, but we do not know if they still use it as some of them are not reachable(lost contact). i am admitting that we wanted to seem official so that's why we had all these logos where our users are "from".
Congrats on the launch. I never understood why an AI meeting notetaker needed sota LLMs and subscriptions (talking about literally all the other notetakers) - thanks for making it local first.
I use a locally patched up whisperx + qwen3:1.7 + nomic embed (ofcourse with a swift script that picks up the audio buffer from microphone) and it works just fine.
Rarely i create next steps / sop from the transcript - i use gemini 2.5 and export it as pdf. I’ll give Hyprnote a try soon.
I hope, since it’s opensource, you are thinking about exposing api / hooks for downstream tasks.
Looks great & kudos for making it local-first & open-source, much appreciated!
From a business perspective, and as someone looking also into the open-source model to launch tools, I'd be interested though how you expect revenue to be generated?
Is it solely relying on the audience segment that doesn't know how to hook up the API manually to use the open-source version? How do you calculate this, since pushing it via open-source/github you would think that most people exposed to it are technical enough to just run it from source.
Would be great if you could include in your launch message how you plan to monetize this. Everybody likes open source software and local-first is excellent too, but if you mention YC too then everybody also knows that there is no free lunch, so what's coming down the line would be good to know before deciding whether to give it a shot or just move on.
We have a Pro license implemented in our app.
Some non-essential features like custom templates or multi-turn chat are gated behind a paid license. (A custom STT model will also be included soon.) There's still no sign-up required. We use keygen.sh to generate offline-verifiable license keys. Currently, it's priced at $179/year.
For business:
If they want to self-host some kind of admin server with integrations, access control, and SSO, we plan to sell a business license.
totally fair concern. we’re actually on the same side when it comes to promoting good security practices like SSO.
the reason we’re gating the admin server under a business license is less about profiting off sso and more about drawing a line between individual and organizational use. it includes a bunch of enterprise-specific features (sso, access control, integrations, ...) that typically require more support and maintenance.
that said, the core app is fully open-source and always will be - so individuals and teams who don’t need the admin layer can still use it freely and privately, without compromising security.
we’ll keep listening and evolving the model - after all, we're still very early and flexible. appreciate the pushback.
(edit: added some more words to reinforce our flexibility)
How are you balancing accuracy vs. time-to-word-on-live-transcript? Is this something you're actively balancing, or can allow an end user to tune?
I find myself often using otter.ai - because while it's inferior to Whisper in many ways, and anything but on-device, it's able to show words on the live transcript with minimal delay, rather than waiting for a moment of silence or for a multi-second buffer to fill. That's vital if I'm using my live transcription both to drive async summarization/notes and for my operational use in the same call, to let me speed-read to catch up to a question that was just posed to me while I was multitasking (or doing research for a prior question!)
It sometimes boggles me that we consider the latency of keypress-to-character-on-screen to be sacrosanct, but are fine with waiting for a phrase or paragraph or even an entire conversation to be complete before visualizing its transcription. Being able to control this would be incredible.
It is more like ai model problem(then app logic. doing it more frequently will require more computation. Things like speculative decoding can help though).
Doing it locally is hard, but we expect to ship it very soon. Please join our Discord(https://hyprnote.com/discord) if you are interested to hear from us.
Do you intend to reach feature parity with something like MacWhisper? I'd love to switch to something open source, but automated meeting detection, push to transcribe (with custom rewrite actions) are two features I've learned to love, beside basic transcript. I also enjoy the automatic transcription from an audio, video or a even a YouTube link.
But because MacWhisper does not store transcripts or do much with them (other than giving you export options), there are some missed opportunities: I'd love to be able to add project tags to transcripts, so that any new transcript is summarized with the context of all previous transcript summaries that share the same tag. Thinking about it maybe I should build a Logseq extension to do that myself as I store all my meeting summaries there anyway.
Speaker detection is not great in MacWhisper (at least in my context where I work mostly with non native English speakers), so that would be a good differentiation too.
definitely planning to catch up to other tools - we ship FAST!
automated meeting detection - working on this.
push to transcribe - want to understand more about this. (could we talk more over at our discord? https://hyprnote.com/discord)
if you're using logseq, we'd love to build an integration for you.
finally, speaker identification is a big challenge for us too.
Let's go! Trying it out, appreciate you building this as the hosted/online alternatives (Granola, Notion recorder, ChatGPT recorder etc) don't really make sense if you can record locally.
I just downloaded on mac M4 pro mini. I installed the apple silicon version and try to launch it and it fails. No error message or anything. Just the icon keep bouncing on the dock. I assumed it needs some privacy and screen recording and audio permissions and explicitly gave them, however still just jumps on the dock and the app does not open. (OS, mac sequoia 15.5)
Looks really cool - I noticed Enterprise has smart consent management?
The thing I think some enterprise customers are worried about in this space is that in many jurisdictions you legally need to disclose recording - having a bot join the call can do that disclosure - but users hate the bot and it takes up too much visibility on many of these calls.
Would love to learn more about your approach there
yes, we’re rolling out flexible consent options based on legal needs - like chat messages, silent bots, blinking backgrounds, or consent links before/during meetings. but still figuring out if there's a more elegant way to do this. would love to hear your take as well.
Congrats on the launch! I'm very bullish on how powerful <10B-param models are becoming, so the on-device angle is cool (and great for your bottom line too, as it's cheaper for you to run).
Something that I think is interesting about AI note taking products is focus. How does it choose what's important vs what isn't? The better it is at distinguishing the signal from the noise, the more powerful it is. I wonder if there is an in-context learning angle here where you can update the model weights (either directly or via LoRA) as you get to know the user better. And, of course, everything stays private and on-device.
I was talking about this a week ago. One person wanted to make a pdf tutorial of how to use a software. I asked him to record himself in teams and share his screen and have AI take notes. It will create a fabulous summary with snapshots of everything he is going over.
Look forward to testing the Windows version. Hope it has ability to also upload recordings, etc.
Meetily is nice but setup feels too convoluted, with a backend and frontend being required to separately install...
will push harder! hop into our discord(https://hyprnote.com/discord) to get the latest updates :) you can sign up to the waitlist on our website as well.
This is perfect timing! I just cancelled my fireflies.ai subscription yesterday because it just felt unnecessary. I prefer using less platforms and more tools, especially those that can work under the surface.
This is really cool! I've been using Obsidian more and more as a second brain and getting data in has consistently been the point of failure, so I've been wanting something just like this. Specifically something that runs locally and offline.
Is the future goal of Hyprnote specifically meeting notes and leaning into features around meeting notes, or more general note taking and recall features?
Curious what made you decide to start with the macOS version. Some insight that most potential customers are on Macs or simply "we use Macs, dogfood time"? I'd always target Windows first for these tools.
Looks cool, I'll wait for the Linux version and try it.
Integration of automatic translations could be an interesting business plan value-add. Branching out into CRM things also makes sense to me.
The only issue I have with those tools, and I have not seen a single one even acknowledge this, is that it becomes completely useless when holding meetings in a hybrid fashion where some people are remote and others are in the office with a shared mic.
Almost all of our meetings are hybrid in this way, and it's a real pain having almost half of the meeting be identified as a single individual talking because the mic is hooked up to their machine.
It's a total dealbreaker for us, and we won't use such tools until that problem is solved.
It can be solved with speaker segmentation/embedding models, although it is not perfect. One thing we do with Hyprnote is that we have a Descript-like transcript editor that allows you to easily edit/assign speakers. Once we integrate a speaker diarization model with that, I think we'll be in good shape.
Either everyone is in the same physical room, or everyone is remote.
The quality of communication plummets in the hybrid case:
* The physical participants have much higher bandwidth communication than those who are remote — they share private expressions and gestures to the detriment of remote.
* The physical participants have massively lower latency communications. In all-online meetings, everyone an adjust and accommodate the small delays; in hybrid meetings it often locks out remote participants who are always just a little behind or have less time to respond.
* The audio quality of remote is significantly worse, which I have seen result in their comments being treated as leas credible.
* Remote participants usually get horrible audio quality from those sharing a mic in the room. No one ever acknowledges this, but it dramatically impacts ability to communicate.
you might need an AI for in-person meeting first. Such tools are available to doctors who see patients. The note taking is great but I think it is skewed towards one-person summary where the name of the patient remains unknown. I wonder if the same tool can take notes if two patients are in the room and distinguish between each one.
The second tool is likely hardware limitation. A multi-cam-mic with beam forming capability to deconstruct overlapping sounds.
hyprnote can be used for in-person meetings as well! we have doctors like ophthalmologists or psychiatrists using it right now. and yes - definitely going to be working on speaker identification as it crucial.
I recently tried Vibe (https://github.com/thewh1teagle/vibe) from a recording of a meeting taken on one side. It was able to identify the speakers. As Speaker 1, 2, etc. But still useful to see.
Since this isn't available yet on Windows, what would be the glue & duct tape alternative? Record audio and dump it in chatGPT? Or do you need to create some kind of automation with n8n / Zapier? I don't have that many meetings but it could be nice to have
I've been using https://www.quillmeetings.com/ on Windows for a few months and it's been great. It processes locally in the same fashion i.e. no cloud requirement.
Looks great. From my experience Tauri team has no clue what mobile is and they're not interested in fixing mobile issues. I can already tell the mobile version will be a disappointment.
We tried Tauri mobile (few months ago) and had the same disappointment. We will use RN or Dioxus(to share Rust code) for the mobile version. So it will be cool :)
from our experience, we tried using stackflow on the frontend but encountered problems with responsive keyboard layouts. thought it was stackflow's problem but soon realized that it just wasn't implemented in tauri yet.
Local-first, controllability(custom endpoint part), and eventually extensibility(VSCode part of the post)
We're putting a lot of effort into making it run smoothly on local machines. There are no signups, and the app works without any internet connection after downloading models.
One of the things I would want to do is - As the meeting is going on - I would like to ask a LLM what questions I could ask at that point in time. Especially if it's a subject I am not expert in.
Would I be able to create an extension that could do this?
you can definitely do that in the future. but we had that on our mind as well from multiple requests - planning to add "eli5 - explain like i'm five" and "mmss - make me sound smart" ;)
(edit: grammar fix)
i think there are tools like cluely - where they propose to "cheat" on everything in real-time. or just wearables like waves that shows ar displays with real-time assist. (i've never used both of them before, but i understood their products like this) so proactive ai agents are somewhat becoming a thing I guess. but it all boils down to privacy for us.
mmss was something that a lot of users suggested - they wanted to be saved from public humiliation
another free tier (but not opensource) recording tool Ive been using is MacWhisper. Does this and more all locally too. Will try hyprnote out because its neat to do the transcription in real time and its note taking purposes
thanks! we're focusing on local-first at the moment which makes real-time assist a bit challenging. but we definitely have something similar on our mind - for example, a live summary of what's been going on during the past 5 mins. planning to support this via extension in the future!
Super cool, congrats on the launch - will be trying this soon! I noticed it’s using Tauri - what are your main takeaways from building a local inference desktop app with it?
Thanks. I learned that running a server on the Rust side and calling it from a TypeScript frontend is a good approach. For example, we run an OpenAI-compatible server using a Tauri plugin (https://github.com/fastrepl/hyprnote/tree/main/plugins/local...) and call it using the Vercel AI SDK.
I wanted to build this for myself. Could never figure out how to get audio output from Mac. Tried almost all audio loopback driver (Blackhole, Soundflower ...). There was problem everywhere wrt security.
Even tried making a teams meeting bot. But Teams doesn't give live audio to developer unless you are a special partner.
Super neat! You’ve made an impressive amount of cool specialized crates - have you considered making them generally usable to the wider community and licensing them under LGPL/MIT instead of GPL?
Whisper is more lightweight and smaller. We will support Parakeet in the near future too.
Monetization:
For individuals:
We have a Pro license implemented in our app. Some non-essential features like custom templates or multi-turn chat are gated behind a paid license. (A custom STT model will also be included soon.) There's still no sign-up required. We use keygen.sh to generate offline-verifiable license keys. Currently, it's priced at $179/year.
For business:
If they want to self-host some kind of admin server with integrations, access control, and SSO, we plan to sell a business license.
Congrats! I'm currently a Granola user, and wanted to build this myself a while back. But I probably wouldn't have gone as far as fine-tuning a small model for meeting summarization. Can't wait to try it out!
You have an icon with the Finder face labeled "Open Finder view." I would expect this to open the app's data folder in the macOS Finder. Instead, it opens an accessory window with some helpful views such as calendar view. I'd encourage you to find another name for that window, because it's too confusing to call it "Finder" (especially with the icon).
I'd also add a menu item for Settings (and Command-comma shortcut) in the Application menu.
You also need a dark mode at some point.
Finally, I'm not sure where note files end up. Seeing that there's an Obsidian integration, I would love an option to save notes in Markdown format into a folder of my choice. I'm an iA Writer user, and would love to have meeting notes go directly into my existing notes folder.
I'll let you know how the actual functionality is working for me after my next few meetings!
I just tried to build on Linux and it keeps panicking because it requires dozen(s) of API keys. I was not expecting that from local first software.
I'm always amazed at these relatively tiny projects that "launch" with a "customers" list that reads like they've spent 10 years doing hard outbound enterprise sales: Google, Intel, Apple, Amazon, Deloitte, IBM, Ford, Meta, Uber, Tencent, etc.
We’re firmly in a world where “cheat on everything” is an acceptable business, startups that were hacked together in a week at YC claim they have SOC2 and vibecoded GPT wrappers claim they “trained a model”. Shameless lying took over tech, and if anyone catches you lying, you double down, make a scene and a bunch of podcasts will talk about you. Free advertising!
Of course, dishonesty is as old as time, but these last couple of years have been hard to watch…
have to admit that we did some logo plays. but our users are really all over the place and just wanted to show it off! i am not sure how it looked but that's why we didn't use terms like "teams" or "customers" to be honest while showing some validation.
"Logo play" is such a YCombinator word for Lie.
It says "Our users are everywhere" and shows some logos for the companies these users are from.
If the users are from those companies, this is not lying.
If they added logos for companies their users are not from, it would be lying.
Adding a logo to your webpage has started to follow different patterns for the stage of the company.
Early stage companies show things like "people at X, Y, Z use our product!" (showing logos without permission), whilst later stage ones tend to show logos after asking for permission, and with more formal case studies.
They may not have asked for permission to show these logos, but that's not the same thing as lying.
There's a lot of heavy lifting in the idea that someone who tried it / used it of their own volition that happens to work for, say Google, is the same as indicating that your product is "used by Google".
It's a lie of accuracy, but still a lie.
The customer used a Gmail address!
I feel like everyone already understands that argument and it won't convince anyone that it's any less of a lie.
> we did some logo plays
Help me understand what this means
The most honest version is the company is paying for the tool. The most stretched version I’ve seen is a former employee of a company uses the tool in a personal capacity. Most commonly for newly launched things it means someone with an @company email has tried the tool (even if they didn’t pay). You could, for example, set up a waitlist and then let anyone with a logo-worthy email in.
I think this is way too far. For me personally, the threshold to put the logo is someone within the company is paying, even though the whole company is not in a contract. For example, you might not have a full fledged contract with Google, but one manager of a tiny team might have used her/his company credit card to pay for your tool. If the sum is below a certain threshold, they don't need to authorize or go through vendor vetting and all that.
The threshold should be that a relevant representative of the company agreed to have the logo displayed on the website. Anything other than that is deceptive.
Well "Joe in accounting from Google is using it" doesn't have the same glamour as putting the Google logo apparently.
to show that we are acknowledged by many users from various orgs. we listed users who talked to, but we do not know if they still use it as some of them are not reachable(lost contact). i am admitting that we wanted to seem official so that's why we had all these logos where our users are "from".
kudos for being transparent on your approach here
Congrats on the launch. I never understood why an AI meeting notetaker needed sota LLMs and subscriptions (talking about literally all the other notetakers) - thanks for making it local first. I use a locally patched up whisperx + qwen3:1.7 + nomic embed (ofcourse with a swift script that picks up the audio buffer from microphone) and it works just fine. Rarely i create next steps / sop from the transcript - i use gemini 2.5 and export it as pdf. I’ll give Hyprnote a try soon.
I hope, since it’s opensource, you are thinking about exposing api / hooks for downstream tasks.
What kind of API/Hooks you expect us to expose? We are down to do that.
The ability to receive live transcripts from a webhook, including speaker diarization metadata would be super useful.
Can you share the Swift script? I was thinking of doing something similar but was banging my head against the audio side of macOS.
Looks great & kudos for making it local-first & open-source, much appreciated!
From a business perspective, and as someone looking also into the open-source model to launch tools, I'd be interested though how you expect revenue to be generated?
Is it solely relying on the audience segment that doesn't know how to hook up the API manually to use the open-source version? How do you calculate this, since pushing it via open-source/github you would think that most people exposed to it are technical enough to just run it from source.
Nice!
Would be great if you could include in your launch message how you plan to monetize this. Everybody likes open source software and local-first is excellent too, but if you mention YC too then everybody also knows that there is no free lunch, so what's coming down the line would be good to know before deciding whether to give it a shot or just move on.
For individuals:
We have a Pro license implemented in our app. Some non-essential features like custom templates or multi-turn chat are gated behind a paid license. (A custom STT model will also be included soon.) There's still no sign-up required. We use keygen.sh to generate offline-verifiable license keys. Currently, it's priced at $179/year.
For business:
If they want to self-host some kind of admin server with integrations, access control, and SSO, we plan to sell a business license.
Does that mean the admin server is not open source?
Another sso.tax candidate.
Let's actively not support software that chooses anti-security.
totally fair concern. we’re actually on the same side when it comes to promoting good security practices like SSO.
the reason we’re gating the admin server under a business license is less about profiting off sso and more about drawing a line between individual and organizational use. it includes a bunch of enterprise-specific features (sso, access control, integrations, ...) that typically require more support and maintenance.
that said, the core app is fully open-source and always will be - so individuals and teams who don’t need the admin layer can still use it freely and privately, without compromising security.
we’ll keep listening and evolving the model - after all, we're still very early and flexible. appreciate the pushback.
(edit: added some more words to reinforce our flexibility)
Fair stance. I believe sso tax is a necessary evil.
How are you balancing accuracy vs. time-to-word-on-live-transcript? Is this something you're actively balancing, or can allow an end user to tune?
I find myself often using otter.ai - because while it's inferior to Whisper in many ways, and anything but on-device, it's able to show words on the live transcript with minimal delay, rather than waiting for a moment of silence or for a multi-second buffer to fill. That's vital if I'm using my live transcription both to drive async summarization/notes and for my operational use in the same call, to let me speed-read to catch up to a question that was just posed to me while I was multitasking (or doing research for a prior question!)
It sometimes boggles me that we consider the latency of keypress-to-character-on-screen to be sacrosanct, but are fine with waiting for a phrase or paragraph or even an entire conversation to be complete before visualizing its transcription. Being able to control this would be incredible.
It is more like ai model problem(then app logic. doing it more frequently will require more computation. Things like speculative decoding can help though).
Doing it locally is hard, but we expect to ship it very soon. Please join our Discord(https://hyprnote.com/discord) if you are interested to hear from us.
Do you intend to reach feature parity with something like MacWhisper? I'd love to switch to something open source, but automated meeting detection, push to transcribe (with custom rewrite actions) are two features I've learned to love, beside basic transcript. I also enjoy the automatic transcription from an audio, video or a even a YouTube link.
But because MacWhisper does not store transcripts or do much with them (other than giving you export options), there are some missed opportunities: I'd love to be able to add project tags to transcripts, so that any new transcript is summarized with the context of all previous transcript summaries that share the same tag. Thinking about it maybe I should build a Logseq extension to do that myself as I store all my meeting summaries there anyway.
Speaker detection is not great in MacWhisper (at least in my context where I work mostly with non native English speakers), so that would be a good differentiation too.
definitely planning to catch up to other tools - we ship FAST!
automated meeting detection - working on this. push to transcribe - want to understand more about this. (could we talk more over at our discord? https://hyprnote.com/discord)
if you're using logseq, we'd love to build an integration for you.
finally, speaker identification is a big challenge for us too.
so many things to do - so exciting!
Let's go! Trying it out, appreciate you building this as the hosted/online alternatives (Granola, Notion recorder, ChatGPT recorder etc) don't really make sense if you can record locally.
hope you give us some feedback that we can push further on! this is our discord(https://hyprnote.com/discord) if you're interested
I just downloaded on mac M4 pro mini. I installed the apple silicon version and try to launch it and it fails. No error message or anything. Just the icon keep bouncing on the dock. I assumed it needs some privacy and screen recording and audio permissions and explicitly gave them, however still just jumps on the dock and the app does not open. (OS, mac sequoia 15.5)
Seems like
https://github.com/fastrepl/hyprnote/blob/d0cb0122556da5f517...
this is invalid on Mac mini. Should be fixed today.
working on trying to identify the problem! could you come over to our discord where we could better support you? https://hyprnote.com/discord
That is very strange. Can you launch it from the command line and share what you got?
/Applications/Hyprnote.app/Contents/MacOS/Hyprnote
Looks really cool - I noticed Enterprise has smart consent management?
The thing I think some enterprise customers are worried about in this space is that in many jurisdictions you legally need to disclose recording - having a bot join the call can do that disclosure - but users hate the bot and it takes up too much visibility on many of these calls.
Would love to learn more about your approach there
yes, we’re rolling out flexible consent options based on legal needs - like chat messages, silent bots, blinking backgrounds, or consent links before/during meetings. but still figuring out if there's a more elegant way to do this. would love to hear your take as well.
Please shoot me a note - I'm trying to figure this out for my enterprise now, would love to figure out a way to get you in / trial it out.
can i send you a follow-up to the email that's on your profile?
yes
Congrats on the launch! I'm very bullish on how powerful <10B-param models are becoming, so the on-device angle is cool (and great for your bottom line too, as it's cheaper for you to run).
Something that I think is interesting about AI note taking products is focus. How does it choose what's important vs what isn't? The better it is at distinguishing the signal from the noise, the more powerful it is. I wonder if there is an in-context learning angle here where you can update the model weights (either directly or via LoRA) as you get to know the user better. And, of course, everything stays private and on-device.
> How does it choose what's important vs what isn't?
The idea of Hyprnote is that you write chicken-scratch-raw note during the meeting(what you think is important), and AI enhance based on it.
On-device learning is interesting too. For example, Gboard: https://arxiv.org/abs/2305.18465
And yes - we are open to this too
How to deal with the AI meeting recording function that comes with conference software?
hyprnote listens to both sounds coming out(system audio output) and going in(microphone input) so it will work perfectly fine with virtual meetings.
I was talking about this a week ago. One person wanted to make a pdf tutorial of how to use a software. I asked him to record himself in teams and share his screen and have AI take notes. It will create a fabulous summary with snapshots of everything he is going over.
Congratulations! Is there a mobile version as well, especially for Android?
have it planned for the 4th quarter!
Awesome, will look forward to it!
Look forward to testing the Windows version. Hope it has ability to also upload recordings, etc. Meetily is nice but setup feels too convoluted, with a backend and frontend being required to separately install...
will push harder! hop into our discord(https://hyprnote.com/discord) to get the latest updates :) you can sign up to the waitlist on our website as well.
Congrats on the launch. Is there a reason why the app isn't sandboxed?
What kind of sandboxing you expect? keen to learn more about it since we care about security too. (one reason why we use Tauri over Electron)
Just the default macOS sandboxing entitlement[1], which is mandatory for all apps on the Mac App Store, and optional for notarized apps.
[1]: https://developer.apple.com/documentation/xcode/configuring-...
This is perfect timing! I just cancelled my fireflies.ai subscription yesterday because it just felt unnecessary. I prefer using less platforms and more tools, especially those that can work under the surface.
happy to support you! would love to hear your feedback. we have a discord up and running, fyi.
This is really cool! I've been using Obsidian more and more as a second brain and getting data in has consistently been the point of failure, so I've been wanting something just like this. Specifically something that runs locally and offline.
Is the future goal of Hyprnote specifically meeting notes and leaning into features around meeting notes, or more general note taking and recall features?
At least for near future, we'll focusing on meeting notepad side of a thing.
We actually have "export to Obsidian". I think you can pair Hyprnote nicely with Obsidian.
Screenshot: https://github.com/user-attachments/assets/5149b68d-486c-4bd...
You need this plugin installed in Obsidian first: https://github.com/coddingtonbear/obsidian-local-rest-api
Obsidian export code 1:
https://github.com/fastrepl/hyprnote/blob/d0cb0122556da5f517...
Obsidian export code 2:
https://github.com/fastrepl/hyprnote/tree/main/plugins/obsid...
Curious what made you decide to start with the macOS version. Some insight that most potential customers are on Macs or simply "we use Macs, dogfood time"? I'd always target Windows first for these tools.
Looks cool, I'll wait for the Linux version and try it.
Integration of automatic translations could be an interesting business plan value-add. Branching out into CRM things also makes sense to me.
Good luck, keep shipping.
The only issue I have with those tools, and I have not seen a single one even acknowledge this, is that it becomes completely useless when holding meetings in a hybrid fashion where some people are remote and others are in the office with a shared mic.
Almost all of our meetings are hybrid in this way, and it's a real pain having almost half of the meeting be identified as a single individual talking because the mic is hooked up to their machine.
It's a total dealbreaker for us, and we won't use such tools until that problem is solved.
It can be solved with speaker segmentation/embedding models, although it is not perfect. One thing we do with Hyprnote is that we have a Descript-like transcript editor that allows you to easily edit/assign speakers. Once we integrate a speaker diarization model with that, I think we'll be in good shape.
If you are interested, you can join our Discord and follow updates. :) https://hyprnote.com/discord
Oh awesome, I was reading through to see about whether it had speaker diarization (why I got rid of my whisper script I use).
I'll look forward to the Linux version.
Is there any chance of a headless mode? (I.e. start, and write transcript to stdout with some light speaker diarization markup. e.g. "Speaker1: text")
our conference rooms even have some sort of rotating camera contraption that automatically focus on the person speaking
I forbid this kind of meeting on my teams.
Either everyone is in the same physical room, or everyone is remote.
The quality of communication plummets in the hybrid case:
* The physical participants have much higher bandwidth communication than those who are remote — they share private expressions and gestures to the detriment of remote.
* The physical participants have massively lower latency communications. In all-online meetings, everyone an adjust and accommodate the small delays; in hybrid meetings it often locks out remote participants who are always just a little behind or have less time to respond.
* The audio quality of remote is significantly worse, which I have seen result in their comments being treated as leas credible.
* Remote participants usually get horrible audio quality from those sharing a mic in the room. No one ever acknowledges this, but it dramatically impacts ability to communicate.
you might need an AI for in-person meeting first. Such tools are available to doctors who see patients. The note taking is great but I think it is skewed towards one-person summary where the name of the patient remains unknown. I wonder if the same tool can take notes if two patients are in the room and distinguish between each one.
The second tool is likely hardware limitation. A multi-cam-mic with beam forming capability to deconstruct overlapping sounds.
hyprnote can be used for in-person meetings as well! we have doctors like ophthalmologists or psychiatrists using it right now. and yes - definitely going to be working on speaker identification as it crucial.
I recently tried Vibe (https://github.com/thewh1teagle/vibe) from a recording of a meeting taken on one side. It was able to identify the speakers. As Speaker 1, 2, etc. But still useful to see.
I think if you put N-1 mics in the room (where N is the number of people) you could easily identify all individuals...
Unfortunately the GPL license makes it dead in the water for using something like this at work
:(
Why? You can use it, only if you want to extend the tool as part of your (paid) work, you are meant to contribute the code changes back.
But none of it should prevent someone from just using it (GPL does not mean any usage data is being made "public").
Since this isn't available yet on Windows, what would be the glue & duct tape alternative? Record audio and dump it in chatGPT? Or do you need to create some kind of automation with n8n / Zapier? I don't have that many meetings but it could be nice to have
I've been using https://www.quillmeetings.com/ on Windows for a few months and it's been great. It processes locally in the same fashion i.e. no cloud requirement.
Paid subscription, not open source.
https://github.com/thewh1teagle/vibe
Only for speech to text though.
Thanks!
Looks great. From my experience Tauri team has no clue what mobile is and they're not interested in fixing mobile issues. I can already tell the mobile version will be a disappointment.
We tried Tauri mobile (few months ago) and had the same disappointment. We will use RN or Dioxus(to share Rust code) for the mobile version. So it will be cool :)
I'm glad you figured that out early on. Sounds like mobile is going to be great indeed
is this a known pattern, I did a basic tauri app, but not sure what to do with mobile yet...
from our experience, we tried using stackflow on the frontend but encountered problems with responsive keyboard layouts. thought it was stackflow's problem but soon realized that it just wasn't implemented in tauri yet.
ref: https://github.com/daangn/stackflow
Really cool - how does it compare to Granola outside of the OSS part?
Local-first, controllability(custom endpoint part), and eventually extensibility(VSCode part of the post)
We're putting a lot of effort into making it run smoothly on local machines. There are no signups, and the app works without any internet connection after downloading models.
One of the things I would want to do is - As the meeting is going on - I would like to ask a LLM what questions I could ask at that point in time. Especially if it's a subject I am not expert in.
Would I be able to create an extension that could do this?
you can definitely do that in the future. but we had that on our mind as well from multiple requests - planning to add "eli5 - explain like i'm five" and "mmss - make me sound smart" ;) (edit: grammar fix)
Wow, does anything like this exist in current commercial tools?
ELI5 sounds useful.
MMSS sounds terrifying though, honestly.
i think there are tools like cluely - where they propose to "cheat" on everything in real-time. or just wearables like waves that shows ar displays with real-time assist. (i've never used both of them before, but i understood their products like this) so proactive ai agents are somewhat becoming a thing I guess. but it all boils down to privacy for us.
mmss was something that a lot of users suggested - they wanted to be saved from public humiliation
another free tier (but not opensource) recording tool Ive been using is MacWhisper. Does this and more all locally too. Will try hyprnote out because its neat to do the transcription in real time and its note taking purposes
https://goodsnooze.gumroad.com/l/macwhisper
we will support even faster realtime transcription in near future. (within few weeks)
What model? All the realtime transcription services I've seen are Whisper wrappers or custom closed source ones.
It's custom model. (whisper variant)
Makes sense. How well do you expect Hyprnote to work on mobile? I found that phones in general are still pretty weak at AI inference on-device still.
I think it will work pretty well considering how rapid it moves. (See gemma3n, Cactus etc)
congrats, app looks gorgeous. def a good tauri codebase to study ( been using https://deepwiki.com/fastrepl/hyprnote)
any interest in the Cluely-style live conversation help/overlay?
thanks! we're focusing on local-first at the moment which makes real-time assist a bit challenging. but we definitely have something similar on our mind - for example, a live summary of what's been going on during the past 5 mins. planning to support this via extension in the future!
Nicely done, I or someone can push the translation option too. Well done.
what language do you have in mind? would love to know!
I would like to try this on Linux
we move the status from "maybe" to "of course". we are definitely interested as well :)
Super cool, congrats on the launch - will be trying this soon! I noticed it’s using Tauri - what are your main takeaways from building a local inference desktop app with it?
Thanks. I learned that running a server on the Rust side and calling it from a TypeScript frontend is a good approach. For example, we run an OpenAI-compatible server using a Tauri plugin (https://github.com/fastrepl/hyprnote/tree/main/plugins/local...) and call it using the Vercel AI SDK.
I wanted to build this for myself. Could never figure out how to get audio output from Mac. Tried almost all audio loopback driver (Blackhole, Soundflower ...). There was problem everywhere wrt security.
Even tried making a teams meeting bot. But Teams doesn't give live audio to developer unless you are a special partner.
Glad you made this. Will play around
Great! We use AudioTap api
great to see that you like it :)
I’ve been thinking about ways to reuse Vercel’s AI SDK, so this is a great one - thanks for sharing!
Also you might find this interesting:
https://github.com/fastrepl/hyprnote/blob/main/packages/util....
Super neat! You’ve made an impressive amount of cool specialized crates - have you considered making them generally usable to the wider community and licensing them under LGPL/MIT instead of GPL?
we def considered it. our line of thinking is start with more restrictive license, and later switch to MIT or something once we figure things out :)
why use whisper over parakeet? how will you monetise?
Whisper is more lightweight and smaller. We will support Parakeet in the near future too.
Monetization:
For individuals: We have a Pro license implemented in our app. Some non-essential features like custom templates or multi-turn chat are gated behind a paid license. (A custom STT model will also be included soon.) There's still no sign-up required. We use keygen.sh to generate offline-verifiable license keys. Currently, it's priced at $179/year.
For business: If they want to self-host some kind of admin server with integrations, access control, and SSO, we plan to sell a business license.
Congrats! I'm currently a Granola user, and wanted to build this myself a while back. But I probably wouldn't have gone as far as fine-tuning a small model for meeting summarization. Can't wait to try it out!
Super! let us know how it goes! (we have discord channel too)
Installation and onboarding was nice & smooth!
A few random bits of realtime feedback:
You have an icon with the Finder face labeled "Open Finder view." I would expect this to open the app's data folder in the macOS Finder. Instead, it opens an accessory window with some helpful views such as calendar view. I'd encourage you to find another name for that window, because it's too confusing to call it "Finder" (especially with the icon).
I'd also add a menu item for Settings (and Command-comma shortcut) in the Application menu.
You also need a dark mode at some point.
Finally, I'm not sure where note files end up. Seeing that there's an Obsidian integration, I would love an option to save notes in Markdown format into a folder of my choice. I'm an iA Writer user, and would love to have meeting notes go directly into my existing notes folder.
I'll let you know how the actual functionality is working for me after my next few meetings!
haha we couldn't find a better icon for the "Finder" view so we just went with the classic one. but i guess it's confusing - noted.
we do have settings shortcut already! be sure to test it out :)
dark mode - noted as well.
we save our notes in db.sqlite that can be found in: ~/Library/Application\ Support/com.hyprnote.stable
this decision was made because we have three documents - raw note, enhanced note, and transcript - assigned to a meeting note.
would love to create an iA integration for you - or just a simple way to export MD for the time being
join our discord for more updates! https://hyprnote.com/discord
Why in the world is there _background music_ when I start the app?!
It should be played only during onboarding. No music while using the app
We thought it'll be cool :) sorry if that disturbs you.
Glad you're having fun, but it had the "website with autoplay" vibe.
Looks promising, but "Linux maybe"? Signing off.
updated readme :) >> Linux (of course)
it was a joke! we are strongly considering to get it done after windows(shipping in aug)