@pratikraj Thank you in your remark !
I sought after to make use of Apple’s personal SDKs for this from the beginning. A 3rd-party audio driving force felt just like the improper basis for an app constructed round minimizing dependencies.
On summarization I if truth be told examined this morning with an actual technical assembly recording, technical jargon, a couple of audio system, dense acronyms. Gemma 4B used to be below a minute however hallucinated acronym expansions, invented complete names for acronyms that sounded believable however have been improper. Tremendous for a basic standup, now not nice when the content material is technical and stakes are upper. Gemma 12B used to be slower however extra correct. Claude by means of BYOK treated the technical terminology appropriately and produced the cleanest structured output.
Native fashions are forged for structured extraction on basic conferences. They begin hallucinating on dense domain-specific content material. That is generally when other folks achieve for an API key. Giving them the customized key phrases which Thoth can do is helping issues however does now not make miracles.
Thoth calls for Apple Silicon, although now not through selection precisely.
The machine audio API I take advantage of used to be presented in macOS 26 which dropped Intel enhance fully, so Intel were given gated out on the OS stage ahead of I needed to make the decision.
Base and Small Whisper run high-quality on M1 (I’ve a M2 myself), native LLMs want 8GB+ unified reminiscence minimal.
Early days on person comments since I simply introduced as of late on PH and few weeks in the past on App Retailer, therefore the self-testing this morning!



