Help me understand Voice Recognition tech

TimewornTraveler@lemm.ee · edit-2 1 year ago

Help me understand Voice Recognition tech

Diabolo96@lemmy.dbzer0.com · 1 year ago

It’s AI and your voice won’t be used for training if you use a local model.

Use Whisper stt. It run on your computer so nothing will be out. You can adapt the model size based on how powerful your computer is. The bigger the model the better at transcribing it will be.

TimewornTraveler@lemm.ee · 1 year ago

That sounds interesting. I was hoping for something that I could use on a mobile app. I’m not sure what “adapting the model size” means so this might be more complicated than I’m looking for.

Diabolo96@lemmy.dbzer0.com · edit-2 1 year ago

I was hoping for something that I could use on a mobile app.

Record then transcribe later ? But you can try https://whisper.ggerganov.com ( this runs on your browser but nothing is sent. So works even on your Android/IOS phone.) the website owner is a trusted dev that made whisper.cpp and llama.ccp, the latter basically being the backbone of the entire LLM industry.

I’m not sure what “adapting the model size” means so this might be more complicated than I’m looking for.

A bit of complexity is generally the price to pay for freedom from the constant surveillance and data gathering. Plus, It’s actually super easy. Bigger model means better transcription quality, but the smaller ones are really good already. The base.en is probably all you need anyway.

On pc, you can generally try any app from github. They basically all use the same backend.

I found a few : https://whishper.net/ https://github.com/chidiwilliams/buzz