You had it 20 years ago: doctors spoke into recorders, transcriptionists turned that into notes, the docs reviewed them.
The first study I cited replaces the "spoke into recorders" stage with non-AI voice recognition.
The second study replaces the "spoke into recorders" stage with LLM voice recognition, and... crucially... also replaces the educated transcriptionist step with nothing.
I imagine that the real problem is that the voice recognition can be classic or LLM and it just doesn't matter as much as having two humans in the loop instead of one. But that's not a story which gets you to replace cheap voicerec with expensive AI.
A pretty insightful viewpoint I heard recently from a doctor friend: doctors and hospitals believe that only a corporation could possibly implement this, so they fall into the SaaS trap and lose data sovereignty.
Under the hood, a lot of the companies are Llama or Gemma wrappers connected to whisper.
I work in healthcare, and we spend oodles of time and money making sure every technology that can possibly be on-prem is.
Maybe it's just not technically possible yet?