Iterative LLM-based improvement for French Clinical Inter...

Iterative LLM-based improvement for French Clinical Interview Transcription and Speaker Diarization

arXiv:2603.00086v1 Announce Type: cross Abstract: Automatic speech recognition for French medical conversations remains challenging, with word error rates often exceeding 30% in spontaneous clinical speech. This study proposes a multi-pass LLM post-processing architecture alternating between Speaker Recognition and Word Recognition passes to improve transcription accuracy and speaker attribution. Ablation studies on two French clinical datasets (suicide prevention telephone counseling and preoperative awake neurosurgery consultations) investigate four design choices: model selection, prompting strategy, pass ordering, and iteration depth. Using Qwen3-Next-80B, Wilcoxon signed-rank tests confirm significant WDER reductions on suicide prevention conversations (p < 0.05, n=18), while maintaining stability on awake neurosurgery consultations (n=10), with zero output failures and acceptable computational cost (RTF 0.32), suggesting feasibility for offline clinical deployment.

相关推荐

Adaptive Uncertainty-Guided Surrogates for Efficient phase field Modeling of Dendritic Solidification

Iterative LLM-based improvement for French Clinical Interview Transcription and Speaker Diarization

Exploring Drug Safety Through Knowledge Graphs: Protein Kinase Inhibitors as a Case Study

Joint Sensor Deployment and Physics-Informed Graph Transformer for Smart Grid Attack Detection

Characterizing and Predicting Wildfire Evacuation Behavior: A Dual-Stage ML Approach

The Value Sensitivity Gap: How Clinical Large Language Models Respond to Patient Preference Statements in Shared Decision-Making