Improving Speech and Speaker Recognition For Multi-Speaker Conversations
Youngberg, Eric R., 1993--
MetadataShow full item record
This paper outlines an effective process for transcribing conversations from an audio file. The process involves combining speech recognition and speaker recognition to prepare the audio signals for transcription without relying on a database of preexisting vocal models. This process is intended for multi-speaker conversations where vocal models are not available or otherwise impossible to create from the amount of data provided. We find in conclusion that we can improve the performance of speech recognition on multi-speaker conversations by leveraging the classifying properties of speaker recognition to reduce variance in the dataset thus producing a result that is just as effective if we were to perform mono-speaker speech recognition.