Speech recognition in Hangouts Meet
There are many possible applications for speech recognition in Real Time Communication services live captions, simultaneous translation, voice commands or storing/summarising audio conversations.   Speech recognition in the form of live captioning has been  available in Hangouts Meet  for some months but recently it was promoted to a button in the main UI and I have started to use it almost every day.      I'm mostly interested in the recognition technology and specifically on how to integrate  DeepSpeech  in RTC media servers to provide a cost effective solution but in this post I wanted to spend some time analysing how Hangouts Meet implemented captioning from a signalling point of view.   At a very high level there are at least three possible architectures for speech recognition in RTC services:    A) On device speech recognition :  This is the cheapest option but not all the devices have support for it, the quality of the models is not as good as in the clou...