Google Tests Sign Language Detector to Switch ‘Active Speaker’ in Video Calls

Oct. 3, 2020



With most of us holed up in our homes and coordinating work over video calls due to the COVID-19 pandemic, you might have become well-acquainted with thevariety of video conferencing software. A great feature of these video calling apps isautomatic switching between video feedsof the person talking in real-time. This, however, doesn’t work with sign language users and they could feel left out of the conversation.

Google researchers have decided to fix this accessibility issue bybuildingareal-time sign language detection engine. It can detect when a person in a video call is trying to communicate using sign language and bring the spotlight on them. The engine will be able to tell when a person starts signing and make them the active speaker.

Here’s a quick look at what the sign language engine sees in real-time:

Now, if you are wondering how this sign language detection engine works thenGooglehas explained it all in detail. First, the video passes through PoseNet, which estimates the key points of the body such as eyes, nose, shoulders, and more. It helps the engine create a stick figure of the person and then compare its movements to a model trained with the German Sign Language corpus.

This is how the researchers detect that the person has started or stopped signing. But, how are they assigned an active speaker role when there is essentially no audio? That was one of the biggest hurdles and Google overcame it by building aweb demo that transmits a 20kHz high-frequency audio signalto the video conferencing app you connect with it. This will fool the video conferencing app into thinking that the person using sign language is speaking and thus, make them an active speaker.

Google researchers have already managed to achieve 80% accuracy in predicting when a person starts signing. It can easily be optimized to reach over 90% accuracy, which is just amazing. This sign detection engine is just a demo (and a research paper) for now but it won’t be long until we see one of the popular video conferencing apps, be itMeet or Zoom, adopt this to make life easier for mute people.