Uncovering Biases: The Importance of Data Diversity in Speech Recognition

Companies can use artificial intelligence (AI) and machine learning (ML) models for a variety of reasons such as reviewing job candidates, monitoring employee productivity, or analyzing voice data to better understand customer’s needs. ML models are typically trained to recognize certain types of patterns over a set of data, providing an algorithm they can use to reason over and learn from. In theory, this model should be able to provide unbiased outputs based on the data set it’s given. So why does bias still persistently exist within ML algorithms? It’s because ML is only as good as the data it’s given.
As more companies turn to automatic speech recognition (ASR) tools and deploy AI and ML models across their organization it will be imperative to build robust models, free of biases. Bias-free AI models are important so that they work for every person and organization, and create meaningful results that help solve problems, not create them. This piece will uncover how to eliminate bias in your models, and ensure you have a diverse data set that’s as representative as possible of your organization for impartial functionality.
Recognizing Bias in Model Training
The first step to eliminating bias in model training is to acknowledge that inherent bias does exist in ML models. We saw one such occurrence of unconscious bias play out earlier this year when a Stanford study that observed five different companies in the voice assistant space uncovered a racial divide in its speech recognition technologies. This study showed that voice assistants misidentified 35% of words from Black users, while the systems only misidentified 19% of words from White users. These systems learn by analyzing vast amounts of data, so if the ML model is only analyzing White user’s voice patterns, bias will inevitably occur.
If a model is built by white heterosexual males living in coastal states then the model will be more reflective of their biases.
Another more tangible example is gender bias. We interact with this type of bias almost every day with voice assistant technology that is programmed with a female voice by default. Startups in the voice assistant and speech technology space are beginning to change this, but there is a long road to truly genderless AI. Moving forward, AI and ML could help businesses understand the biases that humans have and work to correct them over time. This gender bias study is just one example of how bias exists in speech recognition but it’s representative of an issue most companies face every day. Acknowledging this bias is the first step in working towards a solution.
Have a Diverse Data Set
The next and most important step to reducing bias in ML models is to have a diverse data set. As seen in the Stanford study example above, your data must not only represent different dialects but genders as well, in order to reduce biases and be more accurate.
When developing a representative data set, make sure you are aware of the people around you and their individual experiences. If a model is built by white heterosexual males living in coastal states then the model will be more reflective of their biases, such as the words they use, the cadence of speech, dialects and accents. Utilizing a technology that learns from multiple diverse data sets is a way of reducing the gaps and allowing all voices to be heard. It’s important to emphasize that the ML itself isn’t biased and the algorithm isn’t biased upfront — but they can attain inherent biases if the data set provided does not accurately represent your population. In order to limit as many of these biases as possible within AI, you need to make sure you are representing a wide range of people with various demographics in an organization, not just the organization’s leaders.
Use a Deep Neural Network (DNN)
The third step to ultimately eliminating bias in ML models is for companies to use a Deep Neural Network (DNN). With typical ML models, there is the monster (the model) and the creator of the “monster” (humans). The “monster” learns based on the data that it is given. With a DNN, humans aren’t doing any kind of hard coding. Companies like Amazon or Google that have one part heuristics and rule-based and the other part convolutional network are at risk of inserting their own biases.
On the other hand, if you simply use a DNN, only the data that you give your model is what trains it and how it learns over time. This way, it has nothing to do with the beliefs of the person that built the network, eliminating inherent biases. However, to play devil’s advocate, now bias is transferred to the beliefs of the people labeling the training data (or how data is collected). Ultimately, the more robust of a data science strategy you have, and how well you train your model with representative data sets, the higher your accuracy rates will be.
Eliminating bias in ASR is by no means a simple three-step process. Yet, it’s critical that you acknowledge bias exists and how it came about, to understand the importance of employing a diverse data set. From there, source audio that is representative of the conversations you are looking to transcribe and consider implementing a DNN, making on-going training part of the speech recognition process. This will improve the accuracy of your speech recognition model and help to eliminate bias, allowing you to better understand and serve the needs of your customers.
Feature image via Pixabay.