Data, People and Model: A Three-Part Approach to Ethical AI
Ethical AI has taken its rightful place at the forefront of AI conversations, with more companies acknowledging their commitment to building AI responsibly. An AI solution built responsibly must both work as expected and deliver equitable benefits for all of its end users. Accomplishing this requires AI practitioners to be aware of and actively mitigate ethical risks throughout development. While the dialogue around what ethical AI entails spans many topics, we’ll focus here on how it intersects with the data you use to build your model, the people you select to build it, and how the model performs in the real world.
The Data
Keeping in mind that an AI model is a reflection of the data used to train it, responsible AI starts with collecting and building a dataset that accurately represents its final users. In data collection, your focus should be on data diversity and breadth. Do you have sufficient data to cover every use case? Is each use case well-represented in the data? If not, you’ll need to source additional data or create synthetic data to fill gaps. Data diversity is essential to ensuring your final model is trained for every scenario it may face in production.
A speech recognition model, for example, will need training data that represents the accents, languages, dialects, and tones of all of the people who may end up using it. Missing data in even one of these areas will result in a noninclusive product that struggles to understand certain speakers.
The People
Your AI team should comprise people who represent your end user profile when possible, as they’ll bring their unique perspectives to strategic decisions around your model and data. Data annotation, in particular, is a critical step where you should make an effort to source diverse people to serve as annotators. Leveraging annotators from different places, with varying backgrounds and experiences, will reduce your chances of incorporating biased perspectives into your data (and, ultimately, your model).
AI practitioners need to embrace diverse annotators as a crucial link in the AI development lifecycle and be intentional and responsible about their work with them. That means providing them fair pay, flexible hours, open communication lines, privacy, confidentiality, and overall ethical treatment. This often overlooked component of responsible AI is just as important as other considerations, and needs to be addressed more frequently in ethical AI discourse.
The Model
Diverse data that is also responsibly sourced is the foundation of an ethical model. But another key factor to consider is whether the model you create works as intended. AI products need to perform as designed in order to avoid unintentionally introducing harm to business or society. We’ve seen many cases where models haven’t performed as intended (e.g., through discrimination against particular users, or poor decision-making leading to business losses, or even misdiagnosis for medical use cases for certain underrepresented user groups); even teams with the best of intentions can accidentally introduce bias during the AI lifecycle.
To help counter this, incorporate relevant bias and performance metrics, as well as end user feedback in your KPIs and continue to measure these post-deployment. If any of your measurements aren’t meeting your standards, be sure to have data pipelines in place to retrain and update your model as necessary. Performance monitoring, especially around bias, is critical to maintaining a model that works as intended.
Conclusion
As leaders across industries embrace AI as a core component of business, we’re seeing more organizations understand the importance of building AI using an ethical lens. AI practitioners have a responsibility to create AI that recognizes everyone, and is for everyone; that means applying an ethical approach throughout the AI deployment cycle. Teams pursuing AI must implement the right frameworks and infrastructure to ensure they’re creating responsible products, and always have the end user in mind.
Author:
Titus Capilnean
Titus is the Director of Marketing at Appen, driving responsible, unbiased AI and training data conversations with global companies. At Appen, they have been working with global companies to launch and improve their AI for over 25 years, while enabling people from around the world to participate in the AI economy through data annotation and collection projects. Titus has been in the marketing and technology space for over 10 years, starting from a time when social media wasn’t a phrase, with roles both in Europe and here in the US across fintech, artificial intelligence, blockchain, as well as consumer goods and gold mining.
Twitter Handles:
@appenglobal
You can read all insights from techUK's AI Week here
Katherine Holden
Katherine joined techUK in May 2018 and currently leads the Data Analytics, AI and Digital ID programme.