AI Ethics #21: Lexicon of lies, algorithmic accountability, future of AI in education, human rights in AI, organized social media manipulation, and more ...
Discriminatory ads on Facebook, algorithms ruling lives, voice phishing, bias in AI training tools, COVID-19 surveillance infrastructure, digital contact-tracing, and more from the world of AI Ethics!
Welcome to the twenty-first edition of our weekly newsletter that will help you navigate the fast changing world of AI Ethics! Every week we dive into research papers that caught our eye, sharing a summary of those with you and presenting our thoughts on how it links with other work in the research landscape. We also share brief thoughts on interesting articles and developments in the field. More about us on: https://montrealethics.ai/about/
If someone has forwarded this to you and you want to get one delivered to you every week, you can subscribe to receive this newsletter by clicking below:
Summary of the content this week:
In research summaries this week, we cover algorithmic accountability, designing for human rights in AI, the future of AI in education, a global inventory of organized social media manipulation, and lexicon of lies detailing terms used to describe problematic information.
In article summaries this week, we look at how Facebook might still be selling discriminatory ads, 5 ways algorithms already rule our lives, how voice phishers target corporate VPNs, how COVID-19 surveillance infrastructure strengthens authoritarian governments, an AI training tool that has been inadvertently passing bias for the past two decades, and a look at the coronavirus digital contact-tracing apps.
In upcoming events, we have a session being hosted to discuss Facial Recognition Technologies. Scroll to the bottom of the email for more information.
MAIEI Learning Community:
Interested in working together with thinkers from across the world to develop interdisciplinary solutions in addressing some of the biggest ethical challenges of AI? Join our learning community; it’s a modular combination of reading groups + collaborating on papers. Fill out this form to receive an invite!
AI Ethics Concept of the week: ‘Ethics Washing’
A pet peeve of ours is when organizations (including governments) claim to adhere to vaguely defined ethical principles in an attempt to thwart real regulation.
Learn about the relevance of ethics washing to AI ethics and more in our AI Ethics Living dictionary. 👇
MAIEI Serendipity Space:
A digital space for unstructured, free-wheeling conversations with the MAIEI Staff and AI Ethics community at large to talk about whatever is on your mind in terms of building responsible AI systems
Next one is on September 3rd from 12:15 pm ET to 12:45 pm ET
Register via Zoom here to get started!
Let's look at some highlights of research papers that caught our attention at MAIEI:
Designing for Human Rights in AI by Evgeni Alzenberg and Jeroen van den Hoven
Given the pervasive use of AI in so many spheres of life – healthcare, finance, education, criminal justice – the authors of this paper argue for a design-first approach rather than focusing on the algorithms to potential AI products and services to address these issues that can impact people’s basic human rights. Using the well-established frameworks and methodologies of Design for Values, Value Sensitive Design, and Participatory Design, along with the EU Charter of Fundamental Human Rights as a basis of what human rights to address, the authors propose a process to bridge the gap between the purely technical work of AI and its social impact.
To delve deeper, read our full summary here.
Schiff begins by giving a broad overview of the often fraught relationship between education and new technological advances. From the arrival of writing in Plato’s time to social media only a few years ago, novel technologies have generated concerns about their effect on learning and students alike. Clearly, AIEd is no different.
Schiff also makes the crucial distinction between distance education and AIEd. Distance education (or distance learning) follows the structure of the classroom, employs textbooks and lectures in video format, and doesn’t usually involve AI. One prominent form of distance education is Massive Open Online Courses (MOOCs). These courses are meant to be widely accessible, and are typically pre-recorded. The content, level of difficulty, and delivery of the course materials are not personalized for each student. While some believed MOOCs would allow greater accessibility for learners around the world who had little or no access to education, it later became clear that MOOCs instead overwhelmingly reached learners who already had adequate access to education.
On the other hand, one of AIEd’s key features is being able to personalize education in accord with each learner’s preferences, abilities, and situation. AIEd also is not bound by the typical course structure (classroom, lectures, textbooks, et cetera). The main goal of AIEd, according to the author, is to “simulate teachers” and those who might be doing work similar to a teacher’s: mentors, tutors, and perhaps even educational administrators. Ultimately, AIEd doesn’t only aim to convey course material, but also to do this in a way that is characteristic of how a teacher or other person in a teaching position would customize the material for learners.
To delve deeper, read our full summary here.
Algorithmic Accountability by Hetan Shah
Considering algorithms are being used in high stakes situations and that public adoption is essential for adoption, it is essential that we figure out a way to make algorithms accountable. We can do this by improving trustworthiness of algorithms, moving away from negative screening toward pushing for algorithms that can impact society positively. This has varying implications for 3 primary groups of stakeholders: practitioners, regulators, and the public sector.
To delve deeper, read our full summary here.
What we are thinking:
Op-eds from our research staff that explore some of the most pertinent issues in the field of AI ethics:
Watch this space for more content soon!
From our learning communities:
Research that we covered in the learning communities at MAIEI this week, summarized for ease of reading:
Troops, Trolls and Troublemakers: A Global Inventory of Organized Social Media Manipulation by Samantha Bradshaw and Philip N. Howard
Online media manipulation has become a global phenomenon. Bradshaw and Howard examine this emerging phenomenon by focusing on the “cyber troops,” or organized governmental and political actors who manipulate public opinion via social media. This report provides detailed accounts of such groups across 28 countries. The authors investigate the types of messages, valances, and communications strategies that cyber troops use. Furthermore, they also compare their organizational forms, resources and capacities. The authors find that organized social media manipulation is a pervasive and global phenomenon. Some organizations target domestic populations, while others try to influence public opinion of foreign populations. Authoritarian regimes tend to have organized social media manipulation campaigns that target their domestic population. In democratic regimes, cyber troops have campaigns that target foreign publics, while political-party-supported campaigns target domestic voters. Overtime, the mode for organizing cyber troops went from military operation to private-for profit communication firms that work with the government.
To delve deeper, read the full summary here.
Let’s look at highlights of some recent articles that we found interesting at MAIEI:
Does Facebook Still Sell Discriminatory Ads? (The Markup)
Discrimination is a constant struggle that automated systems encounter, especially when trying to regulate content that is posted on online platforms at web-scale. As fewer humans are involved in the operations of how decisions are made with regards to what is accepted for publishing on the platform, opportunities emerge for transferring over discriminatory practices from the analog to the digital world. In this article, The Markup unearths how discriminatory ads are still run on the Facebook platform, especially as they relate to housing ads that target specific people. This is illegal as per regulations in the US owing to how landlords used this historically to exclude people from access to housing based on their race and ethnicity. The difference compared to print ads is that while those were geared to appear next to related content, we can now target people directly.
Civil rights activists argue that what Facebook does is not show users what they want, but what it thinks that they want based on the stereotypes of their demographics. As there are very strong correlations between seemingly innocuous actions and race, gender, and other demographic information that Facebook might be interpreting circumstantial evidence as expressed preferences. What has also frustrated activists is that Facebook has mentioned efforts to combat this sort of problem, but it hasn't been transparent in what it is doing exactly and what is the efficacy of those adopted measures. Finally, relying on Section 230(c) partially absolves them of their responsibility in ensuring that the content posted meets certain normative guidelines but many argue that the antiquated law needs to be adapted for the Internet Age to avoid these problems.
The title of this article might sound familiar to the audiences of this newsletter with everything that has been taking place in the UK. But, if you haven't caught up, with the cancellation of the A-level examinations in the UK, automated systems were used to compute grades for students. Turns out that these grades boosted the scores for those who came from advantaged backgrounds and depressed the scores for those who were from disadvantaged backgrounds. In turn, protests were held that led to the rescinding of the scores allocated by the system in favor of those assigned by the teachers of those students. This article shows that this is not the only time where such egregious faults have come to light, there are many more instances of algorithmic faults within the European countries.
In particular, algorithmic systems have been used to automate things like getting loans where criteria outside of traditional evaluation metrics are used in determining eligibility. In making hiring decisions, some companies claim to utilize external social media data to get a better grip on whether the candidate would be a good fit, unfairly penalizing some in favor of others. Criminal investigations and arrests made based on inputs from algorithmic systems are subject to wide coverage and would be more than familiar to frequent readers of the newsletter. Getting access to welfare and crossing borders is something that isn't as well discussed yet happens with alarming frequency in some of the Scandinavian and other European nations as highlighted in the article. As public awareness rises concerning where these systems might fail, we must continue to demand higher levels of transparency in the operations of algorithmic systems that are involved in making important decisions about our lives.
Voice Phishers Targeting Corporate VPNs (Krebs on Security)
As presciently mentioned in this paper, voice is the next frontier where phishing attacks are going to be mounted. Vishing is the application of a likeness of voice to craft phishing attacks. In a time where work is within the home front, and people are using VPNs to connect to work resources, attackers are taking advantage of new attack surfaces that open up. The article mentions how new hires are particularly susceptible to such attacks because they are not yet familiar with legitimate corporate resources and those that are not.
The article provides a lot of details on how such actors mount their attacks, including the use of multiple actors working in cahoots, one doing the social engineering over the phone and another directing the unsuspecting employee to a phishing website to steal their credentials. In cases where the attackers fail, they have a chance to try again with another employee, each time gaining a bit more information on the colloquialisms and tooling used by the organization. The researchers also found that the attackers have grown more and more sophisticated in penetrating through the networks but are still learning the most effective ways to cash out.
One of the things that we think will become more prevalent in the future is the impersonation of someone's voice, someone that you trust, and then stealing credentials and other valuable information that way. With the advent of more AI-enabled tools that democratize this ability, user awareness is one of the best defenses that we have.
COVID-19 Surveillance Strengthens Authoritarian Governments (CSET Foretell)
An intriguing article that applies a different lens to comprehend when we will head towards surveillance infrastructure that gets established during the COVID-19 pandemic. In particular, the use of indicators is exactly the kind of tangible insight that we relish at MAIEI. Adopting a "canary in the coalmine" approach, the article mentions a few things of note. The incentive structures for businesses as revenue streams dried up in various places has helped to fuel their advances into working with players who are looking to establish this infrastructure in place. The research domain is similarly pivoting to doing work in this space due to other funding resources becoming limited.
While the mileage in how these systems get deployed and used will vary in different countries, the more authoritarian regimes will see overt deployment whereas the more democratic regimes will see pushback and calls for fairness, privacy protections, and accounting for other ethical considerations. Finally, the flywheel of AI kicks in where regimes that don't have hindrances to the deployment of these technologies can gain massive competitive data advantages and train more proficient systems. Ultimately as the technology becomes more effective and cheaper, these regimes will become exporters of the technology to the rest of the world, both in subtle and not-so-subtle ways.
Co-NLL 2003 is a staple in the world of Natural Language Processing (NLP). It is used to benchmark performance on Named Entity Recognition (NER) and is used as fodder to train a lot of machine learning systems that are utilized for tasks like creating knowledge panels, identifying contacts, and much more. The creators behind the Co-NLL 2003 dataset weren’t thinking too much about bias 17 years ago when they worked hard to create the dataset by parsing through newswires and annotated the different entities in the text from them. Only upon deeper analysis did researchers find that there were severe problems with representation in the dataset. Specifically, male names were more frequent than female names and there was only limited consideration given to how some names are gender-neutral.
As some researchers interviewed in the article pointed out, just enhancing the representation within the dataset would also not fix the problem. There are even more issues when trying to infer gender from names since gender is something that people determine for themselves, and shoehorning them into predetermined categories from an undersampled and biased dataset exacerbates the problem. What we at MAIEI believe is that communal resources should be invested to create more representative datasets including the involvement of people from diverse backgrounds in order to embody the principle of “nothing about us, without us.”
As the pandemic raged on, many states rushed to implement their own digital contact-tracing solutions. Then Apple and Google announced a joint proposal that offered a unified API for others to build their solutions on top of it. The key benefit being consistency in the application of security and privacy standards which were being created in an ad-hoc manner for other apps. The Exposure Notification System (ENS) from Apple and Google has the benefits of not storing data in a centralized repository and not tracking location information, two of the primary privacy concerns that people speculate has led to low rates of adoption of this technology. Though the ENS is not without problems - for example, researchers have pointed out that it is vulnerable to Bluetooth spoofing and until recently, there was also no requirement for verification of positive test results which triggered the notification flow. The latter has been addressed now with the requirement to have a verification server but the former is still an unsolved problem though researchers are quick to point out that this hasn’t been done in practice yet so it remains unclear how much of a risk this is.
While one of the studies cited in the article mentions that even with adoption rates of 20-40%, we could reduce the rates of daily infections, it is still not clear what the actual degree of efficacy of these solutions is. Specifically, some argue that investment in other basic infrastructure instead and raising awareness on the proper use of masks and following social distancing would actually do more to reduce the spread of infection.
From elsewhere on the web:
Things from our network and more that we found interesting and worth your time.
Our staff researcher Connor Wright was a panelist at Mitigating bias in facial recognition systems hosted by IndiaAI in partnership with the Ministry of Electronics and Information Technology, Government of India where he spoke about his work at MAIEI and the societal impacts of facial recognition technologies, and what we can do to help avoid negative consequences from the use of such technologies.
Our founder Abhishek Gupta was featured in this piece by REWORK on the next roadblocks in the field of AI where he mentioned the importance of translating ethical and responsible AI principles into practice as the next big hurdle to be solved.
From the archives:
Here’s an article from our blogs that we think is worth another look:
Lexicon of Lies: Terms for Problematic Information by Caroline Jack
This article seeks to explain the terms used to describe problematic information. They could be inaccurate, misleading, or altogether fabricated. The terms we use in describing information would impact how information spreads, who spreads it, and who receives it as this choice is based solely on the perspective of the descriptor. This makes the labelling of information complex, inconsistent and imprecise.
The 2 major divisions of problematic information are misinformation and disinformation;
Misinformation is when the incorrectness/ inaccuracy of information is not intentional but due to mistakes. This is caused by the failure to independently verify a source’s claims or the rush to pass information across- for example in the case of journalists trying to win the competition of being the first to report.
Disinformation, on the other hand, is when information is deliberately intended to mislead.
To delve deeper, read the full article here.
If you’ve got an informed opinion on the impact of AI on society, consider writing a guest post for our community — just send your pitch to firstname.lastname@example.org. You can pitch us an idea before you write, or a completed draft.
As a part of our public competence building efforts, we host events frequently spanning different subjects as it relates to building responsible AI systems, you can see a complete list here: https://montrealethics.ai/meetup
September 2, 10 AM - 11:30 AM ET (Online)
You can find all the details on the event page, please make sure to register as we have limited spots (because of the online hosting solution).
Signing off for this week, we look forward to it again in a week! If you enjoyed this and know someone else that can benefit from this newsletter, please share it with them!
If you have feedback for this newsletter or think there is an interesting piece of research, development or event that we missed, please feel free to email us at email@example.com