AI Ethics #11: Phantom gains, multimodal disinformation, carbon footprint of ML, evasion attacks against ML, recognizing masked faces, and more ...
Our eleventh weekly edition covering research and news in the world of AI Ethics
Welcome to the eleventh edition of our weekly newsletter that will help you navigate the fast changing world of AI Ethics! Every week we dive into research papers that caught our eye, sharing a summary of those with you and presenting our thoughts on how it links with other work in the research landscape. We also share brief thoughts on interesting articles and developments in the field. More about us on: https://montrealethics.ai/about/
If someone has forwarded this to you and you want to get one delivered to you every week, you can subscribe to receive this newsletter by clicking below
We have launched some new initiatives at MAIEI that are aimed at increasing scientific diversity, if you are not already a part of our Slack community this is a great time to join in!
Our intention with the Co-Create Program is to enhance the intellectual diversity and accessibility of CFP responses. We are doing this by enabling members of our community from all academic backgrounds to work together on challenging, cross-disciplinary work. Lowering the barriers to entry for people who don’t have experience in the traditional, academic publishing model. We see this as a way for us to work with each other and include diverse perspectives such that we can come together and share new ideas with each other.
We have also launched the samosa bot on Slack that helps our community better connect with each other, we have a research paper management that gives you the opportunity to pick open papers from our library to summarize and share with the community.
In research summaries this week, we look at how multimodal disinformation is becoming an ever more powerful means of persuasion and manipulation, the energy and carbon footprints of machine learning, security and privacy in machine learning, and evasion attacks against machine learning at test time.
In article summaries, we cover how masked face selfies are being used to train new facial recognition systems and how eye-catching in some AI fields might not be real and point to “phantom gains”.
Our learning communities continue to receive an overwhelming response! Thank you everyone!
We operate on the open learning concept where we have a collaborative syllabus on each of the focus areas and meet every two weeks to learn from our peers. We are starting with 5 communities focused on: disinformation, privacy, labor impacts of AI, machine learning security and complex systems theory. You can fill out this form to receive an invite!
Hoping you stay safe and healthy and looking forward to seeing you at our upcoming public consultation sessions (virtually!) and our learning communities! Enjoy this week’s content!
Let's look at some highlights of research papers that caught our attention at MAIEI:
Photo by Andrian Valeanu on Unsplash
A Picture Paints a Thousand Lies? The Effects and Mechanisms of Multimodal Disinformation and Rebuttals Disseminated via Social Media by Michael Hameleers, Thomas E. Powell, Toni G.L.A. Van Der Meer & Lieke Bos
In the current information environment, fake news and disinformation are spreading and solutions are needed to contrast the effects of the dissemination of inaccurate news and information. In particular, many worry that online disinformation – intended as the intentional dissemination of false information through social media – is becoming a powerful persuasive tool to influence and manipulate users’ political views and decisions.
Whereas so far research on disinformation has mostly focused on only textual input, this paper taps into a new line of research by focusing on multimedia types of disinformation which include both text and images. Visual tools may represent a new frontier for the spread of misinformation because they are likely to be perceived as more ‘direct’ representations of reality. Accordingly, the current hypothesis is that multimedia information will be more readily accepted and believed than merely textual inputs. And since now images can be easily manipulated, the worry that animates this research is that they will constitute a very powerful tool in future disinformation-campaigns. Therefore, the primary goals of this paper are (1) to investigate the persuasive power of multimedia online disinformation in the US and (2) to study the effects of journalistic debunking tools against multimedia disinformation.
To delve deeper, read our full summary here.
Towards the Systematic Reporting of the Energy and Carbon Footprints of Machine Learning by Henderson, P., Hu, J., Romoff, J., Brunskill, E., Jurafsky, D., & Pineau, J.
Climate change and environmental destruction are well-documented. Most people are aware that mitigating the risks caused by these is crucial and will be nothing less than a Herculean undertaking. On the bright side, AI can be of great use in this endeavour. For example, it can help us optimize resource use, or help us visualize the devastating effects of floods caused by climate change.
However, AI models can have excessively large carbon footprints. Henderson et al.’s paper details how the metrics needed to calculate environmental impact are severely underreported. To highlight this, the authors randomly sampled one-hundred NeuRIPS 2019 papers. They found that none reported carbon impacts, only one reported some energy use metrics, and seventeen reported at least some metrics related to compute-use. Close to half of the papers reported experiment run time and the type of hardware used. The authors suggest that the environmental impact of AI and relevant metrics are hardly reported by researchers because the necessary metrics can be difficult to collect, while subsequent calculations can be time-consuming.
To delve deeper, read our full summary here.
From our learning communities:
Research that we covered in the learning communities at MAIEI this week, summarized for ease of reading:
SoK: Security and Privacy in Machine Learning by Nicolas Papernot, Patrick McDaniel, Arunesh Sinha, and Michael P. Wellman
Despite the growing deployment of machine learning (ML) systems, there is a profound lack of understanding regarding their inherent vulnerabilities and how to defend against attacks. In particular, there needs to be more research done on the “sensitivity” of ML algorithms to their input data. In this paper, Papernot et. al. “systematize findings on ML security and privacy,” “articulate a comprehensive threat model” and “categorize attacks and defenses within an adversarial framework.”
To delve deeper, read our full summary here.
Evasion Attacks Against Machine Learning at Test Time by Battista Biggio, Igino Corona, Davide Maiorca, Blaine Nelson, Nedim Srndic, Pavel Laskov, Giorgio Giacinto, and Fabio Roli
Machine learning adoption is widespread and in the field of security, applications such as spam filtering, malware detection, and intrusion detection are becoming increasingly reliant on machine learning techniques. Since these environments are naturally adversarial, defenders cannot rely on the assumption that underlying data distributions are stationary. Instead, machine learning practitioners in the security domain must adopt paradigms from cryptography and security engineering to deal with these systems in adversarial settings.
To delve deeper, read our full summary here.
Let’s look at highlights of some recent articles that we found interesting at MAIEI:
Your face mask selfies could be training the next facial recognition tool (CNET)
Instagram and other social media platforms that allow us to share pictures with each other are a great way to keep in touch with friends and others who are far away. In the times of a pandemic with a forced social distancing, they serve as an even more potent tool to maintain social connection (albeit virtually). But, when such photos are posted publicly, there is always a strong risk for them being scraped and picked up for purposes that you never intended your pictures to be used for. This came to light with the recent compilation of a dataset containing selfies of people wearing face masks that were scraped from public Instagram accounts. The purpose of this data collection exercise is to allow researchers to build an AI system that can recognize faces from limited information, such as when a significant part of the face is obscured by a mask.
For facial recognition technology which requires access to multiple facial features to make an accurate match, masks serve to disrupt that recognition process, something that we had covered in our newsletter before which protected protestors from being identified in an automated manner and be matched up with vast databases of face data. Now, when governments are mandating the use of masks to prevent the spread of the disease, this issue has come back in another form, in cases of law enforcement and other activities where authorities rely on face recognition, their efforts are now hampered by the use of masks and they are searching for alternatives to be able to still use the tools and surveillance infrastructure that has been put in place. While there are attempts to digitally add masks to existing photos as a means of training the system, preliminary analysis shows that it isn’t quite as effective as getting real pictures that captures the diversity of skin tones, lighting conditions, angles, and other factors that affect the photos.
The researcher who was behind the collection of this data shirked their responsibility by asserting that if people didn’t want their photos to be used in this manner, they should make their pages private. This presents a problematic perspective whereby tools enabling greater surveillance and privacy intrusions are built while relegating responsibility to the users of digital services where they are not even aware that such a thing might be happening. Public awareness of such efforts is paramount to combat this kind of blind tech-solutionism that ignores fundamental rights and freedoms of people.
How Well Can Algorithms Recognize Your Masked Face? (Wired)
As mentioned in the other article covered in this week’s newsletter, there is a tremendous rush to collect data to train facial-recognition systems when faces are obscured by people wearing masks. Researchers who work on developing these systems point to the problems that arise when faces are obscured by any element, including face masks. Lower rates of accuracy can lead to false positives which will exacerbate the myriad issues with facial-recognition technologies, including the lower rates of recognition of minorities.
A lot of companies and government entities mentioned in the article claim that they now have the ability to recognize faces even in the presence of face masks, relying on features that are exposed like the eyes, eyebrows, and nose bridges. But, without external validation benchmarks, these claims don’t have verifiable backings which makes it hard to judge whether problems with false positives are aggravated with the use of this technology. NIST, which has benchmarks for this, is looking to add face masks digitally to existing photos to create a new benchmark for which it is inviting companies to submit their systems to check their levels of accuracy in a publicly ranked leaderboard.
Chinese and Russian systems tend to perform well because of lighter privacy regulations which makes it easier to work with larger datasets. In China for example, people use this technology ubiquitously as a means of payment using the app AliPay and there is a case to be made how this can be a supplement to other contactless payment technologies, especially when trying to curb the spread of the pandemic. A user of this technology pointed out how he appreciated the usefulness of this working with masks on, decreasing the chances of risk by not having to take the mask off in public places. The important thing will be to continue to respect people’s fundamental rights of freedom and leveraging informed consent before getting them to opt-in to such systems.
Eye-catching advances in some AI fields are not real (ScienceMag)
In a field that has experienced explosive growth in the last 8 years, there is a tremendous potential here for misrepresentation of results in order to “ride the wave” that the hype around AI has created. This isn’t just limited to startups that claim to use AI in the hopes of attracting VC money, but has also started to apply to the research domain where a lack of stringent application of standards in a consistent manner might be bringing harm to the domain overall. We covered some of this in the work by Lipton and Steinhardt on Troubling Trends in Machine Learning Scholarship.
The researchers interviewed in this article alluded to similar concerns talking about how when parsing through research paper, there were concerns in figuring out what the real state-of-the-art (SOTA) was. Specifically, when there was an uneven comparison, it led to how the field was led into stagnation. In doing a deeper analysis on neural network based recommendation systems, they found that when earlier, simpler techniques were fine-tuned, the newer techniques failed to outperform these tuned, older techniques. Similarly, an analysis done on loss functions in image retrieval didn’t find much progress.
When looking at groundbreaking algorithmic advances, LSTMs came out in 1997 and GANs in 2014, where with enough computation these techniques still provide results that match newer techniques that have come out in later years. Given the vastly different number of ways that people perform comparisons, and the incentives of the AI ecosystem that reward novel techniques rather than incremental progress on older ones, leads to an ecosystem where researchers are disincentivized from diving in too deeply to make comparisons.
From the archives:
Here’s an article from our blogs that we think is worth another look:
Research summary: What’s Next for AI Ethics, Policy, and Governance? A Global Overview
This paper (by Daniel Schiff, Justin Biddle, Jason Borenstein, and Kelly Laas) attempts to discern underlying motivations for creating AI ethics documents, the composition of the people behind them, and what factors might determine the success of the documents in achieving their goals.
If you’ve got an informed opinion on the impact of AI on society, consider writing a guest post for our community — just send your pitch to email@example.com. You can pitch us an idea before you write, or a completed draft.
As a part of our public competence building efforts, we host events frequently spanning different subjects as it relates to building responsible AI systems, you can see a complete list here: https://montrealethics.ai/meetup
We’ve got 3 events lined up, one each week on the following topics, for events where we have a second edition, we’ll be utilizing insights from the first session to dive deeper, so we encourage you to participate in both (though you can just participate in either, we welcome fresh insights too!)
AI Ethics: Santa Clara Principles for Content Moderation (Part 1)
June 17, 11:45 AM - 1:15 PM ET (Online)
AI Ethics: Mozilla RFC for Trustworthy AI (Part 2)
June 23, 11:45 AM - 1:15 PM ET (Online)
AI Ethics: Santa Clara Principles for Content Moderation (Part 2)
June 25, 11:45 AM - 1:15 PM ET (Online)
You can find all the details on the event page, please make sure to register as we have limited spots (because of the online hosting solution).
From elsewhere on the web:
Things from our network and more that we found interesting and worth your time.
The Evolution of Fraud: Ethical Implications in the age of large-scale data breaches and widespread deployment of AI solutions by Abhishek Gupta
Artificial intelligence is being rapidly deployed in all contexts of our lives, often in subtle yet behavior nudging ways. At the same time, the pace of development of new techniques and research advancements is only quickening as research and industry labs across the world leverage the emerging talent and interest of communities across the globe. With the inevitable digitization of our lives, increasingly sophisticated and ever larger data security breaches in the past few years, we are in an era where privacy and identity ownership are becoming a relic of the past. In this paper, we will explore how large-scale data breaches, coupled with sophisticated deep learning techniques, will create a new class of fraud mechanisms allowing perpetrators to deploy “Identity Theft 2.0”.
Signing off for this week, we look forward to it again in a week! If you enjoyed this and know someone else that can benefit from this newsletter, please share it with them!
If you have feedback for this newsletter or think there is an interesting piece of research, development or event that we missed, please feel free to email us at firstname.lastname@example.org
If someone has forwarded this to you and you like what you read, you can subscribe to receive this weekly newsletter by clicking below