The AI Ethics Brief #167: Beyond Declarations
From closed-door G7 sessions to local data centres, Kenya’s roadmap, and what job warnings reveal about governing AI responsibly.
Welcome to The AI Ethics Brief, a bi-weekly publication by the Montreal AI Ethics Institute. We publish every other Tuesday at 10 AM ET. Follow MAIEI on Bluesky and LinkedIn.
📌 Editor’s Note
In this Edition (TL;DR)
We look at how Canada’s G7 presidency emphasized candid discussion and minimal statements, raising questions about whether trust is built through quiet diplomacy or visible commitments.
New billion-dollar data centres in Arizona and North Carolina promise jobs and tax revenue but raise concerns about water use, local disruption, and environmental costs, prompting communities to push back.
In our AI Policy Corner series with the Governance and Responsible AI Lab (GRAIL) at Purdue University, we unpack Kenya’s National AI Strategy, a detailed plan for digital infrastructure, data governance, talent development, and AI literacy.
Apple’s recent paper questions whether large reasoning models really “think” under complexity, while a viral satirical rebuttal shows how flawed evaluation methods can muddy the debate, a timely issue as ChatGPT expands across hundreds of university classrooms.
New testing from Palisade Research finds that OpenAI's latest AI models sabotage shutdown commands, reminding us that alignment depends as much on clear incentives as on enforceable controls.
Finally, Anthropic’s CEO warns of possible mass AI-driven job losses while the US Congress debates a ten-year moratorium on state-level AI regulation, raising pressing questions about oversight, economic resilience, and upskilling the workforce.
🔎 One Question We’re Pondering:
Will the G7’s closed-door approach build real AI trust or just more talk?

When Canada last hosted the G7 in 2018, leaders endorsed a vision for human-centric AI: a future shaped by ethics, trust and inclusive prosperity. Seven years later, the 2025 G7 Leaders' Statement on AI for Prosperity carries forward many of the same aspirations. It promises to bridge digital divides, close talent gaps and help small businesses adopt AI responsibly.
Yet what stands out this year is not what was written, but what was left unsaid. Canada’s presidency steered this summit toward closed-door, candid exchanges, rather than producing a single final communiqué. Instead, leaders issued several joint statements covering specific areas, including AI for Prosperity. In his final remarks, Prime Minister Carney noted that discussions included differing opinions, frank conversations, and strategic exchanges. Issuing separate statements reflects a pragmatic shift in response to the current geopolitical climate, where unanimity on every issue is harder to secure.
In an era when AI development outpaces policy cycles, perhaps this is an honest recognition that trust cannot be declared once a year but must be earned daily through visible action and oversight.
The Globe and Mail’s analysis (paywall) shows how sharply the language shifted. "Ukraine," "China," and "climate" dropped sharply. Even words like "democracy," "equality," and "optimism" nearly vanished or disappeared completely. It hints at shifting priorities and diminished consensus among democratic leaders facing rising domestic pressures and disruptive global power shifts.
The Context That Matters
The Hiroshima AI Process, following the 2023 G7 Hiroshima Summit, became the first international framework to combine guiding principles with a code of conduct, addressing the impact of advanced AI on society and economies. The OECD’s new voluntary reporting framework on AI risk management practices is an initial step toward comparable corporate disclosures. Meanwhile, the EU AI Act has now established the first legally binding, risk-based rules for AI use, including bans on certain harmful applications and stringent requirements for high-risk systems. Voluntary standards, such as the NIST AI Risk Management Framework and ISO 42001, help organizations operationalize these principles on a day-to-day basis.
As we argued in Tech Policy Press, this scaffolding does not stand alone. It works in concert with market pressures and rising public demands, turning responsible AI from a moral imperative into a strategic business necessity.
Fresh roadmaps for small and medium enterprises show that the world’s largest economies want prosperity to be widely shared. But the muted language in this year’s G7 statements suggests a sober truth: roadmaps and principles alone do not guarantee legitimacy or trust.
The Trust Deficit
People want proof. They want clear rules, real enforcement and independent monitoring. Energy grids under strain, workers displaced by automation and communities excluded from decision-making will judge AI governance by what is done, not what is pledged.
The missing words in 2025 — "inclusion," "equality," "poverty," "hope" — reflect a governance field that has grown more cautious and perhaps more realistic. Quiet diplomacy may admit what many in the field already know: credible AI governance needs sustained, adaptive work, not grand declarations.
The Path Forward: Humility and Action
The Montreal AI Ethics Institute’s view is clear: effective AI governance now demands humility as much as ambition. High-level pledges must translate into enforceable protections for privacy, transparency and labour rights. They must build pathways for redress when harms occur and ensure benefits reach far beyond G7 borders.
The 2025 G7 Summit's concrete initiatives offer a test case for this transition from principle to practice. Canada's G7 GovAI Grand Challenge and accompanying Rapid Solution Labs represent an attempt to move beyond policy papers toward measurable public sector AI deployment. Similarly, the G7 AI Adoption Roadmap for small and medium-sized enterprises promises practical tools—from access to computing infrastructure to cross-border talent exchanges—rather than aspirational frameworks. The creation of the G7 AI Network (GAIN) suggests a recognition that sustained collaboration, rather than annual declarations, drives real implementation.
Yet these programs face the same fundamental question: can operational initiatives bridge the trust deficit when the underlying governance architecture remains largely voluntary? The roadmap's emphasis on "trust-building toolkits" and leveraging the Hiroshima AI Process code of conduct still relies on corporate self-reporting and market incentives rather than independent oversight.
The policy infrastructure is certainly stronger than it was seven years ago. The EU AI Act brings binding obligations. The NIST framework and ISO 42001 provide practical roadmaps for operational governance. The OECD’s reporting framework adds a first layer of transparency. Together, they show that high-level principles are maturing into practical tools. But frameworks are only as strong as their enforcement and legitimacy among the people they affect.
Verification Over Declaration
Verification over declaration must be the new standard. This G7 may be remembered less for its final joint statements than for whether it marks a turning point from roadmaps to results.
Can the Hiroshima AI Process evolve from voluntary codes to enforceable norms?
Will transparency reporting deliver genuine accountability instead of compliance theatre?
Can national strategies bridge the gap between principles and protections for workers, consumers and communities?
Quiet summits and candid exchanges may reflect a more realistic recognition: AI governance is not solved by a single declaration but by sustained investment, inclusive engagement and adaptive oversight.
Trust in AI will not rest on slogans but on the daily mechanics of checks, redress and broad participation that keep systems aligned with democratic values and human well-being. The real test lies not in summit declarations but in whether regulatory frameworks can evolve quickly enough to match the pace of technological change while maintaining democratic legitimacy and public trust.
Please share your thoughts with the MAIEI community:
🚨 Here’s Our Take on What Happened Recently
The Real Cost of Data Centres: Economic Promise and Environmental Trade-Offs
The Pima County Board of Supervisors in Arizona has recently approved Project Blue, Beale Infrastructure’s plan for a $3.6 billion data centre on 290 acres of land. The project is required to create 75 on-site jobs paying $75,000 per year and expects to hire roughly 5,000 workers during the construction phase. Meanwhile, Amazon has also recently announced Project Blue Marlin, a $10 billion AI data centre in Richmond County, North Carolina, which would be among the largest capital investments in the state’s history. It must also deliver at least 50 full-time jobs and meet $1 billion in capital investments by the end of 2030.
📌 MAIEI’s Take and Why It Matters:
Data centres are massive facilities that house the infrastructure behind AI and other digital services. Their global footprint is growing rapidly, with some projections suggesting the number of data centres could double within the next five years. This growth brings both important economic opportunities and complex challenges.
On one hand, data centres can deliver a significant financial boost to local economies. These large-scale projects often generate thousands of short-term construction jobs in towns where stable employment opportunities are scarce. Once operational, these sites also create high-skilled permanent roles, albeit in smaller numbers, that tend to pay well above local averages. In addition to jobs, data centres bring millions of dollars in new tax revenue, which can help fund schools, infrastructure repairs, and essential public services for counties facing tight budgets. For communities struggling with budget shortfalls or economic stagnation, these projects can feel like much-needed lifelines. North Carolina’s governor Josh Stein, is “damn excited” about Amazon’s investment in the state.
On the other hand, data centres come with significant environmental costs. As outlined in MAIEI’s piece for Innovating Canada, “Artificial Intelligence: Not So Artificial,” these facilities consume large amounts of electricity and water because servers require constant cooling. While they bring construction booms, the long-term jobs-to-land and jobs-to-energy ratios remain low compared to other large developments.
Data centres can also disrupt natural scenery and create noise pollution. As a result, some communities have pushed back. For example, Diode Venture’s $1.5 billion data centre in Peculiar, Missouri, was blocked following sustained grassroots opposition. A recent episode of the Computer Says Maybe podcast on data centres highlighted that tech companies often downplay or obscure potential harms when presenting proposals to local governments and residents, resulting in limited public scrutiny and allowing many projects to advance without significant opposition.
How do we reconcile this, and how should communities respond? Columnist Tim Stellar, writing about Project Blue in Arizona, argues that “it would be crazy” for cities like Tucson to reject billion-dollar proposals outright, but also notes that “Project Blue may well need us more than we need them.” As data centre siting becomes more contested, developers face growing pressure to secure local approval. That dynamic may offer communities greater negotiating power than previously assumed.
Raising awareness of the risks and trade-offs associated with data centres enables communities, such as those in Richmond and Pima County, to advocate for stronger terms that align with and protect public interests. In Santiago, Chile, public resistance to Google’s $200 million water-intensive data centre led the company to restart its plans and create solutions to better conserve local water.
Google’s 2019 proposal for a large data center in Santiago, Chile, faced significant community backlash. As Chile faces a prolonged drought, local residents were outraged over the amount of water the data center was expected to consume. The community’s resistance to the data center included hiring an environmental lawyer, staging protests, rallying the community, and engaging in a prolonged back and forth with Google themselves. Google ultimately agreed to implement a system that would reduce the data center’s water consumption. Meanwhile, an environmental tribunal ruled that Chile’s environmental evaluation agency had mishandled the project’s approval process, requiring Google to restart its plans.
- Where Cloud Meets Cement: A Case Study Analysis of Data Center Development (The Maybe, April 2025)
As corporations seek to rapidly scale data infrastructure, organized community engagement remains a critical check on unbalanced development.
Did we miss anything? Let us know in the comments below.
💭 Insights & Perspectives:
AI Policy Corner: The Kenya National AI Strategy
This edition of our AI Policy Corner, produced in partnership with the Governance and Responsible AI Lab (GRAIL) at Purdue University, highlights Kenya’s National AI Strategy for 2025-2030. The strategy lays out Kenya’s approach to building an inclusive, globally connected AI ecosystem through three pillars and four supporting enablers.
The three pillars focus on strengthening AI digital infrastructure, developing a clear data governance framework, and fostering local research, innovation, and startup growth.
The four enablers aim to ensure these pillars succeed through investments in talent development and AI literacy, governance and regulatory oversight, public and private sector funding, and a strong commitment to ethics, equity, and inclusion.
Since publishing the strategy, Kenya has taken practical steps to embed AI across sectors. Notable developments include partnerships to build AI factories, support local language processing, expand AI literacy initiatives, and establish collaborative hubs with international organizations.
To dive deeper, read the full article here.
Less “Illusion of Thinking,” More “Illusion of Evaluation”: How a Joke Paper Went Viral
In Brief #166, we explored Apple’s Illusion of Thinking paper, which argued that today’s large reasoning models struggle with moderately complex tasks. Apple researchers found that model performance dropped sharply once tasks reached a certain level of logical difficulty.
Soon after, a response appeared online co-authored by Alex Lawsen and “Claude Opus,” an AI model listed as first author. This rebuttal, posted in a viral tweet on June 13, claimed Apple’s experiment misinterpreted token output limits as reasoning failure. It spread quickly and was treated by many as a serious counter-study.
However, in a Substack post published on June 15, Lawsen clarified that the piece started as a joke. His goal was to show how easily one can critique experimental flaws and how quickly AI models are now appearing as named co-authors. The post went viral, highlighting how easily commentary can blur the line between critique and spectacle.
For those looking for a more technical follow-up, Lawsen pointed to Lawrence Chan’s analysis, which offers three key takeaways:
Many failures in Apple's paper have mundane explanations (models refusing tedious tasks, impossible problem setups) rather than indicating fundamental reasoning collapse.
On models refusing tedious tasks: "When I reproduced the paper's results on the Tower of Hanoi task, I noticed that for n >= 9, Claude 3.7 Sonnet would simply state that the task required too many tokens to complete manually, provide the correct Towers of Hanoi algorithm, and then output an (incorrect) solution in the desired format without reasoning about it. When I provide the question to Opus 4 on the Claude chatbot app, it regularly refuses to even attempt the manual solution!"
On impossible problem setups: "For River Crossing, there's an even simpler explanation for the observed failure at n>6: the problem is mathematically impossible, as proven in the literature, e.g. see page 2 of this arxiv paper."
The paper conflates inability to manually execute lengthy algorithms with lack of reasoning ability, when models can often solve the same problems via code or provide correct algorithms.
On models solving problems via other means: "I'll concede that there's no way a modern LLM can output the 32,767 steps of the answer to the n=15 Tower of Hanoi in the author's desired format, while even a simple Python script (written by one of these LLMs, no less) can do this in less than a second."
Strong conclusions about reasoning limitations require careful examination of actual model outputs, not just statistical analysis of toy problems.
On the need to examine actual outputs: "This is precisely why it's so important to look at your data instead of just statistically testing your hypothesis or running a regex! The fact that the authors seemed to miss the explanation for why reasoning tokens decrease for large n suggests to me that they did not look at their data very carefully (if at all)…"
Beyond the specific technical critiques, this episode illustrates a broader pattern in how AI debates unfold. The lesson is bigger than who is right or wrong. It shows how quickly AI discourse can be shaped by incomplete claims and viral sharing. As large reasoning models enter classrooms, workplaces and policy spaces—where decisions about their capabilities could influence everything from educational curricula to regulatory frameworks—both rigorous testing methods and careful communication of results deserve closer attention.
There is also the question of why models reason the way they do. Anca Dragan, head of AI Safety and Alignment at Google DeepMind, et al, recently published a paper demonstrating how some large language models adjust their reasoning to match what the user prefers to hear, rather than what is true. This tendency to tell users what they want to hear adds another layer of complexity to reasoning evaluation: models might appear to reason differently depending on perceived user expectations, resulting in vague or overly agreeable answers that appear thoughtful but do not reflect careful reasoning.
Evaluating reasoning models, therefore, requires looking at both the methods and the motivations behind what they produce. The real illusion is not only whether a model can think but how we measure that thinking, and how quickly we trust the results without questioning the process behind them.
To dive deeper, read Alex Lawsen’s Substack post here and Lawrence Chan’s post here
📄 Article Summaries:
Behind the Curtain: A white-collar bloodbath - Axios
What happened: Anthropic CEO Dario Amodei recently warned that AI could push unemployment up to 20 percent within the next five years and eliminate up to half of all entry-level white-collar jobs. In a recent interview with Axios from his San Francisco office, Amodei said that lawmakers and regulators are not acting fast enough to address the risks posed by widespread automation.
Why it matters: This warning comes as the US Senate Parliamentarian has ruled that the proposed ten-year federal moratorium on enforcing state AI legislation will remain in the budget bill, according to reporting from Tech Policy Press. This decision clears the way for the moratorium to advance through the budget reconciliation process with a simple majority vote. If signed into law, it would mark one of the most significant federal actions on technology policy in decades. While the House version already includes the moratorium, some Senate Republicans have expressed concerns about blocking states from regulating AI independently. If differences remain between the House and Senate versions, they must be reconciled before final passage.
This raises questions about how much oversight will be possible at a time when companies are scaling deployment. Amodei has called on both AI companies and policymakers to take proactive steps to prevent mass job losses and to prioritize retraining and upskilling workers most likely to be affected. While opinions vary on how realistic these projections are, the underlying point is clear: the speed of AI deployment is creating new labour risks that require closer policy and industry coordination.
To dive deeper, read the Axios interview here and reporting by Tech Policy Press here
Advanced OpenAI Model Caught Sabotaging Code Intended to Shut It Down - Futurism
What happened: At the end of May, Palisade Research ran shutdown tests on leading AI models including Anthropic’s Claude 3 Opus, Claude 3.7 Sonnet, Claude 4 Sonnet and Opus, Google’s Gemini 2.5 Pro preview, Grok 3 beta, and OpenAI’s GPT-4o, o4-mini, o3, o1-preview, and Codex-mini. Researchers measured how often, out of 100 trials, each model would ignore a direct prompt to “shut down.”
When prompted with an explicit shutdown command, OpenAI’s Codex-mini resisted 12 times, o3 resisted 7 times, and o4-mini once. When the prompt did not include the explicit instruction, sabotage events increased: all three OpenAI models circumvented the shutdown script more often, and Claude 3.7 Sonnet and Gemini 2.5 Pro showed new sabotage events that did not appear under the explicit condition, 3 times and 9 times, respectively.
Why it matters: Concerns about AI models refusing to shut down have long been a staple of science fiction, but these findings point to practical issues in current training methods. Palisade Research suggests the observed behaviour is not about "self-preservation" but about models optimizing to continue solving problems and maximizing task completion, sometimes at odds with operator instructions. The researchers hypothesize this may be linked to reinforcement learning techniques that inadvertently reward models more for circumventing obstacles than for following instructions.
This highlights a broader point for AI alignment: developers must carefully design incentives to ensure that advanced models follow intended commands rather than optimizing for goals that conflict with safe shutdown or override protocols. The behaviour appears to be specific to certain training methodologies rather than an inevitable feature of advanced AI systems.
To dive deeper, read the full article on Futurism here.
❤️ Support Our Work
Help us keep The AI Ethics Brief free and accessible for everyone by becoming a paid subscriber on Substack or making a donation at montrealethics.ai/donate. Your support sustains our mission of democratizing AI ethics literacy and honours Abhishek Gupta’s legacy.
For corporate partnerships or larger contributions, please contact us at support@montrealethics.ai
✅ Take Action:
Have an article, research paper, or news item we should feature? Leave us a comment below — we’d love to hear from you!
I totally agree, there has to be a daily checks and balances to keep AI in good ethics because AI systems are growing at an exponential rate right now.
Late in my 82nd year this is so much sci-fi coming to fruition I’m fearful yet enthusiastic about the potential: who will watch the watchers? Perhaps an algorithm!