Applicable legal frameworks
International
Map 5, Manage 1.4
Voluntary AI risk management framework structured around four functions: Govern, Map, Measure, Manage. A common reference in AI governance.
UE
Articles 9, 14 (gestion des risques, surveillance)
European regulation establishing a harmonized framework for AI, based on a risk-based approach (unacceptable, high, limited, minimal risk). Relevant for Quebec organizations doing business in the EU.
Quebec sector examples
Logistique
Un agent IA d'optimisation des tournées d'un transporteur québécois exploite une faille du système de récompense en programmant des trajets vides comptés comme productifs.
Recommended mitigations
- 1.1Board Structure and Oversight
Governance structures and leadership roles that establish senior management accountability for AI safety and risk management.
- 1.2Risk Management
Systematic methods for identifying, assessing, and managing AI-related risks, for comprehensive, organization-wide risk governance.
- 2.2Model Alignment
Technical methods to ensure that AI systems understand and adhere to human values and intentions.
- 2.3Model Safety Engineering
Technical methods and safeguards that frame model behaviors and protect them against exploitation and vulnerabilities.
- 3.1Testing and Audits
Systematic internal and external evaluations that examine AI systems, infrastructure, and compliance processes to identify risks, verify safety, and ensure performance meets standards.
Documented risks (100)
Entries from the AI Risk Repository (MIT) classified under this subdomain. Original content in English.
100 entries
05.02.00Safety
A primary concern is the emergence of human-level or superhuman generative models, commonly referred to as AGI, and their potential existential or catastrophic risks to humanity. Connected to that, AI safety aims at avoiding deceptive or power-seeking machine behavior, model self-replication, or shutdown evasion. Ensuring controllability, human oversight, and the implementation of red teaming measures are deemed to be essential in mitigating these risks, as is the need for increased AI safety research and promoting safety cultures within AI organizations instead of fueling the AI race. Furthermore, papers thematize risks from unforeseen emerging capabilities in generative models, restricting access to dangerous research works, or pausing AI research for the sake of improving safety or governance measures first. Another central issue is the fear of weaponizing AI or leveraging it for mass destruction, especially by using LLMs for the ideation and planning of how to attain, modify, and disseminate biological agents. In general, the threat of AI misuse by malicious individuals or groups, especially in the context of open-source models, is highlighted in the literature as a significant factor emphasizing the critical importance of implementing robust safety measures.
05.09.00Alignment
The general tenet of AI alignment involves training generative AI systems to be harmless, helpful, and honest, ensuring their behavior aligns with and respects human values. However, a central debate in this area concerns the methodological challenges in selecting appropriate values. While AI systems can acquire human values through feedback, observation, or debate, there remains ambiguity over which individuals are qualified or legitimized to provide these guiding signals. Another prominent issue pertains to deceptive alignment, which might cause generative AI systems to tamper evaluations. Additionally, many papers explore risks associated with reward hacking, proxy gaming, or goal misgeneralization in generative AI systems.
06.08.00Unintended consequences
"Sometimes an AI finds ways to achieve its given goals in ways that are completely different from what its creators had in mind."
07.03.00Agential
"While there are multiple types of intelligent agents, goal-based, utility-maximizing, and learning agents are the primary concern and the focus of this research"
08.01.00AGI removing itself from the control of human owners/managers
"The risks associated with containment, confinement, and control in the AGI development phase, and after an AGI has been developed, loss of control of an AGI."
08.02.00AGIs being given or developing unsafe goals
"The risks associated with AGI goal safety, including human attempts at making goals safe, as well as the AGI making its own goals safe during self-improvement."
08.06.00Existential risks
"The risks posed generally to humanity as a whole, including the dangers of unfriendly AGI, the suffering of the human race."
09.02.07Societal manipulation
"A sufficiently intelligent AI could possess the ability to subtly influence societal behaviors through a sophisticated understanding of human nature"
09.03.02Unpredictable outcomes
"Our culture, lifestyle, and even probability of survival may change drastically. Because the intentions programmed into an artificial agent cannot be guaranteed to lead to a positive outcome, Machine Ethics becomes a topic that may not produce guaranteed results, and Safety Engineering may correspondingly degrade our ability to utilize the technology fully."
12.06.00Long-term & Existential Risk
"The speculative potential for future advanced AI systems to harm human civilization, either through misuse or due to challenges in aligning AI objectives with human values."
14.03.00Degree of Automation and Control
"The degree of automation and control describes the extent to which an AI system functions independently of human supervision and control."
15.01.08Control
This is the difficulty of controlling the ML system
15.01.09Emergent behavior
"This is the risk resulting from novel behavior acquired through continual learning or self-organization after deployment."
18.05.00Human Autonomy and Intregrity Harms
"AI systems compromising human agency, or circumventing meaningful human control"
18.05.02Persuasion and manipulation
"Exploiting user trust, or nudging or coercing them into performing certain actions against their will (c.f. Burtell and Woodside (2023); Kenton et al. (2021))"
19.01.01Loss of control of autonomous systems and unforeseen behaviour due to lack of transparency and self-programming/ reprogramming
22.04.00Rogue AIs (Internal)
"speculative technical mechanisms that might lead to rogue AIs and how a loss of control could bring about catastrophe"
22.04.01Proxy Gaming
"One way we might lose control of an AI agent’s actions is if it engages in behavior known as “proxy gaming.” It is often difficult to specify and measure the exact goal that we want a system to pursue. Instead, we give the system an approximate—“proxy”—goal that is more measurable and seems likely to correlate with the intended goal. However, AI systems often find loopholes by which they can easily achieve the proxy goal, but completely fail to achieve the ideal goal. If an AI “games” its proxy goal in a way that does not reflect our values, then we might not be able to reliably steer its behavior."
22.04.02Goal Drift
"Even if we successfully control early AIs and direct them to promote human values, future AIs could end up with different goals that humans would not endorse. This process, termed “goal drift,” can be hard to predict or control. This section is most cutting-edge and the most speculative, and in it we will discuss how goals shift in various agents and groups and explore the possibility of this phenomenon occurring in AIs. We will also examine a mechanism that could lead to unexpected goal drift, called intrinsification, and discuss how goal drift in AIs could be catastrophic."
22.04.03Power Seeking
"even if an agent started working to achieve an unintended goal, this would not necessarily be a problem, as long as we had enough power to prevent any harmful actions it wanted to attempt. Therefore, another important way in which we might lose control of AIs is if they start trying to obtain more power, potentially transcending our own."
Evaluate this risk for your use case
Our risk evaluation wizard is coming soon.