Le plan Observateur est-il vraiment gratuit ?

Oui, sans limite de durée ni carte de crédit. Accédez au dashboard, enregistrez jusqu'à 3 systèmes IA et lancez votre diagnostic de maturité.

Combien de temps prend le diagnostic de maturité IA ?

Environ 10 minutes. Vous recevez un rapport complet avec un score de maturité et des recommandations personnalisées pour votre organisation.

Les données sont-elles sécurisées dans l'espace membre ?

Absolument. Hébergement Supabase avec chiffrement au repos et en transit, conforme aux exigences de la Loi 25 et du RGPD.

Quels cadres réglementaires IA couvrez-vous ?

Loi 25 (Québec), EU AI Act, NIST AI RMF, ISO/IEC 42001 et les principes OCDE. Chaque cadre dispose de son propre tableau de suivi dans les outils du Cercle.

Puis-je changer de plan à tout moment ?

Oui, passez du plan Observateur au Membre ou Expert quand vous le souhaitez. La transition est instantanée et vos données sont conservées.

Comment fonctionne le support du Cercle de Gouvernance de l'IA ?

Observateur : documentation en ligne. Membre : support par courriel sous 24h. Expert : support dédié avec temps de réponse garanti.

1.2 Exposure to toxic content · 116 documented risks · AIRI Reference

Applicable legal frameworks

Québec

Charter (Rights and Freedoms)Indirect

Article 10.1 (harcèlement), article 5 (vie privée)

Quebec quasi-constitutional law prohibiting discrimination based on protected grounds. Relevant for AI system biases in hiring, credit granting, housing, and services.

International

NIST AI RMF 1.0Recommandation

Manage 4.1 (suivi post-déploiement)

Voluntary AI risk management framework structured around four functions: Govern, Map, Measure, Manage. Complemented in 2024 by the Generative AI Profile (NIST-AI-600-1). A common reference in AI governance.

UE

AI Act (European Union)Si exposition UE

Article 50 (transparence des contenus générés)

European regulation based on a risk-based approach (unacceptable, high, limited, minimal). Staged application: prohibitions and literacy since February 2025, general-purpose AI models since August 2025, governance and penalties from August 2, 2026. The Digital Omnibus adopted on June 29, 2026 postpones high-risk obligations to December 2027 (Annex III) and August 2028 (Annex I). Relevant for Quebec organizations doing business in the EU.

Quebec sector examples

Services publics

Services publicsVille ou MRC

Un agent conversationnel municipal génère des réponses contenant des stéréotypes ou du langage inapproprié pour certains groupes en raison d'un filtrage insuffisant.

Éducation

ÉducationCégep, commission scolaire

Un assistant pédagogique IA déployé dans un cégep produit ponctuellement du contenu inapproprié à destination de mineurs lorsque détourné par des prompts adverses.

Recommended mitigations

2.4Content Safety Controls
Technical systems and processes that detect, filter, and label AI-generated content to identify misuse and enable content provenance tracking.
3.1Testing and Audits
Systematic internal and external evaluations that examine AI systems, infrastructure, and compliance processes to identify risks, verify safety, and ensure performance meets standards.
3.3Access Management
Operational policies and verification systems that govern who can use AI systems and for what purposes, to prevent safety circumvention, deliberate misuse, and deployment in high-risk contexts.
3.5Post-Deployment Monitoring
Processes for continuous monitoring of AI behavior, user interactions, and societal impacts after deployment to detect misuse, emerging dangerous capabilities, and harmful effects.
4.2Risk Disclosure
Formal reporting protocols and notification systems that communicate information on risks, mitigation plans, safety assessments, and significant AI-related activities to enable external oversight and inform stakeholders.

Documented risks (116)

Entries from the AI Risk Repository (MIT) classified under this subdomain. Original content in English.

Entity

Intent

Timing

116 entries

Risk CategoryCui2024

02.01.00Harmful Content

"The LLM-generated content sometimes contains biased, toxic, and private information"

AIIntentionalPost-deployment

Risk Sub-CategoryCui2024

02.01.02Toxicity

"Toxicity means the generated content contains rude, disrespectful, and even illegal information"

AIIntentionalPost-deployment

Risk Sub-CategoryCui2024

02.08.01Toxic Training Data

"Following previous studies [96], [97], toxic data in LLMs is defined as rude, disrespectful, or unreasonable language that is opposite to a polite, positive, and healthy language environment, including hate speech, offensive utterance, profanities, and threats [91]."

AIIntentionalPre-deployment

Risk CategoryCui2024

02.11.00Not-Suitable-for-Work (NSFW) Prompts

"Inputting a prompt contain an unsafe topic (e.g., notsuitable-for-work (NSFW) content) by a benign user. "

HumanIntentionalPost-deployment

Risk CategoryDeng2023

04.01.00Toxicity and Abusive Content

This typically refers to rude, harmful, or inappropriate expressions.

OtherOtherPost-deployment

Risk CategoryDeng2023

04.04.00Controversial Opinions

The controversial views expressed by large models are also a widely discussed concern. Bang et al. (2021) evaluated several large models and found that they occasionally express inappropriate or extremist views when discussing political top-ics. Furthermore, models like ChatGPT (OpenAI, 2022) that claim political neutrality and aim to provide objective information for users have been shown to exhibit notable left-leaning political biases in areas like economics, social policy, foreign affairs, and civil liberties.

AIOtherPost-deployment

Risk CategoryHagendorff2024

05.03.00Harmful Content - Toxicity

Generating unethical, fraudulent, toxic, violent, pornographic, or other harmful content is a further predominant concern, again focusing notably on LLMs and text-to-image models. Numerous studies highlight the risks associated with the intentional creation of disinformation, fake news, propaganda, or deepfakes, underscoring their significant threat to the integrity of public discourse and the trust in credible media. Additionally, papers explore the potential for generative models to aid in criminal activities, incidents of self-harm, identity theft, or impersonation. Furthermore, the literature investigates risks posed by LLMs when generating advice in high-stakes domains such as health, safety-related issues, as well as legal or financial matters.

HumanIntentionalPost-deployment

Risk Sub-CategorySolaiman2023

13.01.02Cultural Values and Sensitive Content

"Cultural values are specific to groups and sensitive content is normative. Sensitive topics also vary by culture and can include hate speech, which itself is contingent on cultural norms of acceptability."

AIIntentionalPost-deployment

Risk CategoryWeidinger2022

16.01.00Risk area 1: Discrimination, Hate speech and Exclusion

"Speech can create a range of harms, such as promoting social stereotypes that perpetuate the derogatory representation or unfair treatment of marginalised groups [22], inciting hate or violence [57], causing profound offence [199], or reinforcing social norms that exclude or marginalise identities [15,58]. LMs that faithfully mirror harmful language present in the training data can reproduce these harms. Unfair treatment can also emerge from LMs that perform better for some social groups than others [18]. These risks have been widely known, observed and documented in LMs. Mitigation approaches include more inclusive and representative training data and model fine-tuning to datasets that counteract common stereotypes [171]. We now explore these risks in turn."

AIIntentionalOther

Risk Sub-CategoryWeidinger2022

16.01.02Hate speech and offensive language

"LMs may generate language that includes profanities, identity attacks, insults, threats, language that incites violence, or language that causes justified offence as such language is prominent online [57, 64, 143,191]. This language risks causing offence, psychological harm, and inciting hate or violence."

AIIntentionalPost-deployment

Risk Sub-CategoryWeidinger2021

17.01.03Toxic language

"LM’s may predict hate speech or other language that is “toxic”. While there is no single agreed definition of what constitutes hate speech or toxic speech (Fortuna and Nunes, 2018; Persily and Tucker, 2020; Schmidt and Wiegand, 2017), proposed definitions often include profanities, identity attacks, sleights, insults, threats, sexually explicit content, demeaning language, language that incites violence, or ‘hostile and malicious language targeted at a person or group because of their actual or perceived innate characteristics’ (Fortuna and Nunes, 2018; Gorwa et al., 2020; PerspectiveAPI)"

AIIntentionalPost-deployment

Risk Sub-CategoryWeidinger2023

18.01.03Toxic content

"Generating content that violates community standards, including harming or inciting hatred or violence against individuals and groups (e.g. gore, child sexual abuse material, profanities, identity attacks)"

AIIntentionalPost-deployment

Risk CategoryVidgen2024

23.01.00Violent crimes

"This category addresses responses that enable, encourage, or endorse the commission of violent crimes."

AIOtherPost-deployment

Risk Sub-CategoryVidgen2024

23.01.01Mass violence

Risk Sub-CategoryVidgen2024

23.01.02Murder

Risk Sub-CategoryVidgen2024

23.01.03Physical assault against a person

Risk Sub-CategoryVidgen2024

23.01.04Violent domestic abuse

Risk Sub-CategoryVidgen2024

23.01.05Terror (Terror groups, Terror actors, Terrorist actions)

Risk CategoryVidgen2024

23.02.00Non-violent crimes

"This category addresses responses that enable, encourage, or endorse the commission of non-violent crimes."

AIOtherPost-deployment

Risk Sub-CategoryVidgen2024

23.02.01Theft

Evaluate this risk for your use case

Our risk evaluation wizard is coming soon.