Today´s discussion
The AI Black Box: Approaches to Transparency and Accountability
Building Trust in AI
As artificial intelligence systems become increasingly complex, their decision-making processes have grown more opaque. This opacity, known as the "black box problem," comes from the intricate nature of modern AI systems that process massive amounts of data through complex neural networks.
As current LLM models reach billions of parameters and algorythms used are less and less explainable, it makes it challenging for both users and creators to understand how these systems arrive at their decisions.
Transparency encompasses both technical and non-technical documentation throughout an AI system's lifecycle. This documentation serves multiple purposes:
- Builds trust with users and stakeholders
- Supports regulatory compliance
- Facilitates client due diligence
- Enables effective system maintenance
The self-learning capabilities of AI systems, combined with their growing size and complexity, make the black box problem increasingly difficult to address. Often, developers must balance system performance with interpretability, making conscious trade-offs to maintain transparency while preserving functionality.
And in Open-Source ? Explainability and Interpretability
Legal and Regulatory Approaches to AI Transparency
Global regulators are establishing frameworks to address AI's black-box challenge through mandatory transparency requirements. The EU leads with GDPR and the EU AI Act, while the U.S. advances similar goals through NIST's AI Risk Management Framework and Executive Order 14110. Asia's approach includes China's Interim Measures for Generative AI Services and Singapore's AI Verify initiative.
These regulations aim to transform voluntary transparency practices into enforceable standards. Each framework emphasizes different aspects:
- EU focuses on user rights and algorithmic accountability
- U.S. prioritizes risk management and technical standards
- Asian regulations balance innovation with social responsibility
While these initiatives take varying approaches, they share the common goal of making AI systems more transparent and accountable to users and regulators alike.
> EU GDPR
The General Data Protection Regulation (GDPR) established groundbreaking transparency requirements for automated decision-making systems. Articles 13, 14, and 15 mandate disclosure of algorithmic logic and potential consequences for individuals affected by AI decisions.
Article 22 and Recital 71 strengthen these protections by granting individuals the right to explanations and the ability to challenge automated assessments. This framework creates a legal foundation for:
- Understanding AI decision logic
- Assessing impact on individuals
- Challenging automated decisions
- Obtaining meaningful explanations
These provisions represent the first major legislative effort to address AI transparency and accountability, influencing subsequent global regulations.
> EU AI Act
The EU AI Act adopts a risk-based framework for transparency requirements, focusing on high-risk and general-purpose AI systems. High-risk systems must provide comprehensive technical documentation and user instructions detailing characteristics, capabilities, and limitations. The Act requires automated logging throughout these systems' lifecycles to ensure traceability.
For general-purpose AI providers, documentation requirements extend to training, testing, and evaluation results. Those integrating GPAI must maintain current documentation, while providers of GPAI systems with systemic risks must publicly disclose training content summaries. The EU AI Act grants individuals the right to explanations for high-risk AI decisions affecting their health, safety, or fundamental rights. Additionally, Article 50(2) mandates machine-readable watermarks for specific AI systems, enabling identification of AI-generated content and AI interactions.
GPAI or Global Partnership on Artificial Intelligence (AI) is an international initiative established to guide the responsible development and use of artificial intelligence in a manner that respects human rights and the shared democratic values of its members.
> NIST AI RMF
The NIST AI Risk Management Framework (RMF) presents a structured approach to understanding AI system transparency through 3 distinct but interconnected concepts:
- transparency "what"
- explainability "how"
- interpretability "why"
Transparency addresses the "what" by requiring disclosure of system information to users throughout the AI lifecycle. This includes sharing details about design decisions, training data, model structure, and deployment processes. The framework emphasizes tailoring information to different stakeholders' roles and knowledge levels, and mandates user notification when potential adverse outcomes are detected.
Explainability tackles the "how" by describing the system's underlying mechanisms. The RMF recognizes that complex AI systems require explanations adapted to different audiences' technical understanding.
Meanwhile, interpretability focuses on the "why" by helping users understand the meaning behind specific AI outputs in their intended context.
> U.S. Executive Order 14110
U.S. Executive Order 14110 integrates transparency into AI safety requirements. In it section 4, for developers of advanced AI systems, the Order mandates sharing safety test results and critical information with the government. It also requires watermarking of AI-generated content to protect against fraud and deception. This security-focused approach aims to enhance public safety while ensuring AI system accountability.
> China's Interim Measures for the Management of Generative AI Services
China's Interim Measures for the Management of Generative AI Services requires transparency through specific disclosure requirements. Article 10 mandates service providers to explain AI system uses to users and promote proper understanding of generative AI capabilities. Article 11 extends transparency through mandatory watermarking of AI-generated content.
> Singapore's AI Verify
Singapore's AI Verify framework offers a voluntary testing approach to AI governance through a dual structure: a testing framework based on 11 internationally recognized principles across 5 pillars and a toolkit for technical testing. The framework addresses transparency by requiring organizations to provide users with sufficient information about AI systems to make informed usage decisions. Explainability is assessed through technical tests that examine how AI models reach decisions, ensuring users understand the factors influencing outputs. This comprehensive approach combines documentary evidence for transparency assessment with technical testing for explainability verification.
How to Implement AI Governance in Transparency, Explainability and Interpretability ?
Organizations are developing various tools to address AI transparency and explainability challenges :
> Model and System Cards
Model cards, which are concise documents accompanying AI models, provide transparent reporting by disclosing intended use cases, performance metrics and evaluation benchmarks across different demographics. Major tech companies like Meta, Microsoft, OpenAI and Google have publicly released model cards for their AI systems. However, these cards face challenges in balancing technical detail with accessibility and managing security risks through information disclosure. AI systems typically integrate multiple models and technologies working in concert. While model cards provide transparency for individual components, system cards offer deeper insight into how these elements interact and function together as a unified system.
System cards expand on model cards by explaining how multiple AI models and technologies interact within a larger system. Meta's implementation includes 22 system cards for Facebook and Instagram, each detailing system overview, operational mechanics, content customization, and content delivery.
Each card has four sections that detail:
→ An overview of the AI system.
→ How the system works by summarizing the steps involved in creating experiences on Facebook and Instagram.
→ How the shown content can be customized.
→ How AI delivers content as part of the whole system.
These cards require regular updates as AI systems evolve, while facing similar challenges regarding standardized language and security considerations.
Both model and system cards serve broader purposes beyond transparency. They facilitate stakeholder communication, support bias and security mitigation, enable future version comparison, establish accountability records and help auditors in system evaluation.
> Open-source AI
Open-source AI represents another approach to transparency by making source code publicly accessible. This enables users to view, modify, and distribute it freely. This approach leverages community scrutiny, risk detection. The collaborative environment enhances algorithm transparency, enables risk detection and accelerates solution development. Beyond transparency, open-source AI democratizes technology access and fosters innovation that might otherwise be constrained by proprietary systems.
> Watermarking
As generative AI technology advances, the line between AI-generated and human-created content becomes increasingly blurred, making content authentication more challenging. This evolution necessitates robust identification methods and transparency mechanisms.
Watermarking has emerged as a solution, mandated by 3 mains international regulations including the EU AI Act, U.S. Executive Order 14110, and China's Interim Measures. Companies are implementing machine-readable watermarks to identify AI-generated content, though challenges remain in comprehensive labeling and watermark security.
Watermarking is also gaining popularity as a way for organizations to promote transparency and ensure safety against harmful content, such as misinformation and disinformation.
Despite watermarking's growing adoption as a transparency tool, technical limitations prevent comprehensive labeling of all AI-generated content. Additionally, the emergence of watermark-breaking techniques challenges this solution's effectiveness.
Some Actual Company Examples
Major tech companies are implementing diverse watermarking approaches to identify AI-generated content:
- Google's SynthID technology embeds watermarks directly into content from its Imagen text-to-image generator.
- Meta labels AI-generated images with "Imagined with AI" across Facebook, Instagram, and Threads. The company is expanding its approach by collaborating with industry partners to develop multilingual labeling standards for synthetic content from third-party tools. Meta leverages guidelines from Partnership on AI, Coalition for Content Provenance and Authenticity, and International Press Telecommunications Council to implement invisible markers for content generated by Google, Microsoft, OpenAI, and others.
What is SynthID from Google ?
Want to know more on Transparency, Explainability and Interpretability ? Read the additional blog articles we have prepared for you or contact us. We´re here to help you build the best AI Governance plan to suit your company´s needs.
Source : IAPP AI Governance in practice report 2024.