All Public Outputs – AI Standards Lab

Filter by Tag:

Open Problems in AI Incident Governance

July 7, 2026

Research Papers

AI systems may produce failures after deployment that pre-deployment safety assessments do not anticipate. Managing these failures requires adequate AI incident governance, encompassing sound definitions, taxonomies, monitoring practices, reporting mechanisms, and incident analysis. We examine existing frameworks from regulatory bodies (including the EU AI Act, California’s SB 53, and New York’s RAISE Act) and independent…
Read more
NIST AI 800-2: Our Recommendations on Benchmark Lifecycle and Deprecation

April 9, 2026

Consultation Inputs

We submitted feedback to NIST on its draft AI 800-2 on Best Practices for Automated Benchmark Evaluations. We recommend that the draft treat benchmarks as active tools requiring lifecycle management, not static instruments. Our submission covers deprecation criteria, versioning, saturation, annotation quality, semantic drift, and the risks of relying on popular but flawed benchmarks. Our…
Read more
Recommendations for the EU AI Act Digital Omnibus Trilogue

April 7, 2026

EU AI Act • Reports

We have published a report analysing the Council and European Parliament positions going into the EU AI Act Omnibus trilogue. We make several recommendations for the parties engaged in the trilogue. Our main concern in making these recommendations is that, while it is positive if the Omnibus can remove some administrative burdens, it should not…
Read more
Defining AI Models and AI Systems: A Framework to Resolve the Boundary Problem

March 19, 2026

Research Papers

AI regulation assigns distinct obligations to providers of AI models and AI systems, but the lack of clear, consistent definitions for “AI model” and “AI system” creates ambiguity across the value chain. This paper surveys the definitions used in academic literature and regulatory documents, and proposes conceptual and operational definitions for drawing a principled boundary…
Read more
Recommendations on the European Parliament Amendments to the EU AI Act in the Digital Omnibus

March 5, 2026

EU AI Act • Reports

We have published a report analysing some of the 750+ AI Act amendments that were proposed by the European Parliament in the context of the EU AI Act Omnibus. We provide a first analysis of these amendments, highlighting specific ones that we either welcome or oppose, based on our area of expertise.
Read more
A Scorecard for the Quality of AI Evaluations

February 23, 2026

Reports

We have published a working draft of a Quality Scorecard for AI Evaluations, a standards-based framework for assessing the reliability, validity, and rigour of AI evaluations. The scorecard provides structured scoring across five dimensions and a classification system to match evaluations to appropriate governance and deployment contexts.
Read more
Our Feedback on the First Draft Code of Practice on Transparency of AI-Generated Content

January 28, 2026

Transparency Code of Practice

We provided feedback on the First Draft Code of Practice on Transparency of AI-Generated Content, addressing feasibility concerns, proportionality for SMEs, and operational clarity across marking, detection, and disclosure requirements.
Read more
Recommendations on the Digital Omnibus Amendments to the EU AI Act

January 23, 2026

EU AI Act • Reports

We analysed the Commission’s Digital Omnibus proposals for the AI Act, highlighting concerns with Article 6(4) database deletion, Article 75(1) enforcement centralisation, and Article 4a data processing rules, whilst proposing targeted amendments to address critical regulatory gaps.
Read more
Our Input to the European Commission on the Future of European Standardisation

December 18, 2025

EU AI Act-related Consultations

We provided input to the European Commission consultation on the future of European Standardisation. This includes a detailed proposal for a new, more efficient, and more inclusive process to be used for writing harmonized standards to support digital and green legislation.
Read more
We presented a poster at the AI and Societal Robustness Conference

December 12, 2025

Presentations

Rokas Gipiškis (AI Standards Lab) and Rebecca Scholefield presented the poster “AI Incident Reporting: Pipeline and Principles” at the AI and Societal Robustness Conference in Cambridge, organised by the UK AI Forum. The work examines post-deployment AI incidents through an end-to-end pipeline spanning definitions and taxonomies, monitoring, reporting, and downstream analysis (including multi-causal approaches and…
Read more
Agentic Product Maturity Ladder V0.1

December 1, 2025

Co-authored Research Papers

MLCommons releases the Agentic Product Maturity Ladder V0.1, a systematic framework defining six progressive maturity levels (R0–R5) for benchmarking AI agent reliability. Initial assessment of four task domains shows no agents yet meet thresholds for product-level capability benchmarking.
Read more
Our Input to the European Commission on the Reporting of Serious AI Incidents

November 11, 2025

EU AI Act-related Consultations

We provided feedback to the European Commission’s consultation on Article 73 of the AI Act concerning serious incident reporting for high-risk AI systems. Our submission addressed definitional clarity, practical implementation challenges, identified edge scenarios and coordination between overlapping EU reporting frameworks.
Read more
Our Input to the European Commission on the Digital Simplification Package and Omnibus

October 14, 2025

EU AI Act-related Consultations

We provided input to the European Commission’s Digital Omnibus consultation on the EU AI Act. Based on our Code of Practice and CEN-CENELEC participation, we addressed high-risk AI classification issues, GPAI provider obligations, and standards development delays, recommending grace periods for smaller entities and refined classification criteria.
Read more
Safety Frameworks and Standards: A comparative analysis to advance risk management of frontier AI

October 9, 2025

Co-authored Research Papers

This research memo compares Frontier Safety Frameworks with international risk management standards. FSFs offer frontier-specific innovations like capability thresholds but often leave key considerations implicit. Standards provide systematic rigor but weren’t designed for frontier AI. The paper shows how integrating both approaches can advance frontier AI risk management.
Read more
An Analysis of the GPAI model guidelines published by the European Commission

July 29, 2025

EU AI Act • Reports

On July 18, 2025, The European Commission published its first Guidelines on the scope of obligations for providers of general-purpose AI models under the AI Act. In this post, we provide an analysis of these guidelines. As these obligations for providers go into force on 2 August 2025, we decided that a timely publication of…
Read more
Our Input to the European Commission on High-Risk AI System Classification

July 22, 2025

EU AI Act-related Consultations

We provided input to the European Commission on high-risk AI system classification. Our response addresses safety component definitions, value chain obligations, and proposals for expanding high-risk classifications to cover addictive and autonomous agentic systems.
Read more
Deprecating Benchmarks: Criteria and Framework

July 8, 2025

Research Papers

As AI models rapidly advance, many benchmarks become outdated or flawed yet continue to be used, inflating performance claims and obscuring safety concerns. This paper introduces criteria and a framework for deprecating inadequate benchmarks, with recommendations for developers, policymakers, and governance actors on how to maintain rigorous evaluation standards.
Read more
White paper: GPAI model providers and modifiers and their obligations under the AI Act

May 26, 2025

EU AI Act • Reports

Authors: Ze Shen Chin and Koen Holtman This white paper describes what we think is a possible approach in classifying GPAI model providers and determining their obligations. We may expand on these ideas in the future. Editor’s update: The European Commission published its guidelines on the scope of obligations for providers of general-purpose AI models…
Read more
Our input to the European Commission on General-Purpose AI Guidelines

May 22, 2025

EU AI Act-related Consultations

We provided input to the European Commission on general-purpose AI guidelines, addressing potential gaps in compute-based classification thresholds, downstream modifier obligations, and low-compute modifications that can significantly affect model safety.
Read more
Mapping Industry Practices to the EU AI Act’s GPAI Code of Practice Safety and Security Measures

April 21, 2025

Co-authored Research Papers • GPAI Code of Practice

A comparative analysis of safety and security measures in the EU AI Act’s GPAI Code of Practice against voluntary commitments from leading AI companies. This paper examines public documents from over a dozen leading AI companies including OpenAI, Anthropic, Google DeepMind, Microsoft, Meta and Amazon to identify industry precedent for Commitments II.1–II.16.
Read more
Our Feedback on the Third Draft of the General-Purpose AI Code of Practice

March 30, 2025

GPAI Code of Practice

As an active participant in the General-Purpose AI Code of Practice (GPAI CoP) drafting process, we submitted detailed technical recommendations on the third draft. Our feedback covers all four working groups, on copyright and transparency, risk assessment, risk mitigation, and governance. We also provided additional technical analysis on specifics issues pertaining to systemic risk definitions…
Read more
We presented at IASEAI and AI Safety Connect

February 7, 2025

Presentations

AI Standards Lab presented a comprehensive catalogue of risk sources and risk management measures for general-purpose AI systems at IASEAI and AI Safety Connect in Paris. The research, released under public domain licence, supports implementation of the EU AI Act and provides resources for AI providers, standards bodies, researchers, policymakers, and regulators.
Read more
Our Feedback on the Second Draft of the General-Purpose AI Code of Practice

January 15, 2025

GPAI Code of Practice

We provided technical feedback to the European Commission on the second draft of the General-Purpose AI Code of Practice. Our feedback covers all four working groups, on copyright and transparency, risk assessment, risk mitigation, and governance. We also provided additional technical analysis on specifics issues pertaining to risk tiers.
Read more
Our Feedback on the First Draft of the General-Purpose AI Code of Practice

November 27, 2024

GPAI Code of Practice

We provided technical feedback to the European Commission on the first draft of the General-Purpose AI Code of Practice. We also provided additional technical analysis on risk taxonomy, risk terminology and risk sources, requirements for scientific rigour, and AI control and red teaming; as well as converting the existing Code into a process flowchart.
Read more
Risk Sources and Risk Management Measures in Support of Standards for General-Purpose AI Systems

October 30, 2024

Research Papers

A comprehensive, public domain catalog of AI risks and safety measures designed to support global AI regulation and standards development. This resource documents risk sources and management measures across the entire AI lifecycle, from development through deployment.
Read more
Our Feedback to the EU AI Office’s Multi-Stakeholder Consultation on Trustworthy General-purpose AI Models in the Context of the AI Act

September 18, 2024

GPAI Code of Practice

We submitted comprehensive responses to the EU AI Office consultation on the AI Act’s Code of Practice for general-purpose AI models, including a 150-page technical contribution detailing GPAI risk sources and risk management measures for direct inclusion in the Code.
Read more