-
NIST AI 800-2: Our Recommendations on Benchmark Lifecycle and Deprecation
We submitted feedback to NIST on its draft AI 800-2 on Best Practices for Automated Benchmark Evaluations. We recommend that the draft treat benchmarks as active tools requiring lifecycle management, not static instruments. Our submission covers deprecation criteria, versioning, saturation, annotation quality, semantic drift, and the risks of relying on popular but flawed benchmarks. Our…
-
Recommendations for the EU AI Act Digital Omnibus Trilogue
We have published a report analysing the Council and European Parliament positions going into the EU AI Act Omnibus trilogue. We make several recommendations for the parties engaged in the trilogue. Our main concern in making these recommendations is that, while it is positive if the Omnibus can remove some administrative burdens, it should not…
-
Defining AI Models and AI Systems: A Framework to Resolve the Boundary Problem
AI regulation assigns distinct obligations to providers of AI models and AI systems, but the lack of clear, consistent definitions for “AI model” and “AI system” creates ambiguity across the value chain. This paper surveys the definitions used in academic literature and regulatory documents, and proposes conceptual and operational definitions for drawing a principled boundary…
-
Recommendations on the European Parliament Amendments to the EU AI Act in the Digital Omnibus
We have published a report analysing some of the 750+ AI Act amendments that were proposed by the European Parliament in the context of the EU AI Act Omnibus. We provide a first analysis of these amendments, highlighting specific ones that we either welcome or oppose, based on our area of expertise.
-
A Scorecard for the Quality of AI Evaluations
We have published a working draft of a Quality Scorecard for AI Evaluations, a standards-based framework for assessing the reliability, validity, and rigour of AI evaluations. The scorecard provides structured scoring across five dimensions and a classification system to match evaluations to appropriate governance and deployment contexts.
-
Our Feedback on the First Draft Code of Practice on Transparency of AI-Generated Content
We provided feedback on the First Draft Code of Practice on Transparency of AI-Generated Content, addressing feasibility concerns, proportionality for SMEs, and operational clarity across marking, detection, and disclosure requirements.
-
Recommendations on the Digital Omnibus Amendments to the EU AI Act
We analysed the Commission’s Digital Omnibus proposals for the AI Act, highlighting concerns with Article 6(4) database deletion, Article 75(1) enforcement centralisation, and Article 4a data processing rules, whilst proposing targeted amendments to address critical regulatory gaps.
-
We presented a poster at the AI and Societal Robustness Conference
Rokas Gipiškis (AI Standards Lab) and Rebecca Scholefield presented the poster “AI Incident Reporting: Pipeline and Principles” at the AI and Societal Robustness Conference in Cambridge, organised by the UK AI Forum. The work examines post-deployment AI incidents through an end-to-end pipeline spanning definitions and taxonomies, monitoring, reporting, and downstream analysis (including multi-causal approaches and…
-
An Analysis of the GPAI model guidelines published by the European Commission
On July 18, 2025, The European Commission published its first Guidelines on the scope of obligations for providers of general-purpose AI models under the AI Act. In this post, we provide an analysis of these guidelines. As these obligations for providers go into force on 2 August 2025, we decided that a timely publication of…
-
Deprecating Benchmarks: Criteria and Framework
As AI models rapidly advance, many benchmarks become outdated or flawed yet continue to be used, inflating performance claims and obscuring safety concerns. This paper introduces criteria and a framework for deprecating inadequate benchmarks, with recommendations for developers, policymakers, and governance actors on how to maintain rigorous evaluation standards.
-
White paper: GPAI model providers and modifiers and their obligations under the AI Act
Authors: Ze Shen Chin and Koen Holtman This white paper describes what we think is a possible approach in classifying GPAI model providers and determining their obligations. We may expand on these ideas in the future. Editor’s update: The European Commission published its guidelines on the scope of obligations for providers of general-purpose AI models…
-
Our Feedback on the Third Draft of the General-Purpose AI Code of Practice
As an active participant in the General-Purpose AI Code of Practice (GPAI CoP) drafting process, we submitted detailed technical recommendations on the third draft. Our feedback covers all four working groups, on copyright and transparency, risk assessment, risk mitigation, and governance. We also provided additional technical analysis on specifics issues pertaining to systemic risk definitions…
-
We presented at IASEAI and AI Safety Connect
AI Standards Lab presented a comprehensive catalogue of risk sources and risk management measures for general-purpose AI systems at IASEAI and AI Safety Connect in Paris. The research, released under public domain licence, supports implementation of the EU AI Act and provides resources for AI providers, standards bodies, researchers, policymakers, and regulators.
-
Our Feedback on the Second Draft of the General-Purpose AI Code of Practice
We provided technical feedback to the European Commission on the second draft of the General-Purpose AI Code of Practice. Our feedback covers all four working groups, on copyright and transparency, risk assessment, risk mitigation, and governance. We also provided additional technical analysis on specifics issues pertaining to risk tiers.
-
Our Feedback on the First Draft of the General-Purpose AI Code of Practice
We provided technical feedback to the European Commission on the first draft of the General-Purpose AI Code of Practice. We also provided additional technical analysis on risk taxonomy, risk terminology and risk sources, requirements for scientific rigour, and AI control and red teaming; as well as converting the existing Code into a process flowchart.
-
Risk Sources and Risk Management Measures in Support of Standards for General-Purpose AI Systems
A comprehensive, public domain catalog of AI risks and safety measures designed to support global AI regulation and standards development. This resource documents risk sources and management measures across the entire AI lifecycle, from development through deployment.
-
Our Feedback to the EU AI Office’s Multi-Stakeholder Consultation on Trustworthy General-purpose AI Models in the Context of the AI Act
We submitted comprehensive responses to the EU AI Office consultation on the AI Act’s Code of Practice for general-purpose AI models, including a 150-page technical contribution detailing GPAI risk sources and risk management measures for direct inclusion in the Code.
