Research Papers and Reports

AI Standards Lab publishes research on AI safety engineering, AI governance, and standards development. This list includes both lab publications and papers co-authored by our team members.

  • Recommendations for the EU AI Act Digital Omnibus Trilogue

    We have published a report analysing the Council and European Parliament positions going into the EU AI Act Omnibus trilogue. We make several recommendations for the parties engaged in the trilogue. Our main concern in making these recommendations is that, while it is positive if the Omnibus can remove some administrative burdens, it should not…

    Read more


  • Defining AI Models and AI Systems: A Framework to Resolve the Boundary Problem

    AI regulation assigns distinct obligations to providers of AI models and AI systems, but the lack of clear, consistent definitions for “AI model” and “AI system” creates ambiguity across the value chain. This paper surveys the definitions used in academic literature and regulatory documents, and proposes conceptual and operational definitions for drawing a principled boundary…

    Read more


  • Recommendations on the European Parliament Amendments to the EU AI Act in the Digital Omnibus

    We have published a report analysing some of the 750+ AI Act amendments that were proposed by the European Parliament in the context of the EU AI Act Omnibus. We provide a first analysis of these amendments, highlighting specific ones that we either welcome or oppose, based on our area of expertise.

    Read more


  • A Scorecard for the Quality of AI Evaluations

    We have published a working draft of a Quality Scorecard for AI Evaluations, a standards-based framework for assessing the reliability, validity, and rigour of AI evaluations. The scorecard provides structured scoring across five dimensions and a classification system to match evaluations to appropriate governance and deployment contexts.

    Read more


  • Recommendations on the Digital Omnibus Amendments to the EU AI Act

    We analysed the Commission’s Digital Omnibus proposals for the AI Act, highlighting concerns with Article 6(4) database deletion, Article 75(1) enforcement centralisation, and Article 4a data processing rules, whilst proposing targeted amendments to address critical regulatory gaps.

    Read more


  • Agentic Product Maturity Ladder V0.1

    MLCommons releases the Agentic Product Maturity Ladder V0.1, a systematic framework defining six progressive maturity levels (R0–R5) for benchmarking AI agent reliability. Initial assessment of four task domains shows no agents yet meet thresholds for product-level capability benchmarking.

    Read more


  • Safety Frameworks and Standards: A comparative analysis to advance risk management of frontier AI

    This research memo compares Frontier Safety Frameworks with international risk management standards. FSFs offer frontier-specific innovations like capability thresholds but often leave key considerations implicit. Standards provide systematic rigor but weren’t designed for frontier AI. The paper shows how integrating both approaches can advance frontier AI risk management.

    Read more


  • An Analysis of the GPAI model guidelines published by the European Commission

    On July 18, 2025, The European Commission published its first Guidelines on the scope of obligations for providers of general-purpose AI models under the AI Act.  In this post, we provide an analysis of these guidelines. As these obligations for providers go into force on 2 August 2025, we decided that a timely publication of…

    Read more


  • Deprecating Benchmarks: Criteria and Framework

    As AI models rapidly advance, many benchmarks become outdated or flawed yet continue to be used, inflating performance claims and obscuring safety concerns. This paper introduces criteria and a framework for deprecating inadequate benchmarks, with recommendations for developers, policymakers, and governance actors on how to maintain rigorous evaluation standards.

    Read more


  • White paper: GPAI model providers and modifiers and their obligations under the AI Act

    Authors: Ze Shen Chin and Koen Holtman This white paper describes what we think is a possible approach in classifying GPAI model providers and determining their obligations. We may expand on these ideas in the future. Editor’s update: The European Commission published its guidelines on the scope of obligations for providers of general-purpose AI models…

    Read more