Tag: Co-authored Research Papers
-
Agentic Product Maturity Ladder V0.1
MLCommons releases the Agentic Product Maturity Ladder V0.1, a systematic framework defining six progressive maturity levels (R0–R5) for benchmarking AI agent reliability. Initial assessment of four task domains shows no agents yet meet thresholds for product-level capability benchmarking.
-
Safety Frameworks and Standards: A comparative analysis to advance risk management of frontier AI
This research memo compares Frontier Safety Frameworks with international risk management standards. FSFs offer frontier-specific innovations like capability thresholds but often leave key considerations implicit. Standards provide systematic rigor but weren’t designed for frontier AI. The paper shows how integrating both approaches can advance frontier AI risk management.
-
Mapping Industry Practices to the EU AI Act’s GPAI Code of Practice Safety and Security Measures
A comparative analysis of safety and security measures in the EU AI Act’s GPAI Code of Practice against voluntary commitments from leading AI companies. This paper examines public documents from over a dozen leading AI companies including OpenAI, Anthropic, Google DeepMind, Microsoft, Meta and Amazon to identify industry precedent for Commitments II.1–II.16.
