Natural language processing

EU AI office proposes external testing for large-scale models in first draft of AI code of conduct

EU AI office proposes external testing for large-scale models in first draft of AI code of conduct



summary
Summary

The European Commission has released the first draft of the Code of Conduct for providers of general-purpose AI systems (GPAI). The code aims to facilitate the application of the rules of the EU AI Act for such models. The Commission can approve the code EU-wide and give it general validity.

The draft, prepared by independent experts, contains strict requirements for GPAI models with so-called systemic risk. This includes models trained with more than 10^25 FLOPs of computational power (which, as far as we know, was already broken by GPT-4). According to the current draft, these would have to be reported to the EU two weeks before the start of training (Sub-Measure 20.1).

Comprehensive Safety Frameworks for AI Systems

The code provides for two central documents: the “Safety and Security Framework” (SSF) and the “Safety and Security Report” (SSR). The SSF is the overarching framework that sets out the basic risk management guidelines. It includes four main components:

  1. Risk identification and analysis with detailed methods for identifying systemic risks
  2. Safety measures such as behavioral modifications of models and protective measures during deployment
  3. Security measures to protect model weights and assets as well as access control
  4. Assessment procedures for continuous review of measures

The SSR, on the other hand, is the concrete documentation tool for each individual model. It contains:

Ad

  • Detailed risk analyses before and after implementation of protective measures
  • Assessments of the effectiveness of all safety measures
  • Cost-benefit analyses and scientific method descriptions
  • Internal or external test results

Both documents are closely interlinked: the SSF provides the framework and guidelines according to which the SSRs are created. The SSRs in turn document the concrete implementation and provide insights that are incorporated into updates of the SSF. This interplay is intended to ensure continuous improvement of safety measures.

EU AI Office to Externally Test GPAI Models with Systemic Risk

A new feature in the draft code: for GPAI models with systemic risk, external tests are to be carried out by the AI Office and third parties. The text states: “Signatories will ensure sufficient independent expert testing before deployment of general-purpose AI models with systemic risk, such as by the AI Office and appropriate third-party evaluators, in accordance with AI Office guidance where available, to more accurately assess risks and mitigations, and to provide assurance to external actors. This may also include a review of appropriate elements of the evidence collected by the Signatory.” (Sub-Measure 17.1)

Such external audits are not yet provided for in the AI Act. Preamble 114 merely states that providers of GPAI models with systemic risk should carry out risk assessments “including conducting and documenting adversarial testing of models, also, as appropriate, through internal or independent external testing.”

The question arises as to who would be capable of testing and evaluating the most complex AI models. Does the AI Office have the necessary expertise? The draft code leaves this open for now.

The proposal is also explosive because intensive testing or release of complex models goes hand in hand with far-reaching technical insights into the models to be examined. Testing companies would need to have the expertise for tests on leading frontier technology while keeping the findings from the tests to themselves.

Recommendation

External Tests Could Become Mandatory by EU Commission

The demand for external tests in the code could have far-reaching consequences. According to Preamble 117 of the AI Act, the EU Commission can declare the code of conduct binding EU-wide by means of an implementing act. This would also give the external tests provided for therein the force of law (at least that is my interpretation).

Alternatively, the Commission could issue its own rules for implementing the obligations if the code is not completed in time or is deemed unsuitable by the AI Office. External auditing could thus become mandatory either through the code or directly through a Commission decision.

This would be a significant tightening compared to the original AI Act. However, the preamble also provides that providers can demonstrate compliance by “adequate alternative means” if they do not wish to rely on the code. However, the practical implementation of this option remains unclear.

Strict Copyright Rules and Protection Against Piracy

Another focus of the code is copyright rules. Providers must set up a policy to comply with copyright. This includes respecting the reservations of rights holders who do not want to release their content for training AI models.

As a technical means to this end, providers should support the industry standard robots.txt. This allows website operators to specify which content may be indexed by crawlers. Search engine providers may not use robots.txt exclusions to make content less findable.

In addition, providers should take measures to exclude piracy websites from their crawling activities, for example based on the EU Commission’s “Counterfeit and Piracy Watch List.”

In the next steps, the draft code will be discussed with around 1,000 stakeholders in four thematic working groups. Based on the feedback, the experts are to further develop and specify the code.

EU AI office proposes external testing for large-scale models in first draft of AI code of conduct

Source link