How the AI Act's scientific advisory layer makes evaluation standards and measuring sticks concrete
Important development: The EU AI Act's Scientific Panel, consisting of 60 independent experts, will begin advising on GPAI, systemic risks, and evaluation methods in 2026. This scientific advisory layer will help determine how model classification, risk thresholds, and testing frameworks are shaped in practice.
Why This Advisory Layer Reaches Beyond Technical Expertise
In Brussels, AI governance is being built layer by layer. Alongside the AI Office driving the daily implementation of the AI Act, Europe is establishing a scientific advisory layer designed to provide the technical foundation for policy and supervision. Recent analyses confirm that the Scientific Panel is expected to consist of 60 independent experts with two-year terms, who will advise the AI Office from 2026 on general-purpose AI (GPAI), systemic risks, and methods for evaluation and market surveillance. This composition follows the recruitment round that opened this summer. This is precisely where technical depth meets policy: model classification, risk thresholds, and testing frameworks are conceived here before they reach market surveillance and organizations. (Tech Policy Press)
What Exactly Is the Scientific Panel
The Scientific Panel is a group of independent experts selected by the European Commission to support the AI Office and national authorities in implementing and enforcing the AI Act. The foundation is anchored in the regulation: members are selected based on current scientific and technical expertise, serve in a personal capacity, and must be independent of providers of AI systems or GPAI models. The panel comprises up to 60 experts, with safeguards for geographic distribution and balance. Members are appointed for two years, with the possibility of renewal. (artificialintelligenceact.eu)
Composition and Working Method of the Scientific Panel
Key details:
- 60 independent experts with scientific and technical expertise
- 2-year terms with possibility of renewal
- Personal capacity - no organizational representation
- Independence from AI providers and GPAI models required
- Geographic distribution and balance ensured
- Focus on: GPAI, systemic risks, evaluation methods, market surveillance
In June 2025, the Commission published an official call for candidates. The accompanying Q&A explained that the panel will support implementation and enforcement, focusing on GPAI, evaluation methodologies, cross-border market surveillance, and emerging risks. After the application deadline closed in September 2025, selection and installation are expected to follow toward 2026, when the first opinions are also anticipated. (digital-strategy.ec.europa.eu)
Why This Layer Matters in the Brussels Architecture
The AI Act introduces various governance layers. The AI Office serves as the executive core within the Commission, now with more than 125 staff members and further growth expected. Additionally, there is an AI Board with representatives from Member States and an Advisory Forum for stakeholders. The Scientific Panel adds a technical-scientific pillar that guides not politically, but methodologically and substantively. The goal is consistency: the same terminology, the same testing methods, and the same burden of proof across the Union. (digital-strategy.ec.europa.eu)
The Four Pillars of EU AI Governance
It is precisely on these points that fragmentation is currently the greatest counterforce. For providers and users, the difference between "ready" and "not ready" is often not ambition, but whether clear and reproducible evaluation frameworks exist. The panel can bridge three gaps:
Three crucial bridges the Scientific Panel builds:
- From law to practice: Translating broad legal obligations into concretely testable requirements
- Common language: A uniform conceptual framework for risks currently experienced as heterogeneous
- Academic to operational: A bridge between academic state-of-the-art and the pragmatics of supervision and product development
Recent reporting emphasizes that the panel explicitly focuses on GPAI, systemic risks, and evaluation methods. This reduces room for divergent interpretations in sector-specific applications. (Tech Policy Press)
From Principles to Measuring Sticks: What Changes in Practice
Those working with foundation models or GPAI know how difficult it is to translate abstract due diligence obligations into demonstrable conformity. Consider the question of which evaluations are sufficient to substantiate model behavior. The panel becomes the forum where such questions are operationalized. In practice, the following movements can be expected.
First, a set of reference frameworks for evaluation. Not as separate benchmarks, but as coherent methodologies aligned with the risk-based approach in the law. A model classification that looks not only at input-output, but also at modality, scale, adaptability, and context of use, requires different evidence than is currently standard. This demands datasheets that go beyond dataset inventories and better document the traceability, repeatability, and edge cases of evaluations. Expect guidance toward reproducible experiments, including protocols for red teaming, capability discovery, and stress tests.
Second, systemic risks get a workable threshold. Until now, "systemic" has often been used associatively, for example for models that are widely deployed or drive an ecosystem. But for supervision to work, a testable profile is needed: which capabilities, which scale indicators, which dependencies, which potential amplification mechanisms, and which externalities. An advisory framework from the Scientific Panel can help quantify thresholds, including indicators for monitoring in production environments. Tech analyses from recent days frame it this way: the panel is precisely where those thresholds get methodical elaboration. (Tech Policy Press)
Third, market surveillance becomes more predictable. National authorities differ in experience with AI evaluations and model inspections. A shared methodology set, co-designed by the Scientific Panel, makes it easier to achieve cross-border consistency. This applies not only to GPAI providers but also to high-risk applications where third parties integrate models into products or services. The expectation is that the panel will develop formats for reports that supervisors in all Member States can read and reuse. Such formats also require a clear separation between confidential model information and publicly accountable disclosure, so innovation can continue without supervision becoming toothless. The Commission has explicitly pointed to contributions to evaluation methodologies and cross-border supervision in the panel's recruitment materials. (digital-strategy.ec.europa.eu)
The Position Relative to Norms and Standards
The opinions of the Scientific Panel do not stand alone. In European practice, regulation, harmonized standards, and supervisory guidance work together. The panel's opinions can thus form the bridge between the open standards in the AI Act and the technical implementation through standards under CEN/CENELEC and international standards. Where a standard specifies a process or measurement method, the panel can explain which method fits which risk contour. This makes it easier to connect with the "presumption of conformity" once harmonized standards become available. Several reports in the fall emphasize that this very coupling between method and risk profile is taking shape in the coming months toward 2026. (Tech Policy Press)
Timeline: From Call to Influence
Key Milestones for the Scientific Panel
This timing aligns with the phased entry into force of the AI Act and the growth of the AI Office. For organizations, this means 2025 is the year of preparation and 2026 is the year when a recognizable line in evaluations and reporting becomes visible. (digital-strategy.ec.europa.eu)
What This Means for Foundation Model Providers
For GPAI providers, a clearer playing field emerges. Where much interpretation is still needed on which capability evaluations suffice, the panel is expected to indicate priorities: which risks first, which experiments minimal, which documentation reusable. Benchmarking gains more coherence, with emphasis on explainability of measurements and preventing metric-gaming. More importantly: the conversation with supervisors becomes more substantive. Not marketing claims, but reproducible test results will soon form the starting point. Recent reporting emphasizes that this is the arena where evaluation standards and measuring sticks truly take shape. (Tech Policy Press)
At the same time, it is wise to anticipate questions about systemic risks. Models with broad downstream impact will need to demonstrate how they limit risk amplification. Think of mechanisms for capability containment, policies for model enrichment in the chain, and procedures for timely correction of harmful emergent properties. The panel's opinions are expected to provide guidance on threshold values and what "demonstrating" means in practice.
What This Means for Deployers in Public and Private Sectors
Deployers mainly gain predictability. If evaluation methods and reporting formats become more uniform, internal assessments — such as AI impact assessments or procurement files — can better align with supervisors' expectations. This helps with tenders, vendor due diligence, and accountability toward management and society. Moreover, a common conceptual framework increases the transferability of audit findings, so lessons learned find their way between sectors more quickly.
For healthcare, education, mobility, and safety-critical domains, this delivers concrete advantage. There, the pressure to show that evaluations are robust and repeatable is highest. A European methodology set, supported by the Scientific Panel and promoted by the AI Office, reduces the chance of divergent requirements in different Member States. The Commission has explicitly stated that the panel is also intended to support enforcement and increase consistency. (digital-strategy.ec.europa.eu)
How to Prepare Now
Three Preparation Steps for Organizations
1. Inventory your model and use-case portfolio
Start with an overview in light of GPAI and high-risk obligations. Map which evaluations you already have, which are reproducible, and which gaps exist. Document which datasets, prompts, adversarial scenarios, and red-teaming results you use. Ensure your experimental setup is repeatable and that you properly log versions, configurations, and boundary conditions.
2. Build a documentation layer with European terminology
The same term can mean slightly different things in internal documents, vendor documentation, and supervisory reporting. Work toward an internal dictionary aligned with the concepts used by the AI Office and the Scientific Panel. Create a reporting skeleton that you can later fill according to formats coming from Brussels.
3. Actively follow the selection and installation process
The call and Q&A provide a good picture of scope and expectations. Once the first work programs or consultations appear, you want to be able to scale quickly or respond. Consider submitting practical cases or sharing evaluation results representative of your domain.
If you already work with external auditors or technical due diligence, involve them in designing your evaluation setup. They know which questions recur and which evidence makes the difference. The Commission has clearly outlined the scope: contributions to evaluation methods, GPAI advice, and cross-border supervision. There lie concrete hooks for organizations to share knowledge and provide feedback. (digital-strategy.ec.europa.eu)
The Undercurrent: One European Measurement Culture
Those mapping the governance architecture in Brussels see a clear movement. The AI Office builds capacity and drives implementation. The AI Board keeps Member States aligned. The Advisory Forum brings stakeholder experience in. And the Scientific Panel adds the necessary methodological backbone. The joint goal is not more paper, but less noise. One measurement culture, so providers know what to demonstrate and supervisors know what to expect.
For foundation model providers, this is the moment to invest in evaluation discipline. For deployers in sectors with high expectations, this is the moment to recalibrate internal governance toward reproducible tests and clear reporting. 2026 then becomes not the year when everyone must reinvent how to test, but the year when Europe finally makes explicit how quality and risk in AI become visible and discussable. The contours are now clear in official announcements and recent analyses. (digital-strategy.ec.europa.eu)
Key message for organizations: Begin now with establishing reproducible evaluation processes and documentation aligned with European terminology. When the Scientific Panel starts issuing opinions in 2026, you can then directly align with uniform methodologies instead of having to retrofit afterwards.
Sources and Further Reading
- Tech Policy Press: Europe's Advanced AI Strategy Depends on a Scientific Panel: Who Will Make the Cut? - Analysis of the role and composition of the Scientific Panel
- Artificial Intelligence Act EU: Article 68: Scientific Panel of Independent Experts - Legal basis and tasks of the panel in the AI Act
- European Commission: Commission seeks experts for AI Scientific Panel - Official announcement of call for experts (June 16, 2025)
- European Commission: Questions and answers (Q&A) on the call for establishment of AI Scientific Panel - Explanatory Q&A on the call
- European Commission: European AI Office | Shaping Europe's digital future - Information about the AI Office and its growth
- Tech Policy Press: Global Digital Policy Roundup: September 2025 - Overview of developments in AI governance and standards