The largest global AI safety report ever: The International AI Safety Report 2026 was published on 3 February 2026. Led by Turing Award winner Yoshua Bengio and supported by more than 100 experts from 30 countries, the report provides a scientific foundation for decision-makers worldwide.
A Report No One Can Ignore
Some reports you read. Others you have to read. The International AI Safety Report 2026 falls into the second category. Not because it is alarmist, but because it does exactly what good science is supposed to do: gather facts, acknowledge uncertainty, and lay a foundation for serious policy.
On 3 February 2026, an international panel of more than 100 AI experts published the second International AI Safety Report. The panel was led by Yoshua Bengio, Turing Award winner and one of the founding figures of modern deep learning. Nominations came from more than 30 countries and international organisations. The result is the largest global collaboration on AI safety ever assembled.
The report deliberately makes no policy recommendations. That is an intentional choice. Instead, it synthesises scientific evidence around three central questions: what can general-purpose AI (GPAI) do today, how is it evolving, what risks does it bring, and what safeguards exist?
For organisations working with AI, and for anyone trying to make sense of the EU AI Act, this report provides indispensable context.
What Can AI Do Today and Tomorrow?
The foundation for any risk analysis is understanding what AI systems can actually do. The report paints an impressive but also nuanced picture.
AI systems now perform a broad range of tasks: communicating fluently in multiple languages, writing and debugging computer code, generating realistic images and video, and solving graduate-level mathematics problems. Scientists increasingly use GPAI for literature reviews, data analysis, and experimental design. So-called "reasoning" models, which work through multiple solution paths before choosing an answer, are performing better on complex tasks in mathematics, biochemistry, and scientific research.
At the same time, the report is honest about limitations. Models are less reliable when tasks involve many steps. They still produce hallucinations. They struggle with interaction with the physical world. And they perform worse in less common languages and cultural contexts.
AI agents, systems that plan, reason, and use tools to complete tasks autonomously, receive particular attention. Agents have already demonstrated the ability to complete complex software tasks with minimal human oversight. But they cannot yet handle a broad range of complex tasks and long-term planning. For now, concludes the report, agents complement humans rather than replace them.
That "for now" is deliberate. Development is moving fast.
Three Categories of Risk
The report organises emerging risks into three categories: risks from malicious use, risks from malfunctions, and systemic risks. This structure helps think concretely about what goes wrong and how.
Malicious Use: From Deepfakes to Bioterrorism
The most direct risks come from deliberate harmful intent. On cybersecurity, the report finds that GPAI can help attackers by identifying software vulnerabilities and writing exploit code. Criminal groups and state-affiliated actors are already actively using GPAI in their operations. The current role of AI in attacks is largely limited to preparatory stages, but the scale at which this can occur is growing rapidly.
Biological and chemical risks deserve special attention. The report finds that GPAI systems can provide access to laboratory instructions, help troubleshoot experimental procedures, and lower technical barriers to developing dangerous materials. How much this increases real-world risk remains uncertain due to practical barriers, but the threshold is lower. And that is precisely the problem with asymmetric threats: even a marginal reduction in the barrier can have consequences.
The report also documents growing misuse of AI-generated content for scams, fraud, extortion, and the production of non-consensual intimate imagery. Deepfakes are becoming more realistic and harder to detect, and disproportionately target women and girls.
Malfunctions: When AI Gets It Wrong
Not every risk comes from bad intent. Current AI systems can fail unpredictably: fabricating information, producing flawed code, providing misleading medical advice. No combination of current methods eliminates all failures entirely. The report warns that AI agents can compound these reliability risks, because they operate with greater autonomy and human intervention is less straightforward.
The report also addresses scenarios in which AI systems operate outside anyone's control: systems that evade oversight, execute long-term plans, and resist attempts to shut them down. Experts are divided on the likelihood of such scenarios. Current systems show early signs of such behaviour, but are far from capable of it.
Systemic Risks: Broader Societal Effects
The third category concerns broad societal effects. On labour markets, effects so far are mixed: reduced demand for easily substitutable work like writing and translation, and increased demand for complementary skills. Newer research shows no significant effects on overall employment, though junior workers in AI-exposed occupations are vulnerable.
A notable finding concerns human autonomy. The report cites a study finding that clinicians' tumour detection rate during colonoscopy was 6% lower after several months of working with AI assistance. More broadly, "automation bias" is a growing concern: people rely too strongly on AI output, even when it is wrong.
How Do You Manage These Risks?
On risk management, the report describes what exists and is honest about shortcomings. The fundamental challenge: the AI landscape changes rapidly, but evidence about risks and effective mitigations emerges slowly. Acting too early may entrench ineffective interventions; waiting too long leaves society vulnerable.
One approach the report supports is "defense-in-depth": multiple layers of safeguards that together reduce the chance a single failure leads to significant harm. Capability evaluations, technical safeguards, monitoring, and incident response working in combination.
The report notes that 12 companies published or updated Frontier AI Safety Frameworks in 2025. But there is still no unified approach. Documentation, incident reporting, risk registers, and transparency reports exist as separate practices, without a coordinated structure.
Open-weight models present a distinct challenge: their safeguards can be more easily removed, use is harder to monitor, and once released, model weights cannot be recalled.
What Does This Mean for Organisations in the EU?
The report is not an EU AI Act document. It is broader. But for European organisations, it provides an essential lens for understanding the risk logic behind the EU AI Act.
The EU AI Act classifies systems based on risk. That risk is not arbitrary: it is rooted in precisely the kinds of harm this report describes. Autonomous decision-making in high-risk domains, inadequate transparency, insufficient human oversight, the risks of cybersecurity applications. Reading this report, you understand better why the EU AI Act demands what it demands.
Concretely, teams can learn from this report in three areas.
On risk analysis: use the report's three-part framework (misuse, malfunctions, systemic) as a structure for your own risk assessment. Which category of threats is relevant to your specific AI applications?
On cybersecurity and GPAI: if you deploy general-purpose AI in security-sensitive environments, awareness of the dual-use challenge is essential. The same capabilities that help attackers also help defenders. Both sides require policy.
On human oversight: the finding about clinicians and colonoscopy is a powerful reminder that human oversight does not happen automatically. Oversight must be designed, not assumed.
The India AI Impact Summit later in February 2026 uses the report as a starting point for international policy discussions. The conclusions are likely to resurface in regulation and standards for years to come.
A Scientific Foundation for Serious Policy
The International AI Safety Report 2026 is neither a doom scenario nor a marketing document. It is a carefully constructed scientific synthesis of what we know, what we do not know, and where the limits of our knowledge lie.
Yoshua Bengio and his panel have accomplished something harder than it looks: assembling global scientific consensus on a technology that moves so fast that science can barely keep up. The result is a reference document that anyone working seriously with AI should know.
You can download the full report here. It is extensive, but the summary and section introductions are accessible and well worth reading.