AI has already remade how organizations build software and run business operations. Now it’s moving into power grids, pipelines, water treatment plants, and transportation systems —places where a bad decision doesn’t just interrupt work but affects real equipment and real people.
A new joint guidance document from NSA, CISA, Australia’s ACSC, and several international partners tries to get ahead of that shift. Its main argument is simple: AI in OT isn’t an extension of IT. It’s part of the operational stack, and failures can trigger equipment damage, outages, or real-world safety incidents. OT was built for stability; AI was built to adapt. Bridging that gap is the work ahead.
The New AI-Native Threat Surface
AI introduces weaknesses that don’t look like the software flaws OT teams are used to patching. Models can drift, misinterpret noisy inputs, or lean on training data that never reflected real operating conditions. Those same weaknesses also give attackers room to operate, because they don’t need a code bug if they can influence the data a model learns from or the sensor values it relies on.
Small data manipulations can have outsized consequences. Feed a model slightly biased pressure readings and it may “correct” a stable system into an unstable one. Autonomous decision loops can make that risk even more dangerous. Once AI is allowed to act without a human in the chain, a poisoned input can become an operational event. Recent AI-native issues—such as vulnerabilities where models became unintended access paths—show how easily behavior can diverge from design intent. In OT environments, that divergence lands on physical equipment.
“We have seen evidence where AI models can degrade or behave unexpectedly over time, particularly in OT environments where the consequences of bad decisions can cause physically material outcomes,” said Denis Calderone CRO & COO, Suzu Labs. “A miscalibrated algorithm in a financial system costs money, while a miscalibrated algorithm controlling industrial processes can cost lives.”
Understand AI and Its Unique Risk Profile
The first principle in the guidance calls for broad AI literacy. Operators, engineers, and leadership all need enough familiarity with AI behavior to recognize when a system is sliding into unsafe territory. Model drift is a good example. In an IT setting, drift might mean a recommender goes off target. In OT, drift can nudge, for example, a temperature-control process several percentage points away from normal, with no obvious alert until the deviation becomes dangerous.
Opaque decision-making adds another layer of uncertainty. Traditional control systems offer traceable logic, but deep-learning models don’t. When a model drives a process out of bounds, operators need to understand what might have caused it even if they can’t read the model’s internal reasoning.
Training data further compounds the problem. OT data is messy, incomplete, and sometimes unreliable. If that becomes the foundation for an operational model, the system inherits those flaws. And if adversaries influence the data pipeline, the model can fail in ways that bypass traditional defenses.
Principle 2: Consider Whether AI Belongs in OT at All
The second principle asks a question many organizations skip: does this process actually need AI? Efficiency claims are easy to make, but OT environments depend on predictable, repeatable behavior. AI doesn’t work that way, and introducing it into systems built on fixed logic can create new risks without delivering meaningful gains.
The guidance pushes operators to justify each use case. If the goal isn’t clear, or if the failure modes outweigh the benefits, AI doesn’t belong in the loop. In many cases, that means keeping AI outside the control layer altogether. Let models analyze OT data, flag anomalies, or predict maintenance needs, but keep final actions under deterministic control logic. That separation preserves safety and still gives operators the advantage of AI-driven insight.
Principle 3: Governance and Assurance Frameworks Must Come First
Before any model reaches production, organizations need governance that looks more like safety engineering than standard MLOps. OT requires documented testing, strict change control, and auditable model behavior.
Validation is the starting point. Models that perform well on historical data can behave unpredictably when exposed to real-world noise. OT teams already know how to test equipment under stress; AI needs the same treatment. Version control also matters. Models evolve as data shifts, and an unnoticed update can alter how a system responds. Operators need a clear lineage of what model was used, when it changed, and why.
Continuous monitoring completes the framework. A model may behave safely on day one and drift months later. Without tracking its inputs and outputs over time, operators may miss early warning signs that matter in OT.
Principle 4: Design for Safety, Security, and Human Oversight
Even with guardrails, AI shouldn’t be left to operate OT systems on its own. Human oversight must remain central. Operators need the authority to override model decisions, pause automation, or shut systems down when behavior looks wrong. Human experience still counts, as a seasoned operator can spot a suspicious reading faster than any dashboard.
Systems also need predictable failure behavior. If a model faults or produces conflicting outputs, the process should fall back to a safe state such as manual control, a frozen setting, or an isolated observation mode. AI shouldn’t have a straight path to high-impact controls; it needs to move through hardened layers that can stop or slow bad decisions. And if the model faults or the data turns suspect, the system should settle into a safe state that buys operators time to step in.
“AI is making us faster, more efficient, and hopefully safer,” said Trey Ford, Chief Strategy and Trust Officer at Bugcrowd. The ‘what’ and ‘why,’ however, are the challenges we need to focus on when asking, ‘Is this implementation fit for purpose?’ Specifically, when we give automated agents autonomy (true agency), we need to operationalize how humans stay in the loop, tune, and troubleshoot these capabilities over time.”
What Operators Should Do Now
The guidance leaves operators with a few things they can tackle right away. First, take stock of the environment. Most OT teams already wrestle with incomplete asset maps and old equipment, so an AI readiness check forces everyone to see where the data actually comes from and which processes are brittle.
From there, pull the right people into the same lane. OT, IT, and the data folks rarely make decisions together, but AI cuts across all three. Data quality needs attention too. If the inputs are unreliable—or worse, manipulated—any model you build on top of them will behave unpredictably. And whatever model makes it into the environment should be treated with the same caution you’d apply to any high-impact operational component. Test it under stress, keep track of which version is running, and watch it closely once it’s live.
AI Will Redefine Infrastructure—If We Build It Responsibly
AI will eventually play a large role in how infrastructure operates. It can help plants adapt faster, detect equipment failures earlier, and manage systems that have grown too complex for manual oversight alone. But the benefits only hold if safety comes first. This guidance marks the beginning of a shift toward AI-specific operational standards—rules that sit alongside today’s engineering and safety requirements. The challenge now is turning those principles into practice before AI becomes another weak point in environments that can’t afford surprises.