How We Built HIPAA, GDPR, and EU AI Act Compliance Into an AI Agent System
Most AI agent platforms treat compliance the way most startups treat security: something to address after product-market fit. In regulated industries, this approach is fatal. Not metaphorically fatal — actually fatal to the deployment. A healthcare AI system that cannot produce a HIPAA-compliant audit trail will never make it past a compliance review, regardless of how impressive its clinical reasoning might be.
When we designed Surrogate OS, we made a foundational decision that shaped every architectural choice that followed: compliance is not a feature to be added. It is the substrate on which everything else is built.
This post is a technical deep dive into how we implemented regulatory compliance across six frameworks, why we made the specific architectural decisions we did, and what we learned along the way.
Why Compliance Cannot Be Phase 2
The temptation to defer compliance is understandable. Regulatory requirements are complex, they vary by jurisdiction and industry, and implementing them properly is expensive in terms of engineering effort. The standard startup playbook says to validate the core value proposition first, then layer on compliance requirements once you have traction.
This playbook fails for AI agents in regulated industries for three specific reasons.
Architectural contamination. Compliance requirements affect data flow, storage patterns, API design, and logging infrastructure. If you build a system without compliance constraints and then try to retrofit them, you end up with a dual architecture — the original design plus a compliance overlay that fights against it at every turn. We have seen this pattern in enterprise software for decades. It produces brittle, expensive-to-maintain systems.
Audit trail integrity. A retroactively added audit trail is inherently suspect from a regulatory perspective. If the logging infrastructure was not present from the beginning, there is no way to prove that historical operations were compliant. Regulators understand this. An audit trail that starts six months after deployment raises more questions than it answers.
Trust erosion. Healthcare systems, financial institutions, and legal organizations make deployment decisions based on trust assessments. If your compliance story is "we are working on it," you will not get past the initial evaluation. These organizations need to see compliance built into the architecture, not bolted onto the side.
The Six Frameworks
Surrogate OS currently implements compliance controls for six regulatory frameworks. Each imposes distinct requirements, and there is meaningful overlap that we exploit for implementation efficiency.
HIPAA (Health Insurance Portability and Accountability Act)
HIPAA governs the handling of Protected Health Information (PHI) in the United States. For an AI agent system, the key requirements are:
- Access controls — Role-based access to any data that could identify a patient. Every access must be authenticated and authorized.
- Audit trails — Every access to, modification of, or transmission of PHI must be logged with timestamp, actor identity, action type, and data involved.
- Minimum necessary standard — AI agents must access only the minimum data required to perform their assigned task.
- Breach notification — The system must detect and report unauthorized access within defined timeframes.
In Surrogate OS, HIPAA compliance is enforced at the SOP node level. Each node in a surrogate's procedure graph that handles PHI is tagged with HIPAA requirements. The runtime engine verifies compliance at each node transition — checking that the accessing surrogate has appropriate authorization, that only minimum necessary data fields are being accessed, and that the audit log entry is written before the operation proceeds.
GDPR (General Data Protection Regulation)
GDPR applies to any system processing personal data of EU residents. The requirements that most affect AI agent systems are:
- Lawful basis for processing — Every data processing operation must have a documented legal basis.
- Data subject rights — Right to access, rectification, erasure, and portability of personal data.
- Data Protection Impact Assessments — Required for high-risk processing activities, which AI-driven decision-making almost always qualifies as.
- Data minimization — Similar to HIPAA's minimum necessary standard but broader in scope.
Our implementation tracks data lineage through the entire surrogate operational pipeline. Every piece of personal data carries metadata indicating its lawful processing basis, retention period, and applicable data subject rights. When a data subject exercises a right (such as erasure), the system can trace every location where that individual's data exists within the surrogate's memory and operational logs.
EU AI Act
The EU AI Act is the most directly relevant regulation for AI agent systems, and its high-risk AI system requirements are substantial:
- Risk management system — Continuous identification and mitigation of risks throughout the AI system lifecycle.
- Data governance — Training, validation, and testing datasets must meet quality criteria.
- Technical documentation — Comprehensive documentation of the system's purpose, capabilities, limitations, and performance metrics.
- Record-keeping — Automatic logging of events throughout the AI system's lifetime.
- Transparency — Users must be informed they are interacting with an AI system.
- Human oversight — The system must support effective human oversight, including the ability to interrupt or override AI decisions.
- Accuracy, robustness, and cybersecurity — Appropriate levels of all three, maintained throughout the system lifecycle.
Surrogate OS implements these requirements through its SOP architecture. The procedure graph serves as the risk management system — each node has defined risk levels, mitigation strategies, and escalation triggers. The transparency requirement is handled at the interface layer, where every surrogate interaction begins with a clear disclosure of AI identity. Human oversight is built into the SOP execution engine, which supports real-time intervention and override at any point in a procedure.
SOC 2, FDA SaMD, and FINRA
The remaining three frameworks — SOC 2 for data security controls, FDA Software as a Medical Device guidelines for clinical AI applications, and FINRA for financial services AI — each add specific requirements that overlap significantly with the three frameworks above. Our implementation treats the six frameworks as a compliance matrix, where shared requirements are implemented once and mapped to all applicable frameworks.
Ed25519 SOP Signing Architecture
The most distinctive compliance mechanism in Surrogate OS is cryptographic SOP signing. Every standard operating procedure — the complete directed acyclic graph of nodes, transitions, validation criteria, and compliance annotations — is signed using Ed25519 digital signatures.
The choice of Ed25519 was deliberate. It provides strong security guarantees with fast signature generation and verification, compact signature sizes (64 bytes), and deterministic signatures that simplify testing and auditing. Unlike RSA, Ed25519 has no known padding oracle attacks and is resistant to timing-based side-channel attacks.
The signing process works as follows:
-
The complete SOP definition is serialized into a canonical JSON representation. We use a deterministic serialization that guarantees identical byte output for semantically identical SOPs, regardless of property ordering or whitespace.
-
The serialized SOP is hashed using SHA-512 (which is part of the Ed25519 specification).
-
The hash is signed using the organization's private key, producing a 64-byte signature.
-
The signature, public key identifier, timestamp, and SOP version are stored alongside the SOP definition.
When a surrogate executes an SOP, the runtime engine verifies the signature before beginning execution. If the signature is invalid — indicating the SOP has been modified since signing — execution is blocked and a compliance alert is generated.
Hash Chaining for Revision History
SOP signing alone proves that a procedure has not been tampered with since it was signed. But regulators also need to verify the history of changes. We implement this through hash chaining.
Each SOP revision includes the cryptographic hash of the previous version in its signed content. This creates a tamper-evident chain:
SOP v1: sign(content_v1)
SOP v2: sign(content_v2 + hash(SOP v1))
SOP v3: sign(content_v3 + hash(SOP v2))
To verify the complete history, you walk the chain backward. If any intermediate version has been modified or deleted, the hash chain breaks at that point. This provides mathematical proof of the complete revision history without requiring trust in any single storage system.
The hash chain also includes the identity of the person who authorized each revision and a timestamp from a trusted time source. This produces a complete chain of custody: who changed what, when, and in what order.
Immutable Audit Trails
Every compliance-relevant action in Surrogate OS generates an audit log entry. These entries are structured, indexed, and immutable.
Immutability is enforced at multiple levels:
- Application level — The audit logging API is append-only. There is no update or delete endpoint.
- Database level — Audit tables use database-level constraints to prevent modification of existing records.
- Integrity verification — Each audit entry includes a hash of the previous entry, creating a hash chain similar to the SOP revision history. Any tampering with historical audit records breaks the chain.
An audit entry includes:
- Timestamp (from a trusted source, not the application clock)
- Actor identity (surrogate ID, user ID, or system process ID)
- Action type (a controlled vocabulary mapped to regulatory categories)
- Resource affected (with appropriate identifiers)
- Outcome (success, failure, or escalation)
- Applicable compliance frameworks (HIPAA, GDPR, etc.)
- Contextual data (request parameters, decision factors, confidence scores)
- Hash chain pointer (hash of the previous entry)
The audit trail is queryable by any dimension — you can retrieve all actions by a specific surrogate, all actions affecting a specific data subject (for GDPR right-of-access requests), all actions within a time window, or all actions flagged under a specific compliance framework.
Bias Detection Architecture
The EU AI Act requires that high-risk AI systems include mechanisms for detecting and mitigating bias. Our bias detection system operates at the operational level, monitoring surrogate behavior for statistical anomalies across demographic dimensions.
The system works through three mechanisms:
Distribution monitoring. For every measurable decision dimension (triage priority, response time, escalation frequency, recommendation type), the system maintains running statistical distributions segmented by available demographic variables. When the distribution for any demographic segment deviates significantly from the overall distribution, a bias alert is generated.
Comparative analysis. The system periodically runs comparative analyses across surrogates performing similar roles. If one surrogate shows different behavioral patterns from its peers when controlling for case mix, this is flagged for human review.
Counterfactual testing. On a configurable schedule, the system generates counterfactual test cases — identical scenarios with only demographic variables modified — and runs them through the surrogate's decision logic. Divergent outcomes indicate potential bias in the underlying decision process.
Bias alerts are routed to designated compliance officers and include the statistical evidence, affected demographic dimensions, time period, and recommended investigation steps. The system does not automatically modify surrogate behavior in response to bias detection — that decision remains with human oversight.
Federated Learning With Differential Privacy
Organizations running multiple surrogates across different locations face a tension: surrogates would benefit from shared operational learnings, but sharing operational data across locations may violate data residency requirements or expose sensitive information.
Surrogate OS addresses this with federated learning using differential privacy.
Federated aggregation. Operational learnings are computed locally at each facility. Only aggregated model updates — not raw data — are transmitted to a central coordinator. The coordinator combines updates from all participating facilities and distributes the combined learning back.
Differential privacy guarantees. Before any local update leaves a facility, calibrated Gaussian noise is added. The noise is calibrated to provide a specific privacy budget (epsilon value), which provides a mathematical guarantee about the maximum information that can be inferred about any individual data point from the aggregated output.
The privacy budget is configurable per organization and per regulatory framework. A deployment subject to HIPAA might use a stricter epsilon than a deployment handling only non-clinical administrative data.
This architecture means a hospital network can run ER triage surrogates at twenty locations, have all of them benefit from the collective operational experience of the network, and provide mathematical proof that no individual patient's data was exposed in the process.
Architecture Decision: Compliance as a Service
One of the most important architectural decisions was how compliance logic relates to the rest of the system. We evaluated three approaches:
Middleware approach — Compliance checks as middleware in the request/response pipeline. Rejected because this creates a bypass risk (any code path that skips the middleware skips compliance) and makes it difficult to enforce compliance at the SOP node level.
Embedded approach — Compliance logic embedded directly in the SOP execution engine. Rejected because it creates tight coupling between operational logic and compliance logic, making it difficult to add new frameworks or modify compliance rules without touching the core engine.
Service approach (selected) — Compliance as an independent service with a well-defined API. The SOP execution engine calls the compliance service at each node transition. The compliance service evaluates the operation against all applicable frameworks and returns an allow/deny/flag decision.
The service approach provides several advantages: compliance rules can be updated independently of the SOP engine, new frameworks can be added without modifying core code, the compliance service can be tested in isolation, and the service boundary provides a natural audit point.
The compliance service maintains its own data store for framework definitions, rule configurations, and audit records. It exposes APIs for compliance verification, audit trail queries, bias monitoring results, and compliance reporting.
What We Learned
Building compliance into an AI agent system from the beginning taught us several things that may be useful to others working in this space.
Compliance requirements are better engineering constraints than they appear. The impulse is to view regulations as obstacles. In practice, requirements like audit trails, access controls, and bias monitoring produce a more robust, observable, and maintainable system. The Surrogate OS codebase is better software because of its compliance architecture, not in spite of it.
Cryptographic primitives are underused in AI governance. The AI governance conversation is heavy on policy and light on mechanism. Cryptographic signing, hash chaining, and verifiable computation provide concrete, mathematically provable governance guarantees. We would like to see more AI platforms adopt these approaches.
Open source and compliance are complementary. The ability to audit source code, verify compliance implementations, and run independent security assessments is enormously valuable in regulated contexts. Open source is not a risk factor for compliance — it is an enabler.
Surrogate OS is available on GitHub under the MIT license. We welcome contributions, particularly from engineers and compliance professionals with domain expertise in healthcare, financial services, and legal technology.