Loading...
All Articles
AI Governance · 9 min read

EU AI Act Technical Readiness: What Developers Need to Know Before August 2026

A practical engineering guide to the EU AI Act: risk tier classification, high-risk system requirements, and concrete implementations for logging, transparency, and deployment gating.

Why August 2026 Is A Real Deadline

The EU AI Act entered into force in August 2024. Provisions for prohibited practices kicked in February 2025. General-purpose AI model obligations started in August 2025. And the full set of high-risk system requirements becomes enforceable on 2 August 2026. That is the date most product teams should be paying attention to, because it is the one that imposes the biggest engineering burden.

This post is not legal advice. We have lawyers for that; you should too. What this post covers is what the regulation asks of software systems in language an engineer can act on, and how to implement the common obligations without turning your codebase into a compliance wasteland.

The Four Risk Tiers

The Act classifies AI systems into four tiers, each with different obligations.

TierDefinitionWhat you must do
UnacceptableSocial scoring, real-time biometric ID in public, manipulative systemsProhibited — don't build it
High-riskListed use cases in Annex III (employment, credit, education, critical infra, law enforcement, etc.)Full set of requirements — documentation, oversight, logging, monitoring
Limited riskChatbots, generative AI, emotion recognitionTransparency obligations
Minimal riskEverything elseVoluntary codes of conduct

Most startup teams will fall into one of two categories: limited-risk for a customer-facing LLM feature, or high-risk if you touch HR, credit, education, or other Annex III domains. If you are in any doubt, err toward high-risk.

High-Risk: What It Actually Demands

A high-risk AI system under Article 9-15 must implement, at minimum:

  1. A risk management system (ongoing, documented, updated as the system evolves).
  2. Data and data governance practices — quality, representativeness, bias analysis.
  3. Technical documentation sufficient to allow an auditor to assess conformity.
  4. Record-keeping (automated logs throughout the lifecycle).
  5. Transparency and information to deployers.
  6. Human oversight — the system must be designable for meaningful human supervision.
  7. Accuracy, robustness, cybersecurity.
  8. A quality management system.
  9. Post-market monitoring.
  10. Incident reporting for serious incidents.

Let's walk through the ones that show up in code.

Record-Keeping: Prompt And Response Logging

Article 12 mandates automatic logging throughout the operation of a high-risk AI system. For an LLM application this means every prompt, every response, every tool call, every decision. The logs must allow tracing the system's behavior back to inputs.

A minimal logging middleware for a FastAPI app:

from __future__ import annotations

import hashlib
import json
import time
import uuid
from datetime import datetime, timezone
from typing import Any

from fastapi import FastAPI, Request
from pydantic import BaseModel, Field
import structlog

logger = structlog.get_logger()

app = FastAPI()


class LLMInteraction(BaseModel):
    interaction_id: str = Field(default_factory=lambda: str(uuid.uuid4()))
    timestamp_utc: str = Field(
        default_factory=lambda: datetime.now(timezone.utc).isoformat()
    )
    system_id: str
    system_version: str
    model: str
    model_version: str
    user_id_hash: str
    input_hash: str
    input_tokens: int
    output_tokens: int
    latency_ms: int
    safety_flags: list[str] = []
    human_review_requested: bool = False


def hash_user_id(raw: str) -> str:
    salt = b"eu-ai-act-log-salt-v1"
    return hashlib.sha256(salt + raw.encode()).hexdigest()[:16]


@app.middleware("http")
async def log_llm_interaction(request: Request, call_next):
    start = time.monotonic()
    response = await call_next(request)
    latency = int((time.monotonic() - start) * 1000)

    if request.url.path.startswith("/v1/generate"):
        interaction = LLMInteraction(
            system_id="hr-screening",
            system_version=request.app.state.version,
            model="claude-sonnet-4.6",
            model_version="20260115",
            user_id_hash=hash_user_id(request.headers.get("x-user-id", "anon")),
            input_hash=request.state.input_hash,
            input_tokens=request.state.input_tokens,
            output_tokens=request.state.output_tokens,
            latency_ms=latency,
            safety_flags=getattr(request.state, "safety_flags", []),
        )
        logger.info("llm.interaction", **interaction.model_dump())
    return response

Two notes. First, we hash user identifiers with a salt — you are not required to pseudonymize in the logs themselves, but you almost always want to for GDPR reasons. Second, we log token counts rather than raw content. The Act requires record-keeping but not necessarily full content retention; you can store hashes or summaries if you have a policy that says so. Pick one and document it.

Retention

There is no single retention number in the Act. Recitals and Article 19 require the provider to keep logs "appropriate for the purpose" and "for a period appropriate to the intended purpose" — generally at least six months, often longer. We default clients to 12 months in hot storage, 5 years in cold storage, with documented deletion on user request for personal data components.

Technical Documentation: The Model Card

Annex IV lists what the technical documentation must cover. For a model-backed system the most efficient format is a model card kept in version control next to the code.

A Pydantic model registry entry that doubles as a model card:

from __future__ import annotations

from datetime import date
from typing import Literal

from pydantic import BaseModel, Field, HttpUrl


class IntendedPurpose(BaseModel):
    description: str
    deployers: list[str]
    geographic_scope: list[str]
    not_intended_for: list[str] = []


class DataGovernance(BaseModel):
    training_data_sources: list[str]
    training_data_licensing: list[str]
    bias_evaluation: str
    bias_metrics: dict[str, float]
    pii_handling: str


class HumanOversight(BaseModel):
    oversight_type: Literal["human-in-the-loop", "human-on-the-loop", "human-in-command"]
    override_mechanism: str
    escalation_path: str


class PostMarketMonitoring(BaseModel):
    metrics: list[str]
    alerting: str
    review_cadence_days: int


class ModelCard(BaseModel):
    system_id: str
    name: str
    provider: str
    version: str
    risk_tier: Literal["unacceptable", "high", "limited", "minimal"]
    annex_iii_category: str | None = None
    intended_purpose: IntendedPurpose
    data_governance: DataGovernance
    human_oversight: HumanOversight
    monitoring: PostMarketMonitoring
    known_limitations: list[str]
    accuracy_estimate: float = Field(ge=0, le=1)
    robustness_notes: str
    last_reviewed: date
    reviewed_by: list[str]
    documentation_url: HttpUrl

Store one of these per system in a model-cards/ directory. CI validates that every system referenced by production code has a card, and that the card has been reviewed in the last 180 days.

Human Oversight: Designing For It

Article 14 requires that high-risk systems can be effectively overseen by natural persons. For developers, this usually reduces to three questions:

  1. Can the human see what the system is doing and why?
  2. Can the human intervene?
  3. Can the human override the output before it has an effect?

Concretely: if your system auto-rejects job applications, it cannot be a fire-and-forget pipeline. A human must see the decision, see the reasoning, and be able to override it before the candidate is notified.

A simple pattern: the AI produces a recommendation with a confidence and a reason; the workflow queues it for human review; only after explicit approval is the action taken.

class Recommendation(BaseModel):
    candidate_id: str
    decision: Literal["proceed", "reject", "uncertain"]
    confidence: float
    reasoning: str
    model_version: str
    generated_at: str


class ReviewOutcome(BaseModel):
    recommendation: Recommendation
    reviewer_id: str
    reviewer_decision: Literal["approved", "overridden", "escalated"]
    reviewer_notes: str
    reviewed_at: str

The auditor will expect to see examples of overridden outcomes. If 100% of recommendations are approved as-is, either your model is perfect or your oversight is theater.

Transparency: What The User Sees

Article 52 covers transparency obligations for limited-risk systems. The key ones:

  • Users must be informed they are interacting with an AI system (unless obvious).
  • AI-generated or manipulated content (deepfakes, synthetic images) must be labeled.
  • Emotion recognition and biometric categorization users must be informed.
  • General-purpose AI models must watermark synthetic outputs where technically feasible.

In practice, this is a small UI change. A persistent badge near the chat interface: "You're talking to an AI assistant. Responses may be inaccurate." For generated images: a metadata tag and a visible indicator.

Deployment Gating In CI

The cleanest way to enforce this in day-to-day engineering is a CI check that refuses to promote code to production if the associated model card is missing, stale, or has unresolved gaps.

name: AI Compliance Gate

on:
  pull_request:
    paths:
      - "apps/**"
      - "model-cards/**"

jobs:
  gate:
    runs-on: ubuntu-24.04
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: "3.12"
      - run: pip install pydantic==2.9 pyyaml==6.0
      - name: Validate model cards
        run: python scripts/validate_model_cards.py
      - name: Check referenced systems have cards
        run: python scripts/check_coverage.py
      - name: Enforce staleness
        run: python scripts/check_staleness.py --max-days 180

And the validation script is a few lines:

from __future__ import annotations

import pathlib
import sys
from datetime import date, timedelta

import yaml

from compliance.model_card import ModelCard

root = pathlib.Path("model-cards")
max_age = timedelta(days=180)
today = date.today()

errors: list[str] = []

for path in root.glob("*.yaml"):
    raw = yaml.safe_load(path.read_text())
    try:
        card = ModelCard.model_validate(raw)
    except Exception as exc:
        errors.append(f"{path}: {exc}")
        continue
    if today - card.last_reviewed > max_age:
        errors.append(f"{path}: stale, last reviewed {card.last_reviewed}")
    if card.risk_tier == "high" and not card.annex_iii_category:
        errors.append(f"{path}: high-risk card missing Annex III category")

if errors:
    print("Compliance gate failed:")
    for e in errors:
        print(f"  - {e}")
    sys.exit(1)

print(f"All {len(list(root.glob('*.yaml')))} model cards valid.")

Post-Market Monitoring

Article 72 requires a post-market monitoring plan — not just metrics, but a documented process for how you collect, analyze, and act on real-world performance data. Your existing observability stack is 80% of the answer. The other 20% is a scheduled review where a named person looks at the data, writes down what they saw, and files it. Quarterly is the floor; monthly is better for newer systems.

Serious incidents — where the system causes harm or a near-miss — must be reported to the national authority within 15 days (Article 73). Your incident runbook needs a branch for this: "is this reportable under the EU AI Act?" If yes, the compliance team gets paged alongside engineering.

A Pragmatic Readiness Checklist

  • All AI systems classified into risk tiers
  • High-risk systems have model cards covering Annex IV
  • Prompt/response logging in place with defined retention
  • Human oversight designed into every high-risk workflow
  • Transparency disclosures in UI for limited-risk systems
  • CI gate preventing deployment of uncarded systems
  • Incident runbook includes AI Act reporting branch
  • Post-market monitoring plan documented and scheduled

Next Steps

The August 2026 deadline is close enough that product teams should be treating this as a this-quarter problem, not a this-year problem. The implementation pattern is not exotic — it's a registry, some middleware, a CI gate, and a review cadence — but it is boring enough that it will not get done unless it's scheduled. Book the work now. If you want help mapping your systems to the Act and building the gating pipeline, get in touch.

filed under
eu-ai-actcomplianceairegulation
work with us

Want our team to help with your infrastructure?

talk to an engineerFree 30-min discovery callBook
close