Most AI compliance guides on the open web are written by lawyers, not engineers. They explain what the law says; they don't tell you what to actually build. This guide is the inverse — written by the engineering group that ships HIPAA-compliant AI for fintech and healthcare clients, GDPR-compliant systems for European customers, and PDPA-compliant deployments in Singapore. We'll cover what each framework actually demands of your architecture, the patterns we use to satisfy them, and a pre-production checklist you can hand to your auditor. NKKTech operates under ISO 9001:2015 (quality management) and ISO 22301:2019 (business continuity) certifications, so the compliance posture isn't theoretical — every pattern below is what we actually deploy. This isn't legal advice; check with your privacy counsel before shipping. But it's the engineering-side playbook your privacy counsel doesn't have.
Why AI Compliance Is Different From Software Compliance
Traditional software compliance (HIPAA security rule, GDPR, etc.) was written for systems that store and process data deterministically: you can audit what data is in your database, who accessed it, and what was returned. AI systems break those assumptions in four ways.
First, training data leakage. A model trained on PII can memorize and regurgitate it later — GDPR's right-to-deletion becomes architecturally hard when "forgetting" a user requires retraining the model. The fix: don't train on raw PII; use synthetic data, federated learning, or differential privacy. Or accept that fine-tuned models are subject to the same retention rules as databases (they ARE databases of patterns).
Second, inference-time exposure. Sending a user's medical record to OpenAI for summarization is a data transfer — it goes to OpenAI's servers, lives in their logs, and may be used for model improvement (depending on your account type). HIPAA, GDPR, and APPI all treat this as a regulated transfer. Solutions: use providers that contractually exclude training (OpenAI's API with zero-retention policy, Azure OpenAI with HIPAA BAA, AWS Bedrock with strict regional controls), or self-host open-source models.
Third, the right to explanation. Several frameworks (GDPR Article 22, the EU AI Act, parts of HIPAA in the US healthcare context) give users the right to know why an AI made a decision affecting them. "Because the model said so" doesn't satisfy this. You need explainability infrastructure: retrieval logs for RAG systems ("the AI used these source documents"), reasoning traces for agents ("these tool calls produced this answer"), and where possible, feature-attribution methods (SHAP, LIME) for classification models.
Fourth, non-determinism. The same input on the same model on different days can produce different outputs (temperature, model updates, hidden state). For auditable processes (financial decisions, medical recommendations), you need to either pin model versions and disable sampling (temperature=0, deterministic decoding) or accept that the audit trail must capture the actual output, not assume reproducibility.
Keep these four differences in mind through every framework below. They're why "we're already HIPAA-compliant for our database" doesn't transfer to your new AI feature.
HIPAA for AI Systems (US Healthcare)
HIPAA covers Protected Health Information (PHI) — any identifiable health information held by a covered entity (provider, payer, clearinghouse) or its business associates. If your AI touches PHI, you're either a covered entity or a business associate, and HIPAA applies.
The security rule has three pillars: administrative safeguards (policies, training, risk assessments), physical safeguards (facility access, workstation security), and technical safeguards (access controls, audit logs, integrity controls, transmission security). For AI specifically:
Business Associate Agreement (BAA). Any third-party LLM provider that processes PHI on your behalf needs a signed BAA. As of 2026, this is straightforward for: Azure OpenAI Service (Microsoft signs BAAs), AWS Bedrock (with BAA on the underlying account), Google Cloud Vertex AI (with BAA). OpenAI itself does NOT sign BAAs for most direct API customers — meaning you cannot legally use the OpenAI public API for PHI. Anthropic offers BAA for Claude on enterprise tiers; check current status. Self-hosted models (Llama, Mistral on your own AWS/GCP/Azure with BAA on the account) are always an option.
Access controls. Every PHI query must be authenticated and authorized. The AI agent itself is a software identity; it should have least-privilege scoped access to only the PHI needed for the specific task. Patient-level access controls (a model serving multiple patients must enforce that user X cannot see patient Y's records) need to live at the retrieval layer, not relied on at the LLM layer.
Audit logging. Every PHI access by the AI must be logged with: timestamp, user identity (who initiated the request), patient identity affected, what data was accessed, what data was generated/transferred. Logs must be tamper-evident (write-once-read-many or signed/hashed) and retained for 6 years.
De-identification. The safe harbor method (remove 18 specific identifiers) or expert determination are the two paths. For AI workflows where de-identified data is sufficient (research, model evaluation, training synthetic models), de-identify upstream and treat the downstream system as non-PHI.
Real-world architecture pattern we deploy: PHI never leaves the BAA boundary. A patient question goes through a privacy gateway that strips identifiers (replace name with PATIENT_X, replace MRN with REF_001), the de-identified question goes to the LLM, the de-identified response goes through a re-identification step that re-attaches identifiers from a local mapping table held in the BAA environment. The LLM provider never sees actual PHI; the patient gets a personalized response.
GDPR for AI Systems (EU)
GDPR applies to personal data of EU residents regardless of where your company is based. If you serve EU users — or even monitor their behavior from outside the EU — you're in scope. GDPR's AI implications are broader than HIPAA's because "personal data" is more expansive than "PHI."
Lawful basis. Every processing operation needs a lawful basis: consent, contract, legal obligation, vital interests, public task, or legitimate interests. For AI inference, contract or legitimate interests are usually the basis (you're providing a service the user signed up for). For training on user data, explicit consent is usually required — and consent must be specific, informed, and freely revocable. "I agree to terms" buried in a registration form is NOT valid consent for AI training.
Data minimization. Only collect and process what's needed for the stated purpose. For AI, this rules out the common pattern of "send the entire user profile to the LLM in case it's useful." Send only the fields the task requires.
Right to access. Users can request all data you hold on them, including how it's been used. For RAG systems, this means logging which user data was retrieved for which queries. For chatbots, this means logging conversation history with the ability to export. Practical implementation: tag every retrieval and inference operation with user_id; expose an export endpoint that returns the user's interaction history.
Right to erasure ("right to be forgotten"). User requests deletion; you must delete their personal data within 30 days. Two architectural challenges: (1) data in vector indexes — you need a deletion pipeline that removes the user's embeddings from your vector DB and any caches; (2) data the model was trained on — if you fine-tuned on the user's data, complete erasure is essentially impossible without retraining. Avoid fine-tuning on customer data unless you have a clear retention policy or contractual carve-out.
Right to explanation (Article 22). Users have the right not to be subject to fully automated decisions with legal or similarly significant effects unless they consent or it's necessary for a contract. If your AI makes such decisions (loan approvals, hiring, medical triage), provide: (a) meaningful information about the logic involved, (b) the significance and consequences, and (c) a way for the user to contest the decision and request human review.
Data Processing Agreement (DPA). All LLM providers processing personal data on your behalf are processors and need a DPA. OpenAI, Anthropic, Azure OpenAI, AWS Bedrock, Google Vertex AI all provide DPAs.
EU data residency. For some EU clients (especially in regulated sectors), data must stay in the EU. Practical implementations: Azure OpenAI in West Europe / Sweden Central regions, AWS Bedrock in Frankfurt / Ireland, self-host models on EU-region cloud. OpenAI's standard API processes data in the US — unsuitable for strict EU residency requirements.
PIPEDA for AI Systems (Canada)
PIPEDA (Personal Information Protection and Electronic Documents Act) governs how private-sector organizations in Canada handle personal information during commercial activities. It's less prescriptive than GDPR but applies similar principles: consent, limited collection, accuracy, safeguards, openness, individual access, challenging compliance.
Key AI implications. Meaningful consent is the foundation — for AI uses that aren't obvious to the average user (training on their data, automated decision-making, sharing with US providers), consent must be specific and easy to understand. Buried boilerplate doesn't count.
Cross-border transfers. Canada doesn't restrict transfers to other jurisdictions, but you must maintain accountability — meaning your contract with the US-based LLM provider must impose equivalent privacy obligations. Standard DPAs from major US providers (OpenAI, Anthropic, Azure OpenAI) generally satisfy this for PIPEDA purposes, but document it in your privacy policy and your processor list.
Province-specific laws. Quebec's Law 25 (Loi 25, fully in force from September 2023) is significantly stricter than federal PIPEDA — closer to GDPR in scope. It requires: privacy impact assessments before deploying "systems that use personal information," disclosure of automated decision-making, the right to demand human review, and stricter rules on cross-border transfers (Quebec data sent outside Canada needs a privacy impact assessment specifically). If you serve Quebec users, treat your AI system as if GDPR applied.
British Columbia and Alberta have their own PIPA laws that apply within those provinces; mostly aligned with federal PIPEDA but with additional employer-specific provisions.
Federal Bill C-27 (the proposed AIDA — Artificial Intelligence and Data Act) is still in progress as of 2026; once passed, it will add new requirements for "high-impact" AI systems including bias mitigation, transparency, and oversight. Check current status when planning multi-year AI roadmaps.
Practical defaults that satisfy PIPEDA: clear privacy policy mentioning AI use and provider names, DPAs with all LLM providers, granular user controls (opt-out of training, opt-out of automated decisions where applicable), Quebec users routed through a separate compliance pathway with PIA documentation. We've shipped this pattern for Canadian fintech and HR-tech clients without significant friction.
📥 Free Download: Vietnam Offshore Dev Cost Guide 2026
Real developer rates, project cost breakdowns, and a budget planning template. Used by 200+ startup founders.
Ready to build?
NKKTech delivers AI Development projects from $30K.
Fixed scope. Senior Vietnam engineers. 14-day kickoff.
PDPA for AI Systems (Singapore)
Singapore's Personal Data Protection Act (PDPA) is straightforward by international standards but takes AI seriously. Amendments effective 2021 added accountability obligations, data breach notification, and rules around "derived personal data" (data inferred from other data — which AI models routinely produce).
Consent obligation. PDPA requires consent for collection, use, and disclosure of personal data. Deemed consent (the user provides data while engaging with a service that obviously needs it) covers most inference-time AI uses. Express consent is needed for training, marketing-derived uses, or transfers to overseas locations not on the deemed-adequate list.
Purpose limitation. The purposes for which personal data is collected must be specified at the time of collection. "For AI improvement" is too vague; "to train models that improve search relevance, identify fraud patterns, and personalize recommendations" is specific enough.
Access and correction. Users can request access to their personal data and correction of errors. For AI systems, this means logging what data was used in what inferences (similar to GDPR), but the bar is lower — Singapore's regulator hasn't yet enforced detailed audit trails for inference operations the way EU regulators have.
Transfer Limitation. Personal data can only be transferred outside Singapore to jurisdictions with comparable protection, OR with the user's express consent for the transfer. The PDPC publishes a list of recognized jurisdictions; OpenAI's US transfers are typically handled by binding corporate rules in the user's DPA. Document the legal basis in your data flow map.
Data Breach Notification. Breaches affecting 500 or more individuals OR likely to cause significant harm must be notified to the PDPC within 3 days and to affected individuals "as soon as practicable." Your AI architecture should support breach detection (anomalous data access patterns, leaked credentials, model-output exfiltration patterns) and have a 72-hour playbook ready.
The Model AI Governance Framework (second edition, 2024) is non-binding but expected — it covers AI ethics, internal governance, human-AI interaction, and operations management. Major Singapore enterprises (banks, healthcare) expect their vendors to align with it. If you're selling AI into Singapore's regulated sectors, building against this framework is non-optional.
NKKTech operates a Singapore branch office (NKKTech Global Pte. Ltd., 18 Sin Ming Lane #07-13, Singapore 573960), so PDPA isn't theoretical for us — we're a Singapore-registered data controller for our own operations and a processor for our enterprise clients in Singapore.
APPI for AI Systems (Japan)
Japan's Act on the Protection of Personal Information (APPI) was significantly amended in 2022 to align with GDPR-style protections including extra-territorial application, breach notification, and rights of subjects. Major Japanese enterprises take APPI compliance seriously; Western vendors selling into Japan must address it directly.
Key requirements relevant to AI.
Consent for sensitive personal information. APPI distinguishes "sensitive personal information" (race, creed, social status, medical history, criminal record, victim status) which requires explicit consent for any handling. AI systems that infer or work with these categories need consent at the inference level, not just data collection.
Utilization purpose specification. Similar to PDPA's purpose limitation — the purpose for which personal data is used must be specified and publicly notified. "For AI services" is acceptable if your privacy notice specifies what those services are.
Cross-border data transfer. APPI requires equivalent protection at the destination OR specific user consent. For US transfers, the user must be informed of: the destination country (US), the legal basis (consent or contract), and the protections in place. Standard DPAs with OpenAI, Anthropic, Microsoft, Google address this, but your privacy notice must clearly disclose it. Japanese regulator PPC has been increasingly explicit about wanting transparency for AI provider use.
Mandatory breach notification. Like GDPR/PDPA, breaches must be notified — within "a reasonable period" for affected individuals, and quickly for the PPC.
Pseudonymization. APPI's 2022 amendment introduced "pseudonymized personal information" as a category with reduced obligations. For AI training and analytics, pseudonymizing personal data (replacing direct identifiers with reversible tokens held separately) unlocks more flexibility while still respecting privacy. Japanese clients often expect pseudonymization as a baseline.
The Personal Information Protection Commission (PPC) issued AI-specific guidance in 2024 covering AI training data, automated decision-making, and the use of public LLMs. Japanese enterprise procurement often requires evidence of alignment with this guidance.
Practical pattern for Japanese enterprise AI: data residency in Japan or APPI-recognized jurisdictions (US is recognized under specific frameworks), pseudonymization upstream of AI processing where feasible, clear consent flows for sensitive data uses, Japanese-language privacy notices listing all AI processors, breach response playbook in Japanese.
EU AI Act — What Just Changed
The EU AI Act entered into force in 2024 with staggered application. As of 2026, several key provisions are now in effect or coming online soon. This is the world's first comprehensive AI regulation and affects any company offering AI products in the EU regardless of where they're based.
Risk categories. The Act classifies AI systems into four categories. Unacceptable risk (banned): social scoring, manipulation, exploitation, real-time biometric ID in public spaces (with limited exceptions). Banned outright as of February 2025. High risk: AI in critical infrastructure, employment decisions, credit scoring, law enforcement, education, healthcare. Subject to extensive obligations: risk management, data governance, transparency, human oversight, accuracy/robustness/cybersecurity requirements. High-risk obligations are in force from August 2026. Limited risk: AI systems with transparency obligations (chatbots, emotion recognition, deepfakes). Users must be informed they're interacting with AI. Minimal risk: most consumer applications. Voluntary codes of conduct.
General Purpose AI (GPAI) model providers (OpenAI, Anthropic, Mistral, etc.) have separate obligations from August 2025: technical documentation, copyright policy disclosure, training data summary publication, and for the most capable models ("systemic risk" tier — currently a small list including GPT-4 class and above), model evaluation, adversarial testing, incident reporting.
What this means for AI builders. If your AI does anything that might be classified as high-risk — and the list is broad (hiring filters, credit scoring, medical triage, exam grading, infrastructure management) — you're in scope for the August 2026 obligations. The core requirements include: risk management system, data governance (training data quality, bias mitigation), technical documentation, automatic logging of operation, transparency to users, human oversight provisions, accuracy/robustness measures, cybersecurity, post-market monitoring.
If you're using a third-party LLM as the foundation of your AI product (most production AI), you can rely on the provider's GPAI compliance for the foundation model — but YOU are still on the hook for your application-layer compliance with high-risk obligations if applicable.
Prohibited practices to check now: social scoring systems, predictive policing based purely on profiling, untargeted scraping of facial images for biometric databases, emotion recognition in workplaces or schools (with limited safety exceptions), biometric categorization to infer race/political views/sexual orientation/religion. Quick audit: does your product do any of these even partially? Stop now.
Cross-Border Data Transfer and Model Hosting
Where your data physically goes during AI processing is one of the most consequential architectural decisions for compliance. Most defaults are wrong.
Default OpenAI API. Data is processed in the US. Standard contract terms include zero-retention for API customers (data is not used to train OpenAI's models, not retained beyond what's needed for content moderation, deleted after 30 days). Standard DPA available. NOT HIPAA-eligible (no BAA available for standard accounts). Acceptable for most use cases outside healthcare; verify contractual terms for EU residency requirements.
Azure OpenAI Service. Data stays in the Azure region you choose (West Europe, North Europe, France, Sweden Central available for EU residency). BAA available for HIPAA. DPA included. Acceptable for most regulated EU and US healthcare workloads. The recommendation for any HIPAA-touching AI workload.
AWS Bedrock. Same architectural story — data stays in the AWS region you choose. BAA available with HIPAA-eligible accounts. EU regions (Frankfurt, Ireland) available. Recommended when your application already runs on AWS and you want one less vendor relationship.
Google Cloud Vertex AI. Regional residency similar to Azure/AWS. BAA available. Strong choice for multi-model needs (Gemini + open-source models hosted on Vertex).
Anthropic API. As of 2026, available directly and via AWS Bedrock + Google Vertex. Direct API offers zero-retention for enterprise; BAA available for enterprise tier. Bedrock/Vertex versions get the regional residency of those platforms.
Self-hosted open-source models. Llama 3.1 70B+, Mistral Large, Qwen 2.5, DeepSeek-V3 all run on H100/H200 hardware. Self-host on your own cloud account in the region of your choice. Full data residency control. No third-party provider in the data flow. Operational burden is significant — you're now running GPU infrastructure and dealing with availability, scaling, and model updates yourself. Right answer when: data residency is non-negotiable, regulator won't accept third-party processors, query volume justifies the GPU spend (10k+ queries/day typically), or you have specialized fine-tuning requirements.
Practical decision tree. (1) Is this HIPAA-touching? → Azure OpenAI or Bedrock with BAA. (2) Is this strict EU residency? → Azure OpenAI West Europe / Sweden Central, or self-hosted on AWS Frankfurt. (3) Is this PDPA Singapore? → Bedrock Singapore region (ap-southeast-1) or Azure OpenAI Singapore. (4) Is this high-volume B2C with cost sensitivity? → Default OpenAI API with appropriate DPA. (5) Is this everything else? → OpenAI standard or Anthropic standard with DPA.
Audit Logs, Right to Explanation, Right to Deletion
Three operational capabilities most AI systems lack at launch but must support for compliance. Build them in upfront.
Audit logs. Every AI operation that touches personal data should be logged with: timestamp (UTC, ISO 8601), user identity (authenticated user ID), session identity, request ID (for correlating across services), input data (hashed or referenced — don't log raw PHI), retrieved context (chunk IDs or source document refs), model used (provider + model name + version), output generated (hashed or stored), latency per stage, tokens used.
Log storage. Tamper-evident is required for HIPAA, recommended for everything else. Use append-only storage (S3 with object lock, or a write-once log table with cryptographic hashing of previous entries). Retention: 6 years for HIPAA, 6 months minimum for most GDPR purposes, longer if the data may be subject to litigation. Encrypt at rest with provider-managed keys (KMS, Cloud HSM).
Right to access / data export. Build an API endpoint that, given a user ID, returns all their interaction history with the AI: queries they made, retrievals run on their behalf, decisions made about them, model outputs returned. For GDPR you have 30 days to respond. Pre-build the query — don't try to assemble it under deadline pressure.
Right to explanation. For automated decisions, log: the input features the model considered, the model version that made the decision, the output, and (where the model architecture allows) feature attributions. For LLM-based decisions, this means the system prompt, retrieved context, and full response — the model itself isn't interpretable but the inputs to it are. Provide a user-facing explanation page that shows: "Your application was reviewed by an AI assistant on [date] using these criteria: [list]. The relevant information considered was: [summary]. You can request human review at: [link]."
Right to erasure / deletion. Architectural requirement: every personal data record must be deletable, including derivative data. For RAG indexes, deletion means removing the user's embeddings and source documents from the vector DB AND any cache layers. For chat history, deletion means hard-deleting (not soft-delete) after the user request. For fine-tuned models: if you fine-tuned on the user's data, you usually cannot remove the influence without retraining; structure your fine-tuning workflows to use synthetic or aggregated data only, OR commit to retraining on deletion requests at high volume (rare in B2B; sometimes mandatory for B2C with high GDPR exposure).
Deletion request workflow: user submits request → 30-day SLA starts → automated deletion job removes user's data from primary database, vector DB, cache, logs (or pseudonymizes logs if logs must be retained for legal reasons) → audit trail of the deletion is generated and retained → user is notified of completion. We provide deletion SLA monitoring as part of every production AI deployment.
Pre-Production Compliance Checklist
Before launching an AI feature that touches personal data, walk through this checklist. Hand it to your privacy counsel; tick off in order.
Data Inventory. Document every category of personal data the AI will process. Map data flows end to end: source system → preprocessing → AI provider → output destination → log storage. Identify cross-border transfers.
Lawful Basis (GDPR/PDPA/APPI/PIPEDA). Document the lawful basis for each processing purpose. Update privacy notices to reflect AI use. Implement consent flows where required (training, marketing-derived uses).
Provider Contracts. DPA signed with each AI provider in scope. BAA signed where HIPAA applies. Subprocessor list maintained and disclosed in your privacy notice. Provider's data retention policy reviewed and acceptable.
Residency. Architectural review of where each personal data field physically goes during processing. Documented data flow diagram. Where strict residency applies (EU, Singapore healthcare, Japanese financial sector), regional model deployment configured.
Access Controls. Authentication and authorization layer in front of AI endpoints. Role-based access for sensitive operations. Least-privilege scoping for AI agent access to data sources.
Logging. Audit logs implemented for all PII-touching operations. Log retention policy set and matches the longest applicable framework (6 years for HIPAA). Logs tamper-evident.
Rights Operationalized. Access (data export) endpoint built. Erasure pipeline tested end-to-end including vector DB, cache, and log redaction. Explanation page built for any automated decisions. Human review path defined for high-stakes decisions.
Security. Encryption in transit (TLS 1.2+) and at rest. Secrets management (no API keys in code). Vulnerability scanning (SAST + DAST) in CI. Penetration testing scheduled annually.
Breach Response. Incident response plan written, including 72-hour notification timelines. Detection alerts configured (anomalous data access, exfiltration patterns). Tabletop exercise run with the team.
Vendor Posture. ISO 27001 or SOC 2 type II certification from your AI providers verified. Subprocessor changes monitored (sign up for change notifications).
Documentation. Privacy impact assessment (PIA) completed where required (Quebec, EU AI Act high-risk, Japanese sensitive data). Records of processing maintained (Article 30 GDPR). Internal AI governance policy adopted (aligns with Singapore's Model AI Governance Framework expectation).
This checklist takes 1-2 weeks to work through with privacy counsel and engineering together. It's the difference between launching with confidence and launching with a 72-hour breach-notification timer ticking before you've even reached steady state.
If you'd like NKKTech to ship the AI infrastructure already compliant with these frameworks, book a free 30-minute discovery call. Our ISO 9001:2015 quality and ISO 22301:2019 business continuity certifications are renewed annually under independent audit (TQC CGLOBAL); compliance discipline is built into how we ship every project, not bolted on at the end.
📥 Free Download: Vietnam Offshore Dev Cost Guide 2026
Real developer rates, project cost breakdowns, and a budget planning template. Used by 200+ startup founders.
Ready to build?
NKKTech delivers AI Development projects from $30K.
Fixed scope. Senior Vietnam engineers. 14-day kickoff.

10+ years building AI systems for Toyota, Sony, and Rakuten in Japan. Founded NKKTech in 2018 with a senior-only engineering model.
Want to build this with NKKTech?
Building AI under HIPAA, GDPR, PDPA, PIPEDA, or APPI? Book a free 30-minute call with a NKKTech senior engineer who has shipped under each of these frameworks. We can review your architecture for gaps and suggest specific patterns — no sales pitch.
Book a Free Call