Data Governance and Compliance — System Design Space

Compliance becomes useful when it moves from an external review into architectural constraints on data and processes.

The chapter shows how GDPR, Federal Law No. 152-FZ, data lineage, personal-data handling, access control, and change auditing shape data models, flows, retention, residency, and team workflows.

In design reviews, it helps teams discuss data minimization, auditability, and regulatory constraints as system-design concerns rather than a late legal overlay.

Practical value of this chapter

Design in practice

Design data classification, ownership, retention, deletion, and access control as architecture, not as documentation after release.

Decision quality

Validate whether lineage, legal basis, access rights, and retention rules can be proven when the system is reviewed.

Interview articulation

Frame the answer around the data route: collection, storage, processing, transfer, access, deletion, and audit.

Trade-off framing

Make constraint costs explicit: product convenience, analytics speed, storage cost, deletion complexity, and regulatory risk.

Context

Security Engineering Overview

Data governance only works together with security, SRE, and data-platform processes.

Open chapter

Data Governance and Compliance fail when they stay a stack of legal documents: by the time the audit arrives, it turns out no one implemented them in the code or the stores. In architecture they are concrete technical decisions — data classification, access control, lineage, personal-data handling, retention, deletion, and a provable audit trail. The boundary is simple: the system has to stay product-friendly and still withstand regulatory review, instead of falling apart on the first request from a regulator.

Governance principles

Data classification first

Classify datasets by sensitivity: public, financial, health, personal, and regulated data. Without an explicit class, controls attach to the service, and the same personal data ends up protected differently across services — exactly the gap an audit finds.

Least privilege by design

Broad access “so teams aren’t blocked” is a list of people you will have to explain during a breach investigation. Scope rights tightly: role- or attribute-based authorization, short-lived credentials, just-in-time access, and recurring access review.

Traceability

When a regulator asks where a specific value came from, “let’s check the code” is not an answer. Keep lineage for critical transformations: who changed the data, when, from which source, and where it flowed next. The audit trail should make that answerable.

Retention and deletion

Deletion run “by a quarterly ticket” will reliably miss a copy in a backup or replica, and that copy will surface during a review. Automate retention and deletion across stores, pipelines, backups, and replicas.

Multi-region / Global Systems

Compliance affects data placement, regional constraints, and cross-region routing.

Open chapter

Regulatory framework: GDPR, Federal Law No. 152-FZ, and others

GDPR (EU)

Focus: Legal basis, data subject rights, breach notification, data minimization, and privacy by design.

In architecture this means data subject request workflows, processing maps, audit trails, regional restrictions, and transparent consent flows.

Federal Law No. 152-FZ (Russia)

Focus: Legal grounds for personal-data processing, protection requirements, organizational measures, and storage or processing constraints.

At the platform level this means a personal-data inventory, cross-border transfer controls, segmented environments, and access logging.

CCPA/CPRA, PCI DSS, HIPAA, and others

Focus: Data subject rights and industry requirements for payment, health, and other regulated data.

You need domain-specific controls: tokenization, field-level encryption, strict retention policies, and audit logs that are ready for review.

Personal-data lifecycle

Collection: collect only required fields and record the processing purpose for each personal-data class.
Storage: use encryption at rest, separate keys, and tailored policies for highly sensitive attributes.
Processing: use pseudonymization, masking, and log redaction in analytics and non-production environments.
Transfer: protect channels with mutual TLS, data contracts, and explicit controls for cross-region movement.
Access: grant rights through just-in-time access, MFA for administrators, and recurring access review.
Deletion: verify cascading deletion across downstream copies, serving datasets, backups, and archives.

Data lineage as an architectural control

Data catalog

A shared catalog records datasets, data owners, data contracts, criticality, and sensitivity tags.

End-to-end lineage graph

Links from sources through transformations to serving layers show who a change will hit — which consumers and which regulatory obligations — before it ships to prod.

Automated policy checks

CI/CD and data-platform checks prevent personal data from being exported into unauthorized environments — a violation is caught before rollout, not after a complaint.

Evidence for audits and incidents

Data access events and lineage logs should quickly answer which regulated data was affected and where it is located.

Compliance-ready architecture checklist

Every critical dataset has an owner, a sensitivity tag, and a retention policy.

Personal data in logs, metrics, and traces is masked or removed by default.

There is an automated path for data subject requests: access, correction, and deletion.

Lineage is available to security, product, and compliance stakeholders, not locked inside the platform team everyone has to queue behind for extracts.

Compliance reviews are built into release governance instead of being performed manually after release.

Typical antipatterns

Collecting excess personal data “for later” without a clear purpose and legal basis.

Storing sensitive data in telemetry or logs without masking and redaction.

Ignoring backups and archives in the data deletion strategy.

Maintaining lineage manually in spreadsheets instead of an automated graph.

Treating compliance as a legal-only responsibility without architectural constraints in code and platform controls.

References

This chapter is not legal advice. Validate requirements in your jurisdiction with your legal and compliance stakeholders.

Related chapters

Security Engineering Overview - Provides the secure-by-design baseline that governance policies and compliance checks depend on.
Identification -> AuthN -> AuthZ - Directly related to controlling personal-data access, privileged actions, and auditable authorization flows.
Encryption, Keys and TLS - Covers cryptographic protections for data at rest and in transit required by many regulatory frameworks.
Data Pipeline / ETL / ELT Architecture - Extends governance into transformation pipelines through lineage visibility and data-quality controls.
Multi-region / Global Systems - Explains data residency, regional constraints, and cross-border flows in global systems.