Core concepts¶

This page explains the fundamental concepts that form the foundation of TrustDeck: domains, pseudonyms, algorithms, and projects. These concepts define the data model and operational behavior of the pseudonymization system.

For details on API usage, see the Swagger UI.
For security and authorization, see Authentication (OIDC/JWT).

Overview¶

TrustDeck organizes pseudonymization around five core concepts:

Concept	Purpose	Key code entities (examples)
Domains	Configuration units that define pseudonymization rules and form hierarchical structures	`Domain`, `DomainDTO`, `DomainDBAccessService`, `DomainRESTController`
Pseudonyms	Identifier-to-pseudonym mappings scoped to specific domains	`Pseudonym`, `PseudonymDTO`, `PseudonymDBAccessService`, `PseudonymRESTController`
Algorithms	Technical parameters for pseudonym generation (algorithm type, alphabet, length, etc.)	`Algorithm`, `AlgorithmDTO`, `PseudonymizationFactory`, `Pseudonymizer`
Projects	Entity management containers	`Project`, `ProjectDTO`, `ProjectDBService`, `ProjectRESTController`
Entities	Representations of real life persons, samples, objects, ...	`EntityType`, `EntityInstance`, `EntityTypeRESTController`, `EntityInstanceRESTController`, `EntityTypeDTO`, `EntityInstanceDTO`, `EntityTypeDBService`, `EntityInstanceDBService`

Domain-pseudonym-algorithm relationship (high level)¶

A domain is a cluster for semntically similar pseudonyms and defines some ground rules for managing the pseudonyms.
A pseudonym is created within exactly one domain and stores the mapping for a specific identifier (+ type).
An algorithm describes how the pseudonym is generated (and with which parameters) for that domain.

In practice:

Domains reference an algorithm configuration.
Pseudonyms are created/read within a domain context.
Algorithm choice and parameters can be inherited from the parent domain.

Domains¶

A domain is a configuration unit that defines how pseudonyms are managed. Each domain encapsulates:

Validity period (start/end dates)
Enforcement rules (whether to enforce validity constraints)
Prefix (prepended to generated pseudonyms)
Hierarchical relationship (optional parent domain)

Domain hierarchy¶

Domains form a tree structure. A domain may have a parent domain (superdomain) and can inherit configuration values from that parent.

Inheritance mechanism¶

When a domain is created with a parent, it can inherit properties. Each inheritable property has a corresponding *Inherited boolean flag.

Property	Inherited flag	Default if not specified
validFrom	validFromInherited	Parent’s validFrom or now
validTo	validToInherited	Parent’s validTo or derived from validityTime
enforceStartDateValidity	enforceStartDateValidityInherited	Parent’s value or true
enforceEndDateValidity	enforceEndDateValidityInherited	Parent’s value or true
algorithm	algorithmInherited	Parent’s algorithm or RANDOM
alphabet	alphabetInherited	Parent’s alphabet or A–Z
pseudonymLength	pseudonymLengthInherited	Parent’s length or 16
paddingCharacter	paddingCharacterInherited	Parent’s character or "0"
addCheckDigit	addCheckDigitInherited	Parent’s value or true
lengthIncludesCheckDigit	lengthIncludesCheckDigitInherited	Parent’s value or false
multiplePsnAllowed	multiplePsnAllowedInherited	Parent’s value or false

Domain creation¶

Domains are created through REST endpoints. Typical patterns:

Standard domain creation: simplified creation with essential properties
Complete domain creation: specify all domain properties explicitly

Domains apply defaults if values are not provided (e.g., default algorithm, alphabet, pseudonym length, etc.).

Pseudonyms¶

A pseudonym is a mapping between an identifier and a generated pseudonym string, always scoped to a specific domain.

A pseudonym record typically contains:

Identifier (identifier + idType)
Pseudonym (generally generated by TrustDeck, but can also be given)
Validity period (validFrom, validTo)
Inheritance flags (whether validity was inherited from domain)
Domain reference (domainId)

Identifier structure¶

Identifiers are represented as a combination of:

identifier (the actual identifying string)
idType (the identifier type, e.g., EHR_ID, SSN, INSURANCE_ID)

Example (conceptual):

identifier: "123456"
idType: "PATIENT_ID"

Pseudonym validity and inheritance¶

Pseudonyms inherit validity periods from their domain unless explicitly specified. When the respective flags are set in the domain, these validity times can be automatically enforced to not be before/after those from the domain. Typical behavior:

validFrom:
- uses provided value if given (subject to enforcement rules)
- otherwise inherits from domain
validTo:
- can be provided directly or derived from a validityTime parameter
- may be capped by domain validity if enforcement is enabled
inheritance flags:
- set to true when value was inherited from domain
- set to false when explicitly provided

Example (conceptual):

Domain validTo = 2035-01-01 (enforced)
Pseudonym request validTo = 2036-01-01
Result validTo may be capped to 2035-01-01

Multiple pseudonyms per identifier¶

The domain property multiplePsnAllowed controls whether multiple pseudonyms can exist for the same identifier + idType in the same domain:

false (default): enforce 1:1 mapping
true: allow 1:n mappings

This can be helpful when, for example, a patient has multiple x-ray images of the same modality and you want to generate a unique pseudonym for each image.

Batch pseudonym operations¶

TrustDeck supports batch creation of pseudonyms via dedicated endpoints (see Swagger UI). Batch operations typically return a per-item status such as:

INSERTION_SUCCESS
INSERTION_DUPLICATE_IDENTIFIER
INSERTION_DUPLICATE_PSEUDONYM
INSERTION_ERROR

Cross-domain pseudonym linking¶

TrustDeck can support linking pseudonyms across domains by traversing the domain hierarchy and matching identifier/pseudonym relationships along a path. This enables retrieving a pseudonym linked to another one in the same domain tree but on a different level.

Input would be a starting domain as well as an identifier + idType or Pseudonym
A target domain is specified by the user
The endpoint will then try to find a pseudonym that is chained to the given one in the target domain

Example:

Input:
- identifier: 123456
- idType: EHR_ID
- sourceDomain: Domain
- targetDomain: GrandChildDomain
Assumed domain structure: Domain > ChildDomain > GrandChildDomain
Example pseudonym chain:
- Domain Domain:
  - identifier: 123456
  - idType: EHR_ID
  - Pseudonym: D-abcd
- Domain ChildDomain:
  - identifier: D-abcd
  - idType: Domain_PSN
  - Pseudonym: CD-1234
- Domain GrandChildDomain:
  - identifier: CD-1234
  - idType: ChildDomain_PSN
  - Pseudonym: GCD-a1b2
Output:
- identifier: CD-1234
- idType: ChildDomain_PSN
- Pseudonym: GCD-a1b2

Pseudonymization algorithms¶

An algorithm defines the technical rules for generating pseudonyms. TrustDeck supports multiple algorithm types, for example:

Algorithm	Characteristics
MD5	Cryptographic (but broken) hash
SHA1/2/3	Cryptographic hash
BLAKE3	Cryptographic hash
xxHASH	Fast non-cryptographic hash
RANDOM	Random generation from a configurable alphabet
CONSECUTIVE	Sequential numbering

Algorithm parameters (examples)¶

Common parameters include:

Alphabet (for random-style generation or check digit calculation)
Pseudonym length (how long the output pseudonym shoud be, excl. potential prefix)
Salt (for hashing algorithms)
Padding character
Check digit settings (whether enabled, and whether it counts toward length)

Self configuration of algorithms¶

Often, users want the shortest possible pseudonyms for their use case since shorter strings are less error prone when manually handling them.

When selecting a randomness-based algorithm, the user can let the algorithm configure itself. For that, the user provides an estimated number of pseudonyms that should be available in the domain (e.g., 100 million for pseudonymizing persons in a large country). The user also defines a probability with which the pseudonymization should be successful as potential collisions might arise when generating a large number of random pseudonyms. Lastly, the user defines an alphabet that is to be used to generate the random strings from.

The algorithm will then calculate the minimum length required to guarantee the given settings. This guarantees then that the generated pseudonyms are only as long as they need to be and not longer.

Check digits (optional)¶

TrustDeck can apply a check digit (based on Luhn mod n) depending on domain configuration. This can improve detection of transcription errors when pseudonyms are manually handled.

Projects¶

A project is a container for entity management in the KING module. Projects provide:

Organizational boundaries for entities
Metadata (name, abbreviation, start/end dates)
Configuration flags (e.g., whether it is used to store entities or pseudonyms)
Access control via Keycloak roles/groups (depending on deployment)

Project–domain relationship¶

Projects and domains are separate concepts:

Domains: pseudonymization containers
Projects: entity management containers

Projects may reference domains when a project is configured to store pseudonyms.

Entities¶

Entities are part of the KING module and represent real-world objects you want to register and track (e.g., persons, biosamples, devices, documents). TrustDeck distinguishes between:

Entity types: the schema/blueprint (what fields an entity has, validation rules, semantics)
Entity instances: the actual stored records (data that conforms to an entity type)

In other words: EntityType = definition, EntityInstance = data.

Entity types¶

An entity type defines:

a stable type name / identifier (e.g., Patient, Sample, StudySubject)
a schema describing the allowed payload fields and constraints
metadata (description, versioning information, etc., depending on your setup)

Think of an entity type as a JSON Schema-like contract that describes what “valid entity data” looks like.

To minimize having to recreate the same kind of entities over and over again, TrustDeck distinguishes two kinds of entity types: base types and project-specific entity types.

Base types can only be created by authorized personnel, such as administrators or PIs. Base types can e.g. be a person entity, a biosample entity, or a device entity. Base types should not define too many atributes as perojects might not need some of them. When a project now wants to use a person entity, it can define its project-specific person entity by extending the base type and add attributes that are specific to the project.

Entity instances¶

An entity instance is a concrete record of a specific entity, created under a project and associated with a chosen entity type.

An entity instance typically contains:

an instance identifier (unique ID, can be generated by TrustDeck)
a reference to the entity type (which schema it follows)
the payload/data (a JSON matching the type schema)
lifecycle metadata (created/updated timestamps, status flags such as active/deleted, depending on implementation)
optional links to pseudonyms/domains (depending on project configuration)

Projects and entities¶

Entity instances live inside projects. Projects act as containers and boundaries for:

access control (who may read/write entities in that project)
organization (grouping of related entities)
optional storage behavior (whether to store entities and/or associated pseudonyms)

Projects and domains are separate concepts:

Projects organize entities.
Domains organize pseudonym mappings.

A project may be configured to store pseudonyms, which can be used to attach pseudonyms to entity instances or to derive pseudonyms during ingestion (depending on your workflow).

Example workflow (conceptual)¶

Create a project:
- MyStudyProject
Define an entity type (schema/blueprint):
- Create EntityType named Patient based on BasePerson
- Include fields like patientID, caseID, mainDiagnosis, admissionDate, etc.
Register entity instances in the project:
- Create instance of type Patient with payload (JSON)
- Retrieve/list instances for downstream processing

Example payload (conceptual):

{
  ...
  "patientID": "PAT-123456",
  "caseID": "A4B3C2D1",
  "mainDiagnosis": "ICD10GM-M54.5",
  "admissionDate": "2026-01-01"
}