Skip to content

Core concepts

This page explains the fundamental concepts that form the foundation of TrustDeck: domains, pseudonyms, algorithms, and projects. These concepts define the data model and operational behavior of the pseudonymization system.

For details on API usage, see the Swagger UI.
For security and authorization, see Authentication (OIDC/JWT).


Overview

TrustDeck organizes pseudonymization around five core concepts:

Concept Purpose Key code entities (examples)
Domains Configuration units that define pseudonymization rules and form hierarchical structures Domain, DomainDTO, DomainDBAccessService, DomainRESTController
Pseudonyms Identifier-to-pseudonym mappings scoped to specific domains Pseudonym, PseudonymDTO, PseudonymDBAccessService, PseudonymRESTController
Algorithms Technical parameters for pseudonym generation (algorithm type, alphabet, length, etc.) Algorithm, AlgorithmDTO, PseudonymizationFactory, Pseudonymizer
Projects Entity management containers Project, ProjectDTO, ProjectDBService, ProjectRESTController
Entities Representations of real life persons, samples, objects, ... EntityType, EntityInstance, EntityTypeRESTController, EntityInstanceRESTController, EntityTypeDTO, EntityInstanceDTO, EntityTypeDBService, EntityInstanceDBService

Domain-pseudonym-algorithm relationship (high level)

A domain is a cluster for semntically similar pseudonyms and defines some ground rules for managing the pseudonyms.
A pseudonym is created within exactly one domain and stores the mapping for a specific identifier (+ type).
An algorithm describes how the pseudonym is generated (and with which parameters) for that domain.

In practice:

  • Domains reference an algorithm configuration.
  • Pseudonyms are created/read within a domain context.
  • Algorithm choice and parameters can be inherited from the parent domain.

Domains

A domain is a configuration unit that defines how pseudonyms are managed. Each domain encapsulates:

  • Validity period (start/end dates)
  • Enforcement rules (whether to enforce validity constraints)
  • Prefix (prepended to generated pseudonyms)
  • Hierarchical relationship (optional parent domain)

Domain hierarchy

Domains form a tree structure. A domain may have a parent domain (superdomain) and can inherit configuration values from that parent.

Inheritance mechanism

When a domain is created with a parent, it can inherit properties. Each inheritable property has a corresponding *Inherited boolean flag.

Property Inherited flag Default if not specified
validFrom validFromInherited Parent’s validFrom or now
validTo validToInherited Parent’s validTo or derived from validityTime
enforceStartDateValidity enforceStartDateValidityInherited Parent’s value or true
enforceEndDateValidity enforceEndDateValidityInherited Parent’s value or true
algorithm algorithmInherited Parent’s algorithm or RANDOM
alphabet alphabetInherited Parent’s alphabet or A–Z
pseudonymLength pseudonymLengthInherited Parent’s length or 16
paddingCharacter paddingCharacterInherited Parent’s character or "0"
addCheckDigit addCheckDigitInherited Parent’s value or true
lengthIncludesCheckDigit lengthIncludesCheckDigitInherited Parent’s value or false
multiplePsnAllowed multiplePsnAllowedInherited Parent’s value or false

Domain creation

Domains are created through REST endpoints. Typical patterns:

  • Standard domain creation: simplified creation with essential properties
  • Complete domain creation: specify all domain properties explicitly

Domains apply defaults if values are not provided (e.g., default algorithm, alphabet, pseudonym length, etc.).


Pseudonyms

A pseudonym is a mapping between an identifier and a generated pseudonym string, always scoped to a specific domain.

A pseudonym record typically contains:

  • Identifier (identifier + idType)
  • Pseudonym (generally generated by TrustDeck, but can also be given)
  • Validity period (validFrom, validTo)
  • Inheritance flags (whether validity was inherited from domain)
  • Domain reference (domainId)

Identifier structure

Identifiers are represented as a combination of:

  • identifier (the actual identifying string)
  • idType (the identifier type, e.g., EHR_ID, SSN, INSURANCE_ID)

Example (conceptual):

identifier: "123456"
idType: "PATIENT_ID"

Pseudonym validity and inheritance

Pseudonyms inherit validity periods from their domain unless explicitly specified. When the respective flags are set in the domain, these validity times can be automatically enforced to not be before/after those from the domain. Typical behavior:

  • validFrom:

    • uses provided value if given (subject to enforcement rules)
    • otherwise inherits from domain
  • validTo:

    • can be provided directly or derived from a validityTime parameter
    • may be capped by domain validity if enforcement is enabled
  • inheritance flags:

    • set to true when value was inherited from domain
    • set to false when explicitly provided

Example (conceptual):

  • Domain validTo = 2035-01-01 (enforced)
  • Pseudonym request validTo = 2036-01-01
  • Result validTo may be capped to 2035-01-01

Multiple pseudonyms per identifier

The domain property multiplePsnAllowed controls whether multiple pseudonyms can exist for the same identifier + idType in the same domain:

  • false (default): enforce 1:1 mapping
  • true: allow 1:n mappings

This can be helpful when, for example, a patient has multiple x-ray images of the same modality and you want to generate a unique pseudonym for each image.

Batch pseudonym operations

TrustDeck supports batch creation of pseudonyms via dedicated endpoints (see Swagger UI). Batch operations typically return a per-item status such as:

  • INSERTION_SUCCESS
  • INSERTION_DUPLICATE_IDENTIFIER
  • INSERTION_DUPLICATE_PSEUDONYM
  • INSERTION_ERROR

Cross-domain pseudonym linking

TrustDeck can support linking pseudonyms across domains by traversing the domain hierarchy and matching identifier/pseudonym relationships along a path. This enables retrieving a pseudonym linked to another one in the same domain tree but on a different level.

  • Input would be a starting domain as well as an identifier + idType or Pseudonym
  • A target domain is specified by the user
  • The endpoint will then try to find a pseudonym that is chained to the given one in the target domain

Example:

  • Input:
    • identifier: 123456
    • idType: EHR_ID
    • sourceDomain: Domain
    • targetDomain: GrandChildDomain
  • Assumed domain structure: Domain > ChildDomain > GrandChildDomain
  • Example pseudonym chain:
    • Domain Domain:
      • identifier: 123456
      • idType: EHR_ID
      • Pseudonym: D-abcd
    • Domain ChildDomain:
      • identifier: D-abcd
      • idType: Domain_PSN
      • Pseudonym: CD-1234
    • Domain GrandChildDomain:
      • identifier: CD-1234
      • idType: ChildDomain_PSN
      • Pseudonym: GCD-a1b2
  • Output:
    • identifier: CD-1234
    • idType: ChildDomain_PSN
    • Pseudonym: GCD-a1b2

Pseudonymization algorithms

An algorithm defines the technical rules for generating pseudonyms. TrustDeck supports multiple algorithm types, for example:

Algorithm Characteristics
MD5 Cryptographic (but broken) hash
SHA1/2/3 Cryptographic hash
BLAKE3 Cryptographic hash
xxHASH Fast non-cryptographic hash
RANDOM Random generation from a configurable alphabet
CONSECUTIVE Sequential numbering

Algorithm parameters (examples)

Common parameters include:

  • Alphabet (for random-style generation or check digit calculation)
  • Pseudonym length (how long the output pseudonym shoud be, excl. potential prefix)
  • Salt (for hashing algorithms)
  • Padding character
  • Check digit settings (whether enabled, and whether it counts toward length)

Self configuration of algorithms

Often, users want the shortest possible pseudonyms for their use case since shorter strings are less error prone when manually handling them.

When selecting a randomness-based algorithm, the user can let the algorithm configure itself. For that, the user provides an estimated number of pseudonyms that should be available in the domain (e.g., 100 million for pseudonymizing persons in a large country). The user also defines a probability with which the pseudonymization should be successful as potential collisions might arise when generating a large number of random pseudonyms. Lastly, the user defines an alphabet that is to be used to generate the random strings from.

The algorithm will then calculate the minimum length required to guarantee the given settings. This guarantees then that the generated pseudonyms are only as long as they need to be and not longer.

Check digits (optional)

TrustDeck can apply a check digit (based on Luhn mod n) depending on domain configuration. This can improve detection of transcription errors when pseudonyms are manually handled.


Projects

A project is a container for entity management in the KING module. Projects provide:

  • Organizational boundaries for entities
  • Metadata (name, abbreviation, start/end dates)
  • Configuration flags (e.g., whether it is used to store entities or pseudonyms)
  • Access control via Keycloak roles/groups (depending on deployment)

Project–domain relationship

Projects and domains are separate concepts:

  • Domains: pseudonymization containers
  • Projects: entity management containers

Projects may reference domains when a project is configured to store pseudonyms.

Entities

Entities are part of the KING module and represent real-world objects you want to register and track (e.g., persons, biosamples, devices, documents). TrustDeck distinguishes between:

  • Entity types: the schema/blueprint (what fields an entity has, validation rules, semantics)
  • Entity instances: the actual stored records (data that conforms to an entity type)

In other words: EntityType = definition, EntityInstance = data.

Entity types

An entity type defines:

  • a stable type name / identifier (e.g., Patient, Sample, StudySubject)
  • a schema describing the allowed payload fields and constraints
  • metadata (description, versioning information, etc., depending on your setup)

Think of an entity type as a JSON Schema-like contract that describes what “valid entity data” looks like.

To minimize having to recreate the same kind of entities over and over again, TrustDeck distinguishes two kinds of entity types: base types and project-specific entity types.

Base types can only be created by authorized personnel, such as administrators or PIs. Base types can e.g. be a person entity, a biosample entity, or a device entity. Base types should not define too many atributes as perojects might not need some of them. When a project now wants to use a person entity, it can define its project-specific person entity by extending the base type and add attributes that are specific to the project.


Entity instances

An entity instance is a concrete record of a specific entity, created under a project and associated with a chosen entity type.

An entity instance typically contains:

  • an instance identifier (unique ID, can be generated by TrustDeck)
  • a reference to the entity type (which schema it follows)
  • the payload/data (a JSON matching the type schema)
  • lifecycle metadata (created/updated timestamps, status flags such as active/deleted, depending on implementation)
  • optional links to pseudonyms/domains (depending on project configuration)

Projects and entities

Entity instances live inside projects. Projects act as containers and boundaries for:

  • access control (who may read/write entities in that project)
  • organization (grouping of related entities)
  • optional storage behavior (whether to store entities and/or associated pseudonyms)

Projects and domains are separate concepts:

  • Projects organize entities.
  • Domains organize pseudonym mappings.

A project may be configured to store pseudonyms, which can be used to attach pseudonyms to entity instances or to derive pseudonyms during ingestion (depending on your workflow).


Example workflow (conceptual)

  1. Create a project:

    • MyStudyProject
  2. Define an entity type (schema/blueprint):

    • Create EntityType named Patient based on BasePerson
    • Include fields like patientID, caseID, mainDiagnosis, admissionDate, etc.
  3. Register entity instances in the project:

    • Create instance of type Patient with payload (JSON)
    • Retrieve/list instances for downstream processing

Example payload (conceptual):

{
  ...
  "patientID": "PAT-123456",
  "caseID": "A4B3C2D1",
  "mainDiagnosis": "ICD10GM-M54.5",
  "admissionDate": "2026-01-01"
}