Pseudo-IDs

Every object generated by Pseudata has a unique, deterministic pseudo-id—a special UUID v8 that encodes three pieces of information: the world-seed, the type-sequence, and the object’s index in its PseudoArray.

PseudoID Bit Layout

Unlike random UUIDs, pseudo-ids are:

Deterministic - Same across all languages and executions
Sortable - Organized by world, then type, then creation order
Self-describing - Contains metadata about the object’s origin
RFC 9562 compliant - Standard UUID v8 format for vendor-specific data

Technical Limits:

World-Seeds: 64-bit (18.4 quintillion unique worlds)
Type-Sequences: 16-bit (65,536 different types)
Index Range: 40-bit (1.1 trillion objects per type)
Layout Versions: 2-bit (3 future encoding schemes)

These limits are designed to handle virtually any real-world use case.

Components

World-Seed

The world-seed is the base seed you provide when creating a PseudoArray. It defines the entire “world” or “universe” of generated data.

// Different arrays, same world
users := pseudata.NewUserArray(42)
addresses := pseudata.NewAddressArray(42)

user := users.At(0)
addr := addresses.At(0)

// Both PseudoIDs contain worldSeed=42
// They belong to the same "world"

UserArray users = new UserArray(42L);  // worldSeed = 42
User user = users.at(1000);
String id = user.id();
// This PseudoID will contain worldSeed=42

users = UserArray(42)  # worldSeed = 42
user = users[1000]
id = user.id()
# This PseudoID will contain worldSeed=42

const users = new UserArray(42n);  // worldSeed = 42
const user = users.at(1000);
const id = user.id();
// This PseudoID will contain worldSeed=42

Use cases:

Separate production, staging, and test data
Isolate data by tenant or customer
Group related test scenarios

Type-Sequence

The type-sequence (typeSeq) identifies the type of object. Each complex type has a predefined sequence number:

Type	TypeSeq	Description
User	101	User objects (OIDC-compliant)
Address	110	Address objects (locale-aware)
Custom Types	1024+	Reserved for future types

Note: TypeSeq uses 16 bits (0-65,535), with built-in types using 0-1023 and custom types starting from 1024.

// Different types have different typeSeq values
users := pseudata.NewUserArray(42)      // typeSeq = 101
addresses := pseudata.NewAddressArray(42)  // typeSeq = 110

userID := users.At(100).ID()      // Contains typeSeq=101
addressID := addresses.At(100).ID()  // Contains typeSeq=110

// PseudoIDs are different even at same index!

// Different types have different typeSeq values
UserArray users = new UserArray(42L);      // typeSeq = 101
AddressArray addresses = new AddressArray(42L);  // typeSeq = 110

String userId = users.at(100).id();      // Contains typeSeq=101
String addressId = addresses.at(100).id();  // Contains typeSeq=110

// PseudoIDs are different even at same index!

# Different types have different typeSeq values
users = UserArray(42)      # typeSeq = 101
addresses = AddressArray(42)  # typeSeq = 110

user_id = users[100].id()      # Contains typeSeq=101
address_id = addresses[100].id()  # Contains typeSeq=110

# PseudoIDs are different even at same index!

// Different types have different typeSeq values
const users = new UserArray(42n);      // typeSeq = 101
const addresses = new AddressArray(42n);  // typeSeq = 110

const userId = users.at(100).id();      // Contains typeSeq=101
const addressId = addresses.at(100).id();  // Contains typeSeq=110

// PseudoIDs are different even at same index!

This means you can identify an object’s type-sequence just by looking at its pseudo-id.

Index Position

The index is the position of the object in its PseudoArray. It represents the “creation order” within that type.

users := pseudata.NewUserArray(42)

first := users.At(0)       // index = 0
second := users.At(1)      // index = 1
thousandth := users.At(1000)  // index = 1000

id1 := first.ID()      // Contains index=0
id2 := second.ID()     // Contains index=1
id1000 := thousandth.ID()  // Contains index=1000

UserArray users = new UserArray(42L);

User first = users.at(0);       // index = 0
User second = users.at(1);      // index = 1
User thousandth = users.at(1000);  // index = 1000

String id1 = first.id();      // Contains index=0
String id2 = second.id();     // Contains index=1
String id1000 = thousandth.id();  // Contains index=1000

users = UserArray(42)

first = users[0]       # index = 0
second = users[1]      # index = 1
thousandth = users[1000]  # index = 1000

id1 = first.id()      # Contains index=0
id2 = second.id()     # Contains index=1
id1000 = thousandth.id()  # Contains index=1000

const users = new UserArray(42n);

const first = users.at(0);       // index = 0
const second = users.at(1);      // index = 1
const thousandth = users.at(1000);  // index = 1000

const id1 = first.id();      // Contains index=0
const id2 = second.id();     // Contains index=1
const id1000 = thousandth.id();  // Contains index=1000

The index can range from 0 to 1,099,511,627,775 (40 bits), supporting over 1.1 trillion unique objects per type.

How Pseudo IDs Work

Pseudo-ids use the standard RFC 9562 UUID v8 format but encode the three components into the 122 “random” bits:

UUID Format: xxxxxxxx-xxxx-8xxx-yxxx-xxxxxxxxxxxx
             └─────────── encoded data ──────────┘
                         8 = version (v8)
                          y = variant (8-b)

Bit Distribution:

64 bits: World-seed (full uint64 range)
2 bits: Layout bits (reserved for future layouts, currently always 0)
16 bits: Type-sequence (0-65,535)
40 bits: Index (0-1.1 trillion)

Visual Pattern:

SSSSSSSS-SSSS-8SSS-vSTT-TTIIIIIIIII

S = WorldSeed bits
T = TypeSeq bits
I = Index bits
v = Variant nibble + Layout bits (reserved)
8 = Version (UUID v8)

Example:

Input:  worldSeed=42, typeSeq=101, index=1000
Output: 00000000-0000-8002-a806-5000000003e8
        └─ world ──┘ v │└ type+index ──┘

Sort Order

PseudoIDs are designed to sort hierarchically:

Primary: World-seed (groups by world)
Secondary: Type-sequence (groups types within a world)
Tertiary: Index (creation order within type)

// Go - Demonstrating sort order
id1 := EncodeID(1, 42, 101, 1000)  // World 42, Users, Index 1000
id2 := EncodeID(1, 42, 101, 1001)  // World 42, Users, Index 1001
id3 := EncodeID(1, 42, 110, 1000)  // World 42, Addresses, Index 1000
id4 := EncodeID(1, 43, 101, 1000)  // World 43, Users, Index 1000

// Sort order (lexicographical):
// id1 < id2  (same world/type, index 1000 < 1001)
// id2 < id3  (same world, type 101 < 110)
// id3 < id4  (world 42 < 43)

This makes PseudoIDs perfect for:

Database indexing - Natural clustering by world and type
Analytics - Group queries by world or type
Debugging - Find all objects from a specific world
Testing - Isolate test data by world seed

Using Pseudo-IDs

Accessing Pseudo-IDs

Every generated object has an id() method:

import "github.com/pseudata/pseudata"

users := pseudata.NewUserArray(42)
user := users.At(1000)

pseudoID := user.ID()
fmt.Println(pseudoID)  // "00000000-0000-8002-a806-5000000003e8"

import dev.pseudata.UserArray;
import dev.pseudata.User;

UserArray users = new UserArray(42L);
User user = users.at(1000);

String pseudoId = user.id();
System.out.println(pseudoId);  // "00000000-0000-8002-a806-5000000003e8"

from pseudata import UserArray

users = UserArray(42)
user = users[1000]

pseudo_id = user.id()
print(pseudo_id)  # "00000000-0000-8002-a806-5000000003e8"

import { UserArray } from '@pseudata/core';

const users = new UserArray(42n);
const user = users.at(1000);

// Get the object's unique PseudoID
const pseudoId = user.id();
console.log(pseudoId);  // "00000000-0000-8002-a806-5000000003e8"

Creating Pseudo-IDs Without Pseudo-Arrays

You can generate PseudoIDs directly using utility functions:

import "github.com/pseudata/pseudata"

pseudoID := pseudata.EncodeID(
  42,     // worldSeed
  101,    // typeSeq (Users)
  1000,   // index
)

import dev.pseudata.IDUtils;

String pseudoId = IDUtils.encodeId(
  42L,       // worldSeed
  101,       // typeSeq (Users)
  1000L      // index
);

from pseudata.id_utils import encode_id

pseudo_id = encode_id(
  world_seed=42,
  type_seq=101,  # Users
  index=1000
)

import { encodeId } from '@pseudata/core';

const pseudoId = encodeId(
  42n,     // worldSeed
  101,     // typeSeq (Users)
  1000n    // index
);
// "00000000-0000-8002-a806-5000000003e8"

Use cases for direct encoding:

Generate test fixtures
Create migration scripts
Replay specific scenarios
Generate IDs for external systems

Decoding Pseudo-IDs

Extract the components from any pseudo-id:

import "github.com/pseudata/pseudata"

components, err := pseudata.DecodeID("00000000-0000-8002-a806-5000000003e8")
if err == nil {
  fmt.Printf("World: %d\n", components.WorldSeed)  // 42
  fmt.Printf("Type: %d\n", components.TypeSeq)     // 101
  fmt.Printf("Index: %d\n", components.Index)      // 1000
}

import dev.pseudata.IDUtils;

IDUtils.IDComponents c = IDUtils.decodeId("00000000-0000-8002-a806-5000000003e8");
System.out.printf("World: %d, Type: %d, Index: %d%n",
  c.worldSeed, c.typeSeq, c.index);

from pseudata.id_utils import decode_id

components = decode_id("00000000-0000-8002-a806-5000000003e8")
if components:
  print(f"World: {components.world_seed}")    # 42
  print(f"Type: {components.type_seq}")       # 101
  print(f"Index: {components.index}")         # 1000

import { decodeId } from '@pseudata/core';

const components = decodeId("00000000-0000-8002-a806-5000000003e8");
if (components) {
  console.log(`World: ${components.worldSeed}`);    // 42
  console.log(`Type: ${components.typeSeq}`);       // 101
  console.log(`Index: ${components.index}`);        // 1000
}

Practical Use Cases

Database Primary Keys

Use PseudoIDs as primary keys for deterministic, reproducible databases:

// TypeScript - Generate consistent primary keys
const users = new UserArray(productionSeed);

for (let i = 0; i < 10000; i++) {
  const user = users.at(i);
  await db.insert("users", {
    id: user.id(), // Deterministic PseudoID
    name: user.name(),
    email: user.email(),
  });
}

Benefits:

Same IDs across test runs
Easy to reference in test assertions
Natural clustering in database indexes

Cross-Language Test Fixtures

Generate matching test data across different services:

# Python backend service
from pseudata import UserArray

users = UserArray(42)
test_user = users[1000]
test_user_id = test_user.id()  # "10000000-0000-4000-8a80-1940000003e8"

// TypeScript frontend test
import { UserArray } from "@pseudata/core";

const users = new UserArray(42n);
const testUser = users.at(1000);
const testUserId = testUser.id(); // "10000000-0000-4000-8a80-1940000003e8"

// Same PseudoID! Can test cross-service interactions
await expect(page.locator(`[data-user-id="${testUserId}"]`)).toBeVisible();

Debugging Production Issues

Decode PseudoIDs in production logs to understand data origin:

// Go - Production debugging
suspiciousID := "10000000-0000-4000-8a80-1940000003e8"
components, _ := pseudata.DecodeID(suspiciousID)

log.Printf("Issue analysis:")
log.Printf("  Environment seed: %d", components.WorldSeed)  // 42
log.Printf("  Object type: %d", components.TypeSeq)         // 101 (User)
log.Printf("  Array position: %d", components.Index)        // 1000

// Reproduce the exact object
users := pseudata.NewUserArray(components.WorldSeed)
problematicUser := users.At(int(components.Index))

Multi-Tenant Applications

Use different world seeds per tenant:

// Java
long tenant1Seed = 1001;
long tenant2Seed = 2002;

UserArray tenant1Users = new UserArray(tenant1Seed);
UserArray tenant2Users = new UserArray(tenant2Seed);

// PseudoIDs naturally contain tenant information
String t1UserId = tenant1Users.at(0).id();  // Contains seed=1001
String t2UserId = tenant2Users.at(0).id();  // Contains seed=2002

// Can identify tenant from any PseudoID
IDComponents c = IDUtils.decodeId(t1UserId);
long tenantSeed = c.worldSeed;  // 1001

Analytics & Reporting

Query data by world or type:

# Python - Analytics example
from pseudata.id_utils import decode_id

# Analyze production PseudoIDs
for user_id in production_user_ids:
  components = decode_id(user_id)
  if components:
    metrics[components.world_seed]['count'] += 1
    metrics[components.world_seed]['types'].add(components.type_seq)

# Report: "World 42 has 5000 Users, World 43 has 3000 Users"

Cross-Language Consistency

PseudoIDs are identical across all languages for the same inputs:

// Same world, type, and index → Same PseudoID everywhere

// TypeScript
new UserArray(42n).at(1000).id()
// → "10000000-0000-4000-8a80-1940000003e8"

// Python
UserArray(42)[1000].id()
// → "10000000-0000-4000-8a80-1940000003e8"

// Java
new UserArray(42L).at(1000).id()
// → "10000000-0000-4000-8a80-1940000003e8"

// Go
NewUserArray(42).At(1000).ID()
// → "10000000-0000-4000-8a80-1940000003e8"

This consistency is guaranteed by:

Shared test vectors (fixtures/id_test_vectors.json)
Identical bit-packing algorithms
Comprehensive cross-language test suites

Best Practices

Use Meaningful World Seeds

Choose world seeds that reflect your environments:

const DEV_SEED = 1n;
const STAGING_SEED = 2n;
const PRODUCTION_SEED = 42n;
const TEST_SEED = 9999n;

// Clear separation of data
const devUsers = new UserArray(DEV_SEED);
const stagingUsers = new UserArray(STAGING_SEED);

Document Type Sequences

Maintain a registry of your type sequences:

const (
    TypeSeqUsers     = 101
    TypeSeqAddresses = 110
    TypeSeqOrders    = 201
    TypeSeqPayments  = 202
    // Custom types start at 10000
    TypeSeqMyCustom  = 10000
)

Decode for Observability

Add PseudoID decoding to your logging:

# Python - Enhanced logging
import logging
from pseudata.id_utils import decode_id

def log_user_action(user_id: str, action: str):
    components = decode_id(user_id)
    if components:
        logging.info(
            f"Action: {action}, "
            f"World: {components.world_seed}, "
            f"Type: {components.type_seq}, "
            f"Index: {components.index}"
        )

Validate PseudoIDs in Production

Ensure PseudoIDs come from expected worlds:

// Java - Validation
public boolean isValidProductionUser(String userId) {
    IDComponents c = IDUtils.decodeId(userId);
    return c.worldSeed == PRODUCTION_SEED
        && c.typeSeq == TypeSeq.USERS;
}

Comparison: PseudoID vs Random UUID

Feature	Random UUID	PseudoID
Deterministic	❌ Different each time	✅ Same across languages
Cross-language	❌ Random everywhere	✅ Identical everywhere
Sortable	❌ Random order	✅ World → Type → Index
Decodable	❌ No metadata	✅ Extract components
Use with PseudoArrays	❌ Not tied to data	✅ Directly from objects
RFC compliant	✅ Yes (RFC 4122)	✅ Yes (RFC 9562)
Best for	Unique identifiers	Traceable, reproducible IDs