Pseudata

One seed. Every language. Same data.

Get Started Concept GitHub

Stop Fighting Inconsistent Test Data

Your frontend expects Alice. Your backend returns Bob.
Different faker libraries = Different realities.

Pseudata solves this by making mock data generation a standardized algorithm—not a language-specific implementation.

📖 Read the full story: Why faker.js and Faker Don’t Agree

Key Features

Cross-Language Consistency

Same seed = same data across all languages. No more integration test mismatches. Learn more →

Infinite Scale

$O(1)$ instant access to billions of records with zero memory overhead. Learn more →

Stateless Relations

Create deterministic relationships without a database. Navigate entities in $O(1)$ time. Learn more →

Smart Locale Loading

Minimal default, scale globally. Load only the specific regions and data you need. Learn more →

Same Seed, Same Data—At Any Scale

Pseudata is the only faker library that guarantees both cross-language consistency and $O(1)$ infinite scale.

❌ Traditional Fakers
✅ Pseudata

Different data across languages

# Python (Faker)
Faker.seed(42)
fake.name()  # → "Brett Davis"

// TypeScript (faker.js)
faker.seed(42);
faker.person.fullName(); // → "Miss Dora Kiehn"

Sequential access only

# To get item 1,000,000, must iterate - O(n)
faker.seed(42)
for i in range(1_000_000):
    name = faker.name()

Identical data across all languages

# Python
users = UserArray(42)
user = users.at(0)
print(user.name)  # → "John Smith"

// TypeScript
const users = new UserArray(42);
const user = users.at(0);
console.log(user.name); // → "John Smith"

Direct access to any index - $O(1)$

# Jump directly to the billionth record
users = UserArray(42)
user = users.at(1_000_000_000)  # Instant

Why This Matters

Cross-Language Consistency enables reliable integration testing across polyglot systems. When your Go backend and TypeScript frontend both generate User[1000] with seed 42, you get the same test data—eliminating the integration test mismatches that plague microservices architectures.

Direct $O(1)$ Access unlocks scenarios impossible with traditional fakers:

Load testing - Each test worker accesses its own range (worker 1 uses indices 0-999, worker 2 uses 1000-1999) without coordination
Parallel processing - Split a billion-record dataset across processes instantly, no sequential generation required
Sparse testing - Test edge cases at indices 1, 1000, 1000000, 1000000000 without generating intermediate records
Reproducible demos - Jump to the “interesting” user at index 42857 every time

See It In Action

User[1000] with seed 42 = Always the same across every language:

Diagram showing seed 42 flowing to Go, Java, Python, and TypeScript implementations, all producing identical output: John Smith

package main
import "github.com/pseudata/pseudata"

users := pseudata.NewUserArray(42)
user := users.At(1000)
fmt.Println(user.Name)  // → "John Smith"
fmt.Println(user.Email) // → "john.smith@example.com"

import dev.pseudata.UserArray;

UserArray users = new UserArray(42);
User user = users.at(1000);
System.out.println(user.getName());  // → "John Smith"
System.out.println(user.getEmail()); // → "john.smith@example.com"

from pseudata import UserArray

users = UserArray(42)
user = users.at(1000)
print(user.name)  # → "John Smith"
print(user.email) # → "john.smith@example.com"

import { UserArray } from "@pseudata/core";

const users = new UserArray(42);
const user = users.at(1000);
console.log(user.name);  // → "John Smith"
console.log(user.email); // → "john.smith@example.com"

Stateless Relationships

Navigate complex data graphs in $O(1)$ time without requiring a database.

Traditional mock data is “flat”—users and groups have no inherent connection unless you manually link them. Pseudo links use a bit-coordinate system to bake relationships directly into the IDs themselves.

Stateless Relationships Diagram

Relational data that exists mathematically

By treating a 40-bit index as a coordinate (island, neighborhood, and connector), you can instantly calculate the ID of a related entity.

Zero Lookups: No database queries or cache hits required.
Bidirectional: If User A is in Group B, Group B “knows” it contains User A through the same shared bit pattern.
Stateless: The relationship is defined by deterministic logic, not by stored state.
Shard-Aware: The island component ensures related entities hash to the same partition in distributed systems, keeping relationships co-located.

import "github.com/pseudata/pseudata"

users := pseudata.NewUserArray(42)
groups := pseudata.NewGroupArray(42)
link := pseudata.NewPseudoLink(17, 3)

// Alice in neighborhood 1000
aliceIdx := link.Encode(1, 1000, 0)
alice := users.At(aliceIdx)

// Find Alice's group (same coordinates, different entity type)
groupIdx := link.Resolve(aliceIdx, 0)
group := groups.At(groupIdx)

import dev.pseudata.UserArray;
import dev.pseudata.GroupArray;
import dev.pseudata.PseudoLink;

UserArray users = new UserArray(42);
GroupArray groups = new GroupArray(42);
PseudoLink link = new PseudoLink(17, 3);

long aliceIdx = link.encode(1, 1000, 0);
User alice = users.at(aliceIdx);

long groupIdx = link.resolve(aliceIdx, 0);
Group group = groups.at(groupIdx);

from pseudata import UserArray, GroupArray, PseudoLink

users = UserArray(42)
groups = GroupArray(42)
link = PseudoLink(17, 3)

alice_idx = link.encode(1, 1000, 0)
alice = users.at(alice_idx)

group_idx = link.resolve(alice_idx, 0)
group = groups.at(group_idx)

import { UserArray, GroupArray, PseudoLink } from '@pseudata/core';

const users = new UserArray(42n);
const groups = new GroupArray(42n);
const link = new PseudoLink(17, 3);

const aliceIdx = link.encode(1, 1000, 0);
const alice = users.at(aliceIdx);

const groupIdx = link.resolve(aliceIdx, 0);
const group = groups.at(groupIdx);

Learn more about Stateless Relations →

Built for Global Scale

Traditional faker libraries load all locale data at once or require complex configuration. Pseudata uses a compositional architecture that eliminates duplication and enables precise control.

Compositional Architecture

The diagram below shows how bundle modules are composed from locale modules, which are themselves composed from atomic modules (general, country, language). Box labels are relative import paths from @pseudata/core/resources/ (omitted for readability).

Composition uses shallow copies of resource data—no duplication, just references. This makes creating new bundles extremely cheap: adding a region or custom locale combination has minimal overhead.

Locale Architecture Diagram

Data is organized into four intelligent layers:

General - Shared by all locales (email domains)
Language - Linguistic data (months, weekdays, word lists)
Country - Geographic standards (address formats, phone patterns)
Locale - Cultural specificity (cities, names, streets)

Canadian Example: Both en_CA and fr_CA share the same country data (Canadian address formats, phone patterns) but use different language data. English months are defined once in en/ and reused by both en_US and en_CA—zero duplication.

Regional Bundles

Choose bundles that match your needs:

Geographic: NA, EU, APAC, MEA, SA
Business regions: AMER, EMEA
Cultural groups: LATAM, DACH
Complete: World (all locales)

By default, only US locale loads. Import regional bundles as you expand.

Modern build tools automatically tree-shake unused bundles. You only ship what you use.

Using Locales

By default, Pseudata loads only the US bundle (en_US locale). This means you can start using Pseudata immediately without any configuration:

import "github.com/pseudata/pseudata"

users := pseudata.NewUserArray(42)

import dev.pseudata.UserArray;

UserArray users = new UserArray(42);

from pseudata import UserArray

users = UserArray(42)

import { UserArray } from '@pseudata/core';

const users = new UserArray(42n);

Import a bundle and pass it via options to use locales from that region.

import (
    "github.com/pseudata/pseudata"
    "github.com/pseudata/pseudata/resources/bundles/na"
)

users := pseudata.NewUserArray(42, pseudata.WithResources(na.Resources))

import dev.pseudata.UserArray;
import dev.pseudata.resources.bundles.na.Resources;

UserArray users = new UserArray(42, Resources.INSTANCE);

from pseudata import UserArray
from pseudata.resources.bundles.na import resources_na

users = UserArray(42, resources=resources_na)

import { UserArray } from '@pseudata/core';
import { Resources } from '@pseudata/core/resources/bundles/na';

const users = new UserArray(42n, { resources: Resources });

Learn more about locale architecture →