Skip to content

Pseudata

One seed. Every language. Same data.

Your frontend expects Alice. Your backend returns Bob.
Different faker libraries = Different realities.

Pseudata solves this by making mock data generation a standardized algorithm—not a language-specific implementation.

📖 Read the full story: Why faker.js and Faker Don’t Agree

Cross-Language Consistency

Same seed = same data across all languages. No more integration test mismatches. Learn more →

Infinite Scale

O(1)O(1) instant access to billions of records with zero memory overhead. Learn more →

Stateless Relations

Create deterministic relationships without a database. Navigate entities in O(1)O(1) time. Learn more →

Smart Locale Loading

Minimal default, scale globally. Load only the specific regions and data you need. Learn more →

Pseudata is the only faker library that guarantees both cross-language consistency and O(1)O(1) infinite scale.

Different data across languages

# Python (Faker)
Faker.seed(42)
fake.name() # → "Brett Davis"
// TypeScript (faker.js)
faker.seed(42);
faker.person.fullName(); // → "Miss Dora Kiehn"

Sequential access only

# To get item 1,000,000, must iterate - O(n)
faker.seed(42)
for i in range(1_000_000):
name = faker.name()

Cross-Language Consistency enables reliable integration testing across polyglot systems. When your Go backend and TypeScript frontend both generate User[1000] with seed 42, you get the same test data—eliminating the integration test mismatches that plague microservices architectures.

Direct O(1)O(1) Access unlocks scenarios impossible with traditional fakers:

  • Load testing - Each test worker accesses its own range (worker 1 uses indices 0-999, worker 2 uses 1000-1999) without coordination
  • Parallel processing - Split a billion-record dataset across processes instantly, no sequential generation required
  • Sparse testing - Test edge cases at indices 1, 1000, 1000000, 1000000000 without generating intermediate records
  • Reproducible demos - Jump to the “interesting” user at index 42857 every time

User[1000] with seed 42 = Always the same across every language:

Diagram showing seed 42 flowing to Go, Java, Python, and TypeScript implementations, all producing identical output: John Smith

package main
import "github.com/pseudata/pseudata"
users := pseudata.NewUserArray(42)
user := users.At(1000)
fmt.Println(user.Name) // → "John Smith"
fmt.Println(user.Email) // → "john.smith@example.com"

Navigate complex data graphs in O(1)O(1) time without requiring a database.

Traditional mock data is “flat”—users and groups have no inherent connection unless you manually link them. Pseudo links use a bit-coordinate system to bake relationships directly into the IDs themselves.

Stateless Relationships Diagram

Relational data that exists mathematically

By treating a 40-bit index as a coordinate (island, neighborhood, and connector), you can instantly calculate the ID of a related entity.

  • Zero Lookups: No database queries or cache hits required.
  • Bidirectional: If User A is in Group B, Group B “knows” it contains User A through the same shared bit pattern.
  • Stateless: The relationship is defined by deterministic logic, not by stored state.
  • Shard-Aware: The island component ensures related entities hash to the same partition in distributed systems, keeping relationships co-located.

Learn more about Stateless Relations →

Traditional faker libraries load all locale data at once or require complex configuration. Pseudata uses a compositional architecture that eliminates duplication and enables precise control.

The diagram below shows how bundle modules are composed from locale modules, which are themselves composed from atomic modules (general, country, language). Box labels are relative import paths from @pseudata/core/resources/ (omitted for readability).

Composition uses shallow copies of resource data—no duplication, just references. This makes creating new bundles extremely cheap: adding a region or custom locale combination has minimal overhead.

Locale Architecture Diagram

Data is organized into four intelligent layers:

  • General - Shared by all locales (email domains)
  • Language - Linguistic data (months, weekdays, word lists)
  • Country - Geographic standards (address formats, phone patterns)
  • Locale - Cultural specificity (cities, names, streets)

Canadian Example: Both en_CA and fr_CA share the same country data (Canadian address formats, phone patterns) but use different language data. English months are defined once in en/ and reused by both en_US and en_CA—zero duplication.

Choose bundles that match your needs:

  • Geographic: NA, EU, APAC, MEA, SA
  • Business regions: AMER, EMEA
  • Cultural groups: LATAM, DACH
  • Complete: World (all locales)

By default, only US locale loads. Import regional bundles as you expand.

Modern build tools automatically tree-shake unused bundles. You only ship what you use.

By default, Pseudata loads only the US bundle (en_US locale). This means you can start using Pseudata immediately without any configuration:

import "github.com/pseudata/pseudata"
users := pseudata.NewUserArray(42)

Import a bundle and pass it via options to use locales from that region.

import (
"github.com/pseudata/pseudata"
"github.com/pseudata/pseudata/resources/bundles/na"
)
users := pseudata.NewUserArray(42, pseudata.WithResources(na.Resources))

Learn more about locale architecture →