Pseudo-Arrays
A pseudo-array is an infinite, deterministic array that generates elements on-demand using pseudo-random algorithms. Each index always produces the same element for a given world-seed and type-sequence, with access and zero memory overhead.
The key insight: instead of storing billions of records in memory, pseudo-arrays calculate each element when you access it. The same index always returns the same element because it uses the deterministic PCG32 algorithm with hierarchical seeding: Generator(worldSeed, typeSeq).Advance(index).
In traditional systems, generating test data means creating and storing arrays in memory or databases. With pseudo-arrays, you can instantly access the billionth user or the trillionth address without iterating or storing any data - it calculates on demand and is mathematically guaranteed to be consistent across all languages.
How It Works
Section titled “How It Works”PseudoArray is an abstract base class that concrete types (like UserArray, AddressArray) extend. Each subclass implements the generate() method to define how to create an element at a specific index.
The Generation Process:
- Create a pseudo-array with a world-seed and type-sequence
- Access any index using
.at(index) - Generate the element using
Generator(worldSeed, typeSeq).Advance(index) - Return the generated element (not stored, just calculated)
Key Properties:
- Deterministic: Same
worldSeed,typeSeq, andindexalways produce the same element - Infinite: Access any index from 0 to 2^40-1 (over 1 trillion elements)
- Zero memory: Each element is calculated on-demand, not stored
- access: Direct index access without iteration
- Cross-language consistent: Same element in Go, Java, Python, TypeScript and more
PseudoArray Class
Section titled “PseudoArray Class”The Interface tab shows the PseudoArray base class API. All concrete array types extend this class.
The Implementation tab contains the reference implementation showing how the base class works.
/** * PseudoArray<T> * Infinite deterministic array of items of type T. * Provides constant-time random access to deterministically generated items * without storing them in memory. */declare abstract class PseudoArray<T> { /** * @param worldSeed - Base seed for deterministic generation. * @param typeSeq - Type sequence identifier (e.g., 101 for Users, 110 for Addresses). */ constructor(worldSeed: number | bigint, typeSeq: number | bigint);
/** * Returns the item at the specified index. * The same index always produces the same item. * @param index - Array index (0 to 2^40-1). * @returns Generated item at the specified index. */ public at(index: number | bigint): T;
/** * Subclasses must implement this to define item generation logic. * @param worldSeed - Base seed for the array. * @param typeSeq - Type sequence identifier. * @param index - Array index position. * @returns Generated item of type T. */ protected abstract generate( worldSeed: number | bigint, typeSeq: number | bigint, index: number ): T;}/** * PseudoArray<T> * Infinite deterministic array of items of type T. */export abstract class PseudoArray<T> { protected worldSeed: number | bigint; protected typeSeq: number | bigint;
/** * @param worldSeed - Base seed for deterministic generation. * @param typeSeq - Type sequence identifier. */ constructor(worldSeed: number | bigint, typeSeq: number | bigint) { this.worldSeed = worldSeed; this.typeSeq = typeSeq; }
/** * Returns the item at the specified index. * @param index - Array index (non-negative). * @returns Generated item at the specified index. */ public at(index: number | bigint): T { const idx = Number(index); return this.generate(this.worldSeed, this.typeSeq, idx); }
/** * Subclasses implement this to define generation logic. * @param worldSeed - Base seed for the array. * @param typeSeq - Type sequence identifier. * @param index - Array index position. * @returns Generated item of type T. */ protected abstract generate( worldSeed: number | bigint, typeSeq: number | bigint, index: number ): T;}Concrete Implementations
Section titled “Concrete Implementations”PseudoArray is abstract - you use concrete implementations like UserArray and AddressArray.
UserArray
Section titled “UserArray”Generates OIDC-compliant user objects with names, emails, avatars, and temporal data.
import "github.com/pseudata/pseudata"
users := pseudata.NewUserArray(42) // worldSeed = 42, typeSeq = 101
user := users.At(1000)fmt.Println(user.Name) // "John Smith"fmt.Println(user.Email) // "john.smith@example.com"fmt.Println(user.ID()) // PseudoID: "0000002a-0000-8000-a065-00000003e8"
// Access any index instantlybillionthUser := users.At(1_000_000_000) // O(1)import dev.pseudata.UserArray;
UserArray users = new UserArray(42L); // worldSeed = 42, typeSeq = 101
User user = users.at(1000);System.out.println(user.name()); // "John Smith"System.out.println(user.email()); // "john.smith@example.com"System.out.println(user.id()); // PseudoID: "0000002a-0000-8000-a065-00000003e8"
// Access any index instantlyUser billionthUser = users.at(1_000_000_000); // O(1)from pseudata import UserArray
users = UserArray(42) # worldSeed = 42, typeSeq = 101
user = users[1000]print(user.name) # "John Smith"print(user.email) # "john.smith@example.com"print(user.id()) # PseudoID: "0000002a-0000-8000-a065-00000003e8"
# Access any index instantlybillionth_user = users[1_000_000_000] # O(1)import { UserArray } from '@pseudata/core';
const users = new UserArray(42n); // worldSeed = 42, typeSeq = 101
const user = users.at(1000);console.log(user.name); // "John Smith"console.log(user.email); // "john.smith@example.com"console.log(user.id()); // PseudoID: "0000002a-0000-8000-a065-00000003e8"
// Access any index instantlyconst billionthUser = users.at(1_000_000_000); // O(1) - no iteration neededAddressArray
Section titled “AddressArray”Generates locale-aware address objects with street, city, state, and postal code.
addresses := pseudata.NewAddressArray(42)
address := addresses.At(500)fmt.Println(address.StreetAddress) // "123 Main St"fmt.Println(address.City) // "Springfield"fmt.Println(address.PostalCode) // "12345"AddressArray addresses = new AddressArray(42L);
Address address = addresses.at(500);System.out.println(address.streetAddress()); // "123 Main St"System.out.println(address.city()); // "Springfield"System.out.println(address.postalCode()); // "12345"addresses = AddressArray(42)
address = addresses[500]print(address.street_address) # "123 Main St"print(address.city) # "Springfield"print(address.postal_code) # "12345"import { AddressArray } from '@pseudata/core';
const addresses = new AddressArray(42n); // worldSeed = 42, typeSeq = 110
const address = addresses.at(500);console.log(address.streetAddress); // "123 Main St"console.log(address.city); // "Springfield"console.log(address.postalCode); // "12345"Usage Patterns
Section titled “Usage Patterns”Single Item Access
Section titled “Single Item Access”Access any index directly with complexity:
const users = new UserArray(42n);
const user1 = users.at(0); // First userconst user2 = users.at(999); // 1000th userconst user3 = users.at(1_000_000); // Millionth user - instant access!Cross-Language Testing
Section titled “Cross-Language Testing”Same data across all languages:
// Frontend (TypeScript)const users = new UserArray(42n);const testUser = users.at(1000);// testUser.id() = "0000002a-0000-8000-a065-00000003e8"# Backend (Python)users = UserArray(42)test_user = users[1000]# test_user.id() = "0000002a-0000-8000-a065-00000003e8"# Same ID, same data!Load Testing
Section titled “Load Testing”Each worker accesses its own index range without coordination:
// Worker 1: indices 0-999for (let i = 0; i < 1000; i++) { const user = users.at(i); // Process user...}
// Worker 2: indices 1000-1999for (let i = 1000; i < 2000; i++) { const user = users.at(i); // Process user...}
// No conflicts, no coordination neededBenefits
Section titled “Benefits”Zero Memory Overhead
Section titled “Zero Memory Overhead”Traditional array:
const users = [];for (let i = 0; i < 1_000_000; i++) { users.push(generateUser(i)); // Stores in memory}// Memory usage: ~500MB for 1M usersPseudoArray:
const users = new UserArray(42n);const user = users.at(1_000_000); // Calculates on demand// Memory usage: ~100 bytes (just the worldSeed and typeSeq)Infinite Scale
Section titled “Infinite Scale”const users = new UserArray(42n);
// All of these work instantly:users.at(1_000); // Thousandth userusers.at(1_000_000); // Millionth userusers.at(1_000_000_000); // Billionth userusers.at(1_099_511_627_775); // Maximum index (2^40-1)
// No pre-generation, no memory issuesDeterministic
Section titled “Deterministic”const users1 = new UserArray(42n);const users2 = new UserArray(42n);
users1.at(500).name === users2.at(500).name; // trueusers1.at(500).email === users2.at(500).email; // trueusers1.at(500).id() === users2.at(500).id(); // true
// Same seed = same data, alwaysCross-Language Consistency
Section titled “Cross-Language Consistency”// GoNewUserArray(42).At(1000).Name// "John Smith"// Javanew UserArray(42L).at(1000).name()// "John Smith"# PythonUserArray(42)[1000].name# "John Smith"// TypeScriptnew UserArray(42n).at(1000).name// "John Smith"All languages produce identical data!
When to Use
Section titled “When to Use”Pseudo-arrays are designed for test data, mock data, and development scenarios where deterministic, repeatable data is valuable.
Ideal for:
- Unit testing: Consistent test data across test runs
- Integration testing: Same data across frontend and backend tests
- Load testing: Generate billions of entities without memory overhead
- QA/Demos: Reproducible scenarios across environments
- Development: Realistic data without database dependencies
- Documentation: Consistent examples in code samples
Not suitable for:
- Production databases: Use real data from your database
- User-generated content: This is synthetic, not real user data
- Unique constraints: Generated data may have collisions (e.g., email uniqueness not guaranteed)
- Compliance requirements: Not suitable where real anonymized data is required
Key advantage: Deterministic, infinite-scale data with zero memory overhead - perfect for testing and development where consistency matters more than randomness.
Deep Dive
Section titled “Deep Dive”Hierarchical Seeding
Section titled “Hierarchical Seeding”Pseudo-arrays use the PCG32 algorithm with hierarchical seeding:
Formula:
Why this works:
- PCG32 streams: Each typeSeq is a separate stream, ensuring no overlap
- Advance operation: Jumps directly to the correct state for any index
- Deterministic: Mathematical guarantee of same output for same inputs
Type Sequence Identifiers
Section titled “Type Sequence Identifiers”Each concrete type has a unique type-sequence identifier:
| Type | TypeSeq | Purpose |
|---|---|---|
| User | 101 | OIDC-compliant user objects |
| Address | 110 | Locale-aware address objects |
| Custom | 1024+ | Reserved for future types |
This ensures:
UserArray(42).at(100)andAddressArray(42).at(100)are independent- No overlap between types even with same worldSeed and index
- Each type has its own “stream” of random numbers
Integration with Pseudo-IDs
Section titled “Integration with Pseudo-IDs”Every element generated by pseudo-arrays has a Pseudo ID that encodes its position:
const users = new UserArray(42n); // worldSeed = 42, typeSeq = 101const user = users.at(1000); // index = 1000
user.id();// "0000002a-0000-8000-a065-00000003e8"// └world─┘ └─typeSeq+index─┘// 42 101 1000The PseudoID encodes:
- World-seed: 42
- Type-sequence: 101 (User)
- Index: 1000
This means you can decode a PseudoID to find the exact array position:
import { decodeId } from '@pseudata/core';
const components = decodeId(user.id());// components.worldSeed = 42// components.typeSeq = 101// components.index = 1000
// Reconstruct the exact same userconst users = new UserArray(components.worldSeed);const sameUser = users.at(components.index);// sameUser.id() === user.id() // trueIntegration with Pseudo-Links
Section titled “Integration with Pseudo-Links”You can use pseudo links with the PseudoLink class (see reference) to create indices with relational properties:
import { UserArray, PseudoLink } from '@pseudata/core';
const link = new PseudoLink(17, 3);const users = new UserArray(42n);
// Create user at specific coordinatesconst userIndex = link.spawn( 1, // island: 1 1000, // neighborhood: 1000 0 // connector: 0);
const user = users.at(userIndex);
// Find related users in same neighborhoodfor (let slot = 0; slot <= link.maxConnectors(); slot++) { const relatedIndex = link.resolve(userIndex, slot); const relatedUser = users.at(relatedIndex); // Related users share neighborhood and island bits}This combines the infinite scale of pseudo-arrays with the relational capabilities of pseudo-links.
See Also
Section titled “See Also”- Pseudo-IDs - Learn about the deterministic IDs that identify each element
- Pseudo-Links - Use coordinate-based indices for relational test data
- Models - See all available types and their properties
- Primitives - Understand the generator functions used to create element properties
© 2025 Pseudata Project. Open Source under Apache License 2.0. · RSS Feed