Benchmarks

Pseudata’s pseudo-arrays provide constant-time access to any element, regardless of its position in the sequence. Unlike traditional arrays that must allocate memory or sequential generators that must iterate through all previous elements, pseudo-arrays compute each element independently in $O(1)$ time.

This page demonstrates real-world performance characteristics across different array types and access patterns.

Access Times

The following benchmarks measure the time to access individual elements at various positions, from the first element to the trillionth ( $2^{40}-1$ ). All tests were performed on an average laptop.

UserArray

UserArray.At(i) in Go

$i$	$T$	$\Delta T$
$0$	$5.551\mu s$
$1$	$5.68\mu s$	$0.129\mu s$
$2^{8}$	$5.75\mu s$	$0.199\mu s$
$2^{16}$	$6.292\mu s$	$0.741\mu s$
$2^{32}$	$7.073\mu s$	$1.522\mu s$
$2^{40}-1$	$7.973\mu s$	$2.422\mu s$

AddressArray

AddressArray.At(i) in Go

$i$	$T$	$\Delta T$
$0$	$2.793\mu s$
$1$	$2.866\mu s$	$0.073\mu s$
$2^{8}$	$2.893\mu s$	$0.100\mu s$
$2^{16}$	$2.988\mu s$	$0.195\mu s$
$2^{32}$	$3.516\mu s$	$0.723\mu s$
$2^{40}-1$	$3.941\mu s$	$1.148\mu s$

The tables above are representative examples demonstrating access time characteristics—Pseudata includes pseudo-arrays for many more entity types beyond User and Address.

Analysis

As demonstrated above, access time remains practically constant—whether retrieving the 1st element or the trillionth ( $2^{40}-1$ ). This validates true $O(1)$ behavior: index position has negligible impact on performance, enabling instant access to any element in a trillion-element dataset.

Understanding the ΔT

The $ΔT$ column shows the seeking time—the overhead of advancing the PCG32 generator to the target index. This seeking operation is technically $O(\log n)$ due to the generator’s advance() method, which uses binary exponentiation to jump to any position.

Total access time = Seeking time ( $ΔT$ ) + Generation time

Seeking time ( $ΔT$ ): $O(\log n)$ but grows extremely slowly (measured in nanoseconds/microseconds)
Generation time: $O(1)$ and remains constant regardless of index—this is the bulk of the access time

The $O(\log n)$ seeking factor is so small compared to the constant generation time that practical access time is effectively $O(1)$ . Even at index $2^{40}-1$ (over 1 trillion), seeking adds only a negligible amount for real-world use cases.

Put this in perspective: While Pseudata’s $ΔT$ seeking overhead is measured in microseconds (millionths of a second), traditional sequential generators would take seconds or tens of seconds to iterate through millions of records—a difference of six orders of magnitude.

Performance Characteristics

The slight variations in access time are due to:

Data complexity: UserArray generates more fields than AddressArray
Seeking overhead: Logarithmic growth, but imperceptible in practice (nanoseconds to microseconds)
CPU cache effects: Minor variations from system state
String operations: Name generation involves more allocations than simple numeric fields

Key Takeaways

No memory overhead: Accessing element 1 trillion requires the same memory as accessing element 1
No sequential dependency: Jump directly to any index without computing intermediate values
Predictable performance: Access time is determined by entity complexity, not position
Scalable testing: Load test with billions of records without infrastructure costs

Methodology

All benchmarks use Go’s standard testing.B benchmark framework with the following settings:

Iterations: Automatically determined by Go benchmark runner for statistical significance
Warmup: Initial iterations excluded from timing
System: Standard development laptop (no dedicated benchmark hardware)
Isolation: Each benchmark runs independently

Comparison

To understand the practical impact of $O(1)$ access, consider how Pseudata compares to traditional data generation approaches.

Sequential Generators

Traditional faker libraries must iterate through all previous elements:

# To access element 1,000,000 with traditional faker
faker.seed(42)
for i in range(1_000_000):  # Must iterate O(n)
    user = faker.person()

With Pseudata:

# Direct access - O(1)
users = UserArray(42)
user = users.at(1_000_000)  # Instant O(1) access

Memory-Based Arrays

Pre-allocated arrays require memory proportional to size:

# Traditional array for 1 million users
users = [generate_user(i) for i in range(1_000_000)] # High memory usage

With Pseudata:

# Pseudo-array for 1 trillion users
users = UserArray(42)
user = users.at(1_000_000_000_000)  # No memory overhead