Skip to content

Add a New Primitive

Primitives are fundamental data generation methods that form the building blocks of Pseudata. Each primitive must be implemented identically across all supported languages to maintain deterministic cross-language consistency.

Primitives follow the interface generation path:

Code Generation Flow

interface-emitter.js generates the interface signatures in each language from your TypeSpec definition. You then implement the actual logic in each SDK.

Add your method signature to typespec/src/primitives.tsp:

interface Primitives {
// ... existing methods ...
/**
* Generates a hexadecimal color code.
* Format: "#RRGGBB" with uppercase letters.
* @returns A color string like "#FF5733"
*/
hexColor(): string;
}

The TypeSpec interface defines:

  • Method name: Must be camelCase
  • Parameters: Strongly typed (string, int32, float32, etc.)
  • Return type: Must match across all languages
  • Documentation: Used in generated interfaces

Generate the interface definitions:

Terminal window
cd typespec
npm run generate

This updates the interface files in each SDK:

  • Go: sdks/go/primitives.go (interface)
  • Java: sdks/java/src/main/java/dev/pseudata/Primitives.java (interface)
  • Python: sdks/python/pseudata/primitives.py (Protocol)
  • TypeScript: sdks/typescript/src/primitives.ts (interface)

Add implementation to primitives_impl.go:

func (p *PrimitivesImpl) HexColor() string {
rng := p.rng()
r := rng.Intn(256)
g := rng.Intn(256)
b := rng.Intn(256)
return fmt.Sprintf("#%02X%02X%02X", r, g, b)
}

Store rng() in a variable if you need multiple random values. Each call to p.rng() creates a new generator at the same position:

// WRONG - Returns same value twice!
x := p.rng().Intn(10)
y := p.rng().Intn(10)
// CORRECT - Returns different values
rng := p.rng()
x := rng.Intn(10)
y := rng.Intn(10)

This is the most common bug when implementing primitives.

Add implementation to sdks/java/src/main/java/dev/pseudata/PrimitivesImpl.java:

@Override
public String hexColor() {
Generator rng = rng();
int r = rng.intn(256);
int g = rng.intn(256);
int b = rng.intn(256);
return String.format("#%02X%02X%02X", r, g, b);
}

Unsigned integers: Java doesn’t have unsigned types. Use Integer.toUnsignedLong() for modulo operations:

// For deterministic operations matching Go's uint32
long value = Integer.toUnsignedLong(rng.uint32()) % max;

Resource access: Use property syntax:

String[] names = RESOURCES.get(locale).givenMaleNames;

Add implementation to sdks/python/pseudata/primitives_impl.py:

def hex_color(self) -> str:
"""Generates a hexadecimal color code."""
rng = self._rng()
r = rng.intn(256)
g = rng.intn(256)
b = rng.intn(256)
return f"#{r:02X}{g:02X}{b:02X}"

Method names: Use snake_case to follow Python conventions:

  • TypeSpec hexColor() → Python hex_color()

Resource access: Use dictionary syntax:

names = RESOURCES[locale]["givenMaleNames"]

Type hints: Always include return type hints for IDE support.

Add implementation to sdks/typescript/src/primitives-impl.ts:

hexColor(): string {
const rng = this.rng();
const r = rng.intn(256);
const g = rng.intn(256);
const b = rng.intn(256);
const toHex = (n: number) => n.toString(16).padStart(2, '0').toUpperCase();
return `#${toHex(r)}${toHex(g)}${toHex(b)}`;
}

Large integers: Use BigInt for values > 53 bits:

// For worldSeed operations
const seed = BigInt(worldSeed);

Resource access: Use property syntax:

const names = RESOURCES[locale].givenMaleNames;

Test all implementations produce identical output for the same seed:

Terminal window
# Go
cd /workspaces/pseudata/pseudata-poc
go test -run TestPrimitives
# Java
cd sdks/java
mvn test
# Python
cd sdks/python
pytest
# TypeScript
cd sdks/typescript
npm test

All tests must pass with matching outputs across languages.

Pseudata uses fixture-based testing to ensure cross-language consistency. See the Testing Guide for a comprehensive explanation of why fixtures are critical and how they work.

Primitive fixtures test individual method outputs for specific seeds:

{
"testCases": [
{
"name": "hexColor_basic",
"worldSeed": 42,
"typeSeq": 0,
"index": 0,
"expected": "#A3D5F1"
},
{
"name": "hexColor_different_seed",
"worldSeed": 100,
"typeSeq": 0,
"index": 0,
"expected": "#7F2A9B"
}
]
}

Add fixture tests for primitives that:

  • Have complex logic (multiple RNG calls, conditionals)
  • Use resources (locale-specific data)
  • Involve formatting or string manipulation
  • Are critical to data generation

Simple primitives like nextInt() may not need dedicated fixtures if they’re already covered by model/array fixtures.

  1. Add test cases to Go test file:
func TestHexColorFixtures(t *testing.T) {
cases := []struct {
worldSeed uint64
typeSeq uint64
index int
expected string
}{
{42, 0, 0, "#A3D5F1"},
{100, 0, 0, "#7F2A9B"},
}
for _, tc := range cases {
p := NewPrimitivesImpl(tc.worldSeed, tc.typeSeq, tc.index)
got := p.HexColor()
assert.Equal(t, tc.expected, got)
}
}
  1. Run go test -update to generate fixture JSON

  2. Other languages automatically load and test against these fixtures

The most common mistake:

// WRONG - Each rng() call resets to same position
func (p *PrimitivesImpl) BadExample() string {
x := p.rng().Intn(10)
y := p.rng().Intn(10) // Same value as x!
return fmt.Sprintf("%d-%d", x, y)
}
// CORRECT - Store rng once
func (p *PrimitivesImpl) GoodExample() string {
rng := p.rng()
x := rng.Intn(10)
y := rng.Intn(10) // Different value
return fmt.Sprintf("%d-%d", x, y)
}

Different languages use different syntax:

# Python - dictionary access
names = RESOURCES[locale]["givenMaleNames"]
// Java - property access
String[] names = RESOURCES.get(locale).givenMaleNames;
// TypeScript - property access
const names = RESOURCES[locale].givenMaleNames;

Keep types consistent across languages:

TypeSpecGoJavaPythonTypeScript
int32int64intintnumber
int64int64longintnumber
float32float32floatfloatnumber
stringstringStringstrstring
booleanboolbooleanboolboolean

Each language has its own naming convention:

  • Go: PascalCase for exported methods
  • Java: camelCase for methods
  • Python: snake_case for methods
  • TypeScript: camelCase for methods

The code generators handle this automatically, but be aware when reading generated code.

Always use proper string/rune handling for multi-byte characters:

// WRONG - Corrupts UTF-8
first := name[0:1]
// CORRECT - Preserves UTF-8
runes := []rune(name)
first := string(runes[0:1])

After implementing in all supported languages, use your primitive in TypeSpec models:

model Product {
@generator("id")
id: string;
@generator("hexColor")
color: string;
}
model Product {
@generator("id")
id: string;
@template("Item #{digit(4)} - {hexColor}")
product_code: string;
}

For primitives with parameters, pass them in the decorator:

model User {
@generator("id")
id: string;
@generator("probability", 0.75)
email_verified: boolean;
}

Many primitives select from locale-specific resources:

func (p *PrimitivesImpl) CityName() string {
data := p.resources()
cities := data.Cities
return p.Element(cities)
}

The Element() primitive handles the random selection deterministically.

Some primitives need to format values according to locale rules:

func (p *PrimitivesImpl) PhoneNumber() string {
data := p.resources()
format := data.PhoneFormat
// Generate digits and format according to locale
rng := p.rng()
// ... generate replacements
return p.formatString(format, replacements)
}

Ensure formatting logic is identical across all languages to maintain determinism.