Add a New Primitive
Primitives are fundamental data generation methods that form the building blocks of Pseudata. Each primitive must be implemented identically across all supported languages to maintain deterministic cross-language consistency.
Code Generation Flow
Section titled “Code Generation Flow”Primitives follow the interface generation path:
interface-emitter.js generates the interface signatures in each language from your TypeSpec definition. You then implement the actual logic in each SDK.
Step 1: Add to TypeSpec Interface
Section titled “Step 1: Add to TypeSpec Interface”Add your method signature to typespec/src/primitives.tsp:
interface Primitives { // ... existing methods ...
/** * Generates a hexadecimal color code. * Format: "#RRGGBB" with uppercase letters. * @returns A color string like "#FF5733" */ hexColor(): string;}The TypeSpec interface defines:
- Method name: Must be camelCase
- Parameters: Strongly typed (string, int32, float32, etc.)
- Return type: Must match across all languages
- Documentation: Used in generated interfaces
Step 2: Run Code Generation
Section titled “Step 2: Run Code Generation”Generate the interface definitions:
cd typespecnpm run generateThis updates the interface files in each SDK:
- Go:
sdks/go/primitives.go(interface) - Java:
sdks/java/src/main/java/dev/pseudata/Primitives.java(interface) - Python:
sdks/python/pseudata/primitives.py(Protocol) - TypeScript:
sdks/typescript/src/primitives.ts(interface)
Step 3: Implement in Go
Section titled “Step 3: Implement in Go”Add implementation to primitives_impl.go:
func (p *PrimitivesImpl) HexColor() string { rng := p.rng() r := rng.Intn(256) g := rng.Intn(256) b := rng.Intn(256) return fmt.Sprintf("#%02X%02X%02X", r, g, b)}Critical Pattern: RNG State
Section titled “Critical Pattern: RNG State”Store rng() in a variable if you need multiple random values. Each call to p.rng() creates a new generator at the same position:
// WRONG - Returns same value twice!x := p.rng().Intn(10)y := p.rng().Intn(10)
// CORRECT - Returns different valuesrng := p.rng()x := rng.Intn(10)y := rng.Intn(10)This is the most common bug when implementing primitives.
Step 4: Implement in Java
Section titled “Step 4: Implement in Java”Add implementation to sdks/java/src/main/java/dev/pseudata/PrimitivesImpl.java:
@Overridepublic String hexColor() { Generator rng = rng(); int r = rng.intn(256); int g = rng.intn(256); int b = rng.intn(256); return String.format("#%02X%02X%02X", r, g, b);}Java-Specific Considerations
Section titled “Java-Specific Considerations”Unsigned integers: Java doesn’t have unsigned types. Use Integer.toUnsignedLong() for modulo operations:
// For deterministic operations matching Go's uint32long value = Integer.toUnsignedLong(rng.uint32()) % max;Resource access: Use property syntax:
String[] names = RESOURCES.get(locale).givenMaleNames;Step 5: Implement in Python
Section titled “Step 5: Implement in Python”Add implementation to sdks/python/pseudata/primitives_impl.py:
def hex_color(self) -> str: """Generates a hexadecimal color code.""" rng = self._rng() r = rng.intn(256) g = rng.intn(256) b = rng.intn(256) return f"#{r:02X}{g:02X}{b:02X}"Python-Specific Considerations
Section titled “Python-Specific Considerations”Method names: Use snake_case to follow Python conventions:
- TypeSpec
hexColor()→ Pythonhex_color()
Resource access: Use dictionary syntax:
names = RESOURCES[locale]["givenMaleNames"]Type hints: Always include return type hints for IDE support.
Step 6: Implement in TypeScript
Section titled “Step 6: Implement in TypeScript”Add implementation to sdks/typescript/src/primitives-impl.ts:
hexColor(): string { const rng = this.rng(); const r = rng.intn(256); const g = rng.intn(256); const b = rng.intn(256); const toHex = (n: number) => n.toString(16).padStart(2, '0').toUpperCase(); return `#${toHex(r)}${toHex(g)}${toHex(b)}`;}TypeScript-Specific Considerations
Section titled “TypeScript-Specific Considerations”Large integers: Use BigInt for values > 53 bits:
// For worldSeed operationsconst seed = BigInt(worldSeed);Resource access: Use property syntax:
const names = RESOURCES[locale].givenMaleNames;Step 7: Verify Consistency
Section titled “Step 7: Verify Consistency”Test all implementations produce identical output for the same seed:
# Gocd /workspaces/pseudata/pseudata-pocgo test -run TestPrimitives
# Javacd sdks/javamvn test
# Pythoncd sdks/pythonpytest
# TypeScriptcd sdks/typescriptnpm testAll tests must pass with matching outputs across languages.
Fixture-Based Testing
Section titled “Fixture-Based Testing”Pseudata uses fixture-based testing to ensure cross-language consistency. See the Testing Guide for a comprehensive explanation of why fixtures are critical and how they work.
Primitive Fixture Structure
Section titled “Primitive Fixture Structure”Primitive fixtures test individual method outputs for specific seeds:
{ "testCases": [ { "name": "hexColor_basic", "worldSeed": 42, "typeSeq": 0, "index": 0, "expected": "#A3D5F1" }, { "name": "hexColor_different_seed", "worldSeed": 100, "typeSeq": 0, "index": 0, "expected": "#7F2A9B" } ]}When to Add Primitive Fixtures
Section titled “When to Add Primitive Fixtures”Add fixture tests for primitives that:
- Have complex logic (multiple RNG calls, conditionals)
- Use resources (locale-specific data)
- Involve formatting or string manipulation
- Are critical to data generation
Simple primitives like nextInt() may not need dedicated fixtures if they’re already covered by model/array fixtures.
Creating Primitive Fixtures
Section titled “Creating Primitive Fixtures”- Add test cases to Go test file:
func TestHexColorFixtures(t *testing.T) { cases := []struct { worldSeed uint64 typeSeq uint64 index int expected string }{ {42, 0, 0, "#A3D5F1"}, {100, 0, 0, "#7F2A9B"}, }
for _, tc := range cases { p := NewPrimitivesImpl(tc.worldSeed, tc.typeSeq, tc.index) got := p.HexColor() assert.Equal(t, tc.expected, got) }}-
Run
go test -updateto generate fixture JSON -
Other languages automatically load and test against these fixtures
Common Pitfalls
Section titled “Common Pitfalls”1. RNG State Bug
Section titled “1. RNG State Bug”The most common mistake:
// WRONG - Each rng() call resets to same positionfunc (p *PrimitivesImpl) BadExample() string { x := p.rng().Intn(10) y := p.rng().Intn(10) // Same value as x! return fmt.Sprintf("%d-%d", x, y)}
// CORRECT - Store rng oncefunc (p *PrimitivesImpl) GoodExample() string { rng := p.rng() x := rng.Intn(10) y := rng.Intn(10) // Different value return fmt.Sprintf("%d-%d", x, y)}2. Resource Access Patterns
Section titled “2. Resource Access Patterns”Different languages use different syntax:
# Python - dictionary accessnames = RESOURCES[locale]["givenMaleNames"]// Java - property accessString[] names = RESOURCES.get(locale).givenMaleNames;// TypeScript - property accessconst names = RESOURCES[locale].givenMaleNames;3. Type Mappings
Section titled “3. Type Mappings”Keep types consistent across languages:
| TypeSpec | Go | Java | Python | TypeScript |
|---|---|---|---|---|
int32 | int64 | int | int | number |
int64 | int64 | long | int | number |
float32 | float32 | float | float | number |
string | string | String | str | string |
boolean | bool | boolean | bool | boolean |
4. String Case Conventions
Section titled “4. String Case Conventions”Each language has its own naming convention:
- Go:
PascalCasefor exported methods - Java:
camelCasefor methods - Python:
snake_casefor methods - TypeScript:
camelCasefor methods
The code generators handle this automatically, but be aware when reading generated code.
5. UTF-8 Handling
Section titled “5. UTF-8 Handling”Always use proper string/rune handling for multi-byte characters:
// WRONG - Corrupts UTF-8first := name[0:1]
// CORRECT - Preserves UTF-8runes := []rune(name)first := string(runes[0:1])Usage in Models
Section titled “Usage in Models”After implementing in all supported languages, use your primitive in TypeSpec models:
Direct Generator
Section titled “Direct Generator”model Product { @generator("id") id: string;
@generator("hexColor") color: string;}Template Composition
Section titled “Template Composition”model Product { @generator("id") id: string;
@template("Item #{digit(4)} - {hexColor}") product_code: string;}Conditional Logic
Section titled “Conditional Logic”For primitives with parameters, pass them in the decorator:
model User { @generator("id") id: string;
@generator("probability", 0.75) email_verified: boolean;}Advanced: Primitives with Resources
Section titled “Advanced: Primitives with Resources”Many primitives select from locale-specific resources:
func (p *PrimitivesImpl) CityName() string { data := p.resources() cities := data.Cities return p.Element(cities)}The Element() primitive handles the random selection deterministically.
Advanced: Primitives with Formatting
Section titled “Advanced: Primitives with Formatting”Some primitives need to format values according to locale rules:
func (p *PrimitivesImpl) PhoneNumber() string { data := p.resources() format := data.PhoneFormat // Generate digits and format according to locale rng := p.rng() // ... generate replacements return p.formatString(format, replacements)}Ensure formatting logic is identical across all languages to maintain determinism.
© 2025 Pseudata Project. Open Source under Apache License 2.0. · RSS Feed