Your Mock Data Lies
Why faker.js and Faker Don’t Agree?
Modern software development is inherently polyglot. A typical stack might have a Go microservice handling authentication, a Python service running analytics, a TypeScript frontend, a Swift iOS app, and a Dart Android app.
When developers write tests or build demos for these separate pieces, they reach for the standard tool in each ecosystem: faker.js for the frontend, gofakeit for the backend, Faker (Python) for the data scripts.
These are all fantastic libraries. They all allow deterministic seeding.
But they are all different.
Seeding faker.js with 42 and Faker (Python) with 42 results in two completely different realities. The frontend expects “Alice,” but the backend returns “Bob.”
# Python backend (Faker)from faker import Fakerfake = Faker()Faker.seed(42)print(fake.name()) # → "Brett Davis"// TypeScript frontend (faker.js)import { faker } from "@faker-js/faker";faker.seed(42);console.log(faker.person.fullName()); // → "Miss Dora Kiehn"Same seed. Different data. Integration chaos.
Why? Because each library uses a different random number generator and different word lists. Same seed, different algorithm, different data.
This fragmentation forces teams to either:
- Hardcode static JSON files (which are heavy and hard to maintain).
- Write intricate “bridge” scripts just to sync mock data across services.
- Accept that frontend and backend tests run in parallel universes.
What if mock data generation used a standardized algorithm instead of library-specific implementations?
Quick Comparison
| Feature | faker.js / Faker | Pseudata |
|---|---|---|
| Cross-Language Consistency | ❌ Different data per language | ✅ Identical data across all languages |
| String Seeds | 🟡 Varies by implementation | ✅ Consistent hash-to-seed conversion |
| Random Number Generator | 🟡 Library-specific implementations | ✅ Standardized algorithm (PCG32) |
| Multi-Locale Support | ✅ Yes | ✅ Yes |
| Best For | 🎯 Single-language projects | 🎯 Polyglot systems, integration testing |
The bottom line: Traditional faker libraries are excellent for single-language development. But the moment your system spans multiple languages, you need a standardized algorithm, not just another library.
Introducing Pseudata
Section titled “Introducing Pseudata”Pseudata is an open-source library (Apache 2.0) designed to solve the Polyglot Data Problem.
The goal is simple but ambitious: To create an algorithm specification for mock data generation that produces identical results in every programming language.
The Vision: A seed of
42+ Index1000should result in the exact same User object—down to the pixel in the avatar—whether it is accessed in Python, Go, Java, or TypeScript/JavaScript.
How It Works in Practice
Section titled “How It Works in Practice”package mainimport "github.com/pseudata/pseudata"
users := pseudata.NewUserArray(42)user := users.At(1000)fmt.Println(user.Name) // → "John Smith"fmt.Println(user.Email) // → "john.smith@example.com"import dev.pseudata.UserArray;
UserArray users = new UserArray(42);User user = users.at(1000);System.out.println(user.getName()); // → "John Smith"System.out.println(user.getEmail()); // → "john.smith@example.com"from pseudata import UserArray
users = UserArray(42)user = users.at(1000)print(user.name) # → "John Smith"print(user.email) # → "john.smith@example.com"import { UserArray } from "pseudata";
const users = new UserArray(42);const user = users.at(1000);console.log(user.name); // → "John Smith"console.log(user.email); // → "john.smith@example.com"Same seed. Same index. Same data. Every language.
The Architecture: Virtual & Stateless
Section titled “The Architecture: Virtual & Stateless”To achieve this universal consistency, Pseudata ignores the traditional “list of random words” approach and uses a strict mathematical architecture based on the PCG32 algorithm.
1. The “Virtual Array”
Section titled “1. The “Virtual Array””Instead of generating a list of 1,000 objects and storing them in memory, Pseudata implements Virtual Arrays.
Data is calculated on-the-fly using a hierarchical seeding strategy:
ObjectSeed = WorldSeed + Index
This allows for O(1) Random Access without memory overhead. A developer can request User[5,000,000] instantly, without generating the previous 4,999,999 items. This is something most current libraries struggle with efficiently.
For convenience, Pseudata includes a SeedFrom utility function that converts strings into deterministic numeric seeds, allowing developers to use memorable identifiers like "demo-2025" instead of raw numbers.
2. Consistency by Default
Section titled “2. Consistency by Default”Pseudata enforces strict schema consistency across languages. A User object generated in Java will have the exact same field names and value formats as one generated in Python. This eliminates the “works on my machine” class of bugs caused by subtle data structure mismatches between services.
3. Multi-Locale Support
Section titled “3. Multi-Locale Support”Unlike most mock data libraries that focus on English-only data, Pseudata supports 15 locales across 3 tiers—from English and Chinese to Arabic and Vietnamese. Each locale provides culturally appropriate names, addresses, and geographic data, making it suitable for building and testing globally-aware applications.
Why This Matters Now
Section titled “Why This Matters Now”Micro-frontends, microservices, edge computing—the industry is doubling down on distributed architectures. But we’re still mocking data like it’s 2010.
With Pseudata, you can:
- QA Engineers report a bug in “User #8821”, and the backend dev can instantly reproduce that user’s state locally without a database dump.
- Sales Teams present demos where the data looks consistent across the web dashboard and the mobile app.
- Load Testing scripts in Python verify the exact data rendered by a Node.js server.
Current Status
Section titled “Current Status”Pseudata is currently in active development with working implementations in 4 languages.
The initial release supports Go, Java, Python, and TypeScript, with 5 additional languages planned (C#, Rust, Swift, Dart, PHP). The core architecture is established and demonstrably consistent across all current implementations.
Getting Started
Section titled “Getting Started”⚠️ Development Status: Pseudata is currently in intensive initial development. There are no publicly available releases or installable versions yet. The core architecture is being finalized across all four languages.
While the libraries are not yet released, you can follow development progress and explore the technical architecture:
Installation (when released):
bash go get github.com/pseudata/pseudata <dependency> <groupId>dev.pseudata</groupId> <artifactId>pseudata</artifactId> <version>0.1.0</version></dependency>bash pip install pseudata bash npm install pseudata Explore Now:
What’s Next?
Section titled “What’s Next?”The roadmap is organized into focused phases:
Phase 1 (Current): Core Language Support
- Finalize Go, Java, Python, and TypeScript implementations
- Ensure 100% cross-language consistency
- Comprehensive test coverage
- Initial stable release (v1.0)
Phase 2: Backend Expansion
- C# for .NET ecosystems
- Rust for high-performance systems
- PHP for web applications
Phase 3: Mobile Platforms
- Swift for iOS development
- Dart for Flutter cross-platform apps
Phase 4: Advanced Features
- Additional data types (Products, Companies, Financial data)
- Custom schema definitions
- Plugin architecture for domain-specific generators
- Performance optimizations
Join the Mission
Section titled “Join the Mission”If you’ve ever struggled with inconsistent mock data across your polyglot stack, Pseudata is being built for you.
The project is open source (Apache 2.0) and welcomes contributors who care about:
- Writing tests that work the same way everywhere
- Building demos with consistent data
- Making mock data generation a solved problem
How to Contribute:
- Star the repositories to show support
- Report bugs or inconsistencies you find
- Suggest new features or data types
- Contribute implementations in new languages
- Improve documentation and examples
Let’s fix mock data. Together.
© 2025 Pseudata Project. Open Source under Apache License 2.0. · RSS Feed