Real patterns preserved · Real identities protected
We learn patterns from your data, but never copy actual records. Share synthetic data freely it can't be traced back.
No credit card · No setup · Works in your browser
CSV or JSON files with automatic schema detection
CTGAN, TVAE, or schema-based with DP options
Quality metrics and downloadable synthetic datasets
Features
Built-in differential privacy with configurable budget controls. Your data patterns are learned, not copied.
Train ML models on your data, or use schema-based generation to create data instantly without training.
Define your columns and types, get realistic data in seconds. No training, no waiting. Perfect for prototyping.
Compare your synthetic data against the original. Get clear scores for statistical similarity and privacy.
Automatic detection of names, emails, SSNs, and other sensitive fields before you generate.
Export privacy reports and model cards as PDFs for HIPAA, GDPR, and audit documentation.
How It Works
Two ways to generate synthetic data pick what fits your needs
Train ML models on your existing data for realistic synthetic output
Upload CSV/JSON. Schema auto-detected, PII flagged.
Choose CTGAN, TVAE, or Copula. Configure privacy settings.
Create synthetic data. View quality metrics and export.
Security
Your data never leaves your control. Configurable privacy budgets, full audit trails, and compliance-ready exports keep your team protected.
FAQ
Create privacy-preserving synthetic data in minutes. Free to use, open source, and self-hostable.