Understanding Marginal Accuracy in Population Synthesis

What is Marginal Accuracy?

Marginal accuracy measures how well a synthetic population matches the target distributions of individual variables (like age, gender, or education) without considering their combinations.

Example Scenario

Given these target distributions for a population of 1,000:

Variable Category Target Count Target %
Age 18-25 300 30%
26-35 700 70%
Gender Male 450 45%
Female 550 55%

How Methods Achieve Marginal Accuracy

Method Marginal Accuracy Mechanism Visual
Deterministic Reweighting Perfect for 1-2 dimensions Direct weight adjustment per variable
Perfect
Iterative Proportional Fitting (IPF) High for all dimensions Iterative proportional adjustments
High
Conditional Probabilities Approximate Sampling from conditional distributions
Medium
Simulated Annealing Variable (depends on parameters) Optimization toward targets
Variable

Tradeoffs with Joint Distributions

Methods that prioritize perfect marginal accuracy often sacrifice realistic joint distributions:

Method Marginal Accuracy Joint Distribution Quality
Deterministic ✅ Perfect ❌ Poor
IPF ✅ High ⚠️ Moderate
Conditional ⚠️ Approximate ✅ Best

Key Considerations

  1. Precision Needs: Health policy models may need perfect marginals, while marketing simulations may prioritize realistic combinations
  2. Variable Importance: Some variables (like age in healthcare) may need higher marginal accuracy than others
  3. Convergence: IPF guarantees convergence to marginal targets if constraints are compatible
  4. Sample Size: Conditional methods need sufficient reference data to estimate probabilities accurately

Measuring Marginal Accuracy

Common metrics include: