Joint distributions capture how variables co-occur in a population, while marginal distributions only show individual variable statistics.
Age | Gender | Education | Probability |
---|---|---|---|
18-25 | Female | Degree | 12% |
26-35 | Male | School | 8% |
18-25 | Male | Degree | 5% |
Method | Joint Distribution Preservation | Visualization |
---|---|---|
Deterministic Reweighting | Only matches single-variable marginals well May distort natural relationships |
✅ Age ✅ Gender ✅ Education ❌ Combinations |
Iterative Proportional Fitting (IPF) | Better at preserving 2-way relationships 3+ way interactions might still be off |
✅ Age×Gender ✅ Gender×Education ⚠️ Age×Gender×Education |
Conditional Probabilities | Explicitly models multi-way dependencies Best preserves realistic combinations |
✅ Age×Gender×Education ✅ Natural clustering |
A real population might show:
Conditional probability methods will maintain these natural relationships, while simpler methods might artificially flatten them to hit marginal targets.