Synthetic Population Generation Using Simulated Annealing

Objective

To create a synthetic population for each census area by combining individual-level survey data (Understanding Society) with aggregated census data using Simulated Annealing (SA). The synthetic population will match the demographic distributions of the census while preserving individual-level characteristics.

Methodology

1. Data Preparation

2. Parallel Simulated Annealing

graph TD E[NumPy Array] -->|Shared Data| F[Parallel SA Threads] D[SA Constraints] --> F F -->|Best-Fit IDs| G[Synthetic Population per Area]

3. Population Reconstruction

graph LR G[Synthetic Population IDs] --> H[Link to Survey Data] H --> I[Final Synthetic Dataset]

Example Output (Area 1):
Census Requirements:
- Females (20-40) = 2
- Males (20-40) = 1

SA Solution:
Selected IDs = [1, 3, 2]

Final Population:
ID | Age | Gender
1 | 25 | Female
3 | 40 | Female
2 | 35 | Male

Full System Workflow

graph TD A[Survey Data] --> B(Encode Categories) C[Census Data] -->|Category Definitions| B C --> D(SA Constraints) B --> E[Shared NumPy Array] E --> F[Parallel SA Threads] D --> F F --> G[Synthetic Populations] G --> H[Link IDs to Survey Data] H --> I[Final Synthetic Dataset]

Key Enhancements

Consistent categorization: Ensures survey data uses same groups as census
Parallel Workflow: The data is in a format that is easily shareable among threads and readily testable by the fitness function.

Navigation