In market sizing and forecasting, we’re often tasked with estimating large data tables with limited data or only tables of subtotals. For example, one might know the number of people employed in the a country by age range, as well as the gender breakdown of total employment without age. Using iterative proportion fitting with these two data series, one can estimate the number of people employed by age range AND gender.
Iterative proportion fitting (IPF) has been used in market analysis for over 30 years. This package integrates the methodology, along with some more expansions of this technique, into the tidyverse in R.
The ipfitr package for R allows the user to input multiple high-level summary/aggregate data tables and create a single, full-dimension data table estimating individual cell values using iterative proportion fitting.
The ipfitr package performs iterative proportion fitting on a seed datatable, continuously scaling values to each target until the seed values sum to every supplied target. If no seed is supplied, the function begins with a seed of 1 for every value.
Changing the seed will result in different final results (see Examples 1a and 2 in the demo). For this reason, it’s best to include as much information as you can about the relative values of the cells in the seed.
This package is built using the tidyverse packages.
The functions in ipfitr can be run independently, but are more useful as part of a %>% workflow.
For the versions available, see the tags on this repository.
See also the list of contributors who participated in this project.