calibration
Define the calibration class
Classes
| Name | Description |
|---|---|
| Calibration | A class to handle calibration of HPVsim simulations. Uses the Optuna hyperparameter |
Calibration
calibration.Calibration(
sim,
datafiles,
calib_pars=None,
genotype_pars=None,
hiv_pars=None,
fit_args=None,
extra_sim_result_keys=None,
par_samplers=None,
n_trials=None,
n_workers=None,
total_trials=None,
name=None,
db_name=None,
estimator=None,
keep_db=None,
storage=None,
rand_seed=None,
sampler=None,
label=None,
die=False,
verbose=True,
)A class to handle calibration of HPVsim simulations. Uses the Optuna hyperparameter optimization library (optuna.org), which must be installed separately (via pip install optuna).
Note: running a calibration does not guarantee a good fit! You must ensure that you run for a sufficient number of iterations, have enough free parameters, and that the parameters have wide enough bounds. Please see the tutorial on calibration for more information.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| sim (Sim) | the simulation to calibrate | required | |
| datafiles (list) | list of datafile strings to calibrate to | required | |
| calib_pars (dict) | a dictionary of the parameters to calibrate of the format dict(key1=[best, low, high]) | required | |
| genotype_pars(dict) | a dictionary of the genotype-specific parameters to calibrate of the format dict(genotype=dict(key1=[best, low, high])) | required | |
| hiv_pars (dict) | a dictionary of the hiv-specific parameters to calibrate of the format dict(key1=[best, low, high]) | required | |
| extra_sim_results (list) | list of result strings to store | required | |
| fit_args (dict) | a dictionary of options that are passed to sim.compute_fit() to calculate the goodness-of-fit | required | |
| par_samplers (dict) | an optional mapping from parameters to the Optuna sampler to use for choosing new points for each; by default, suggest_float | required | |
| n_trials (int) | the number of trials per worker | required | |
| n_workers (int) | the number of parallel workers (default: maximum | required | |
| total_trials (int) | if n_trials is not supplied, calculate by dividing this number by n_workers) | required | |
| name (str) | the name of the database (default: ‘hpvsim_calibration’) | required | |
| db_name (str) | the name of the database file (default: ‘hpvsim_calibration.db’) | required | |
| keep_db (bool) | whether to keep the database after calibration (default: false) | required | |
| storage (str) | the location of the database (default: sqlite) | required | |
| rand_seed (int) | if provided, use this random seed to initialize Optuna runs (for reproducibility) | required | |
| label (str) | a label for this calibration object | required | |
| die (bool) | whether to stop if an exception is encountered (default: false) | required | |
| verbose (bool) | whether to print details of the calibration | required | |
| kwargs (dict) | passed to hpv.Calibration() | required |
Returns
| Name | Type | Description |
|---|---|---|
| A Calibration object |
Example::
sim = hpv.Sim(pars, genotypes=[16, 18])
calib_pars = dict(beta=[0.05, 0.010, 0.20],hpv_control_prob=[.9, 0.5, 1])
calib = hpv.Calibration(sim, calib_pars=calib_pars,
datafiles=['test_data/south_africa_hpv_data.xlsx',
'test_data/south_africa_cancer_data.xlsx'],
total_trials=10, n_workers=4)
calib.calibrate()
calib.plot()
Methods
| Name | Description |
|---|---|
| calibrate | Actually perform calibration. |
| get_full_pars | Make a full pardict from the subset of regular sim parameters, genotype parameters, and hiv parameters used in calibration |
| make_study | Make a study, deleting one if it already exists |
| parse_study | Parse the study into a data frame – called automatically |
| plot | Plot the calibration results |
| remove_db | Remove the database file if keep_db is false and the path exists. |
| run_sim | Create and run a simulation |
| run_trial | Define the objective for Optuna |
| run_workers | Run multiple workers in parallel |
| sim_to_sample_pars | Convert sim pars to sample pars |
| to_json | Convert the data to JSON. |
| trial_pars_to_sim_pars | Create genotype_pars and pars dicts from the trial parameters. |
| trial_to_sim_pars | Take in an optuna trial and sample from pars, after extracting them from the structure they’re provided in |
| update_dict_pars | Function to update parameters from nested dict to nested dict’s value |
| update_dict_pars_from_trial | Function to update parameters from nested dict to trial parameter’s value |
| update_dict_pars_init_and_bounds | Function to update initial parameters and parameter bounds from a trial pars dict |
| worker | Run a single worker |
calibrate
calibration.Calibration.calibrate(
calib_pars=None,
genotype_pars=None,
hiv_pars=None,
verbose=True,
load=True,
tidyup=True,
**kwargs,
)Actually perform calibration.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| calib_pars | dict | if supplied, overwrite stored calib_pars | None |
| verbose | bool | whether to print output from each trial | True |
| kwargs | dict | if supplied, overwrite stored run_args (n_trials, n_workers, etc.) | {} |
get_full_pars
calibration.Calibration.get_full_pars(
sim=None,
calib_pars=None,
genotype_pars=None,
hiv_pars=None,
)Make a full pardict from the subset of regular sim parameters, genotype parameters, and hiv parameters used in calibration
make_study
calibration.Calibration.make_study()Make a study, deleting one if it already exists
parse_study
calibration.Calibration.parse_study(study)Parse the study into a data frame – called automatically
plot
calibration.Calibration.plot(
res_to_plot=None,
fig_args=None,
axis_args=None,
data_args=None,
show_args=None,
do_save=None,
fig_path=None,
do_show=True,
plot_type='sns.boxplot',
**kwargs,
)Plot the calibration results
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| res_to_plot | int | number of results to plot. if None, plot them all | None |
| fig_args | dict | passed to pl.figure() | None |
| axis_args | dict | passed to pl.subplots_adjust() | None |
| data_args | dict | ‘width’, ‘color’, and ‘offset’ arguments for the data | None |
| do_save | bool | whether to save | None |
| fig_path | str or filepath |
filepath to save to | None |
| do_show | bool | whether to show the figure | True |
| kwargs | dict | passed to hpv.options.with_style(); see that function for choices |
{} |
remove_db
calibration.Calibration.remove_db()Remove the database file if keep_db is false and the path exists.
run_sim
calibration.Calibration.run_sim(
calib_pars=None,
genotype_pars=None,
hiv_pars=None,
label=None,
return_sim=False,
)Create and run a simulation
run_trial
calibration.Calibration.run_trial(trial, save=True)Define the objective for Optuna
run_workers
calibration.Calibration.run_workers()Run multiple workers in parallel
sim_to_sample_pars
calibration.Calibration.sim_to_sample_pars()Convert sim pars to sample pars
to_json
calibration.Calibration.to_json(filename=None, indent=2, **kwargs)Convert the data to JSON.
trial_pars_to_sim_pars
calibration.Calibration.trial_pars_to_sim_pars(
trial_pars=None,
which_pars=None,
return_full=True,
)Create genotype_pars and pars dicts from the trial parameters. Note: not used during self.calibrate. Args: trial_pars (dict): dictionary of parameters from a single trial. If not provided, best parameters will be used return_full (bool): whether to return a unified par dict ready for use in a sim, or the sim pars and genotype pars separately
Example::
sim = hpv.Sim(genotypes=[16, 18])
calib_pars = dict(beta=[0.05, 0.010, 0.20],hpv_control_prob=[.9, 0.5, 1])
genotype_pars = dict(hpv16=dict(prog_time=[3, 3, 10]))
calib = hpv.Calibration(sim, calib_pars=calib_pars, genotype_pars=genotype_pars
datafiles=['test_data/south_africa_hpv_data.xlsx',
'test_data/south_africa_cancer_data.xlsx'],
total_trials=10, n_workers=4)
calib.calibrate()
new_pars = calib.trial_pars_to_sim_pars() # Returns best parameters from calibration in a format ready for sim running
sim.update_pars(new_pars)
sim.run()
trial_to_sim_pars
calibration.Calibration.trial_to_sim_pars(pardict=None, trial=None)Take in an optuna trial and sample from pars, after extracting them from the structure they’re provided in
update_dict_pars
calibration.Calibration.update_dict_pars(name_pars, value_pars)Function to update parameters from nested dict to nested dict’s value
update_dict_pars_from_trial
calibration.Calibration.update_dict_pars_from_trial(name_pars, value_pars)Function to update parameters from nested dict to trial parameter’s value
update_dict_pars_init_and_bounds
calibration.Calibration.update_dict_pars_init_and_bounds(
initial_pars,
par_bounds,
target_pars,
)Function to update initial parameters and parameter bounds from a trial pars dict
worker
calibration.Calibration.worker()Run a single worker