calibration

calibration

Define the calibration class

Classes

Name	Description
Calibration	A class to handle calibration of HPVsim simulations. Uses the Optuna hyperparameter

Calibration

calibration.Calibration(
    sim,
    datafiles,
    calib_pars=None,
    genotype_pars=None,
    hiv_pars=None,
    fit_args=None,
    extra_sim_result_keys=None,
    par_samplers=None,
    n_trials=None,
    n_workers=None,
    total_trials=None,
    name=None,
    db_name=None,
    estimator=None,
    keep_db=None,
    storage=None,
    rand_seed=None,
    sampler=None,
    label=None,
    die=False,
    verbose=True,
)

A class to handle calibration of HPVsim simulations. Uses the Optuna hyperparameter optimization library (optuna.org), which must be installed separately (via pip install optuna).

Note: running a calibration does not guarantee a good fit! You must ensure that you run for a sufficient number of iterations, have enough free parameters, and that the parameters have wide enough bounds. Please see the tutorial on calibration for more information.

Parameters

Name	Description	Default
sim (Sim)	the simulation to calibrate	required
datafiles (list)	list of datafile strings to calibrate to	required
calib_pars (dict)	a dictionary of the parameters to calibrate of the format dict(key1=[best, low, high])	required
genotype_pars(dict)	a dictionary of the genotype-specific parameters to calibrate of the format dict(genotype=dict(key1=[best, low, high]))	required
hiv_pars (dict)	a dictionary of the hiv-specific parameters to calibrate of the format dict(key1=[best, low, high])	required
extra_sim_results (list)	list of result strings to store	required
fit_args (dict)	a dictionary of options that are passed to sim.compute_fit() to calculate the goodness-of-fit	required
par_samplers (dict)	an optional mapping from parameters to the Optuna sampler to use for choosing new points for each; by default, suggest_float	required
n_trials (int)	the number of trials per worker	required
n_workers (int)	the number of parallel workers (default: maximum	required
total_trials (int)	if n_trials is not supplied, calculate by dividing this number by n_workers)	required
name (str)	the name of the database (default: ‘hpvsim_calibration’)	required
db_name (str)	the name of the database file (default: ‘hpvsim_calibration.db’)	required
keep_db (bool)	whether to keep the database after calibration (default: false)	required
storage (str)	the location of the database (default: sqlite)	required
rand_seed (int)	if provided, use this random seed to initialize Optuna runs (for reproducibility)	required
label (str)	a label for this calibration object	required
die (bool)	whether to stop if an exception is encountered (default: false)	required
verbose (bool)	whether to print details of the calibration	required
kwargs (dict)	passed to hpv.Calibration()	required

Returns

Name	Type	Description
		A Calibration object

Example::

sim = hpv.Sim(pars, genotypes=[16, 18])
calib_pars = dict(beta=[0.05, 0.010, 0.20],hpv_control_prob=[.9, 0.5, 1])
calib = hpv.Calibration(sim, calib_pars=calib_pars,
                        datafiles=['test_data/south_africa_hpv_data.xlsx',
                                   'test_data/south_africa_cancer_data.xlsx'],
                        total_trials=10, n_workers=4)
calib.calibrate()
calib.plot()

Methods

Name	Description
calibrate	Actually perform calibration.
get_full_pars	Make a full pardict from the subset of regular sim parameters, genotype parameters, and hiv parameters used in calibration
make_study	Make a study, deleting one if it already exists
parse_study	Parse the study into a data frame – called automatically
plot	Plot the calibration results
remove_db	Remove the database file if keep_db is false and the path exists.
run_sim	Create and run a simulation
run_trial	Define the objective for Optuna
run_workers	Run multiple workers in parallel
sim_to_sample_pars	Convert sim pars to sample pars
to_json	Convert the data to JSON.
trial_pars_to_sim_pars	Create genotype_pars and pars dicts from the trial parameters.
trial_to_sim_pars	Take in an optuna trial and sample from pars, after extracting them from the structure they’re provided in
update_dict_pars	Function to update parameters from nested dict to nested dict’s value
update_dict_pars_from_trial	Function to update parameters from nested dict to trial parameter’s value
update_dict_pars_init_and_bounds	Function to update initial parameters and parameter bounds from a trial pars dict
worker	Run a single worker

calibrate

calibration.Calibration.calibrate(
    calib_pars=None,
    genotype_pars=None,
    hiv_pars=None,
    verbose=True,
    load=True,
    tidyup=True,
    **kwargs,
)

Actually perform calibration.

Parameters

Name	Type	Description	Default
calib_pars	dict	if supplied, overwrite stored calib_pars	`None`
verbose	bool	whether to print output from each trial	`True`
kwargs	dict	if supplied, overwrite stored run_args (n_trials, n_workers, etc.)	`{}`

get_full_pars

calibration.Calibration.get_full_pars(
    sim=None,
    calib_pars=None,
    genotype_pars=None,
    hiv_pars=None,
)

Make a full pardict from the subset of regular sim parameters, genotype parameters, and hiv parameters used in calibration

make_study

calibration.Calibration.make_study()

Make a study, deleting one if it already exists

parse_study

calibration.Calibration.parse_study(study)

Parse the study into a data frame – called automatically

plot

calibration.Calibration.plot(
    res_to_plot=None,
    fig_args=None,
    axis_args=None,
    data_args=None,
    show_args=None,
    do_save=None,
    fig_path=None,
    do_show=True,
    plot_type='sns.boxplot',
    **kwargs,
)

Plot the calibration results

Parameters

Name	Type	Description	Default
res_to_plot	int	number of results to plot. if None, plot them all	`None`
fig_args	dict	passed to pl.figure()	`None`
axis_args	dict	passed to pl.subplots_adjust()	`None`
data_args	dict	‘width’, ‘color’, and ‘offset’ arguments for the data	`None`
do_save	bool	whether to save	`None`
fig_path	str or `filepath`	filepath to save to	`None`
do_show	bool	whether to show the figure	`True`
kwargs	dict	passed to `hpv.options.with_style()`; see that function for choices	`{}`

remove_db

calibration.Calibration.remove_db()

Remove the database file if keep_db is false and the path exists.

run_sim

calibration.Calibration.run_sim(
    calib_pars=None,
    genotype_pars=None,
    hiv_pars=None,
    label=None,
    return_sim=False,
)

Create and run a simulation

run_trial

calibration.Calibration.run_trial(trial, save=True)

Define the objective for Optuna

run_workers

calibration.Calibration.run_workers()

Run multiple workers in parallel

sim_to_sample_pars

calibration.Calibration.sim_to_sample_pars()

Convert sim pars to sample pars

to_json

calibration.Calibration.to_json(filename=None, indent=2, **kwargs)

Convert the data to JSON.

trial_pars_to_sim_pars

calibration.Calibration.trial_pars_to_sim_pars(
    trial_pars=None,
    which_pars=None,
    return_full=True,
)

Create genotype_pars and pars dicts from the trial parameters. Note: not used during self.calibrate. Args: trial_pars (dict): dictionary of parameters from a single trial. If not provided, best parameters will be used return_full (bool): whether to return a unified par dict ready for use in a sim, or the sim pars and genotype pars separately

Example::

sim = hpv.Sim(genotypes=[16, 18])
calib_pars = dict(beta=[0.05, 0.010, 0.20],hpv_control_prob=[.9, 0.5, 1])
genotype_pars = dict(hpv16=dict(prog_time=[3, 3, 10]))
calib = hpv.Calibration(sim, calib_pars=calib_pars, genotype_pars=genotype_pars
                    datafiles=['test_data/south_africa_hpv_data.xlsx',
                               'test_data/south_africa_cancer_data.xlsx'],
                    total_trials=10, n_workers=4)
calib.calibrate()
new_pars = calib.trial_pars_to_sim_pars() # Returns best parameters from calibration in a format ready for sim running
sim.update_pars(new_pars)
sim.run()

trial_to_sim_pars

calibration.Calibration.trial_to_sim_pars(pardict=None, trial=None)

Take in an optuna trial and sample from pars, after extracting them from the structure they’re provided in

update_dict_pars

calibration.Calibration.update_dict_pars(name_pars, value_pars)

Function to update parameters from nested dict to nested dict’s value

update_dict_pars_from_trial

calibration.Calibration.update_dict_pars_from_trial(name_pars, value_pars)

Function to update parameters from nested dict to trial parameter’s value

update_dict_pars_init_and_bounds

calibration.Calibration.update_dict_pars_init_and_bounds(
    initial_pars,
    par_bounds,
    target_pars,
)

Function to update initial parameters and parameter bounds from a trial pars dict

worker

calibration.Calibration.worker()

Run a single worker