calibration

calibration

Define the calibration class

Classes

Name Description
Calibration A class to handle calibration of HPVsim simulations. Uses the Optuna hyperparameter

Calibration

calibration.Calibration(
    sim,
    datafiles,
    calib_pars=None,
    genotype_pars=None,
    hiv_pars=None,
    fit_args=None,
    extra_sim_result_keys=None,
    par_samplers=None,
    n_trials=None,
    n_workers=None,
    total_trials=None,
    name=None,
    db_name=None,
    estimator=None,
    keep_db=None,
    storage=None,
    rand_seed=None,
    sampler=None,
    label=None,
    die=False,
    verbose=True,
)

A class to handle calibration of HPVsim simulations. Uses the Optuna hyperparameter optimization library (optuna.org), which must be installed separately (via pip install optuna).

Note: running a calibration does not guarantee a good fit! You must ensure that you run for a sufficient number of iterations, have enough free parameters, and that the parameters have wide enough bounds. Please see the tutorial on calibration for more information.

Parameters

Name Type Description Default
sim (Sim) the simulation to calibrate required
datafiles (list) list of datafile strings to calibrate to required
calib_pars (dict) a dictionary of the parameters to calibrate of the format dict(key1=[best, low, high]) required
genotype_pars(dict) a dictionary of the genotype-specific parameters to calibrate of the format dict(genotype=dict(key1=[best, low, high])) required
hiv_pars (dict) a dictionary of the hiv-specific parameters to calibrate of the format dict(key1=[best, low, high]) required
extra_sim_results (list) list of result strings to store required
fit_args (dict) a dictionary of options that are passed to sim.compute_fit() to calculate the goodness-of-fit required
par_samplers (dict) an optional mapping from parameters to the Optuna sampler to use for choosing new points for each; by default, suggest_float required
n_trials (int) the number of trials per worker required
n_workers (int) the number of parallel workers (default: maximum required
total_trials (int) if n_trials is not supplied, calculate by dividing this number by n_workers) required
name (str) the name of the database (default: ‘hpvsim_calibration’) required
db_name (str) the name of the database file (default: ‘hpvsim_calibration.db’) required
keep_db (bool) whether to keep the database after calibration (default: false) required
storage (str) the location of the database (default: sqlite) required
rand_seed (int) if provided, use this random seed to initialize Optuna runs (for reproducibility) required
label (str) a label for this calibration object required
die (bool) whether to stop if an exception is encountered (default: false) required
verbose (bool) whether to print details of the calibration required
kwargs (dict) passed to hpv.Calibration() required

Returns

Name Type Description
A Calibration object

Example::

sim = hpv.Sim(pars, genotypes=[16, 18])
calib_pars = dict(beta=[0.05, 0.010, 0.20],hpv_control_prob=[.9, 0.5, 1])
calib = hpv.Calibration(sim, calib_pars=calib_pars,
                        datafiles=['test_data/south_africa_hpv_data.xlsx',
                                   'test_data/south_africa_cancer_data.xlsx'],
                        total_trials=10, n_workers=4)
calib.calibrate()
calib.plot()

Methods

Name Description
calibrate Actually perform calibration.
get_full_pars Make a full pardict from the subset of regular sim parameters, genotype parameters, and hiv parameters used in calibration
make_study Make a study, deleting one if it already exists
parse_study Parse the study into a data frame – called automatically
plot Plot the calibration results
remove_db Remove the database file if keep_db is false and the path exists.
run_sim Create and run a simulation
run_trial Define the objective for Optuna
run_workers Run multiple workers in parallel
sim_to_sample_pars Convert sim pars to sample pars
to_json Convert the data to JSON.
trial_pars_to_sim_pars Create genotype_pars and pars dicts from the trial parameters.
trial_to_sim_pars Take in an optuna trial and sample from pars, after extracting them from the structure they’re provided in
update_dict_pars Function to update parameters from nested dict to nested dict’s value
update_dict_pars_from_trial Function to update parameters from nested dict to trial parameter’s value
update_dict_pars_init_and_bounds Function to update initial parameters and parameter bounds from a trial pars dict
worker Run a single worker
calibrate
calibration.Calibration.calibrate(
    calib_pars=None,
    genotype_pars=None,
    hiv_pars=None,
    verbose=True,
    load=True,
    tidyup=True,
    **kwargs,
)

Actually perform calibration.

Parameters
Name Type Description Default
calib_pars dict if supplied, overwrite stored calib_pars None
verbose bool whether to print output from each trial True
kwargs dict if supplied, overwrite stored run_args (n_trials, n_workers, etc.) {}
get_full_pars
calibration.Calibration.get_full_pars(
    sim=None,
    calib_pars=None,
    genotype_pars=None,
    hiv_pars=None,
)

Make a full pardict from the subset of regular sim parameters, genotype parameters, and hiv parameters used in calibration

make_study
calibration.Calibration.make_study()

Make a study, deleting one if it already exists

parse_study
calibration.Calibration.parse_study(study)

Parse the study into a data frame – called automatically

plot
calibration.Calibration.plot(
    res_to_plot=None,
    fig_args=None,
    axis_args=None,
    data_args=None,
    show_args=None,
    do_save=None,
    fig_path=None,
    do_show=True,
    plot_type='sns.boxplot',
    **kwargs,
)

Plot the calibration results

Parameters
Name Type Description Default
res_to_plot int number of results to plot. if None, plot them all None
fig_args dict passed to pl.figure() None
axis_args dict passed to pl.subplots_adjust() None
data_args dict ‘width’, ‘color’, and ‘offset’ arguments for the data None
do_save bool whether to save None
fig_path str or filepath filepath to save to None
do_show bool whether to show the figure True
kwargs dict passed to hpv.options.with_style(); see that function for choices {}
remove_db
calibration.Calibration.remove_db()

Remove the database file if keep_db is false and the path exists.

run_sim
calibration.Calibration.run_sim(
    calib_pars=None,
    genotype_pars=None,
    hiv_pars=None,
    label=None,
    return_sim=False,
)

Create and run a simulation

run_trial
calibration.Calibration.run_trial(trial, save=True)

Define the objective for Optuna

run_workers
calibration.Calibration.run_workers()

Run multiple workers in parallel

sim_to_sample_pars
calibration.Calibration.sim_to_sample_pars()

Convert sim pars to sample pars

to_json
calibration.Calibration.to_json(filename=None, indent=2, **kwargs)

Convert the data to JSON.

trial_pars_to_sim_pars
calibration.Calibration.trial_pars_to_sim_pars(
    trial_pars=None,
    which_pars=None,
    return_full=True,
)

Create genotype_pars and pars dicts from the trial parameters. Note: not used during self.calibrate. Args: trial_pars (dict): dictionary of parameters from a single trial. If not provided, best parameters will be used return_full (bool): whether to return a unified par dict ready for use in a sim, or the sim pars and genotype pars separately

Example::

sim = hpv.Sim(genotypes=[16, 18])
calib_pars = dict(beta=[0.05, 0.010, 0.20],hpv_control_prob=[.9, 0.5, 1])
genotype_pars = dict(hpv16=dict(prog_time=[3, 3, 10]))
calib = hpv.Calibration(sim, calib_pars=calib_pars, genotype_pars=genotype_pars
                    datafiles=['test_data/south_africa_hpv_data.xlsx',
                               'test_data/south_africa_cancer_data.xlsx'],
                    total_trials=10, n_workers=4)
calib.calibrate()
new_pars = calib.trial_pars_to_sim_pars() # Returns best parameters from calibration in a format ready for sim running
sim.update_pars(new_pars)
sim.run()
trial_to_sim_pars
calibration.Calibration.trial_to_sim_pars(pardict=None, trial=None)

Take in an optuna trial and sample from pars, after extracting them from the structure they’re provided in

update_dict_pars
calibration.Calibration.update_dict_pars(name_pars, value_pars)

Function to update parameters from nested dict to nested dict’s value

update_dict_pars_from_trial
calibration.Calibration.update_dict_pars_from_trial(name_pars, value_pars)

Function to update parameters from nested dict to trial parameter’s value

update_dict_pars_init_and_bounds
calibration.Calibration.update_dict_pars_init_and_bounds(
    initial_pars,
    par_bounds,
    target_pars,
)

Function to update initial parameters and parameter bounds from a trial pars dict

worker
calibration.Calibration.worker()

Run a single worker