\(\newcommand{\B}[1]{ {\bf #1} }\) \(\newcommand{\R}[1]{ {\rm #1} }\)
csv.population¶
View page sourcePopulation Weighting of Measurement Values¶
ode_step_size¶
The only integrand Sincidence does use the dismod_at ODE. Thus this step size is only used when approximating averages with respect to age and time. It is very small so that the predictions are very accurate.
ode_step_size = 0.01
csv_file¶
This dictionary is used to hold the data corresponding to the csv files for this example:
csv_file = dict()
node.csv¶
For this example the root node, n0, has no children.
csv_file['node.csv'] = \
'''node_name,parent_name
n0,
'''
option_fit.csv¶
This example uses the default value for the options that are not listed below:
Name, Value |
|
|---|---|
random_seed |
chosen using current seconds reported by python time package. |
compress_interval |
use zero so that no intervals get compressed. |
tolerance_fixed |
this is set small, 1e-8, so we can check accuracy. |
ode_step_size |
step size use to approximate averages w.r.t. age, time. |
The population covariate is used to weight the data; see population in the covariate.csv table. It does not matter if it is an absolute_covariates because it does not appear in the covariate column of the mulcov.csv table.
random_seed = str( int( time.time() ) )
csv_file['option_fit.csv'] = 'name,value\n'
csv_file['option_fit.csv'] += f'random_seed,{random_seed}\n'
csv_file['option_fit.csv'] += f'compress_interval,0.0 0.0\n'
csv_file['option_fit.csv'] += f'tolerance_fixed,1e-8\n'
csv_file['option_fit.csv'] += f'ode_step_size,{ode_step_size}\n'
option_predict.csv¶
This example uses the default value for all the options in option_predict.csv.
csv_file['option_predict.csv'] = 'name,value\n'
covariate.csv¶
This example has one covariate, population. Other cause mortality, omega, is constant and equal to 0.02. The population depends on age and sex but not time. The female (male) population decreases (increases) with age. This is unrealistic for the male case but is set this way just to make sure that the different population weights have a different effect.
female_population_age_0 = 1e4
female_population_age_100 = 1e3
csv_file['covariate.csv'] = \
'node_name,sex,age,time,omega,population\n' + \
'n0,female,0,2000,0.02,' + str(female_population_age_0) + '\n' \
'n0,female,100,2000,0.02,' + str(female_population_age_100) + '\n' \
'n0,male,0,2000,0.02,' + str(female_population_age_100) + '\n' \
'n0,male,100,2000,0.02,' + str(female_population_age_0) + '\n'
fit_goal.csv¶
This example only fits node n0.
csv_file['fit_goal.csv'] = \
'''node_name
n0
'''
predict_integrand.csv¶
For this example we want to know the values of Sincidence. (Note that Sincidence is a direct measurement of iota.)
csv_file['predict_integrand.csv'] = \
'''integrand_name
Sincidence
'''
prior.csv¶
We define three priors:
uniform_1_1 |
a uniform distribution on [ -1, 1 ] |
uniform_eps_1 |
a uniform distribution on [ 1e-6, 1 ] |
gauss_01 |
a Gaussian with mean 0 standard deviation 1 |
csv_file['prior.csv'] = \
'''name,lower,upper,mean,std,density
uniform_-1_1,-1.0,1.0,0.5,,uniform
uniform_eps_1,1e-6,1.0,0.5,,uniform
gauss_01,,,0.0,1.0,gaussian
'''
parent_rate.csv¶
The only non-zero rates are omega and iota (omega is known and specified by the covariate.csv file). The model for iota is linear w.r.t age and constant w.r.t. time. Its value prior is uniform_eps_1 and its dage prior is gauss_01. It does not have any dtime priors because there are no time differences between grid values.
csv_file['parent_rate.csv'] = \
'''rate_name,age,time,value_prior,dage_prior,dtime_prior,const_value
iota,0.0,2000,uniform_eps_1,gauss_01,,
iota,100.0,2000,uniform_eps_1,gauss_01,,
'''
child_rate.csv¶
The are not children (hence no random effects) in this example.
csv_file['child_rate.csv'] = \
'''rate_name,value_prior
'''
mulcov.csv¶
There are no covariate multipliers in this example:
csv_file['mulcov.csv'] = \
'''covariate,type,effected,value_prior,const_value
'''
data_in.csv¶
The only integrand for this example is Sincidence (a direct measurement of iota.) The age intervals are is [0, 0], [0, 100], and [100, 100]. The time intervals are all the same, [2000, 2010]. The measurement standard deviation is 0.001 (during the fitting) and none of the data is held out. The actual
header = 'data_id, integrand_name, node_name, sex, age_lower, age_upper, '
header += 'time_lower, time_upper, meas_value, meas_std, hold_out, '
header += 'density_name, eta, nu'
csv_file['data_in.csv'] = header + \
'''
0, Sincidence, n0, female, 0, 50, 2000, 2000, 0.0000, 0.001, 0, gaussian, ,
1, Sincidence, n0, female, 50, 100, 2000, 2000, 0.0000, 0.001, 0, gaussian, ,
2, Sincidence, n0, male, 0, 50, 2000, 2000, 0.0000, 0.001, 0, gaussian, ,
3, Sincidence, n0, male, 50, 100, 2000, 2000, 0.0000, 0.001, 0, gaussian, ,
4, Sincidence, n0, both, 0, 50, 2000, 2000, 0.0000, 0.001, 0, gaussian, ,
5, Sincidence, n0, both, 50, 100, 2000, 2000, 0.0000, 0.001, 0, gaussian, ,
'''
csv_file['data_in.csv'] = csv_file['data_in.csv'].replace(' ', '')
The measurement value meas_value is 0.0000 above and gets replaced by the following code:
row['meas_value'] = average_Sincidence(sex, age_lower, age_upper)
Source Code¶
#
# no_effect_iota
def no_effect_iota(age) :
age_0 = 0.01
age_100 = 0.03
iota = ( age_0 * (100.0 - age) + age_100 * (age - 0.0) ) / 100.0
return iota
#
# population
def population(sex, age) :
male_population_age_0 = female_population_age_100
male_population_age_100 = female_population_age_0
if sex == 'female' :
age_0 = female_population_age_0
age_100 = female_population_age_100
elif sex == 'male' :
age_0 = male_population_age_0
age_100 = male_population_age_100
else :
assert sex == 'both'
age_0 = (female_population_age_0 + male_population_age_0) / 2
age_100 = (female_population_age_100 + male_population_age_100) / 2
pop = ( age_0 * (100.0 - age) + age_100 * (age - 0.0) ) / 100.0
return pop
#
# average_Sincidence
def average_Sincidence(sex, age_lower, age_upper) :
if age_lower == age_upper :
return no_effect_iota(age_lower)
#
max_step = ode_step_size
n_step = int( (age_upper - age_lower) / max_step ) + 1
step_size = (age_upper - age_lower) / n_step
#
average = 0.0
sum_pop = 0.0
for i_step in range(n_step + 1) :
age = age_lower + i_step * step_size
pop = population(sex, age)
iota = no_effect_iota(age)
average += pop * iota
sum_pop += pop
average /= sum_pop
return average
#
# main
def main() :
#
# fit_dir
fit_dir = 'build/example/csv'
at_cascade.empty_directory(fit_dir)
#
# write csv files
for name in csv_file :
file_name = f'{fit_dir}/{name}'
file_ptr = open(file_name, 'w')
file_ptr.write( csv_file[name] )
file_ptr.close()
#
# data_in.csv
float_format = '{0:.5g}'
file_name = f'{fit_dir}/data_in.csv'
table = at_cascade.csv.read_table( file_name )
for row in table :
sex = row['sex']
node_name = row['node_name']
integrand_name = row['integrand_name']
age_lower = float( row['age_lower'] )
age_upper = float( row['age_upper'] )
assert integrand_name == 'Sincidence'
#
# BEGIN_MEAS_VALUE
row['meas_value'] = average_Sincidence(sex, age_lower, age_upper)
# END_MEAS_VALUE
at_cascade.csv.write_table(file_name, table)
#
# fit
at_cascade.csv.fit(fit_dir)
#
# predict
at_cascade.csv.predict(fit_dir)
#
#
# predict_table
file_name = f'{fit_dir}/fit_predict.csv'
predict_table = at_cascade.csv.read_table(file_name)
#
# row
for row in predict_table :
assert row['integrand_name'] == 'Sincidence'
assert row['node_name'] == 'n0'
age = float( row['age'] )
iota = float( row['avg_integrand'] )
check = no_effect_iota(age)
rel_error = (iota - check) / check
if abs(rel_error) >= 1e-4 :
print(f'age={age}, iota={iota}, check={check}, rel_error={rel_error}')
assert abs(rel_error) < 1e-4
#
#
if __name__ == '__main__' :
main()
print('population.py: OK')