one_at_function

View page source

Example That Directly Measures One Age Time Function

For this example everything is constant in time so the functions below do not depend on time.

cov_reference Table

This example specifies the cov_reference_table in its call to create_all_node_db ; using

   all_node_database = f'{result_dir}/all_node.db'
   at_cascade.create_all_node_db(
      all_node_database       = all_node_database,
      option_all              = option_all,
      cov_reference_table     = cov_reference_table,
   )

It also checks that the table has been set properly using

   connection = dismod_at.create_connection(
      all_node_database, new = False, readonly = True
   )
   check_table = dismod_at.get_table_dict(connection, 'cov_reference')
   connection.close()
   assert len(check_table) == len(cov_reference_table)
   for check_id in range( len(check_table) ) :
      row       = cov_reference_table[check_id]
      check_row = check_table[check_id]
      assert row == check_row

Nodes

The following is a diagram of the node tree for this example. The root_node is n1, the fit_goal_set is {n3, n4}, and the leaf nodes are {n3, n4, n5, n6}:

            n0
      /-----/\-----\
    n1             (n2)
   /  \            /  \
(n3)  (n4)       n5    n6

fit_goal_set

fit_goal_set = { 'n3', 'n4' }

Rates

The only non-zero dismod_at rate for this example is iota; i.e., we choose iota to represent the function that we are estimating. (We could have used rho or chi but not omega for this purpose.) We use iota_n(a, n, I) to denote the value for iota as a function of age a node number n and income I.

Covariate

The only covariate for this example is income. Its reference value is the average income corresponding to the fit_node.

r_n

We use r_n for the reference value of income at node n. The code below sets this reference using the name avg_income:

avg_income       = { 'n3':1.0, 'n4':2.0, 'n5':3.0, 'n6':4.0 }
avg_income['n2'] = ( avg_income['n5'] + avg_income['n6'] ) / 2.0
avg_income['n1'] = ( avg_income['n3'] + avg_income['n4'] ) / 2.0
avg_income['n0'] = ( avg_income['n1'] + avg_income['n2'] ) / 2.0

alpha

We use alpha for the rate_value covariate multiplier that multipliers income. This multiplier affects the value of iota. The true value for alpha (used which simulating the data) is

alpha_true = - 0.2

Random Effects

For each node, there is a random effect on iota that is constant in age and time. Note that the leaf nodes have random effect for the node above them as well as their own random effect.

s_n

We use s_n to denote the sum of the random effects for node n. The code below sets this sum using the name sum_random:

size_level1      = 0.2
size_level2      = 0.2
sum_random       = { 'n0': 0.0, 'n1': size_level1, 'n2': -size_level1 }
sum_random['n3'] = sum_random['n1'] + size_level2;
sum_random['n4'] = sum_random['n1'] - size_level2;
sum_random['n5'] = sum_random['n2'] + size_level2;
sum_random['n6'] = sum_random['n2'] - size_level2;

Simulated Data

Random Seed

The random seed can be used to reproduce results. If the original value of this setting is zero, the clock is used get a random seed. The actual value or random_seed is always printed.

random_seed = 0
if random_seed == 0 :
   random_seed = int( time.time() )
random.seed(random_seed)
print('one_at_function: random_seed = ', random_seed)

rate_true(rate, a, t, n, c)

For rate equal to iota, this is the true value for rate in node n at age a, time t, and covariate values c. The covariate values are a list in the same order as the covariate table. The value t is not used by this function for this example.

def rate_true(rate, a, t, n, c) :
   income = c[0]
   s_n    = sum_random[n]
   r_0    = avg_income['n0']
   effect = s_n + alpha_true * ( income - r_0 )
   if rate == 'iota' :
      return (1 + a / 100) * 1e-2 * exp(effect)
   return 0.0

y_i

The only simulated integrand for this example is Sincidence which is a direct measurement of iota. (If we had used a different rate to represent the function we are estimating, we would use the corresponding direct measurement of that rate.) This data is simulated without any noise; i.e., the i-th measurement is simulated as y_i = rate_true(‘iota’, a_i, None, n_i, I_i) where a_i is the age, n_i is the node, and I_i is the income for the i-th measurement. The data is modeled as having noise even though there is no simulated noise.

n_i

Data is only simulated for the leaf nodes; i.e., each n_i is in the set { n3, n4, n5, n6 }. Since the data does not have any nose, the data residuals are a measure of how good the fit is for the nodes in the fit_goal_set.

a_i

For each leaf node, data is generated on the following age_grid:

age_grid = [0.0, 20.0, 40.0, 60.0, 80.0, 100.0 ]

I_i

For each leaf node and each age in age_grid, data is generated for the following income_grid:

random_income = False
income_grid   = dict()
for node in [ 'n3', 'n4', 'n5', 'n6' ] :
   max_income  = 2.0 * avg_income[node]
   if random_income :
      n_income_grid = 10
      income_grid[node] = \
         [ random.uniform(0.0, max_income) for j in range(n_income_grid) ]
      income_grid[node] = sorted( income_grid[node] )
   else :
      n_income_grid = 3
      d_income_grid = max_income / (n_income_grid - 1)
      income_grid[node] = [ j * d_income_grid for j in range(n_income_grid) ]

Note that the check of the fit for the nodes in the fit_goal_set expects much more accuracy when the income grid is not chosen randomly.

Parent Rate Smoothing

This is the iota smoothing used for the fit_node. This smoothing uses the age_gird and one time point. There are no dtime priors because there is only one time point.

Value Prior

The following is the value prior used for the root_node

      {   'name':    'parent_value_prior',
         'density': 'gaussian',
         'lower':   iota_50 / 10.0,
         'upper':   iota_50 * 10.0,
         'mean':    iota_50,
         'std':     iota_50 * 10.0,
         'eta':     iota_50 * 1e-3
      }

The mean and standard deviation are only used for the root_node. The create_shift_db routine replaces them for other nodes.

dage Prior

The following is the dage prior used for the fit_node:

      {   'name':    'parent_dage_prior',
         'density': 'log_gaussian',
         'mean':    0.0,
         'std':     3.0,
         'eta':     iota_50 * 1e-3,
      }

Child Rate Smoothing

This is the smoothing used for the random effect for each child of the fit_node. There are no dage or dtime priors because there is only one age and one time point in this smoothing.

Value Prior

The following is the value prior used for the children of the fit_node:

      {   'name':    'child_value_prior',
         'density': 'gaussian',
         'mean':    0.0,
         'std':     10.0,
      }

Alpha Smoothing

This is the smoothing used for alpha which multiplies the income covariate. There is only one age and one time point in this smoothing so it does not have dage or dtime priors.

Value Prior

The following is the value prior used for this smoothing:

      {   'name':    'alpha_value_prior',
         'density': 'gaussian',
         'lower':   - 10 * abs(alpha_true),
         'upper':   + 10 * abs(alpha_true),
         'std':     + 10 * abs(alpha_true),
         'mean':    0.0,
      }

The mean and standard deviation are only used for the root_node. The create_shift_db routine replaces them for other nodes.

Checking The Fit

The results of the fit are checked by check_cascade_node using the avgint_table that was created by the root_node_db routine. The node_id for each row is replaced by the node_id for the fit being checked. routine uses these tables to check that fit against the truth.

Child

Title

one_at_function.py

one_at_function: Python Source Code