no_ode_xam

View page source

Example Using no_ode_fit To Initialize Optimization

This example uses mtexcess data during a no_ode_fit and then holds it out during the actual estimation. This is meant to simulate the case where mtexcess is obtain form other data to help initialize the optimization (and without this smart initialization the optimization would fail). For this example everything is constant in time so the functions below do not depend on time.

Nodes

The following is a diagram of the node tree for this example. The root_node is n0, the fit_goal_set and the leaf node set are both {n2, n3} for this example:

        n0
  /-----/\-----\
n1              n2

fit_goal_set

fit_goal_set = { 'n1', 'n2' }

Rates

The non-zero dismod_at rates for this example are iota, chi, and omega. We use iota(a, n, I) , chi(a, n, I) to denote the value for iota and chi as a function of age a, node number n, and income I. We use omega(a, n) to denote the value for omega as a function of age a and node n.

Covariate

There is one covariate for this example, income. The reference value for income is the average income corresponding to the root_node.

I_n

We use I_n for the reference value of income at node n. The code below sets this reference using the name avg_income:

avg_income       = { 'n1':1.0, 'n2':2.0 }
avg_income['n0'] = ( avg_income['n1'] + avg_income['n2'] ) / 2.0

alpha

We use alpha_iota ( alpha_chi ) for the rate_value covariate multiplier which multiplies income and affects iota (chi). The true value for these multipliers (used which simulating the data) is

alpha_true = { 'iota':- 0.25, 'chi':-0.15 }
alpha_true_max_abs = 0.3

Random Effects

For each node, there is a random effect on iota and chi that is constant in age and time. Note that the leaf nodes have random effect for the node above them as well as their own random effect.

R_n

We use R_n to denote the random effects for node n. The code below sets this value:

random_effect_true  = { 'n0': 0.0, 'n1': 0.3, 'n2': -0.3}

Simulated Data

Random Seed

The random seed can be used to reproduce results. If the original value of this setting is zero, the clock is used get a random seed. The actual value or random_seed is always printed.

random_seed = 0
if random_seed == 0 :
   random_seed = int( time.time() )
random.seed(random_seed)
print('no_ode_xam: random_seed = ', random_seed)

rate_true(rate, a, t, n, c)

For rate equal to iota, chi, and omega, this is the true value for rate in node n at age a, time t, and covariate values c. The covariate values are a list in the same order as the covariate table. The values t and c[1] are not used by this function for this example.

def rate_true(rate, a, t, n, c) :
   if rate in [ 'pini', 'rho' ] :
      return 0.0
   income   = c[0]
   R_n      = random_effect_true[n]
   I_n      = avg_income['n0']
   if rate == 'iota' :
      effect   = R_n + alpha_true[rate] * ( income - I_n )
      return (1 + a / 100) * 1e-3 * math.exp( effect  )
   if rate == 'chi' :
      effect   = R_n + alpha_true[rate] * ( income - I_n )
      # chi is constant up to second age grid point because prevalence
      # cannot determine chi at age zero.
      aa = max(a, age_grid[1] )
      return (1 + aa / 100) * 1e-2 * math.exp( effect )
   if rate == 'omega' :
      # omega random effect is constrained for each node so to get an
      # exact match it cannot depend on income, but it can depend on
      # age and time
      effect   = R_n +  a / 10000
      return (1 + a / 100) * 1e-2 * math.exp( effect)
   assert False

The true model for chi is constant below the second age grid point because it is not possible to determine chi at age zero from Sincidence and prevalence measurements.

y_i

The simulated integrands for this example are mtexcess, Sincidence, and prevalence. The data is simulated without any noise but it is modeled as having noise. In addition, the mtexcess data is only used for the no_ode_fit and is held out during actual fits. This simulates the case where the mtexcess data was constructed from the other data in order to help initialize the optimization.

n_i

Data is only simulated for the leaf nodes; i.e., each n_i is in the set { n3, n4, n5, n6 }. Since the data does not have any nose, the data residuals are a measure of how good the fit is for the nodes in the fit_goal_set.

a_i

For each leaf node, data is generated on the following age_grid:

age_grid = [0.0, 20.0, 40.0, 60.0, 80.0, 100.0 ]

I_i

For each leaf node and each age in age_grid, data is generated for the following income_grid:

random_income = False
income_grid   = dict()
for node in [ 'n1', 'n2' ] :
   max_income  = 2.0 * avg_income[node]
   n_income_grid = 3
   d_income_grid = max_income / (n_income_grid - 1)
   income_grid[node] = [ j * d_income_grid for j in range(n_income_grid) ]

Note that the check of the fit for the nodes in the fit_goal_set expects much more accuracy when the income grid is not chosen randomly.

Omega Constraints

The omega_constraint routine is used to set the value of omega in the parent and child nodes.

Parent Rate Smoothing

The parent smoothings use the true value of iota and chi at age 50 and in the root_node:

   covariate_list = [ avg_income['n0'] ]
   iota_50        = rate_true('iota', 50.0, None, 'n0', covariate_list )
   chi_50         = rate_true('chi',  50.0, None, 'n0', covariate_list )

iota and chi

This is the smoothing used in the root_node model for the rates. Note that the value part of this smoothing is only used for the root_node. This smoothing uses the age_gird and one time point. There are no dtime priors because there is only one time point. The smoothing for chi does not use the first age grid point, age zero, because it is not possible to determine chi at age zero from Sincidence and prevalence measurements.

Value Prior

The following is the value prior used for the root_node

      {   'name':    'parent_iota_value_prior',
         'density': 'uniform',
         'lower':   iota_50 / 10.0,
         'upper':   iota_50 * 10.0,
         'mean':    iota_50,
         'std' :    iota_50 * 10.0,
         'eta':     iota_50 * 1e-3,
      }
      {   'name':    'parent_chi_value_prior',
         'density': 'uniform',
         'lower':   chi_50 / 10.0,
         'upper':   chi_50 * 10.0,
         'mean':    chi_50,
         'std':     chi_50 * 10.0,
         'eta':     chi_50 * 1e-3,
      }

The mean and standard deviation are only used for the root_node. The create_shift_db routine replaces them for other nodes.

dage Prior

The following is the dage prior used for the root_node:

      {   'name':    'parent_dage_prior',
         'density': 'log_gaussian',
         'mean':    0.0,
         'std':     4.0,
         'eta':     min(iota_50 , chi_50 ) * 1e-3,
      }

Child Rate Smoothing

This is the smoothing used for the random effect for each child of the root_node. There are no dage or dtime priors because there is only one age and one time point in this smoothing.

Value Prior

The following is the value prior used for the children of the root_node:

      {   'name':    'child_value_prior',
         'density': 'gaussian',
         'mean':    0.0,
         'std':     1.0,
      }

Alpha Smoothing

This is the smoothing used for alpha which multiplies the income covariate. There is only one age and one time point in this smoothing so it does not have dage or dtime priors.

Value Prior

The following is the value prior used for this smoothing:

      {   'name':    'alpha_value_prior',
         'density': 'gaussian',
         'lower':   - 10 * alpha_true_max_abs,
         'upper':   + 10 * alpha_true_max_abs,
         'std'  :   10 * alpha_true_max_abs,
         'mean':    0.0,
      }

The mean and standard deviation are only used for the root_node. The create_shift_db routine replaces them for other nodes.

Checking The Fit

The results of the fit are checked by check_cascade_node using the avgint_table that was created by the root_node_db routine. The node_id for each row is replaced by the node_id for the fit being checked. routine uses these tables to check that fit against the truth.

Child

Title

no_ode_xam.py

no_ode_xam: Python Source Code