--------------------------------------------
lines 14-1056 of file: at_cascade/csv/fit.py
--------------------------------------------

{xrst_begin csv.fit}
{xrst_spell
  avg
  avgint
  bnd
  cen
  const
  cov
  cpus
  cv
  dage
  dtime
  eigen
  ipopt
  iter
  meas
  mtexcess
  mtother
  mtwith
  mul
  num
  pini
  pos
  relrisk
  sincidence
  sqlite
  std
  underbars
  rcond
}

Fit a CSV Specified Cascade
###########################

Prototype
*********
{xrst_literal
    # BEGIN_FIT
    # END_FIT
}

Example
*******
:ref:`csv.break_fit_pred-name` .

fit_dir
*******
This string is the directory name where the csv files
are located.

max_node_depth
**************
This is the number of generations below root node that are included;
see :ref:`job_descendant@Node Depth Versus Job Depth`
and note that sex is the :ref:`option_all_table@split_covariate_name` .
If max_node_depth is zero,  only the root node will be included.
If max_node_depth is None,  the root node and all its descendants are included.

Input Files
***********

option_fit.csv
==============
This csv file has two columns,
one called ``name`` and the other called ``value``.
The rows of this table are documented below by the name column.
If an option name does not appear, the corresponding value is empty,
the default value is used for the option.
The final value for each of the options is reported in the file
:ref:`csv.fit@Output Files@option_fit_out.csv` .
Because each option has a default value,
new option are added in such a way that
previous option_fit.csv files are still valid.

absolute_covariates
-------------------
This is a space separated list of the names of the absolute covariates.
The reference value for an absolute covariate is always zero.
(The reference value for a relative covariate is its average for the
location that is being fit.)
The default value for *absolute_covariates* is the empty string; i.e.,
there are no absolute covariates.
The covariate named ``one`` is automatically created and is always
absolute and should not be in this list.

age_avg_split
-------------
This string contains a space separated list of float values
(there is one or more spaces between each float value).
Each float value is age at which to split the integration of both the
ODE and the average of an integrand over an interval.
The default for this value is the empty string; i.e.,
no extra age splitting over the uniformly spaced grid specified by
:ref:`csv.fit@Input Files@option_fit.csv@ode_step_size`.

asymptotic_rcond_lower
----------------------
This float is a lower bound for an approximate
reciprocal condition number of the Hessian of the fixed effects objective.
This Hessian is used as an approximation for the information matrix when
using the ``asymptotic`` or ``censor_asymptotic``
:ref:`csv.fit@Input Files@option_fit.csv@sample_method` .
This option must be between zero and one and its default value is zero..
If the approximate reciprocal condition number is less than
*asymptotic_rcond_lower*, the asymptotic sample method will fail.

balance_sex
-----------
This is a boolean option.
The subsample of a data with size
:ref:`csv.fit@Input Files@option_fit.csv@max_fit` always attempts to
balance the child nodes; i.e.,
get an equal number data values for each child of the node currently being fit.
If *balance_sex* is true, the selection will also try to balance
the sex covariate values; i.e.,
get an equal amount of male and female data for each child node.

bound_random
------------
This float option specifies a bound on the random effects.
Sometimes the initial fixed effects are very far from truth and
the random effects try to compensate with large values.
This bound can stabilize the optimization in this case.
It is the intention that this bound not be active
at the final value for the fixed effects.
The default value for this option is infinity; i.e., no bound.

child_prior_dage
----------------
This option is true or false.
If it is false, no dage priors are created for the child jobs.
The default value for this option is true.
See the :ref:`create_shift_db@Problem` for a discussion of
why you may want to use this option.

child_prior_dtime
-----------------
This option is true or false.
If it is false, no dtime priors are created for the child jobs.
The default value for this option is true.
See the :ref:`create_shift_db@Problem` for a discussion of
why you may want to use this option.

child_prior_std_factor
----------------------
This factor multiplies the parent fit posterior standard deviation for the
value priors the during a child fit (except for the covariate multipliers).
If it is greater (less) than one, the child priors are larger (smaller)
than indicated by the posterior corresponding to the parent fit.
The default value for this option is 2.0.

child_prior_std_factor_mulcov
-----------------------------
This factor multiplies the parent fit posterior standard deviation for the
value priors for the covariate multipliers.
The default value for this option is *child_prior_std_factor* .

compress_interval
-----------------
This string contains two float values separated by one or more spaces.
The first (second) float value is called *age_size* ( *time_size* ).
The default value for this option is both *age_size* and *time_size* are 100.

#. If for a :ref:`csv.fit@Input Files@data_in.csv` row,
   *age_upper* - *age_lower*  <= *age_size* ,
   the age average for that data is approximated by its value at age
   ( *age_upper* - *age_lower* ) / 2.
#. If for a data_in.csv row,
   *time_upper* - *time_lower*  <= *time_size* ,
   the time average for that data is approximated by its value at time
   ( *age_upper* - *age_lower* ) / 2.

covariate_reference
-------------------
This string is either ``data_in.csv`` or ``covariate.csv`` .
If it is ``data_in.csv`` the reference value for each
(sex, node, covariate) is the average of the covariate
corresponding to the data that is fit for that (sex, node) .
If it is ``covariate.csv`` the reference value for each
(sex, node, covariate) is the average of the
values in covariate.csv that are for that sex, node, and covariate.
The default value for this option is ``data_in.csv`` .
See :ref:`csv.shock_cov@covariate_reference` in the csv.shock
for an example use of this option.

freeze_type
-----------
This options specifies the type of covariate multiplier freeze that is done.
It is either ``mean`` or ``posterior`` and its default is ``mean`` .
If :ref:`csv.fit@Input Files@option_fit.csv@refit_split` is false,
the freeze fit is the only fit at the root level.
If *refit_split* is true,
the freeze fit is the second fit at the root level; i.e,
the fit directly after the sex split.
Note that in general the cascade can freeze the covariate multipliers
at any level; see :ref:`option_all_table@freeze_type`
in the option_all table.

mean
....
If the freeze_type is ``mean`` ,
the mean (optimal value) for the covariate multipliers,
determined by the freeze fit,
is used as the lower and upper limit
for fits that are descendant of the freeze fit.
Note that if the lower and upper limits are equal, the corresponding
model variable is treated as if it has no uncertainty.


posterior
.........
If the freeze_type is ``posterior`` ,
the posterior distribution for the covariate multipliers,
determined by the freeze fit,
is used as the prior for all the descendants of the freeze fit.
This enables one to account for the uncertainty of covariate multiplier values.


hold_out_integrand
------------------
This string contains a space separate list of integrand names.
These integrands are held out from all the fits except for the
:ref:`no_ode_fit-name` .
The no_ode_fit is used to initialize the rates.
You can use this option to hold out direct measurements of the
rates that are only intended to help with the initialization
(are not real data).
The following is a list of the rates and corresponding integrand
that is a direct measurement of the rate:

.. csv-table::
    :widths: auto
    :header-rows: 1

    Rate,Integrand
    iota,Sincidence
    rho,remission
    chi,mtexcess

The default value for *hold_out_integrand* is the empty string; i.e.,
all of the data is real data and is included in the fits.

max_abs_effect
--------------
This float option specifies an extra bound on the
absolute value of the covariate multipliers,
except for the measurement noise multipliers.
To be specific, the bound on the covariate multiplier is as large as possible
under the condition

    *max_abs_effect* <= | *mul_bnd* * ( *cov_value* - *cov_ref* ) |

where *mul_bnd* is the non-negative covariate multiplier bound,
*cov_value* is a data table value of the covariate,
and *cov_ref* is the reference value for the covariate.
It is an extra bound because it is in addition to the priors for a
covariate multiplier.
The default value for this option is 2.

max_fit
-------
This integer is the maximum number of data values to fit per integrand.
If for a particular fit an integrand has more than this number of
data values, a subsample of this size is randomly selected.
There is an exception to this rule, the three fits for the root node
(corresponding to sex equal to female, both and male)
use twice this number of values per integrand.
This is because the sex covariate multiplier is frozen after the both fit
and the other covariate multipliers are frozen of the female and male fits.
The default value for *max_fit* is 250.

max_fit_parent
--------------
If this integer is greater than or equal zero,
*max_fit* only applies to the child data for a fit,
and *max_fit_parent* is the maximum number of data values for the parent.
The default value for *max_fit_parent* is minus one in which case
*max_fit* only applies to the all the data for a fit.
Note that data corresponding to the parent node
will not be used when fitting any of its descendants.

max_num_iter_fixed
------------------
This integer is the maximum number of Ipopt iterations to try before
giving up on fitting the fixed effects.
The default value for *max_num_iter_fixed* is 100.

max_number_cpu
--------------
This integer is the maximum number of cpus (processes) to use.
It must be greater than zero. If it is one, the jobs are run
sequentially, more output is printed to the screen, and the program
can be cleanly stopped with a control-C.
The default value for this option is
{xrst_code py}
    max_number_cpu = max(1, multiprocessing.cpu_count() - 1)
{xrst_code}

minimum_meas_cv
---------------
This float must be non-negative (greater than or equal zero).
It specifies a lower bound on the standard deviation for each measured data
value as a fraction of the measurement value.
The default value for *minimum_meas_cv* is zero.


no_ode_ignore
-------------
The is a space separated list of rate and integrand names.
It specifies which integrands are ignored during a :ref:`no_ode_fit-name` .
The priors for the following variables will not be changed by no_ode_fit:

#. The rate names in *no_ode_ignore* .
#. The covariate multiplies that affect the rates in *no_ode_ignore*.
#. The covariate multiplies that affect measurement values for
   the integrands in *no_ode_ignore* .

all
...
In the special case where *no_ode_ignore* is ``all`` ,
the no_ode fit is not run and none of the priors are changed before the
:ref:`glossary@root_node` fit.

no_ode_fit
----------
If this is true (false) a :ref:`no_ode_fit-name` is (is not)
used to get better values for the fixed effects prior means.
The default value for *no_ode_fit* is true.

number_sample
-------------
This is the number of independent samples of the posterior distribution
for the fitted variables to generate (for each fit).

#. This sampled posterior is used to
   created priors for the children of the node being fit.
#. When splitting, the samples are used to create priors for the
   same node at the new split covariate values.
#. These samples are also used by :ref:`csv.predict-name`
   to create posterior predictions for any function of the fitted variables.

The default value for this option is 20.
(You can get 1000 MCMC samples by just repeating each of the 20
independent samples 50 times.)

ode_method
----------
This default for *ode_method* is ``iota_pos_rho_zero`` (see below).

no_ode
......
The *ode_method* value does not matter for the following integrands:
``Sincidence`` ,
``remission`` ,
``mtexcess`` ,
``mtother`` ,
``mtwith`` ,
``relrisk`` ,
``mulcov_`` *mulcov_id* .
If all of your integrands are in the set above, you can use
``no_ode`` as the *ode_method* and avoid having to worry about
constraining certain rates to be positive or zero.

2DO
,,,
This ode_method does not currently work in the context of csv.fit
because csv.fit automatically requests the prevalence integrand for
predicting values of pini.
This should either be fixed or no_ode should be removed from the
possible ode_method values.

trapezoidal
...........
If *ode_method* is ``trapezoidal`` ,
a trapezoidal method is used to approximation the ODE solution.
Like ``no_ode``, you do not have to worry about constraining
certain rates to be positive or zero when using the trapezoidal method.

iota_zero_rho_zero
..................
If *ode_method* is ``iota_zero_rho_zero`` ,
the smoothing for
*iota* and *rho* must always have lower and upper limit zero.
In this case an eigen vector method is used to approximate the ODE solution.

iota_pos_rho_zero
.................
If *ode_method* is ``iota_pos_rho_zero`` ,
the smoothing for
*iota* must always have lower limit greater than zero and for
*rho* lower and upper limit zero.
In this case an eigen vector method is used to approximate the ODE solution.

iota_zero_rho_pos
.................
If *ode_method* is ``iota_zero_rho_pos`` ,
the smoothing for
*rho* must always have lower limit greater than zero and for
*iota* lower and upper limit zero.
In this case an eigen vector method is used to approximate the ODE solution.

iota_pos_rho_pos
................
If *ode_method* is ``iota_pos_rho_pos`` ,
the smoothing for
*iota* and *rho*
must always have lower limit greater than zero.
In this case an eigen vector method is used to approximate the ODE solution.

ode_step_size
-------------
This float must be positive (greater than zero).
It specifies the step size in age and time to use when solving the ODE.
It is also used as the step size for approximating average integrands
over age-time intervals.
The smaller *ode_step_size*, the more computation is required to
approximation the ODE solution and the average integrands.
Finer resolution for specific ages can be achieved using the
:ref:`csv.fit@Input Files@option_fit.csv@age_avg_split` option.
The default value for this option is 10.0.

perturb_optimization_scale
--------------------------
This is the standard deviation of the log of a random multiplier
that perturbs the scaling point; see
:ref:`option_all_table@perturb_optimization_scale` .
The default value for this option is 0.3.

perturb_optimization_start
--------------------------
This is the standard deviation of the log of a random multiplier
that perturbs the starting point; see
:ref:`option_all_table@perturb_optimization_start` .
The default value for this option is 0.1.

quasi_fixed
-----------
If this boolean option is true,
a quasi-Newton method is used to optimize the fixed effects.
Otherwise a Newton method is used
The Newton method uses second derivatives of the objective
and hence requires more work per iteration but it can often attain
much more accuracy in the final solution.
The default value *quasi_fixed* is true.

random_seed
-----------
This integer is used to seed the random number generator.
The default value for this option is
{xrst_code py}
    random_seed = int( time.time() )
{xrst_code}

refit_split
-----------
#. If this boolean is true,
   there is a  female, male, and both fit at the root level.
   The both fit is used for the female and male priors.
   The female and male fits are used for the priors below the root level.
#. If *refit_split* is false,
   There is no female or male fit at the root level and
   the both fit is used for the priors below the root level.
#. The default value for this option is true.

Multiplier Freeze
.................
If *refit_split* is true,
the covariate multipliers are frozen after the sex split; i.e.,
after the separate female, male fits at the root level.
If *refit_split* is false,
the covariate multipliers are frozen after the both fit at the root level.

root_node_name
--------------
This string is the name of the root node.
The default for *root_node_name* is the top root of the entire node tree.
Only the root node and its descendants will be fit.
Sometimes it is useful to set :ref:`csv.fit@max_node_depth` to zero
and change *root_node_name* to a particular node that the
cascade is having trouble fitting. This can greatly speed up model building.

root_node_sex
-------------
This is either ``female`` , ``male`` , or ``both``.
If it is ``both``, then the ``female`` and ``male`` directories
occur directory below the directory for the root node; i.e.,
the sexes are split just after fitting the root node..
If it is not ``both``, there is no ``female`` or ``male`` directory
directly below the directory for the root node and all of the fits
are for the *root_node_sex* .

sample_method
-------------
This string specifies the :ref:`option_all_table@sample_method` .
It must be ``asymptotic`` , ``censor_asymptotic`` or ``simulate``
'and it's default value is ``asymptotic`` .

shared_memory_prefix
--------------------
This string is used added to the front of the name of the shared
memory objects used to run the cascade in parallel.
No two cascades can run at the same time with the same shared memory prefix.
If a cascade does not terminate cleanly, you may have to clear the
shared memory before you can run it again; see :ref:`clear_shared-name` .
The default value for this option is your user name ($USER) with spaces
replaced by underbars.
If the USER environment variable is not defined,
the value ``none`` is used for this default.

tolerance_fixed
---------------
is the tolerance for convergence of the fixed effects optimization problem.
This is relative to one and its default value is 1e-4.

node.csv
========
This file has the same description as the simulate
:ref:`csv.simulate@Input Files@node.csv` file.

covariate.csv
=============
This csv file has the same description as the simulate
:ref:`csv.simulate@Input Files@covariate.csv` file.

Compression
-----------
The :ref:`csv.covariate_same-name` routine is used to detect when
two (node, sex) pairs have the same values for a covariate.
In addition, csv.fit detects when a covariate is constant with respect to
age or time or both.
If many (node_name, sex) pairs have the same values for a covariate,
or do not depend on age or time,
this can result in a large savings in the size of the root node database
and the amount of memory required by dismod_at.
This depends on the values you choose in covariate.csv.
The following summary of this savings is printed when csv.fit is run::

    csv.fit: create_root_database: covariate counts
    number (node, sex, covariate) combinations = ...
    number of corresponding weights            = ...
    number that are constant w.r.t. age        = ...
    number that are constant w.r.t. time       = ...
    number that are constant w.r.t. both       = ...


population
----------
If this table has a covariate called ``population`` ,
it is also used to weight the data as a function of age and time; e.g.,
see :ref:`csv.population-name` .
This function is different for each sex and location.

#. The :ref:`csv.simulate-name` routine does not yet do this data weighting.

#. No population weighting is used during the predictions in
   :ref:`csv.predict@Output Files@fit_predict.csv` because these predictions
   are for a single (age, time) point and not a rectangular (age, time) region.

Both Sexes
..........
The population weighting, and covariate value,
for data with sex equal to ``both`` is the
average of the ``female`` and ``male`` populations.
One might think the ``both`` population would be the sum of the
female and male populations but this would make the population covariate
different than all the other covariates (which use the average of the
female and male values for both).

fit_goal.csv
============
If a :ref:`csv.simulate@Input Files@node.csv@node_name` is in this table,
and the node is a descendant of the root node,
it will be included in the fit.
All the ancestors of goal nodes, up to the root node, are also fit.

#. This is different from the :ref:`glossary@fit_goal_set` which only
   contains nodes that are descendants of the root node.

#. A fit_goal.csv file that only has its header line is the same
   as one that contain all the nodes in the node table.

#. If you only have one node in this file, at_cascade will do a drill
   from the root node to the goal node.

node_name
---------
Is the name of a node in the fit goal set.
Each such node must be an descendant of the root node.

predict_integrand.csv
=====================
This is the list of integrands at which predictions are made
and stored in :ref:`csv.predict@Output Files@fit_predict.csv` .

integrand_name
--------------
This string is the name of one of the prediction integrands.
You can use the integrand name ``mulcov_0`` , ``mulcov_1`` , ...
which corresponds to the first , second , ...
covariate multiplier in the mulcov.csv file.


{xrst_comment ---------------------------------------------------------------}

prior.csv
=========
This csv file has the following columns:

name
----
is a string contain the name of this prior.
No two priors can have the same name.

density
-------
is one of the following strings:
uniform,
gaussian, cen_gaussian, log_gaussian
laplace, cen_laplace, log_laplace.
(Only these densities are included, so far, so that we do not have to
worry about the degrees of freedom.)

mean
----
is a float containing the mean for the density
for this prior (before truncation).
If density is uniform, this value is only used for starting
and scaling the optimization.
This column must appear and its value cannot be empty.

std
---
is a float containing the standard deviation for the density
for this prior (before truncation).
If density is uniform, this value is not used and can be empty.
If all the densities are uniform, this column is optional.

eta
---
is a float specifying the offset for
the log_gaussian, and log_laplace densities.
If the density is not log_gaussian or log_laplace,
this value is not used and can be empty.
If none of the densities are log_gaussian or log_laplace,
this column is optional.

lower
-----
is a float containing the lower limit for the truncated density
for this prior.
This column is optional,
if it does not appear or its value is empty, there is no lower bound.

upper
-----
is a float containing the upper limit for the truncated density
for this prior.
This column is optional,
if it does not appear or its value is empty, there is no upper bound.

{xrst_comment ---------------------------------------------------------------}

parent_rate.csv
===============
This file specifies the prior for the root node parent rates.
These are no effect rates; i.e., no random or covariate effects are
included in these rates.
For each value of *rate_name*,
this file must have a rectangular grid in *age* and *time* .

rate_name
---------
is a string containing the name for the non-zero rates
(except for omega which is specified by covariate.csv).

age
---
is a float containing the age for this grid point.

time
----
is a float containing the time for this grid point.

value_prior
-----------
is a string containing the name of the value prior for this grid point.
Either *value_prior* or *const_value* must be non-empty but not both.
The standard deviation for a value prior is always in the same units as
the mean for the prior, even when the prior is log-scaled.

dage_prior
----------
is a string containing the name of the dage prior for this grid point.
If dage_prior is empty, there is no prior for the forward age difference
of this rate at this grid point.
This prior cannot be censored.
If a dage prior is log-scaled,
the standard deviation is for the difference w.r.t age
of the offset log transform of the corresponding model variable.
Otherwise,
the standard deviation is for the difference w.r.t age
of the corresponding model variable.

dtime_prior
-----------
is a string containing the name of the dtime prior for this grid point.
If dtime_prior is empty, there is no prior for the forward time difference
of this rate at this grid point.
This prior cannot be censored.
If a dtime prior is log-scaled,
the standard deviation is for the difference w.r.t time
of the offset log transform of the corresponding model variable.
Otherwise,
the standard deviation is for the difference w.r.t time
of the corresponding model variable.


const_value
-----------
is a float specifying a constant value for this grid point or the empty string.
This is equivalent to the upper and lower limits being equal to this value.
Either *const_value* or *value_prior* must be non-empty but not both.

{xrst_comment ---------------------------------------------------------------}

child_rate.csv
==============
This csv file specifies the prior for the child rate effects
pini, iota, rho and chi. These are random effects.
(The parent and child priors for omega are created automatically
using the :ref:`csv.simulate@Input Files@covariate.csv@omega` column
in the :ref:`csv.fit@Input Files@covariate.csv` file. )

rate_name
---------
this string is the name of this rate and is one of the following:
pini, iota, rho, chi .
If one of these rates does not appear in child_rate.csv ,
that rate has not random effects.

value_prior
-----------
is a string containing the name of the value prior for this child rate effects.
The child rate effects are constant in age and time
(this is a limitation of the csv.fit).

Note that the child rate effects are in log of rate space.
In other words, if :math:`u` is a child rate effect and :math:`p(a, t)` is the
corresponding parent rate as a function of age, time.
The corresponding child rate as a function of age and time :math:`c(a, t)` is

.. math::

    c(a,t) = \exp(u) p(a,t)

{xrst_comment ---------------------------------------------------------------}

mulcov.csv
==========
This csv file specifies the covariate multipliers.

covariate
---------
this string is the name of the covariate for this multiplier.
The covariate
``one`` is an absolute covariate that is always equal to one and
``sex`` is the splitting covariate
( ``sex`` is sex name in :ref:`csv.module@sex_name2value` ).
All the other covariates are specified by
:ref:`csv.fit@Input Files@covariate.csv`.
If one of these covariates appears in the
:ref:`csv.fit@Input Files@option_fit.csv@absolute_covariates` list it is an
absolute covariate.
The other covariates in covariate.csv are
:ref:`relative covariates<glossary@Relative Covariate>` .
For relative covariates,
the average of the covariate
(for the current node and sex being fit)
is subtracted before it is multiplied by a multiplier.

type
----
This string is rate_value, meas_value, or meas_noise.

rate_value
..........
The multiplier times the covariate affects the rate
in the effected column; i.e.
the exponential of the product multiplies the rate.

meas_value
..........
The multiplier times the covariate affects the model for the integrand
in the effected column; i.e.
the exponential of the product multiplies the model for the integrand.

meas_noise
..........
The multiplier times the covariate affects the model for the
measurement noise for the integrand in the effected column.
To be more specific, the product is added to the standard deviation for
measurements for the integrand.

effected
--------
is the name of the integrand or rate affected by this multiplier;
see type above.

value_prior
-----------
is a string containing the name of the value prior
for this covariate multiplier.
Note that the covariate multipliers are constant in age and time
(this is a limitation of the csv.fit).
Either *value_prior* or *const_value* must be non-empty but not both.

const_value
-----------
is a float specifying a constant value for this grid point or the empty string.
This is equivalent to the upper and lower limits being equal to this value.
Either *value_prior* or *const_value* must be non-empty but not both.

{xrst_comment ---------------------------------------------------------------}

data_in.csv
===========
This csv file specifies the data set
with each row corresponding to one data point.

Optional Columns
----------------
The following columns are optional and the empty string is used
for all the rows of a column that does not appear:
meas_std, eta, nu, sample_size.

data_id
-------
is an :ref:`csv.module@Notation@Index Column` for data_in.csv.
This is necessary so that the dismod_at data table data_id values correspond
to the data_in.csv data_id values.


integrand_name
--------------
This string is a dismod_at integrand name; e.g. ``Sincidence``.

density_name
------------
This string is one of the following dismod_at density names:

.. csv-table::

    gaussian, cen_gaussian, log_gaussian, cen_log_gaussian
    laplace,  cen_laplace,  log_laplace,  cen_log_laplace
    students,            ,  log_students,
    binomial,,,


node_name
---------
This string identifies the node corresponding to this data point.

sex
---
This string is the sex name for this data point.

age_lower
---------
This float is the lower age limit for this data row.

age_upper
---------
This float is the upper age limit for this data row.

time_lower
----------
This float is the lower time limit for this data row.

time_upper
----------
This float is the upper time limit for this data row.

meas_value
----------
This float is the measured value for this data point.

meas_std
--------
This float is the standard deviation of the measurement noise
for this data point.
This standard deviation is always in the same units as
the data, even when the density is log-scaled.

binomial
........
The *meas_std*
must be empty when the density is binomial.
In this case the standard deviation corresponding to a measurement
is a function of the sample size and the model for the mean of the data.
This requires that the model for the mean of the data is positive; i.e.,
greater than zero.

eta
---
This float is the offset in the log transformation for the log densities
(it can be empty if this is not a log density).

nu
--
This float is the degrees of freedom for the students densities
(it can be empty if this is not a students density).

sample_size
-----------
This float should be empty if the density is not binomial.
Otherwise, it the sample size for a binomial distribution
(see :ref:`csv.binomial-name` for an example):

.. csv-table::
    :widths: auto

    y,is the meas_value for this data
    n,is the sample size
    k,is the counts in the binomial distribution; k = y * n .
    p,is the success rate; p is the mean of y

The log of the binomial density function is:

.. math::

    \log {n \choose k} + k \log(p) + (n-k) \log(1 - p)

We suggest using gaussian approximation of the binomial when p * n
is greater than 5.
This approximation will be faster and less likely to have evaluation issues
during the optimization.
If you do not have a good idea as to the value of p,
uses a gaussian when k = y * n is greater than 5.

hold_out
--------
This integer is one (zero) if this data point is held out (not held out)
from the fit.

{xrst_comment ---------------------------------------------------------------}

Output Files
************

root.db
=======
This is the dismod_at sqlite database corresponding to the root node for
the cascade.

all_node.db
===========
This is the at_cascade sqlite all node database for the cascade.

dismod.db
=========
1. There is a subdirectory of the :ref:csv.fit@`fit_dir` with the
   name of the root node. The ``dismod.db`` file in this directory is
   the `dismod_at_database`_ corresponding to the fit and predictions for
   the root node fit for both sexes.
2. The root node directory has a ``female`` and ``male`` subdirectory.
   These directories contain ``dismod.db`` database for
   the root node fit of the corresponding sex.
3. For each node between the root node and the
   :ref:`fit_goal nodes <csv.fit@Input Files@fit_goal.csv>` ,
   and for the ``female`` and ``male`` sex, there is a directory.
   This is directly below the directory for its parent node and same sex.
   It contains the ``dismod.db`` data base for the corresponding fit.

.. _dismod_at_database: https://dismod-at.readthedocs.io/latest/database.html

option_fit_out.csv
==================
This is a copy of :ref:`csv.fit@Input Files@option_fit.csv` with the default
filled in for missing values.

fit_predict.csv
===============
This is the predictions for all of the nodes at the age, time and
covariate values specified in covariate.csv.
The prediction is done using the optimal variable values.

avgint_id
---------
Each avgint_id corresponds to a different value for age, time, or
integrand in the sam_predict file.
The age and time values comes from the covariate.csv file.
The integrands come for the predict_integrand.csv file.

integrand_name
--------------
is the integrand for this sample is equal to the integrand names
in predict_integrand.csv

avg_integrand
-------------
This float is the mode value for the average of the integrand,
with covariate and other effects but without measurement noise.

node_name
---------
is the node name for this sample and
cycles through the nodes in covariate.csv.

age
---
is the age for this prediction and is one of
the ages in covariate.csv.

time
----
is the time for this prediction and is one of
the times in covariate.csv.

sex
---
is the sex name for this data point; i.e., female, both, or male.

covariate_names
---------------
The rest of the columns are covariate names and contain the value
of the corresponding covariate in covariate.csv.

sam_predict.csv
===============
This is a sampling of the predictions for all of the nodes at the age, time and
covariate values specified in covariate.csv.
It has the same columns as fit_predict.csv (see above) plus
an extra column named sample_index.

sample_index
------------
For each sample_index value, there is a complete set of all the values
in the fit_predict.csv table.
A different (independent) sample from of the model variables
from their posterior distribution is used to do the predictions for
each sample index.

{xrst_end csv.fit}