docs.javahelp.manual.tetrad_overview.html Maven / Gradle / Ivy
Introduction to TETRAD
Tetrad Overview
What is Tetrad?
Tetrad is a program for
- creating,
- simulating data from,
- estimating,
- testing,
- predicting with,
- and searching for
causal/statistical models.
The aim of the program is to provide sophisticated methods in a friendly interface
requiring very little statistical sophistication of the user and no programming
knowledge. It is not intended to replace flexible statistical programming systems
such as Matlab, Splus or R. Tetrad is freeware that performs many of the functions
in commercial programs such as Netica, Hugin, LISREL, EQS and other programs,
and many discovery functions these commercial programs do not perform.
Tetrad is unique in the suite of principled search ("exploration," "discovery")
algorithms it provides--for example its ability to search when there may be
unobserved confounders of measured variables, to search for models of latent
structure, and to search for linear feedback models--and in the ability to calculate
predictions of the effects of interventions or experiments based on a model.
All of its search procedures are "pointwise consistent"--they are guaranteed
to converge almost certainly to correct information about the true structure
in the large sample limit, provided that structure and the sample data satisfy
various commonly made (but not always true!) assumptions.
Tetrad is limited to models of categorical data (which can also be used for
ordinal data) and to linear models ("structural equation models') with a Normal
probability distribution, and to a very
limited class of time series models. The Tetrad programs describe causal models
in three distinct parts or stages: a picture, representing a directed graph
specifying hypothetical causal relations among the variables; a specification
of the family of probability distributions and kinds of parameters associated
with the graphical model; and a specification of the numerical values of those
parameters.
The program and its search algorithms have been developed over several years
with support from the National Aeronautics and Space Administration and the
Office of Naval Research. Joseph Ramsey has implemented most of the program,
with substantial
assistance from Frank Wimberly. Executable and Source code for all versions
of Tetrad IV, and this manual, are copyrighted, 2004, by Clark Glymour, Richard
Scheines, Peter Spirtes and Joseph Ramsey. The program may be freely downloaded
and used without permission of copyright holders, who reserve the right to alter
the program at any time without notification.
The Tetrad suite of programs permits the user to do any of the following:
- Generate a graphical statistical/causal
model of any of the following kinds:
- Models for categorical data (Bayes networks);
- Models for continuous data with variables having a Gaussian (Normal) joint
probability distribution;
- Models for a limited class of time-series representing genetic regulatory
networks..
- Estimate parameters of models of
the following kinds:
- Models for categorical data in which all variables are recorded in the
data (no "latent" variables);
- Models for continuous data with or without latent variables;
- Test the fit of models of any of
the kinds listed in 2. above.
- Simulate data from a model. or
any of the kinds listed in 1. above.
- Update models of categorical data;
i.e.,, compute the probability of any variable in the model conditional on
any set of values for other variables in the model.
- Predict the probability of a variable
in a model (without latent variables) from interventions that fix or randomize
values for any set of other variables in the model.
- Search for models:
- Of categorical data with or without latent variables;
- Of continuous, Gaussian data with or without latent variables.
- Compare graphical features of two
models.
- Find alternative models statistically
equivalent to any given model without latent variables.
- Select variables within a dataset
for classifying values of cases of another variable in the dataset
- Classify new (or old) cases using
the variables selected in 9. above.
- Assess the accuracy of classification.