docs.javahelp.manual.boxes.search.mimbuild.html Maven / Gradle / Ivy
Search Algorithms: MIM Build
Search Algorithms: MIMBuild
Introduction
MIM Build stands for Multiple Indicator Model Build. It is one of
the three algorithms in Tetrad designed to build pure
measurement/structural models (the others are the Build Pure Clusters algorithm
and the Purify algorithm).
MIM Build should be used to learn causal relationships among latent
variables in a when the measurement model is
given in advance but the structural model is unknown.
The MIM Build algorithm also assumes that the underlying (unknown)
data generating process is a linear graph. If the user strongly
suspects that the latents or indicators may be non-linearly related, MIM Build should not be used. We are also assuming
that latents here do not have other hidden common causes.
All observed
variables are assumed to be continuous, and therefore the current
implementation of the algorithm accepts only continuous data sets as
input. For general information about model building algorithms, consult
the Search
Algorithms page.
Create a new Search nodes as
described in the Search
Algorithms page, but in order to follow this tutorial,
use the following graph to generate a simulated continuous data set:
When the MIM Build algorithm is chosen from the Search box, a window
appears for specifying search parameters.
The parameters that are used by MIM Build can be specified in this
window. The parameters are as follows:
- depErrorsAlpha value: if you choose the PC search in the
combo box "Choice of algorithm", MIM Build uses statistical hypothesis
tests in order to generate models automatically. The depErrorsAlpha value
parameter represents the level by which such tests are used to accept
or reject constraints that compose the final output. The default value
is 0.05, but the user may want to experiment with different depErrorsAlpha
values in order to test the sensitivity of her data within this
algorithm.
- number of clusters: MIM Build needs a pure
measurement model specified in advance. The measurement model is
defined by a set of clusters of variables, where each cluster
represents a set of pure indicators of a single latent. In this box,
the user specifies how many latents there are in the measurement model
based in prior knowledge. In our example, let's use three clusters.
- edit cluster assignments: once the number of
latents is specified, the user should now determine which variables in
the data set should be clustered together. When this button is clicked,
the following dialog box appears:
In this example, we want to enter the measurement model that we
know is the correct one by assumption. In other words, variables X1, X2
and X3 should be clustered together, since they are pure indicators of
a same latent. Variables X4, X5 and X6 form another cluster, and the
same holds for X7, X8 and X9. In order to perform cluster assignment,
since click the respective combo box and choose the cluster that shows
up in the list. For example, click the X4 combo box and choose Cluster
1. Do the same for X5 and X6. For variables X7, X8 and X9, choose
Cluster 2. The final outcome should be as follows:
- algorithm: MIM Build is actually a family of
algorithms for the problem of learning structural models. Currently, we
offer two alternatives, both corresponding to the case where we have no
latent variables: the GES and PC search algorithms. The PC
version can be slower and less robust than GES, but might be useful to
indicate if the assumption of no extra hidden common causes among the
latents holds (the appearance of double directed edges is an indication
of that possibility).
- view background knowledge: this button gives
access to a background knowledge editor
that is analogous to the one used in most search algorithms, but with
one difference: instead of entering background knowledge about observed
variables (in MIM Build case, all background knowledge about observed
variables boils down to the specification of a measurement model), the
user here enters prior knowledge about causal relations of latent
variables. Latents are denoted by the label _Lx, where x
is the number of the respective cluster. In our example, the latent
parent of X7, X8 and X9 is referred as _L2. Note: use of
background knowledge is not implemented for GES yet.
Execute the search as explained in the Search Algorithms page.
MIM Build returns a CPDAG over latent
variables that is completely analogous to the one produced by a PC Search, or GES Search. The
same interpretation used in such algorithms can be applied to MIM Build
output.