All Downloads are FREE. Search and download functionalities are using the official Maven repository.

docs.javahelp.manual.boxes.search.purify.html Maven / Gradle / Ivy

There is a newer version: 7.6.6
Show newest version






    Search Algorithms: Purify


    





Search Algorithms: Purify

Introduction

Entering Purify parameters

Interpreting the output


Introduction

Purify is one of the three algorithms in Tetrad designed to build pure measurement/structural models (the others are the MIM Build algorithm and the Purify algorithm).

Purify should be used to select indicators of a given measurement model such that the selected indicators form a pure measurement model. In other words, the user specifies a set of clusters of indicators, where each cluster containts indicators of an assumed latent variable. The task of Purify is to discard any indicator that is impure, i.e., that may have other common causes with other indicators, or that is a direct cause of other indicators.

The Purify algorithm assumes that the population can be described as a measurement/structural model where observed variables are linear indicators of the unknown latents, and that the given measurement model is correct, but perhaps impure. Notice that linearity among latents is not necessary (although it will be necessary for the MIM Build algorithm) and latents do not need to be continuous.

All variables are assumed to be continuous, and therefore the current implementation of the algorithm accepts only continuous data sets as input. For general information about model building algorithms, consult the Search Algorithms page.


Entering Purify parameters

Create a new Search nodes as described in the Search Algorithms page, but in order to follow this tutorial, use the following graph to generate a simulated continuous data set:

Notice that, in this example, X4, X5 and X7 are in impure relations. Notice also that X4 is not an impurity anymore when X7 is removed, but X5 and X7 cannot be made pure, since they are indicators of two latents.

When the Purify algorithm is chosen from the Search Object combo box, the following window appears:

The parameters that are used by Purify can be specified in this window. The parameters are as follows:

  • depErrorsAlpha value: Purify uses statistical hypothesis tests in order to generate models automatically. The depErrorsAlpha value parameter represents the level by which such tests are used to accept or reject constraints that compose the final output. The default value is 0.05, but the user may want to experiment with different depErrorsAlpha values in order to test the sensitivity of her data within this algorithm.
  • number of clusters: Purify needs a measurement model specified in advance. The measurement model is defined by a set of clusters of variables, where each cluster represents a set of pure indicators of a single latent. In this box, the user specifies how many latents there are in the measurement model based in prior knowledge. In our example, assuming we know the true measurement model, let's use three clusters.
  • edit cluster assignments: this is identical to the cluster editor of the MIM Build algorithm. Consult its documentation for details. In our example, we should create the following clustering:

  • statistical test: as stated before, automated model building is done by testing statistical hypothesis. Purify provides two basic statistical tests that can be used. Wishart's Tetrad ssumes that the given variables follow a multivariate normal distribution. Bollen's Tetrad test not make this assumption. However, it needs to compute a matrix of fourth moments, which can be time consuming. It is also less robust against sampling variability when compared to Wishart's test if the data actually follows a multivariate normal distribution.
  • default mode: there are basically two different strategies used by Purify. In the Impure by default mode, the algorithm does not assume that the user believes the measurement model is pure, and therefore will try to find constraints that guarantees that a indicator is pure with respect to other indicators. If it fails to find a condition by which indicator A is pure with respect to indicator B, then A will be marked as impure with respect to B. In the Pure by default mode, the algorithm assumes that the given measurement model is pure. It will try to find constraints that guarantees that a indicator is impure with respect to other indicators. If it fails to find a condition by which indicator A is impure with respect to indicator B, then A will be marked as pure with respect to B.

Execute the search as explained in the Search Algorithms page.


Interpreting the output

Although a given measurement model may have many different pure submodels, the Purify algorithm has a deterministic output: it will basically throw away indicators that violate constraints, following an order determined by the number of constraints that are violated by each indicator. It returns a pure measurement model. In our example, the outcome should be as follows if the sample is representative of the population:

Edges with circles at the endpoints are added only to distinguish latent variables from the indicators. Purify does not make any claims about the causal relationships among latent variables (this is the role of the MIM Build algorithm). The labels given to the latent variables are arbitrary.

Sometimes some latents will not have any indicator. As an important sidenote, if some cluster has only two variables, Purify cannot find any condition by which the two indicators in this cluster can be considered pure. If the Impure by default method is chosen, such indicators will always be removed.

 





© 2015 - 2025 Weber Informatics LLC | Privacy Policy