docs.javahelp.manual.boxes.search.svarfci.html Maven / Gradle / Ivy
Search Algorithms: SvarFCI
Search Algorithms: svarFCI
The svarFCI algorithm is a version of FCI for time series data. See the FCI documentation for a
description of the FCI algorithm, which allows for unmeasured (hidden, latent) variables in the
data-generating process and produces a PAG (partial ancestral graph).
svarFCI takes as input a time lag data set, i.e., a data set which includes time series observations
of variables X1, X2, X3, ..., and their lags X1:1, X2:1, X3:1, ..., X1:2, X2:2, X3:2, ... and so on.
X1:n is the nth-lag of the variable X1. To create a time lag data set from a standard tabular data
set (i.e., a matrix of observations of X1, X2, X3, ...), use the create time lag data function in
the data manipulation toolbox. The user will be prompted to specify the number of lags (n), and a
new data set will be created with the above naming convention. The new sample size will be
equal to the original sample size minus n.
Just like the FCI algorithm, SvarFCI operates by checking conditional independence facts. So, the
algorithm can handle continuous or discrete data and Gaussian or non-Gaussian distributions
according to the independence test selected. Just as with FCI, the user must choose a value for
the tuning parameter depErrorsAlpha which determines a cut-off threshold for conditional independence
tests.
The difference between SvarFCI and FCI is that SvarFCI will automatically respect the time order of
the variables and impose a repeating structure. Firstly, it puts lagged variables in appropriate tiers
so, e.g., X3:2 can cause X3:1 and X3 but X3:1 cannot cause X3:2 and X3 cannot cause either
X3:1 or X3:2. Also, it will assume that the causal structure is the same across time, so that if X1
is independent of X2 given X3:1, then also X1:1 is independent of X2:1 given X3:2, and so on
for additional lags if they exist. When some edge is removed as the result of a conditional
independence test, all similar (or homologous) edges are also removed. Note that this
implementation does not carry out the independence tests for homologous sets of variables, it
just assumes that the appropriate lagged independencies hold and removes appropriate edges. (If
X1 is independent of X2 given X3:1, but X1:1 is not independent of X2:1 given X3:3 due to
finite sample effects at some cut-off threshold, this discrepancy will not be detected by the
algorithm: the edge between X1:1 and X2:1 would be removed.)
See Enter and Hoyer (2010) for reference. Note that Enter and Hoyer assume no causal
connections among contemporaneous variables, i.e., variables at the same lag; SvarFCI as
implemented here does allow for contemporaneous causal connections (unless explicitly
forbidden by user-provided background knowledge).
References:
- Entner, D., & Hoyer, P. O. (2010). On causal discovery from time series data using FCI.
- Proceedings of the Fifth European Workshop on Probabilistic Graphical Models, 121-128.