All Downloads are FREE. Search and download functionalities are using the official Maven repository.

docs.javahelp.manual.boxes.search.fci.html Maven / Gradle / Ivy

The newest version!



    Search Algorithms: FCI
    


Search Algorithms: FCI

The FCI algorithm is designed to search for causal explanations of observational or mixed observational and experimental data in which is may be assumed that the true causal graph is acyclic, but there may be unrecorded (hidden, latent) common causes of variables in the data set, or in which there may be sample selection bias. Sample selection bias occurs when the values of two or more recorded variables influence the probability that a unit is sampled. (It is also assumed that no relationship between variables in the data is deterministic--see PCD.)

The algorithm operates by asking a conditional independence oracle to make judgements about the independence of pairs of variables (e.g., X, Z) conditional on sets of variables (e.g., {Y}). Conditional indepedence tests are available for datasets that consist either entirely of continuous variables or entirely of discrete variables; hence, datasets of these types can be used as input to the algorithm. As a way of getting one's head around how the algorithm should behave in the ideal, when independence tests always give correct answers, one may also use a DAG as an input to the algorithm, in which case graphical m-separation will be substituted for an actual independence test.

In the case where a continuous dataset is used as input, the available conditional independence tests assume that the direct causal influence of any variable on any other is linear and that the distribution of each variable is Normal.

Some of the above assumptions are not testable using observational data. They should come from prior knowledge or partial experiments.

FCI is operated by the user exactly as is PC. The differences are in the interpretation of the output. The output of FCI is a partial ancestral graph (PAG). It gives partial information about which variables are or are not drect or indirect causes and effects of other variables.

An edge between two variables in the output, however the ends of the edge are marked, indicates that there is a causal pathway--a direct cause in one direction or the other or a common cause--connecting the two variables, that does not contain any other observed variable. It does not necessarily mean that in the true causal graph, the connected variables have a direct causal connection.  An edge of any kind between two measured variables implies that the variables are not independent conditional on any set of measured variables.

If there is a edge from X to Y that is unmarked--a tail of an arrow-- then X is a cause of Y.  X may not, however, be a direct cause of Y.

If there is an edge from X to Y that has an arrowhead directed into Y, then Y is not a cause--not an ancestor--of X.

If there is an edge with two arrowheads connecting X and Y, then there is an unrecorded common cause of X and Y

If an edge end  is marked with an "o" the algorithm cannot determine whether there should or should not be an arrowhead at that edge end.


Here is pseudocode for the implementation of the FCI algorithm used in Tetrad:

Given: Independence test I over variables v1,...,vn.


Step A:


Form new empty PAG G with variables from I. Fully connect G using 
unoriented (o-o) edges.


Step B:


Run a Fast Adjacency Search on G using I.


Step C:


Orient colliderDiscovery in G, as follows:

For all nodes B:
 For each pair of nodes A,C adjacent to B:
   If A and C are not adjacent:
      If A and C are d-connected conditional on B:
         Orient A-->B<--C as a collider.

Step D:

Form a Sepset matrix using a possible d-sep search.
Then reorient all edges as unoriented.

Step CI C:

Orient unshielded triples, as follows:

For all nodes B:
 For each pair of nodes A,C adjacent to B:
    If A and C are not adjacent:
       If A and C are d-connected conditional on B:
          Orient A-->B<--C as a collider.
       Else:
         Do nothing (effectively marking A---B---C as a noncollider)

Step CI D:

Apply orientation rules until no more orientations are possible.

Rules to use: double triangle, discriminating paths, away from collider, away 
from ancestor, away from cycle.

Definitions of Orientation Rules:

Double triangle rule:

If D*-oB, A*->B<-**C and A**-**D**-**C is a noncollider, then D**->B.

For all nodes B:
 possible A: nodes into B with arrow
 possible C: nodes into B with arrow
 possible D: nodes into B with circle

 For all possible D:
    For all possible A:
       For all possible C:
          If A != C and required conditions hold:
             Orient D*->B.


Discriminating paths rule:

The triangles that must be oriented this way (won't be done by another rule) 
all look like the ones below, where the dots are a collider path from L to A 
with each modelNode on the path (except L) a parent of C.

             B

            xo           x is either an arrowhead or a circle

           /  \

          v    v
    L....A --> C



For all nodes B
 possible A: nodes out from B with arrow and into B with arrow or circle.
 possible C: nodes out from B with arrow and into B with circle.

 For all possible A:
    For all possible C:
       If A is a parent of C:
          Find a collider path back from A.
          If path found and if path endpoint is d-sep from C conditional on B:
             Set C<--B.
          else,
             Set A<->B and B<->C.


Away from collider rule:

If A*->Bo-oC and not A*-**C, then orient A**->B-->C. (Orient either circle 
if present, don't need both.)


Away from ancestor rule:

If A*-oC and either A-->B*->C or A*->B-->C, then orient A*->C.


Away from cycle rule:

If Ao->C and A-->B-->C, then orient A-->C.


Pseudocode for FCI:

Given: Independence test I over variables v1,...,vn.

Step A:

Form new empty PAG G with variables from I. Fully connect G using 
unoriented (o-o) edges.

Step B:

Run a Fast Adjacency Search on G using I.

Step C:

Orient colliderDiscovery in G, as follows:

For all nodes B:
 For each pair of nodes A,C adjacent to B:
   If A and C are not adjacent:
      If A and C are d-connected conditional on B:
         Orient A-->B<--C as a collider.

Step D:

Form a Sepset matrix using a possible d-sep search.
Then reorient all edges as unoriented.

Step CI C:

Orient unshielded triples, as follows:

For all nodes B:
 For each pair of nodes A,C adjacent to B:
    If A and C are not adjacent:
       If A and C are d-connected conditional on B:
          Orient A-->B<--C as a collider.
       Else:
         Do nothing (effectively marking A---B---C as a noncollider)

Step CI D:

Apply orientation rules until no more orientations are possible.

Rules to use: double triangle, discriminating paths, away from collider, 
away from ancestor, away from cycle.


Definitions of Orientation Rules:

Double triangle rule:

If D*-oB, A*->B<-**C and A**-**D**-**C is a noncollider, then D**->B.

For all nodes B:
 possible A: nodes into B with arrow
 possible C: nodes into B with arrow
 possible D: nodes into B with circle

 For all possible D:
    For all possible A:
       For all possible C:
          If A != C and required conditions hold:
             Orient D*->B.


Discriminating paths rule:

The triangles that must be oriented this way (won't be done by another rule) all 
look like the ones below, where the dots are a collider path from L to A with each 
modelNode on the path (except L) a parent of C.

             B

            xo           x is either an arrowhead or a circle

           /  \

          v    v

    L....A --> C



For all nodes B
 possible A: nodes out from B with arrow and into B with arrow or circle.
 possible C: nodes out from B with arrow and into B with circle.

 For all possible A:
    For all possible C:
       If A is a parent of C:
          Find a collider path back from A.
          If path found and if path endpoint is d-sep from C conditional on B:
             Set C<--B.
          else,
             Set A<->B and B<->C.


Away from collider rule:

If A*->Bo-oC and not A*-**C, then orient A**->B-->C. (Orient either circle if 
present, don't need both.)

Away from ancestor rule:

If A*-oC and either A-->B*->C or A*->B-->C, then orient A*->C.

Away from cycle rule:

If Ao->C and A-->B-->C, then orient A-->C. 

Note: Zhang (2006) supplies an orientation rule set for FCI that is both arrow-complete and tail-complete; this is not currently implemented.





© 2015 - 2024 Weber Informatics LLC | Privacy Policy