docs.javahelp.manual.boxes.search.fci.html Maven / Gradle / Ivy

Go to download



    Search Algorithms: FCI
    



    
    
        
            Search Algorithms: FCI
        
    
    

The FCI algorithm is designed to search for causal explanations of observational or mixed observational and
    experimental data in which is may be assumed that the true causal graph is acyclic, but there may be unrecorded
    (hidden, latent) common causes of variables in the data set, or in which there may be sample selection bias. Sample
    selection bias occurs when the values of two or more recorded variables influence the probability that a unit is
    sampled. (It is also assumed that no relationship between variables in the data is deterministic--see PCD.) 
The algorithm operates by asking a conditional independence oracle to make judgements about the independence of pairs
    of variables (e.g., X, Z) conditional on sets of variables (e.g., {Y}). Conditional indepedence tests are available
    for datasets that consist either entirely of continuous variables or entirely of discrete variables; hence, datasets
    of these types can be used as input to the algorithm. As a way of getting one's head around how the algorithm should
    behave in the ideal, when independence tests always give correct answers, one may also use a DAG as an input to the
    algorithm, in which case graphical m-separation will be substituted for an actual independence test. 
In the case where a continuous dataset is used as input, the available conditional independence tests assume that the
    direct causal influence of any variable on any other is linear and that the distribution of each variable is
    Normal. 
Some of the above assumptions are not testable using observational data. They should come from
    prior knowledge or partial experiments.

    

    FCI is operated by the user exactly as is PC. The differences are in
    the interpretation of the output. The output of FCI is a partial ancestral graph (PAG). It gives
    partial information about which variables are or are not drect or
    indirect causes and effects of other variables. 

    

    An edge between two variables in the output, however the ends of the
    edge are marked, indicates that there is a causal pathway--a direct
    cause in one direction or the other or a common cause--connecting the
    two variables, that does not contain
any other observed variable. It does not necessarily mean that in the
true causal graph, the connected variables have a direct causal
connection.  An edge of any kind between two measured variables
implies that the variables are not independent conditional on any set
of measured variables.



    If there is a edge from X to Y that is unmarked--a tail of an arrow--
    then X is a cause of Y.  X may not, however, be a direct cause of Y.

    

    If there is an edge from X to Y that has an arrowhead directed into Y,
    then Y is not a cause--not an ancestor--of X.

    

    If there is an edge with two arrowheads connecting X and Y, then there
    is an unrecorded common cause of X and Y

    

    If an edge end  is marked with an "o" the algorithm cannot
    determine whether there should or should not be an arrowhead at that
    edge end.

    

    

    Here is pseudocode for the implementation of the FCI algorithm used in Tetrad:
Given: Independence test I over variables v1,...,vn.


Step A:


Form new empty PAG G with variables from I. Fully connect G using 
unoriented (o-o) edges.


Step B:


Run a Fast Adjacency Search on G using I.


Step C:


Orient colliderDiscovery in G, as follows:

For all nodes B:
 For each pair of nodes A,C adjacent to B:
   If A and C are not adjacent:
      If A and C are d-connected conditional on B:
         Orient A-->B<--C as a collider.

Step D:

Form a Sepset matrix using a possible d-sep search.
Then reorient all edges as unoriented.

Step CI C:

Orient unshielded triples, as follows:

For all nodes B:
 For each pair of nodes A,C adjacent to B:
    If A and C are not adjacent:
       If A and C are d-connected conditional on B:
          Orient A-->B<--C as a collider.
       Else:
         Do nothing (effectively marking A---B---C as a noncollider)

Step CI D:

Apply orientation rules until no more orientations are possible.

Rules to use: double triangle, discriminating paths, away from collider, away 
from ancestor, away from cycle.

Definitions of Orientation Rules:

Double triangle rule:

If D*-oB, A*->B<-**C and A**-**D**-**C is a noncollider, then D**->B.

For all nodes B:
 possible A: nodes into B with arrow
 possible C: nodes into B with arrow
 possible D: nodes into B with circle

 For all possible D:
    For all possible A:
       For all possible C:
          If A != C and required conditions hold:
             Orient D*->B.


Discriminating paths rule:

The triangles that must be oriented this way (won't be done by another rule) 
all look like the ones below, where the dots are a collider path from L to A 
with each modelNode on the path (except L) a parent of C.

             B

            xo           x is either an arrowhead or a circle

           /  \

          v    v
    L....A --> C



For all nodes B
 possible A: nodes out from B with arrow and into B with arrow or circle.
 possible C: nodes out from B with arrow and into B with circle.

 For all possible A:
    For all possible C:
       If A is a parent of C:
          Find a collider path back from A.
          If path found and if path endpoint is d-sep from C conditional on B:
             Set C<--B.
          else,
             Set A<->B and B<->C.


Away from collider rule:

If A*->Bo-oC and not A*-**C, then orient A**->B-->C. (Orient either circle 
if present, don't need both.)


Away from ancestor rule:

If A*-oC and either A-->B*->C or A*->B-->C, then orient A*->C.


Away from cycle rule:

If Ao->C and A-->B-->C, then orient A-->C.


Pseudocode for FCI:

Given: Independence test I over variables v1,...,vn.

Step A:

Form new empty PAG G with variables from I. Fully connect G using 
unoriented (o-o) edges.

Step B:

Run a Fast Adjacency Search on G using I.

Step C:

Orient colliderDiscovery in G, as follows:

For all nodes B:
 For each pair of nodes A,C adjacent to B:
   If A and C are not adjacent:
      If A and C are d-connected conditional on B:
         Orient A-->B<--C as a collider.

Step D:

Form a Sepset matrix using a possible d-sep search.
Then reorient all edges as unoriented.

Step CI C:

Orient unshielded triples, as follows:

For all nodes B:
 For each pair of nodes A,C adjacent to B:
    If A and C are not adjacent:
       If A and C are d-connected conditional on B:
          Orient A-->B<--C as a collider.
       Else:
         Do nothing (effectively marking A---B---C as a noncollider)

Step CI D:

Apply orientation rules until no more orientations are possible.

Rules to use: double triangle, discriminating paths, away from collider, 
away from ancestor, away from cycle.


Definitions of Orientation Rules:

Double triangle rule:

If D*-oB, A*->B<-**C and A**-**D**-**C is a noncollider, then D**->B.

For all nodes B:
 possible A: nodes into B with arrow
 possible C: nodes into B with arrow
 possible D: nodes into B with circle

 For all possible D:
    For all possible A:
       For all possible C:
          If A != C and required conditions hold:
             Orient D*->B.


Discriminating paths rule:

The triangles that must be oriented this way (won't be done by another rule) all 
look like the ones below, where the dots are a collider path from L to A with each 
modelNode on the path (except L) a parent of C.

             B

            xo           x is either an arrowhead or a circle

           /  \

          v    v

    L....A --> C



For all nodes B
 possible A: nodes out from B with arrow and into B with arrow or circle.
 possible C: nodes out from B with arrow and into B with circle.

 For all possible A:
    For all possible C:
       If A is a parent of C:
          Find a collider path back from A.
          If path found and if path endpoint is d-sep from C conditional on B:
             Set C<--B.
          else,
             Set A<->B and B<->C.


Away from collider rule:

If A*->Bo-oC and not A*-**C, then orient A**->B-->C. (Orient either circle if 
present, don't need both.)

Away from ancestor rule:

If A*-oC and either A-->B*->C or A*->B-->C, then orient A*->C.

Away from cycle rule:

If Ao->C and A-->B-->C, then orient A-->C. 

Note: Zhang (2006) supplies an orientation rule set for FCI that is both arrow-complete and tail-complete; this
    is not currently implemented.