docs.javahelp.manual.boxes.im.dirichlet_bayes_im.html Maven / Gradle / Ivy

Go to download



    
    IM



    
        Dirichlet Bayes Instantiated Model 
    

??? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ??
Description of the Model
A Dirichlet Bayes Instantiated Model (Bayes IM) is an alternative to a Bayes
    Instantiated Model that represents the distribution over each row in each conditional probability table as a
    Dirichlet distribution. A Dirichlet distribution the probability distribution of the parameters of a multinomial
    distribution. That is, it is the probability distribution of the parameters of a list of "cells" whose
    probabilities are (a) mutually independent and (b) sum to 1.0. Each row in a conditional probability table satisfies
    this criterion, so we build an alternative Bayes net representation row by row out of distributions defined in this
    way. 
A specific Dirichlet model for a list of cells is given by specifying a Dirichlet parameter for each cell.
    The maximum likelihood probability for the cells is then given by the ratio of the Dirichlet parameter for that cell
    divided by the sum of Dirichlet parameters for all of cells in the list. In the simple case, these Dirichlet
    parameters will just be cell counts, considered as real numbers. We will usually, therefore, refer to
    Dirichlet parameters as pseudocounts. In the more general case, these pseudocounts can in fact be any
    positive real numbers.
A Dirichlet Bayes Instantiated Model is constructed using a Bayes Parametric Model,
    just like a Bayes Instantiated Model. The main differences are:

    Instead of conditional probability tables, the Dirichlet Bayes Instantiated Model contains tables of
        pseudocounts, as described above.
    
     Dirichlet Bayes Instantiated Models are used as inputs to a Dirichlet
        Estimator, rather than as inputs to an ML Bayes Estimator.
    



    How to Construct a Dirichlet Bayes IM
To construct a Bayes IM, first construct a DAG, then a Bayes PM, and add an IM box to the workspace, with an arrow
    from the Bayes PM to the IM.

Fill in the Graph box and the PM box, as explained in Bayes Parameterized Model.
    For instance, you might end up with a graph that looks like this (the categories for X1 are shown).

Now, double click the IM box. You get a choice of models; choose Dirichlet Bayes Instantiated Model:

What you click OK, you are offered a choice. You may either initialize the parameters of your Dirichlet Bayes net
    manually (i.e., fill them in one by one, by hand), or fill them in randomly.

We choose "Manually." We now get a dialog that looks like the following:

There are two tabs in the dialog that comes up next, "Probabilities" and "Pseudocounts." Let us
    consider "Pseudocounts" first. Pseudocounts are displayed in tables, one for each variable, with the same
    structure as conditional probability tables in Bayes IM's. Each pseudocount is a positive real number; in this case
    the are all initialized to 1.0. The sum of the pseudocounts in each row is shown in the rightmost column.
Turning now to the "Probabilities" tab, we have a table in the form of a conditional probability table that
    displays maximum likelihood probabilities for each cell of each Dirichlet distribution (row) in the model. These
    probabilities are calculated by dividing each pseudocount value in the previous display by the sum or pseudocounts
    in that row. In order not to lose information, the total count for each row is displayed in the "Probabiliities"
    tab as well. To recover pseudocounts, simply multiply the probability of a cell by the "total count" in
    the rightmost column. 

[Note: there is some funny business going on with the right-click popup menus for doing randomization. Need to make
    this work.] 
 
Old text:


    If you choose a Dirichlet Instantiated Bayes Model, you will be putting an initial (or prior) Dirichlet probability
    distribution over the conditional probability of each  value of each variable condtional on values of its
    parent variables:. a probability distribution over conditional probabiliy distributions. The probability
    distribution over the conditional probability distributions implies an "all probabilities" considered probability
    for each value of each variable condiitonal on its parent's values. Such Dirichlet distributions can be specified by
    pseudocounts, essentially a kind of fictional database.  The program will automatically create a uniform and
    symmetric Dirichlet prior distribution for you in which all counts have the same value--you can pick the value. A
    Dirichlet Bayes IM may be set up manually (all values set by hand) or set up automatically as a symmetric prior in
    which all pseudocounts for all cells are set to a given, specified, value. Such Dirichlet distributions are called
    "symmetric, because the distribution function itself with such a choice of pseudocounts is symmetric with respect to
    variable permutation. (If all pseudocounts are set to 1.0, the distribution function is completely flat and
    therefore uninformative. If all pseudocounts are set to 0.5, the resulting distribution is known as a Jeffreys prior
    and has connections to information theory.)