hirdparty.forester.1.039.source-code.phyloxml.xsd Maven / Gradle / Ivy

Go to download
Show more of this group Show more artifacts with this name
Show all versions of forester Show documentation
Applications and software libraries for evolutionary biology and comparative genomics research
The newest version!





















   
       phyloXML is an XML language to describe evolutionary trees and associated data. Version: 1.10.
         License: dual-licensed under the LGPL or Ruby's License. Copyright (c) 2008-2011 Christian M Zmasek.
   
   
   
   
   
      
          'phyloxml' is the name of the root element. Phyloxml contains an arbitrary number of
            'phylogeny' elements (each representing one phylogeny) possibly followed by elements from other namespaces.
         
      
      
         
         
      
   
   
   
      
          Element Phylogeny is used to represent a phylogeny. The required attribute 'rooted' is used
            to indicate whether the phylogeny is rooted or not. The attribute 'rerootable' can be used to indicate that
            the phylogeny is not allowed to be rooted differently (i.e. because it is associated with root dependent
            data, such as gene duplications). The attribute 'type' can be used to indicate the type of phylogeny (i.e.
            'gene tree'). It is recommended to use the attribute 'branch_length_unit' if the phylogeny has branch
            lengths. Element clade is used in a recursive manner to describe the topology of a phylogenetic
         tree.
      
      
         
         
         
         
         
         
         
         
         
         
      
      
      
      
      
   
   
   
      
          Element Clade is used in a recursive manner to describe the topology of a phylogenetic tree.
            The parent branch length of a clade can be described either with the 'branch_length' element or the
            'branch_length' attribute (it is not recommended to use both at the same time, though). Usage of the
            'branch_length' attribute allows for a less verbose description. Element 'confidence' is used to indicate
            the support for a clade/parent branch. Element 'events' is used to describe such events as gene-duplications
            at the root node/parent branch of a clade. Element 'width' is the branch width for this clade (including
            parent branch). Both 'color' and 'width' elements apply for the whole clade unless overwritten in-sub
            clades. Attribute 'id_source' is used to link other elements to a clade (on the xml-level).
         
      
      
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
      
      
      
      
   
   
   
      
          Element Taxonomy is used to describe taxonomic information for a clade. Element 'code' is
            intended to store UniProt/Swiss-Prot style organism codes (e.g. 'APLCA' for the California sea hare 'Aplysia
            californica') or other styles of mnemonics (e.g. 'Aca'). Element 'authority' is used to keep the authority,
            such as 'J. G. Cooper, 1863', associated with the 'scientific_name'. Element 'id' is used for a unique
            identifier of a taxon (for example '6500' with 'ncbi_taxonomy' as 'provider' for the California sea hare).
            Attribute 'id_source' is used to link other elements to a taxonomy (on the xml-level).
      
      
         
         
         
         
         
         
         
         
         
      
      
   
   
      
         
      
   
   
      
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
      
   
   
   
      
          Element Sequence is used to represent a molecular sequence (Protein, DNA, RNA) associated
            with a node. 'symbol' is a short (maximal 20 characters) symbol of the sequence (e.g. 'ACTM') whereas
            'name' is used for the full name (e.g. 'muscle Actin'). 'gene_name' can be used when protein and gene names differ.
            'location' is used for the location of a sequence on a genome/chromosome. The actual sequence can be stored with the 
            'mol_seq' element. Attribute 'type' is used to indicate the type of sequence ('dna', 'rna', or 'protein'). 
            One intended use for 'id_ref' is to link a sequence to a taxonomy (via the taxonomy's 'id_source') in case 
            of multiple sequences and taxonomies per node. 
      
      
         
         
         
         
         
         
         
         
         
         
         
      
      
      
      
   
   
      
         
      
   
   
      
          Element 'mol_seq' is used to store molecular sequences. The 'is_aligned' attribute is used
            to indicated that this molecular sequence is aligned with all other sequences in the same phylogeny for
            which 'is aligned' is true as well (which, in most cases, means that gaps were introduced, and that all
            sequences for which 'is aligned' is true must have the same length).
      
      
         
            
         
      
   
   
      
         
         
         
      
   
   
   
      
          Element Accession is used to capture the local part in a sequence identifier (e.g. 'P17304'
            in 'UniProtKB:P17304', in which case the 'source' attribute would be 'UniProtKB'). 
      
      
         
            
            
         
      
   
   
   
      
          Used to store accessions to additional resources. 
       
      
         
      
    
   
   
      
          This is used describe the domain architecture of a protein. Attribute 'length' is the total
            length of the protein
      
      
         
      
      
   
   
      
          To represent an individual domain in a domain architecture. The name/unique identifier is
            described via the 'id' attribute. 'confidence' can be used to store (i.e.) E-values.
      
      
         
            
            
            
            
         
      
   
   
   
      
          Events at the root node of a clade (e.g. one gene duplication). 
      
      
         
         
         
         
         
      
   
   
      
         
         
         
         
         
         
      
   
   
   
      
          The names and/or counts of binary characters present, gained, and lost at the root of a
            clade. 
      
      
         
         
         
         
      
      
      
      
      
      
   
   
      
         
      
   
   
   
      
          A literature reference for a clade. It is recommended to use the 'doi' attribute instead of
            the free text 'desc' element whenever possible. 
      
      
         
      
      
   
   
   
      
          The annotation of a molecular sequence. It is recommended to annotate by using the optional
            'ref' attribute (some examples of acceptable values for the ref attribute: 'GO:0008270',
            'KEGG:Tetrachloroethene degradation', 'EC:1.1.1.1'). Optional element 'desc' allows for a free text
            description. Optional element 'confidence' is used to state the type and value of support for a annotation.
            Similarly, optional attribute 'evidence' is used to describe the evidence for a annotation as free text
            (e.g. 'experimental'). Optional element 'property' allows for further, typed and referenced annotations from
            external resources.
      
      
         
         
         
         
      
      
      
      
      
   
   
   
      
          Property allows for typed and referenced properties from external resources to be attached
            to 'Phylogeny', 'Clade', and 'Annotation'. The value of a property is its mixed (free text) content.
            Attribute 'datatype' indicates the type of a property and is limited to xsd-datatypes (e.g. 'xsd:string',
            'xsd:boolean', 'xsd:integer', 'xsd:decimal', 'xsd:float', 'xsd:double', 'xsd:date', 'xsd:anyURI'). Attribute
            'applies_to' indicates the item to which a property applies to (e.g. 'node' for the parent node of a clade,
            'parent_branch' for the parent branch of a clade). Attribute 'id_ref' allows to attached a property
            specifically to one element (on the xml-level). Optional attribute 'unit' is used to indicate the unit of
            the property. An example: <property datatype="xsd:integer" ref="NOAA:depth" applies_to="clade"
            unit="METRIC:m"> 200 </property> 
      
      
      
      
      
      
   
   
      
         
      
   
   
      
         
         
         
         
         
         
      
   
   
      
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
      
   
   
   
      
          A uniform resource identifier. In general, this is expected to be an URL (for example, to
            link to an image on a website, in which case the 'type' attribute might be 'image' and 'desc' might be
            'image of a California sea hare'). 
      
      
         
            
            
         
      
   
   
   
      
          A general purpose confidence element. For example this can be used to express the bootstrap
            support value of a clade (in which case the 'type' attribute is 'bootstrap').
      
      
         
            
            
         
      
   
   
   
      
          A general purpose identifier element. Allows to indicate the provider (or authority) of an
            identifier. 
      
      
         
            
         
      
   
   
   
      
          The geographic distribution of the items of a clade (species, sequences), intended for
            phylogeographic applications. The location can be described either by free text in the 'desc' element and/or
            by the coordinates of one or more 'Points' (similar to the 'Point' element in Google's KML format) or by
            'Polygons'. 
      
      
         
         
         
      
   
   
      
          The coordinates of a point with an optional altitude (used by element 'Distribution').
            Required attributes are the 'geodetic_datum' used to indicate the geodetic datum (also called 'map datum',
            for example Google's KML uses 'WGS84'). Attribute 'alt_unit' is the unit for the altitude (e.g. 'meter').
         
      
      
         
         
         
      
      
      
   
   
      
          A polygon defined by a list of 'Points' (used by element 'Distribution').
         
      
      
         
      
   
   
   
      
          A date associated with a clade/node. Its value can be numerical by using the 'value' element
            and/or free text with the 'desc' element' (e.g. 'Silurian'). If a numerical value is used, it is recommended
            to employ the 'unit' attribute to indicate the type of the numerical value (e.g. 'mya' for 'million years
            ago'). The elements 'minimum' and 'maximum' are used the indicate a range/confidence
         interval
      
      
         
         
         
         
      
      
   
   
   
      
          This indicates the color of a clade when rendered (the color applies to the whole clade
            unless overwritten by the color(s) of sub clades).
      
      
         
         
         
      
   
   
   
      
          This is used to express a typed relationship between two sequences. For example it could be
            used to describe an orthology (in which case attribute 'type' is 'orthology'). 
      
      
         
      
      
      
      
      
   
   
      
         
         
         
         
         
         
         
         
      
   
   
   
      
          This is used to express a typed relationship between two clades. For example it could be
            used to describe multiple parents of a clade.