org.nmdp.ngs.sra.xsd.SRA.common.xsd Maven / Gradle / Ivy
Submitter designated name of the SRA document of this type. At minimum alias should
be unique throughout the submission of this document type. If center_name is specified, the name should
be unique in all submissions from that center of this document type.
Owner authority of this document and namespace for submitter's name of this document.
If not provided, then the submitter is regarded as "Individual" and document resolution
can only happen within the submission.
Broker authority of this document. If not provided, then the broker is considered "direct".
The document's accession as assigned by the Home Archive.
Identifies a record by name that is known within the namespace defined by attribute "refcenter"
Use this field when referencing an object for which an accession has not yet been issued.
The center namespace of the attribute "refname". When absent, the namespace is assumed to be the current submission.
Identifies a record by its accession. The scope of resolution is the entire Archive.
A string value that constrains the domain of named
identifiers (namespace).
Set of record identifiers.
A primary identifier in the INSDC namespace.
A secondary identifier in the INSDC namespace.
An identifer rom a public non-INSDC resource.
A submitter provided identifier.
A universally unique identifier that requires no namespace.
INSDC controlled vocabulary of permitted cross references.
Please see http://www.insdc.org/db_xref.html . For example, FLYBASE.
Accession in the referenced database. For example, FBtr0080008 (in FLYBASE).
Text label to display for the link.
Text label to display for the link.
The internet service link (file:, http:, ftp:, etc).
Reusable attributes to encode tag-value pairs with optional units.
Name of the attribute.
Value of the attribute.
Optional scientific units.
Reusable external links type to encode URL links, Entrez links, and db_xref links.
Text label to display for the link.
The internet service link (file:, http:, ftp:, etc).
NCBI controlled vocabulary of permitted cross references. Please see http://www.ncbi.nlm.nih.gov/entrez/eutils/einfo.fcgi? .
Numeric record id meaningful to the NCBI Entrez system.
Accession string meaningful to the NCBI Entrez system.
How to label the link.
The SPOT_DESCRIPTOR specifies how to decode the individual reads of interest from the
monolithic spot sequence. The spot descriptor contains aspects of the experimental design,
platform, and processing information. There will be two methods of specification: one
will be an index into a table of typical decodings, the other being an exact specification.
Number of base/color calls, cycles, or flows per
spot (raw sequence length or flow length including all
application and technical tags and mate pairs, but not including
gap lengths). This value will be platform dependent, library
dependent, and possibly run dependent. Variable length platforms
will still have a constant flow/cycle length.
READ_INDEX starts at 0 and is incrementally increased for each sequential READ_SPEC within a SPOT_DECODE_SPEC
READ_LABEL is a name for this tag, and can be used to on output to determine read name, for example F or R.
There are various methods to ordering the reads on the spot.
The read is located beginning at the offset or cycle relative to another read.
This choice is appropriate for example when specifying a read
that follows a variable length expected sequence(s).
Specify the read index that precedes this read.
Specify the read index that follows this read.
The location of the read start in terms of base count (1 is beginning of spot).
A set of choices of expected basecalls for a current read. Read will be zero-length if none is found.
Element's body contains a basecall, attribute provide description of this read meaning as well as matching rules.
When match occurs, the read will be tagged with this group membership
Minimum number of matches to trigger identification.
Maximum number of mismatches
Where the match should occur. Changes the rules on how min_match and max_mismatch are counted.
Only @max_mismatch influences matching process
Both matches and mismatches are counted.
When @max_mismatch is exceeded - it is not a match.
When @min_match is reached - match is declared.
Both matches and mismatches are counted.
When @max_mismatch is exceeded - it is not a match.
When @min_match is reached - match is declared.
Specify whether the spot should have a default length for this tag if the expected base cannot be matched.
Specify an optional starting point for tag (base offset from 1).
The PLATFORM specifies the sequencing platform.
Sequencers based on capillary electrophoresis technology manufactured by LifeTech (formerly
Applied BioSciences).
Tells the Archive who will execute the sample demultiplexing operation..
There shall be no sample de-multiplexing at the level of assiging individual reads to sample pool members.
The submitter has assigned individual reads to sample pool members by providing individual files
containing reads with the same member assignment.
The PipelineType identifies the sequence or tree of actions to
process the sequencing data.
Lexically ordered value that allows for the pipe section to be hierarchically ordered. The float primitive data type is
used to allow for pipe sections to be inserted later on.
STEP_INDEX of the previous step in the workflow. Set toNIL if the first pipe section.
Name of the program or process for primary analysis. This may include a test or condition
that leads to branching in the workflow.
Version of the program or process for primary analysis.
Notes about the program or process for primary analysis.
Name of the processing pipeline section.
Reference assembly details.
A standard genome assembly.
A recognized name for the genome assembly.
Identifies the genome assembly
using an accession number and a sequence version.
Other genome assembly.
Description of the genome
assembly.
A link to the genome
assembly.
Text label to display for the
link.
The internet service link
(file:, http:, ftp:, etc).
Reference assembly and sequence details.
Reference assembly details.
Reference sequence details.
A recognized name for the
reference sequence.
Accession.version with version being mandatory
This is how Reference Sequence is labeled in submission file(s).
It is equivalent to SQ label in BAM.
Optional when submitted file uses INSDC accession.version
Generic processing pipeline specification.
Processing directives tell the Sequence Read Archive how to
treat the input data, if any treatment is requested.