Query - Available Data Sources
The BioExtract Server gives access to data sources containing nucleotide and protein sequence data for animal and plant species. A list of the current data sources with a brief description of each is given below.
Nucleotide Sequences
- EMBL-Bank
More information
- EMBL-Bank (Last Release): The full quarterly release of
all EMBL-Bank entries except MGA and EMBL-CDS entries.
- EMBL-Bank (Update since last release): All EMBL-Bank entries
created or updated after the latest EMBL-Bank release except
CON, MGA or EMBL-CDS entries.
- EMBL-Bank (Deleted Entries): Entries no longer present in the
latest EMBL-Bank release.
- EMBL-Bank (Coding Sequence): Full release of all EMBL-CDS
entries.
- NCBI Nucleotide Databases
More information
- Nucleotide (nuccore): Contains all nucleotide sequences not in EST
or GSS.
- EST (Expressed Sequence Tags): Contains short single-pass reads of cDNA (transcript) sequences.
- GSS (Genome Survey Sequences): Contains short single-pass reads of genomic DNA.
- Nucleotide: Contains the sequence data in GenBank,
EMBL and DDBJ, including all of nuccore, EST and GSS.
Protein Sequences
- NCBI Protein Database: Contains sequence data from the translated
coding regions from DNA sequences in EMBL/GenBank/DDBJ, as well as
protein sequences submitted to PIR, SwissProt, PRF, and PDB.
More information
- UniRef: The UniProt NREF (UniProt Reference Clusters) database. In
the UniRef90 and UniRef50 databases no pair of sequences in the
representative set has >90% or >50% mutual sequence identity.
The UniRef100 database presents identical sequences and sub-fragments
as a single entry.
More information
- UniProtKB: The UniProt Knowledgebase (UniProtKB) is a complete annotated protein
sequence database.
More information
Plant-specific nucleotides and proteins
- Miscellaneous
- GB-PLN (DNA): GenBank plant nucleotide sequence data comprising the entire
PLN division from NCBI
- GB-PLN (protein): GenBank plant protein sequence data comprising the entire
PLN division from NCBI
- Viridiplantae and Viridiplantae Protein: GB-PLN DNA and protein sequences for green
plants. Data are updated monthly from NCBI.