Query - Selecting Data Sources


What is a data source?

Data sources are collections of nucleotide and protein sequence data you can search through. The BioExtract Server provides access to the following data sources:

Where are BioExtract Server's data sources located?

Data sources can be found in the Query tab within the Available Data Sources box. They are arranged into different groups.

An introduction to selecting data sources

There are three icons to become familiar with when selecting data sources. Those three icons are:

BioExtract Server allows its users to select data sources in three ways:

You can select any number of data sources. Multiple groups of data sources can be selected simultaneously, and multiple individual data sources can be selected from within one group or across several different groups simultaneously.

How to select all data sources

To select all data sources simultaneously, click the checkbox next to the All group name. All of BioExtract Server's data sources, both nucleotide and protein, will appear in the adjacent right box. Deselecting the All checkbox will remove any data sources currently selected.

How to select a group of data sources

All of the data sources a group contains can be selected for querying by clicking the checkbox next to the name of the group.

For example, selecting the checkbox next to the Protein Sequences group will cause all of its data sources to appear in the adjacent right box for querying.

Groups contain sub-groups that can be selected for querying. If you were to open the Protein Sequences group by using the plus sign, you could select a portion of its data sources by selecting one of its sub-groups. In the below image, the sub-group UniRef has been selected by clicking the checkbox next to its name. Now that UniRef has been selected, all of its data sources will appear in the adjacent right box for querying.

How to select an individual data source

Selecting an individual data source is possible by clicking on the checkbox next to its name.

For example, to select the data source GB-PLN (DNA), open the Miscellaneous group using the plus sign, then select the checkbox next to GB-PLN (DNA). After you've selected the checkbox, this data source and this data source alone will appear in the adjacent box.

An important caution

Every time you select a data source for querying certain Search Fields will be grayed out and will become unusable.

For example, if you were to select UniProt KB, a data source located within the Protein Sequences group, then two of the twelve available search fields, Qualifier and Feature Key, would become unusable in queries.

If you were to simultaneously select UniRef100, then the available search fields would drop to four options leaving you with All Text, ID, Accession and Species as viable search fields.

If you dislike the results of a query made using multiple data sources, then retry the same query with less data sources selected. This will give you greater flexibility in what search fields you can select while querying.