KEGG is an integrated database resource consisting of 16 main databases, broadly categorized into systems information, genomic information, and chemical information. Genomic and chemical information represents the molecular building blocks of life in the genomic and chemical spaces, respectively, and systems information represents functional aspects of the biological systems, such as the cell and the organism, that are built from the building blocks. KEGG has been widely used as a reference knowledge base for biological interpretation of large-scale datasets generated by sequencing and other high-throughput experimental technologies.
In Gitools we allow to retrieve information about KEGG Pathways and to translate KEGG identifiers to other categories of identifiers using Ensembl database. So you can download pathways modules for gene symbols or for some affymetrix platform probe sets as far as the translation information is available in Ensembl. You can also retrieve this information for any organism available in either KEGG and Ensembl databases.
In the first step you select the modules to download (only pathways are available by now) and the version of Ensembl to use for the translation of identifiers.
The wizard will retrieve all the organisms available from KEGG and Ensembl databases and they will be shown in a selection box.
The user has to select one before continuing. There is a filter box that allow to reduce the long list of organisms by using keywords.
The category of identifiers for the genes or other elements of your interest can be selected in this step. Only the identifiers available for the selected organism in either KEGG or Ensembl databases will be shown.
In this page you can select the name prefix for the files generated, the folder where they will be created and the file format.
Once all the parameters are selected and the user clicks on Finish button it starts downloading data from different sources in several steps depending on which module category, organism, and feature identifiers have been selected.
It will generate two files, one with the modules information and other with the annotations of the modules. The modules information contains lists of genes or features of interest that are related to the pathways, and the annotations file contains the textual description of each pathway.