Human and Mouse Gene Catalogue
If you use any of the gene lists below, please please cite:
Nato, AQ and Matise, TC. 2012. Human and Mouse Gene Catalogue. http://compgen.rutgers.edu/gene.shtml. [Date accessed].
As part of our project on schizophrenia candidate gene regions, we have created files containing gene information by parsing flat files from NCBI. We have realized the importance of having this type of file since we have also utilized these files in our collaborative projects (e.g., devQTL). Moreover, some of our colleagues also asked if we have files containing information for each gene so we decided to make it publicly available. These files contain 16 columns having the following headers:
- GeneID. The geneID is a unique identifier for each gene record in Entrez Gene that is assigned by NCBI
- NumPos. Number of positions of a particular gene at the reference assembly
- Code. Alphabetically-assigned letters for each unique position of a particular gene (starting from A)
- NumEntries. Number of entries of a particular gene based on the different accessions
- Gen_Acc. Genomic nucleotide accession.version
- Gen_gi. Genomic nucleotide gi
- Status. The status of the RefSeq gene
- GeneName. Default symbol for the gene
- Chr. Chromosome on which the gene is located
- MapLocation. Map location of a particular gene
- PStart. Start physical position on the genomic accession
- PEnd. End physical position on the genomic accession
- Orientation. Orientation of the gene on the genomic accession
- Type. The type assigned to the gene
- Synonyms. Aliases (set of unofficial symbols) for the gene (these are bar-delimited)
- dbXrefs. Set of identifiers in other databases for the gene (bar-delimited; format is database:value)
- Description. Descriptive name for the gene
The indicated status of the genes in these files are based on the RefSeqs maintained within the annotated genomes (main reference assemblies). The status of a RefSeq that is maintained independently of the annotated genome or is in the process of being mapped into the annotated genome may therefore be different from the status indicated for a particular gene within these flat files. With regard to the physical positions, these flat files have been converted to offset 1.
Links to gene lists and dates they were downloaded from NCBI: