The GPCR NaVa database describes sequence variants within the family of human G Protein-Coupled Receptors (GPCRs). GPCRs regulate many physiological functions and are the targets for most of today s medicines. The acronym NaVa stands for Natural Variant, which means any (non-artificial) variant that occurs in humans.
The GPCR NaVa database includes:
1) rare mutations (frequency below 1%);
2) polymorphisms (frequency above 1%), including Single Nucleotide Polymorphisms (SNPs);
3) variants without estimates of allele frequency.
The GPCR NaVa database aids GPCR research by categorising and integrating information on variants from databases and scientific papers. Moreover, the GPCR NaVa database is linked with the reputable GPCRDB.
The GPCR NaVa database resulted from a joint project of the Leiden Medicinal Chemistry section of the Leiden/Amsterdam Center for Drug Research (LACDR) and the Leiden Institute of Advanced Computer Science (LIACS). We are very grateful to the Nederlandse organisatie voor Wetenschappelijk Onderzoek (NWO) for providing financial support.
Kazius J., Wurdinger K., van Iterson M., Kok J., Back T., and Ijzerman A.P. (2008) GPCR NaVa database: natural variants in human G protein-coupled receptors. Hum Mutat. 29(1), 39-44.
Seven-transmembrane-helix receptors (7-TMR), known as G-protein-coupled receptors , are important genes that work as the gateway of signal transudation induced by ligand binding. Recent progress in determination of human draft sequences [2,3] accelerates the comprehensive analysis of 7-TMR in whole human genome. We have developed an automated system for discovering 7-TMR genes in the whole human genome by three stages. (I) Gene prediction stage: Genomic sequences were obtained from human genome resources of NCBI . To maximize the number of gene candidates, we detected three kinds of sequence sets, (a)"6f-sequences" which were all possible combination between initial and stop codons in 6 reading frames. (b)"ALN-sequences" obtained by ALN , which is a dynamic programming algorithm that assigns genome sequence to known protein sequence. (c)"GD-sequences" generated by GeneDecoder  which is based on HMM models. (II)Screening stage: The predicted genes passed an analyzing filter using items of BLASTP  for similarity search, HMMER  and in house program for assigning 7-TMR specific HMM. (PFAM domain  ), PROSITE patterns  and transmembrane helix (TMH) prediction tools . By carefully assessing each component, two threshold settings, best specificity and best sensitivity, were determined. Then four confidence levels of the datasets were obtained by combining the best specificity and best sensitivity thresholds. (III) Quality improvement stage: Sequence redundancies were adjusted as follows. (1) Pair-wise alignment was applied to the candidate sequences in all-against-all fashion. (2) Sequences were linked together only when they hit for > 50 A.A residues with > 95% identity and shared the same chromosome No., and overlapping genetic position. (3)The result of a transitive closure of the links was then regarded as one cluster. And one representative gene was selected from each cluster. Applying this system to human genome sequences (Apr, 2003), we collected 7-TMR genes in four confidence levels ranging from 1,114 candidates at the highest specificity to 2,235 at the highest sensitivity. These are summarized in SEVENS (http://sevens.cbrc.jp/1.20/). This database intends to cover all "7-TMR universe" with not only the known sequences but also to use newly discovered sequence by computational gene finding program. This aspect is clearly different from previous databases [10-12]. The content search button navigates a page, where candidates are obtained. by the "AND" combination of (a) Keyword in nr.aa database search results, (b)Chromosome number, (c)Data Level, (d)Predicted exon number, (e) Gene Length, (f)Protein length, (g)E-value of sequence search against SWISSPROT or nr.aa, (h) Prosite motifs, and (i) Pfam domains. This search lists up 7-TMR candidate sequences at a chromosomal viewer and a list table. Then each chromosome or sequence links to the sequence analysis page. Here, chromosomal viewer shows the mapping information of selected genes (purple) which links to their protein sequence analysis. Result of Similarity Search part shows an alignment of the query searched against SWISS-PROT and nr.aa database. using BLASTP. Structure part shows the results of analysis, with TMH prediction, PROSITE motif pattern and PFAM domain in amino acid sequence. We are planning to maintain SEVENS with constant updates according to the version up of human genome sequence. Additional information (such as expression data, tertiary structure data etc.) will be included in database with every update chance. We hope these datasets will be of value to researchers engaged in 7-TMR studies.
Recent develoments :
We recalculated the data collection process by using human genome sequences (Apr. 2003). Web pages are more visualized by chromosomal mapping viewer.
1. Watson, S. & Arkinstall, S. (1994). The G-protein Linked Receptor Facts Book, Academic Press,�@ London.
2 International Human Genome Sequencing Consortium. (2001). Initial sequencing and analysis of the human genome. Nature 409, 860-921.
3 Venter, J. C., et al. (2001) The sequence of the human genome. Science. 291, 1304-1351.
4. Goto, O. (2000) Homology-based gene structure prediction: simplified matching algorithm using a translated codon (tron) and improved accuracy by allowing for long gaps. Bioinfomatics, 16,190-202.
5. Asai, K., Itou, K., Ueno, Y. and Yada, T. (1998) Recognition of human genes by stochastic parsing, Pacific Symposium on Biocomputing 98, pp. 228-239 (PSB98, 1998).
6. Altschul, S. F., et al.(1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389-3402.
7. Bateman, A., Birney, E., Durbin, R., Eddy, S. R., Howe, K, L. and Sonnhammer, E. L. (2000) The Pfam protein families�f database. Nucleic Acids Res. 28, 263-266.
8. Bairoch, A. (1992) Prosite: A dictionary of sites and patterns in proteins. Nucleic Acids Res. 20, 2013-2018.
9. Hirokawa, T., Boon-Chieng, S. and Mitaku, S. (1998) SOSUI: classification and secondary structure prediction system for membrane proteins. Bioinformatics 14, 378-379.
10. Horn, F., Vriend, G. & Cohen, F. E. (2001) Collecting and harvesting biological data: the GPCRDB and NucleaRDB information systems. Nucleic Acids Res. 29, 346-349.
11. Crasto, C., Marenco, L., Miller, P, Shepherd G. (2002) Olfactory Receptor Database: a metadata-driven automated population from sources of gene and protein sequences. Nucl. Acids. Res. 30, 354-360.
12. Hodges PE, Carrico PM, Hogan JD, O'Neill KE, Owen JJ, Mangan M, Davis BP, Brooks JE, Garrels JI. (2002). Annotating the human proteome: the Human Proteome Survey Database (HumanPSDTM) and an in-depth target database for G protein-coupled receptors (GPCR-PDTM) from Incyte Genomics. Nucleic Acids Res 30. 137-141.
GPCRpred is a tool that uses a support vector machine based method to make GPCR family and subfamily predictions for a user-supplied query sequence.
GRIFFIN (G-protein-Receptor Interacting Feature Finding INstrument) uses a support vector machine and hidden markov model to predict G-protein coupled receptors (GPCRs) and G-protein coupling selectivity.
G protein-coupled receptors; expression in cell lines
G-proteins and their interaction with GPCRs
G-protein coupled receptors (GPCRs) represent one of the most important families of drug targets in pharmaceutical development. GLIDA is a novel public GPCR-related chemical genomic database that is primarily focused on the correlation of information between GPCRs and their ligands. It provides correlation data between GPCRs and their ligands, along with chemical information on the ligands, as well as access information to the various web databases regarding GPCRs. These data are connected with each other in a relational database, allowing users in the field of GPCR-related drug discovery to easily retrieve such information from either biological or chemical starting points. GLIDA includes structure similarity search functions for the GPCRs and for their ligands. Thus, GLIDA can provide correlation maps linking the searched homologous GPCRs (or ligands) with their ligands (or GPCRs). By analyzing the correlation patterns between GPCRs and ligands, we can gain more detailed knowledge about their interactions and improve drug design efforts by focusing on inferred candidates for GPCR-specific drugs. GLIDA is publicly available at http://gdds.pharm.kyoto-u.ac.jp:8081/glida. We hope that it will prove very useful for chemical genomic research and GPCR-related drug discovery.
GPCRsclass is a tool for predicting amine-binding receptors based on a protein sequence provided by the user.
PRED-GPCR is a tool which queries user-supplied sequences against a database of HMMs corresponding to G-protein coupled receptor (GPCR) families in order to determine which GPCR family the query sequence most resembles.
GRIS, the Glycoprotein-hormone Receptors Information System is dedicated to the collection and dissemination of data involving the three most known glycoprotein-hormone receptors (GpHRs): The thyrotropin receptor (TSHR), the follitropin receptor (FSHR) and the lutropin/ choriogonadotropin receptor (LHR/CGR) .
These receptors are members of the rhodopsin-like G protein-coupled receptor (GPCR) family.
GRIS collects, organises and presents to its visitors the most heterogenous data on GpHRs: sequences, models, mutational data, etc ...
The focus is mainly on mutational data and the transfer of these to structural information, i.e. the models.
The good think for the bioinformaticians is that all mutant information can be obtained as a flat text file. The link to the text file is on the mutant details page.
A flat text file has the advantage of being human and machine readable. Every flat text file of a mutant you find in GRIS is formatted in the same way, allowing for easy processing by programmers who want to make further use of the mutant data available in GRIS. Fields in a flat file in GRIS are separated by a tab character.
The order of the fields is the following (a slash (/) is used here instead of tab to separate the fields):
Mutation / Accession code / Receptor / Species / Ballesteros (TM) or GPMD (ECD) number / GPCRDB number / Number in the PDB files available from GRIS / Domain / Subdomain / Constitutivity (Y=Yes, N=No, NA=Unknown) / Expression level (% of WT, NA=Unknown) / Binding affinity (U=Unchanged, I=Increased, D=Decreased, NA=Unknown) / Additional comments / PubMed number(s) (separated by a hyphen (-) when more than one) / Original full length sequence / Aligned sequence