Genome of the pea aphid symbiont Buchnera sp. APS
Genomic database for the pea aphid (Acyrthosiphon pisum)
Plant-parasitic nematodes are important pests of crop plants worldwide, and also among the most difficult animals to identify. Their identification based on sequences of nuclear ribosomal DNA (rDNA) cistron (18S, 28S and 5.8S RNA genes, and Internal Transcribed Spacers, ITS1 and ITS2) is currently becoming a popular tool. Each rDNA repeating cistron is arranged as follows: an external region (IGS), a small ribosomal subunit (18S) gene, an internal transcribed spacer 1 (ITS1), the 5.8S gene, another internal transcribed spacer 2 (ITS2), large ribosomal subunit (28S) gene, and lastly an intergenic spacer region (IGS). Sequences from nuclear ribosomal RNA repeats have been used to demonstrate the identity of isolates from various hosts and to unravel the relationships of cryptic and complex species which suffer from confusion in published taxonomy. In addition, the availability of RNA sequences allows study of phylogenetic relationships between nematodes, also for more complete understanding of their biology as agricultural pests.
PPNEMA is a well-integrated, web-based, Plant-Parasitic Nematode bioinformatic resource. It consists of a database of Ribosomal Cistron sequences from various species of plant-parasitic nematodes grouped according to nematode genera, and a Search system allowing data to be extracted according to both text and pattern searching. The database contains 2405 sequences from 26 different genera, organised in 208 multi-aligned groups. The sequences extracted from primary DBs were analysed in order to detect non-redundant sequences by CleanUP software. Within each genera group, multi-alignments of sequences were produced for ribosomal genes and fragments of the genes themselves. Alignments were guided with the Caenorhabditis elegans genome as reference sequence. The database was implemented in MySQL DBMS. Sequences were multi-aligned using both ClustalW and DIALIGN programs. PPNEMA also contains a newly developed tool for characterising an anonymous sequence by comparing it with groups of multi-aligned sequences by means of pattern searching approaches. A system allowing the submission of new sequences is under development. PPNEMA is freely available at http://www.ppnema.uniba.it.
The availability of the PPNEMA database, related sequences are organised in clusters (groups), can highlight small shared sub-sequences among genera, species and populations thus offering the scientific community a pre-processed archive of plant parasitic nematode sequences useful for nematologists interested in both characterisation of new species and phylogenetic relationships.
We would like to thank Dr. Annalisa Marsico for her contribution to the database design during her stay in our Department within the Master program in BIOINFORMATICA “Alberto del Lungo” organised at the Siena University (Italy). This work was supported by the University of Bari and by “PROGETTO DI RICERCA MIUR-PNR FIRB "Laboratorio Internazionale di Bioinformatica"”.
Biochemical pathways and enzymes in crop plants
The UK Crop Plant Bioinformatics Network (UK CropNet) was established in 1996 in order to harness the extensive work in genome mapping in crop plants in the UK. Since this date we have published five databases from our central UK CropNet World Wide Web (WWW) http://ukcrop.net server. Our resource facilitates the identification and manipulation of agronomically important genes by laying a foundation for comparative analysis among crop plants and model species. In addition, we have developed a number of software tools that facilitate the visualisation and analysis of our data. Many of our tools are made freely available for use with both crop plant data and with data from other species.
Genomics of apple, cherry, peach, pear, raspberry, rose and strawberry
Seed development and fatty acid metabolism of oilseed crops
WhETS (Wheat Estimated Transcript Server) is a resource that combines Triticeae ESTs/mRNAs with rice genes to find the best estimate of hexaploid wheat transcript sequences for a target gene, supplemented with information on tissue distribution and likely gene structure, to aid in primer design.
The crop EST database CR-EST (http://pgrc.ipk-gatersleben.de/cr-est/) is a publicly available online resource providing access to sequence, classification, clustering, and annotation data of crop EST projects at IPK Gatersleben, Germany. CR-EST currently holds more than 200,500 sequences derived from four species: barley, wheat, pea, and potato. The barley section comprises more than one third of all known public domain ESTs in EMBL sequence database. Sequences were organized into 41 cDNA libraries. We implemented an automatic EST preparation pipeline, including identification of chimeric clones, to meet requirements of transparently displaying data quality. Sequences have been clustered in species-specific projects to generate a non-redundant set of ~22,600 consensus sequences and ~17,200 singletons. This is the basis of the provided set of unigenes. A web application allows to BLAST against CR EST ESTs and to query and retrieve data from Gene Ontology and metabolic pathway annotations as well as sequence similarities from stored BLAST results. CR EST also features interactive JAVA-based tools, such as open reading frame visualization and explorative analysis of Gene Ontology mappings applied to ESTs.
Génoplante is a collaboration between public French institutes (INRA, CIRAD, IRD and CNRS) and private companies (Biogemma, Bayer CropScience and Bioplante) that aims at developing genome analysis programs for crop species (corn, wheat, rapeseed, sunflower and pea) and model plants (Arabidopsis thaliana and rice). The outputs of these programs form a wealth of information (genomic sequence, transcriptome, proteome, allelic variability, mapping and synteny, mutation data) and tools (databases, interfaces, analysis software), that are being integrated and made public at the public bioinformatics resource centre of Génoplante: GénoPlante-Info (GPI). This continuous flood of data and tools is regularly updated and will grow continuously during the next two years. Access to the GPI databases and tools is offered at http://genoplante-info.infobiogen.fr/.
This work is supported by the Génoplante program. We thank Benjamin Preciado and the Infobiogen team (Guy Vaysseix, Francis Capy, François Laissus, Didier Gillet, Xavier Benigni, Jean-Marc Plaza and Claire Valencien) for their support in system and network administration, and in public data and software installation and maintenance. We are very grateful to our Génoplante colleagues for the submission of their data and/or tools to GPI, their participation in many bioinformatics meetings and task forces, and for their help in defining and enriching the GPI environment. In particular we would like to express our gratitude towards Johann Joets, Nicolas Sajot, Philippe Lessard, Philippe Dufour, Philippe Leroy, Olivier Gigonzac, Evelyne James, Denis Scala, Virginie Chataigner, Pierre Rouzé, Pierre Hilson, Vincent Thareau, Véronique Brunaud, Franck Samson, Sébastien Aubourg, Alain Lecharny, Ian Small, Michel Caboche, Jean-Loup Risler, Catherine Christophe and many others.
Samson, F., Brunaud, V., Balzergue, S., Dubreucq, B., Lepiniec, L., Pelletier, G., Caboche, M. and Lecharny, A. (2002) FLAGdb/FST: a database of mapped flanking insertion sites (FSTs) of Arabidopsis thaliana T-DNA transformants. Nucleic Acids Res., 30, 94-97.
Crowe, M., Serizet, C., Thareau, V., Aubourg, S., Rouzé, P. and Trick, M. (2003) CATMA - A complete Arabidopsis GST database. Nucleic Acids Res., 31, submitted to the database issue (NAR-00311-2002).
Schiex, T., Moisan, A. and Rouzé, P. (2001) EuGène : an eucaryotic gene finder that combines several sources of evidence. Lecture Notes in Computer Science, 2006, 111-125.
Pedersen, A. G. and Nielsen, H. (1997) Neural network prediction of translation initiation sites in eukaryotes: perspectives for EST and genome analysis. Proceedings of the Fifth International Conference on Intelligent Systems for Molecular Biology, AAAI Press, Memlo Park, California, 226-233.
Hebsgaard, S.M., Korning, P.G., Tolstrup, N., Engelbrecht, J., Rouzé, P. and Brunak, S. (1996) Splice site prediction in Arabidopsis thaliana DNA by combining local and global sequence information. Nucleic Acids Res., 24, 3439-3452.
Brunak, S., Engelbrecht, J. and Knudsen, S. (1991) Prediction of Human mRNA Donor and Acceptor Sites from the DNA Sequence. Journal of Molecular Biology, 220, 49-65.
Kleffe, J., Hermann, K., Vahrson, W., Wittig, B. and Brendel, V. (1996) Logitlinear models for the prediction of splice sites in plant pre-mRNA sequences. Nucleic Acids Res., 24, 4709-4718.
Thareau, V., Déhais, P., Serizet, C., Hilson, P., Rouzé, P. and Aubourg, S. (submitted to Genome Res.) Automatic design of specific gene sequence tags for genome-wide functional studies.
Glémet, E. and Codani, JJ. (1997) LASSAP, a LArge Scale Sequence compArison Package. Comput. Appl. Biosci., 13, 137-43.
Huang, X. and Madan, A. (1999) CAP3: A DNA sequence assembly program. Genome Res., 9, 868-77.
Letondal, C. (2001) A Web interface generator for molecular biology programs in Unix. Bioinformatics, 17, 73-82.
Molecular and phenotypic information on wheat, barley, rye, triticale, and oats
TropGENE DB is a database that manages genetic and genomic information about tropical crops. The database is organised into crop specific modules. Seven modules are presently online (Banana, Cocoa, Coconut, Cotton, Oil Palm, Rice and Sugarcane). Other modules are being developed.
Each module includes data on genetic resources (agro-morphological data, parentages, allelic diversity), information on molecular markers, genetics maps, result of QTL analyses, data from physical mapping, sequences, genes, as well as corresponding references. TropGENE DB interface has been designed to allow quick consultations as well as complex queries.
Recent develoments :
We added links to phenotypic databases: Musa Web Services Platform (see http://tropgenedb.cirad.fr/en/banana.html), CocoaGenDB (see http://tropgenedb.cirad.fr/en/cocoa.html). The GMOD CMAP viewer (the Comparative Map Viewer) has been integrated into the TropGENE DB information system.
Ruiz M, Rouard M, Raboin LM, Lartaud M, Lagoda P, Courtois B. (2004) TropGENE-DB, a multi-tropical crop information system. Nucleic Acids Res. 32:D364-D367.