Indian Journal of Physiology and Allied Sciences

Gemone Annotation- State of the art

Authors : RAHUL BANIK,
RAHUL BANIK
(Corresponding Author)
rbanik2009@gmail.com

DBT Centre for Bioinformatics, Presidency University, Kolkata

SAYAK GANGULI,
SAYAK GANGULI

DBT Centre for Bioinformatics, Presidency University, Kolkata.

Address

DBT Centre for Bioinformatics, Presidency University, Kolkata

Address

DBT Centre for Bioinformatics, Presidency University, Kolkata.

Page : 90-92 , vol : 69 ,No : 3 , Year : 2015

Abstract

Genome annotation is of enormous significance in interpretation of the necessary information obtained from raw sequence data produced by genome sequencing projects. The process aids in identifying the biological significance of raw sequence data and thus putting our understanding of biological processes in proper context. There are two interrelated types of genomic annotation: structural and functional. Structural annotation deals with identified the genome elements (such as genes, promoters, and regulatory elements) whereas functional annotation assigns functions to these structural elements. Structural annotation is defined as finding genes in genomic DNA. Structural annotation is of two types: one, prediction based and another, sequence similarity based. Prediction based algorithms designed to find structures of gene(s) based on nucleotide sequence and composition whereas similarity based prediction is alignment with mRNA sequences (ESTs) from the same or related species to identification of motifs. In case of functional annotation, Gene Ontology (GO) plays an integral part. GO describes three attributes of gene product(s): molecular function, biological process and cellular component. The basic steps of functional annotation include BLAST, Mapping of GO terms, Annotation using an annotation rule, and finally statistical analysis of GO term distribution differences between groups of sequences. In this study, two unannotated sequences were used as samples for the analyses for functional annotation. After mapping the sequences, GO terms that were generated, describe the molecular function (F), biological process (P), and cellular component (C) e.g. DNA Binding (F), biosynthetic process (P), protein complex (C). Enzyme codes and KEGG pathway maps were generated for each sequence, which describe the different pathways like purine and pyrimidine metabolism, pentose phosphate pathway etc. Another important aspect is evidence code distribution i.e. the quality of annotation. Here, IEA (Inferred from electronic assay) is dominant.

Description

Research Article
Loading…