GENOME ANNOTATION-STATE OF THE ART

Main Article Content

RAHUL BANIK
SAYAK GANGULI

Abstract

Genome annotation is of enormous significance in interpretation of the necessary information obtained from raw sequence data produced by genome sequencing projects. The process aids in identifying the biological significance of raw sequence data and thus putting our understanding of biological processes in proper context. There are two interrelated types of genomic annotation: structural and functional. Structural annotation deals with identified the genome elements (such as genes, promoters, and regulatory elements) whereas functional annotation assigns functions to these structural elements. Structural annotation is defined as finding genes in genomic DNA. Structural annotation is of two types: one, prediction based and another, sequence similarity based. Prediction based algorithms designed to find structures of gene(s) based on nucleotide sequence and composition whereas similarity based prediction is alignment with mRNA sequences (ESTs) from the same or related species to identification of motifs. In case of functional annotation, Gene Ontology (GO) plays an integral part. GO describes three attributes of gene product(s): molecular function, biological process and cellular component. The basic steps of functional annotation include BLAST, Mapping of GO terms, Annotation using an annotation rule, and finally statistical analysis of GO term distribution differences between groups of sequences. In this study, two unannotated sequences were used as samples for the analyses for functional annotation. After mapping the sequences, GO terms that were generated, describe the molecular function (F), biological process (P), and cellular component (C) e.g. DNA Binding (F), biosynthetic process (P), protein complex (C). Enzyme codes and KEGG pathway maps were generated for each sequence, which describe the different pathways like purine and pyrimidine metabolism, pentose phosphate pathway etc. Another important aspect is evidence code distribution i.e. the quality of annotation. Here, IEA (Inferred from electronic assay) is dominant.

Article Details

How to Cite
BANIK, R., & GANGULI, S. (2015). GENOME ANNOTATION-STATE OF THE ART. INDIAN JOURNAL OF PHYSIOLOGY AND ALLIED SCIENCES, 69(03), 90–92. https://doi.org/10.55184/ijpas.v69i03.14
Section
Research Article