Thursday, 16 August 2012

BIOINFORMATICS

Bioinformatics  is a branch of biological science which deals with the study of methods for storing, retrieving and analyzing biological data, such as nucleic acid (DNA/RNA) and protein sequence, structure, function, pathways and genetic interactions. It generates new knowledge that is useful in such fields as drug design and development of new software tools to create that knowledge. Bioinformatics also deals with algorithms, databases and information systems, web technologies, artificial intelligence and soft computing, information and computation theory, structural biology, software engineering, data mining, image processing, modeling and simulation, discrete mathematics, control and system theory, circuit theory, and statistics.
Commonly used software tools and technologies in this field include Java, XML, Perl, C, C++, Python, R, MySQL, SQL, CUDA, MATLAB, and Microsoft Excel.

Building on the recognition of the importance of information transmission, accumulation and processing in biological systems, in 1978 Paulien Hogeweg, coined the termed "Bioinformatics" to refer to the study of information processes in biotic systems . This definition placed bioinformatics as field parallel to biophysics and biochemistry. Examples of relevant biological information processes studied in the early days of bioinformatics are the formation of complex social interaction structures by simple behavioral rules, and the information accumulation and maintenance in models of prebiotic evolution.

At the beginning of the "genomic revolution", the term bioinformatics was re-discovered to refer to the creation and maintenance of a database to store biological information such as nucleotide sequences and amino acid sequences. Development of this type of database involved not only design issues but the development of complex interfaces whereby researchers could access existing data as well as submit new or revised data.

In order to study how normal cellular activities are altered in different disease states, the biological data must be combined to form a comprehensive picture of these activities. Therefore, the field of bioinformatics has evolved such that the most pressing task now involves the analysis and interpretation of various types of data. This includes nucleotide and amino acid sequences, protein domains, and protein structures. The actual process of analyzing and interpreting data is referred to as computational biology. Important sub-disciplines within bioinformatics and computational biology include:
  • the development and implementation of tools that enable efficient access to, and use and management of, various types of information.
  • the development of new algorithms (mathematical formulas) and statistics with which to assess relationships among members of large data sets. For example, methods to locate a gene within a sequence, predict protein structure and/or function, and cluster protein sequences into families of related sequences.
The primary goal of bioinformatics is to increase the understanding of biological processes. What sets it apart from other approaches, however, is its focus on developing and applying computationally intensive techniques to achieve this goal. Examples include: pattern recognition, data mining, machine learning algorithms, and visualization. Major research efforts in the field include sequence alignment, gene finding, genome assembly, drug design, drug discovery, protein structure alignment, protein structure prediction, prediction of gene expression and protein–protein interactions, genome-wide association studies and the modeling of evolution.

Interestingly, the term bioinformatics was coined before the "genomic revolution". Paulien Hogeweg and Ben Hesper defined the term in 1978 to refer to "the study of information processes in biotic systems". This definition placed bioinformatics as a field parallel to biophysics or biochemistry (biochemistry is the study of chemical processes in biological systems). However, its primary use since at least the late 1980s has been to describe the application of computer science and information sciences to the analysis of biological data, particularly in those areas of genomics involving large-scale DNA sequencing.

Bioinformatics now entails the creation and advancement of databases, algorithms, computational and statistical techniques and theory to solve formal and practical problems arising from the management and analysis of biological data.

Over the past few decades rapid developments in genomic and other molecular research technologies and developments in information technologies have combined to produce a tremendous amount of information related to molecular biology. Bioinformatics is the name given to these mathematical and computing approaches used to glean understanding of biological processes.
Common activities in bioinformatics include mapping and analyzing DNA and protein sequences, aligning different DNA and protein sequences to compare them, and creating and viewing 3-D models of protein structures.

There are two fundamental ways of modelling a Biological system (e.g., living cell) both coming under Bioinformatic approaches.
  • Static
    • Sequences – Proteins, Nucleic acids and Peptides
    • Structures – Proteins, Nucleic acids, Ligands (including metabolites and drugs) and Peptides
    • Interaction data among the above entities including microarray data and Networks of proteins, metabolites
  • Dynamic
    • Systems Biology comes under this category including reaction fluxes and variable concentrations of metabolites
    • Multi-Agent Based modelling approaches capturing cellular events such as signalling, transcription and reaction dynamics
A broad sub-category under bioinformatics is structural bioinformatics.

No comments:

Post a Comment