About Training programs Services Events Team Contact

Certificate Program in Bioinformatics (One year)

Program Duration
:      One year from July, 2021 - June, 2022 (6 month training and 6 month project-optional)

Number of slots
:       20

Schedule & Mode
:       1hr/day (9:00 am to 10:00 am) & Virtual (online)

Application opens
:      May 03, 2021

Application deadline
:      June 15, 2021

:      bioinfo@rgcb.res.in

Course Fee
:      6 month training: ₹30,000 and 6 month project: ₹20,000

Download Brochure here

Course Overview

This one year certificate program is a platform for highly motivated students to explore bioinformatics through practical experience. It provides a solid base to the use of bioinformatics by providing theory and hands-on training in methods and resources appropriate to all major fields of biological research. This Program provides best strategies for undertaking bioinformatics analysis, computer programming, statistical analysis, data management and reproducibility. All participants will have close and correct mentoring by RGCB faculty. Special invited lectures will be arranged by distinguished scientists and academicians.


BSc/MSc/BTech/MTech students and graduates from life science, Physics, Chemistry & Computer science streams.


The RGCB Academic Committee will screen all applications and potential candidates will be invited for an online interview. In case more than 20 candidates are being short-listed after screening of applications, an online test will be conducted before the selection interview.

Program Fee

The total fee amount of 50,000 INR. No certificate will be issued without fulfilment of the curriculum & payment of the total fee. Program fees include admission, study materials, access to internal computational facilities and consumables used in the Bioinformatics Facility. It does not cover your travel and local accommodation.

Accommodation (For Project students)

RGCB Hostel facility will be limited. Assistance will be provided to find suitable local accommodation if hostel rooms are not available.

Who should apply?

This certificate course is aimed at people with a background in biological sciences who have little or no experience in bioinformatics. Applicants are expected to be at an early stage of their career with an interest to develop their bioinformatics skills. Essential qualifications include a first class bachelor's degree in medical/engineering sciences or a masters degree in any branch of life science. Previous knowledge of computer programming is not required for this program.


Submit your online application here. Please note, your application will not be considered without a Statement of Purpose (Why you want to get trained in this specialization?) & resume.

Course Modules

The curriculum is divided into 6 core bioinformatics modules (Theory) + a 6-month final dissertation (optional). The syllabus includes

Module 1: Fundamentals of Computer programming & Biostatistics

Introduction to Linux/UNIX environment: Unix file system; Installing & executing programs in LINUX environment; Navigating your computer from the shell; Basic command line operations; Introduction to common text editors like gedit, nedit, emacs & vi with special emphasis on vi editor basic commands. Working with remote machines.

Python,Perl, R, shell scripting: Advantages of using Python. A comparison with Perl and other high-level languages. Python variables (String, List, Dictionary, Tuple and Set), Control structures and loops like if, while, if-else, if-elsif-else, foreach, for, and unless loops for simple data structures. Complex data structures, tuples & dictionaries, Use of loops through complex data structures; Useful python libraries for Biologists.

Introduction to R package: Installation in windows/Mac/Linux environment, basic commands to store and print variables; Use of commands like read.table, read.csv, write.table to read/write data in R console. Basic statistics (Mean, standard deviation, correlation coeffiecient and p-value) in R, Generating simple plots on screen or/and in pdf/png/jpg and Publication quality figures.

Module 2: Bioinformatics data resources, Biological sequence analysis, phylogeny

Biological data resources, access & management : Genomes across the tree of life, Major sequencing projects, Major centralized bioinformatics databases to store DNA, RNA & protein sequences. Major resources and services at NCBI, Web based and command-line access to information. Navigating through major resources and services at NCBI; Overview of major web resources for the study of genomes: Enseml, NCBI-Genome and UCSC genome browser.

Biological sequence analysis : Homology, Similarity & Identity, Scoring matrices, EMBOSS tools, NCBI blast programs, Evaluation of significance of results using E-value and Bit score, Profile searches, HMMER, Sequence alignment programs. Different approaches to perform Multiple Sequence Alignment, Best strategies to perform pairwise and multiple sequence alignment. Multiple sequence alignment of genomic regions. Databases of Multiple sequence alignment.

Molecular phylogeny & Evolution : Principles of molecular phylogeny and evolution; Stages of Phylogenetic Analysis, Distance-Based, Character based & Model-Based Phylogenetic Inference; Model based phylogenetic inference (ML), Bayesian inference methods, PHYLIP, MEGA, Evaluation of phylogenetic trees, Phylogenetic networks;

Module 3: Introduction to Next Generation Sequence Analysis

Introduction to DNA Sequencing Technologies: Overview of Next-Generation Sequencing Data Analysis: From Generating Sequence Data to FASTQ; Quality control; Different genome assembly programs; Multiple read alignment software programs; The SAM format & SAMtools; Variant calling, VCF format & VCF tools; Interpreting variants; Visualizing & Tabulating NGS data; The GATK Genome analysis suite. Analysis of Human genome using GATK. Storing Data in public repositories; Applications of NGS.

Genome analysis: Completed genomes: Viruses, Bacteria, Archaea & Eukaryotes; Comparison of prokaryotic genomes; Plant genomes; Major genome analysis projects; ENCODE project; Finding Genes in Eukaryotic Genomes; Human Genome project; A Bioinformatics perspective on Human Disease.

Module 4: Transcriptomics and proteomics

Introduction to Microarrays and RNA-Seq: Data acquisition & Analysis. Microarray data analysis with NCBI-GEO2R/Bioconductor; RNA-Seq analysis using TopHat and Cuffflinks, Functional annotation of microarray/Rna-seq data.

Bioconductor in R: Brief introduction to various Bioconductor packages for NGS; Quality assessment (packages: qrqc, seqbias, ReQON, htSeqTools, TEQC, Rolexa & ShortRead), RNA-seq (packages: DEXSeq, EDASeq, edgeR etc). Alignment (packages: Rsubread & Biostrings), Microbiome (packages: phyloseq, DirichletMultinomial, clstutils, manta & mcaGUI), Work flows (packages: ArrayExpressHTS, Genominator, easyRNASeq, oneChannelGUI & rnaSeqMap), Database (SRAdb).

Module 5: Structural Bioinformatics & Fundamentals of drug discovery

Proteomics: Protein analysis & prediction – Principles of Protein Structure (Primary, Secondary & Tertiary), Protein Data Bank (PDB), Protein structure visualization tools, Protein Domains and Motifs, SCOP & CATH Database; Proteomic resources;

Introduction to Fundamentals of biomolecular structures: Nucleic acids and Proteins, Three-dimensional structure representations and coordinate formats. Calculating solvent accessibility, membrane region and secondary structure:- DSSP/STRIDE/PSIPRED/JPRED, Structure prediction: Homology modeling and Ab-initio modeling:- SWISS-MODEL /I-TASSER2, Ligand binding site prediction and docking:- Vina/Patchdock, Computer-aided drug design:-Dockblaster, Molecular Dynamics and Normal Mode Analysis of Biomolecular structures:- ElNemo/GROMACS.(Molecular Docking and Molecular Dynamics).

Module 6: Introduction to Data Science

Python Essentials: Anaconda - Python Distribution Installation And Setup, Jupyter Notebook, Python Basics, Data Structures, Control Statements

Basics of Data Mining: Data Preparation with Pandas, Numpy Array Functions, Data Munging With Pandas, Imputation, Outlier Analysis

Data Visualization: Matplotlib Introduction, different kinds of plotting using mathplotlib

Machine Learning: Machine Learning Introduction, ML Core Concepts, Unsupervised And Supervised Learning, Clustering With K-Means, Linear Regression, Logistic Regression, K-Nearest Neighbor

Deep Learning: Introduction, Tensorflow and Keras, Convolution Neural Network Basics

Introduction to RDBMS & MySQL for Python: Relational Database Management Systems basics, Sql introduction, Connection To Sql databases, Fetching data with Select, Where Condition, Sql Joins, Sql Crud Operations.

Training Schedule

One hour class (9.00 am to 10.00 am)


Two online exams will be conducted. Final grade will be calculated based on exams, lab activities (Journal presentation, assignments, discussion etc) and final project (optional).

For more information on the certificate Program please contact:
Office of Academic Affairs
Rajiv Gandhi Centre for Biotechnology (RGCB)
Trivandrum, Kerala- 695014
+91 471-2781247

Find us here


Computational Biology & Bioinformatics Facility
Bio-Innovation Center (BIC), Rajiv Gandhi Centre for Biotechnology (RGCB)
KINFRA film park, Sainik School PO, Kazhakootam
Trivandrum, Kerala 695585


Shijulal Nelson-Sathi, PhD
Jamshaid Ali, PhD
K C. Sivakumar, MSc
Meena Vinaykumar, MCA

Invited Faculty

At least one guest lecture per module.
Final list of guest faculties will be published soon.

Our Alumni