From NGS data through third dimension toward new agrochemicals and drugs

In association with EMBnet, we organise a tutorial on the different aspects of NGS (New Generation Sequencing) and drug discovery data analysis. This tutorial will take place on Monday May, 26th. It will be held at the PRABI (Pôle Rhône-Alpes de Bioinformatique) in the Grégoire Mendel building (2nd floor) on the Doua campus of University Lyon 1.

Registration to the tutorial includes one lunch at the university cafeteria in the Domus building and two coffee breaks.

Registration – 8:30 to 9:00

Morning sessions – 9:00 to 12:30

Session I: Galaxy tour

In this session we will present some applications of the Galaxy workflow manager to analyse NGS data.

Tutor: Vincent Navratil (PRABI, Université Claude Bernard – Lyon 1).

Session II: KisSplice

The session will consist in a short tour of the possibilities of using KisSplice to analyse RNA-seq data with or without a reference genome. We will briefly explain what are the advantages of a local transcriptome assembler, w.r.t. both to global transcriptome assemblers (like Trinity or Oases), and to traditional mapping approaches (TopHat/Cufflinks...).

A hands-on session will then give details on how to analyse an RNA-seq dataset from the ENCODE project. We will focus on the SKNSH cell line, with and without retinoic acid stimulation. For practical reasons, we will restrict the analysis to only 10M reads in each replicate of each condition. Starting from the fastq files, we will use KisSplice to identify and quantify variants in both datasets. At the end of this step, a list of SNPs, indels and Alternative Splicing events is produced, with a quantification of each variant in each condition.

We will then use our newly developed R package KissDE to test if a variant is specific to a condition. Here we will focus on splice variants specific to the retinoic acid stimulation.

Finally, we will use our newly developed Python package KisSplice2RefGenome to annotate each event and assign a gene name to each candidate. This last step obviously cannot be used in the case of a non model species where no reference genome is available.

Tutor: Vincent Lacroix (LBBE, Université Claude Bernard – Lyon 1).

Afternoon sessions – 14:00 to 18:00

Session I: HOPE

Much NGS work is directed towards finding that one point mutation that is causal for a genetic disorder. Despite NGS and massive bioinformatics efforts, this is still not an easy task, but NGS and bioinformatics have made the undoable doable, and the doable much faster.

At the end of this mutation hunt the medical doctor is stuck with the left over question "Now that i know that the mutation H58G in the GVase protein is causal for the disease, but why?". We have designed HOPE, a software that nearly 100% automatically answers that last question. In the seminar we will discuss HOPE's possibilities and limitations against a background of what is possible today using the most modern tools in protein structure bioinformatics.

Tutor: Gert Vriend (CMBI, Nijmegen Centre for Molecular Life Sciences).

Session II: Nano-environments of protein districts – the valuation edge of structure-enabled drug discovery

Proteins function in a very dense environment of the cell interior. There they fold, they execute their function by obeying specificity, they interact with other proteins and they bind with certain agents. For each of these activities, proteins have designated a very peculiar "district" of their structure, which we were able to describe in terms of corresponding nano-environment properties/attributes/descriptors. I will be briefly describing the procedure for identification of protein interfaces, catalytic site residues and secondary structure elements, all in terms of their respective nano-environments. Toward the end of my presentation I would address application potential of our current results. The STING platform, STING DB and selected STING modules will be demonstrated.

Tutor: Goran Neshich (EMBRAPA, Brazilian Agricultural Research Corporation).

Session III: Macromolecular Dynamics (MD) and Computational Quantum Chemistry (CQC)

This session is CANCELLED!

This presentation will show that the dynamic properties of molecules are often paramount to understanding and predicting their behaviour and interactions; it will be fully hands-on, and will depend on the available time.

Macromolecular Dynamics

This part is optional, depending on time availability. The idea is to start by launching Zephyr and configuring a basic MD run. In so doing, a short explanation of the basic parameters will be provided (whether it should be in vacuum, in modelled solvent or in full solvent, the timescale, etc.) and, while it runs, many more parameters will be briefly mentioned, and properties of interest explained: thermodynamics, flexibility, structural optimisation, etc. This can take about half an hour and be omitted if needed, or deferred until after point 3 below.

Computational Quantum Chemistry

Students will move to Gabedit and open a small drug (likely aspirin) as well as its Wikipedia Web page. First, they'll run a minimisation and conformational search using classic MD. Here, they will learn about Simulated Annealing and conformational search.

This should be relatively quick and allow us to move to Semi-empirical Quantum Molecular Dynamics. This is likely to be slow using mopac7, and would be much faster using MOPAC2012 or FireFly, but these need licences. However, in the meantime, MD can be explained in more detail.

If MD has already been talked about, then students will stop the simulation and load a pre-computed trajectory. They will visualise the trajectory and pick one conformation on which to perform a Single Point MOPAC calculation to compute the Molecular Orbitals. Then they will visualise the HOMO and LUMO, and a short introduction to Frontier Molecular Orbital (FMO) theory will be given, comparing results against known data from Wikipedia.

After the tutorial, students should understand why is it important to study the dynamic behaviour of molecules, and how modern CQC can help characterise differences in molecular properties. Hopefully, they will be enticed to follow on with more advanced courses.

Tutor: José Valverde (CNB, Universidad Autónoma de Madrid).

Online user: 1 RSS Feed