Title : Theories and Modeling of Metabolic Networks

by Jean-Pierre  MAZAT, University of Bordeaux, France
 

We will firstly emphasize that metabolic modelling lies on implicit hypothesis which will be discussed. We will show in the following that there is an actual theory of metabolic networks based on :

The enzymatic kinetics which is a more than an hundred-year-old modelling venture in biology. It leads to complex rate equations which depict the molecular mechanisms of enzymatic catalysis.  When studying metabolic networks, these complex equations may be replaced by more simple ones involving the maximal rate constants in both directions and half saturation constants for substrates and products. In the representation of allosteric or genetic regulations, Hill’s curves are usually satisfactory. The representation of a metabolic network through the Stoichiometry Matrix. It allows a simple representation of the dynamics of a metabolic network by a matrix differential equation depicting the metabolites variations as a function of the individual rate constants. The deconstruction of a metabolic network in elementary modes. The elementary modes are the minimal pathways through a metabolic network. The great (huge) number of elementary modes in a given metabolic network has to be emphasized. The Metabolic Control Theory which is a theory of the sensitivity of metabolic networks in the linear vicinity of their steady-states. It leads to important practical theorems such as the summation theorem. It allowed to solve some paradoxical situations or apparent contradictory results in the the functioning of metabolic networks. On simple models, we will show, using these theories that it is possible to disprove some general accepted ideas in biology.

In conclusion, we will wonder why biologists do not often model their experimental situation instead of describing them with words and sentences which rapidly lead to false reasoning. One reason could be that there exist few simple tools that allow biologists to develop first simple models themselves.

 


Title : On the link between oscillations and negative circuits in discrete genetic regulatory networks

 by  Adrien Richard, University of Nice-Sophia Antipolis, France 

 

The biologist René Thomas conjectured, twenty years ago, that the presence of a negative circuit in the interaction graph of a genetic regulatory network is a necessary condition for the presence of sustained oscillations in the dynamics of the network. Here, we state and prove this conjectured in a general discrete framework.The set of states of the network is assumed to be the Cartesian product X of n finite intervals of integers, and the dynamics of the networks is represented by an asynchronous state transition graph Γ on X, as in the Thomas' modeling. Then we derive from Γ an interactions graph G and we show that the presence, in Γ, of a strongly connected component A that we cannot leave and which is not reduced to a state, implies the presence of a negative circuit in G. This discrete version of the Thomas' conjecture was previously proved by Remy, Ruet and Thieffry under the strong hypothesis that A contains a unique cycle and under some assumptions on the form of Γ.

 

 


 

Title : Formal approaches to model gene regulatory networks

by Gilles Bernot , University of Nice-Sophia Antipolis, France 

 

The first part of the course will develop the basic modelling approach introduced by René Thomas (Brussels) in the 70's. We will firstly explain how the space of possible gene expression levels can be decomposed into several intervals in order to obtain a discrete qualitative description of gene networks. We will then show how this discrete approach can be formalized (according to formal methods of computer science). The parameters of the Thomas'approach are used in order to build an "asynchronous" automaton from the gene interaction graph. This automaton mathematically models the dynamic behaviour of the regulatory network (time evolution of the expression levels).


The second part of the course will focus on current research works. We will show how to use formal logic in order to extract unknown parameter values from the observed behaviours. We will firstly give a short course on formal temporal logics and their associated model checking methods. We will then show how model checking can be used in order to find the set of possible parameter values, i.e. parameters which are consistent with the known qualitative behaviours. Complementary results can be used in order to reduce the size of this set of consistent parameters: we shall explain how to use the notion of "functionality" of a path in the interaction graph. Lastly, we will explain how some of the current software testing methods can be used in order to generate interesting "wet biology" experiments, starting from the formal descriptions of the interaction graph and the biological hypotheses under consideration. Some examples will be used to illustrate the notions defined during the course. In particular the simple model of mucus production in Pseudomonas aeruginosa will be fully studied. Pseudomonas aeruginosa is a opportunistic bacteria which infects the lungs of patients of cystic fibrosis.

 


Title : The Informatics Framework in immunoinformatics

by  Marie-Paule Lefranc,  University Montpellier 2,  France

 

What is an ontology? Why was IMGT-ONTOLOGY needed? What is immunogenetics and what does  IMGT-ONTOLOGY bring to that science? What is immunoinformatics and what does IMGT-ONTOLOGY bring to this field? These are a few of the questions which will be addressed. We will define four of the major IMGT-ONTOLOGY axioms (IDENTIFICATION, DESCRIPTION, CLASSIFICATION and NUMEROTATION) taking as examples the antigen receptors. These molecules, immunoglobulins (or antibodies) and T cell receptors, are key molecular components of the adaptive immune response that characterizes the vertebrate species. Their huge diversity is inherent to the particularly complex and unique molecular synthesis of the antigen receptor chains. We will describe how the IMGT-ONTOLOGY concepts have allowed to standardize and manage the immunogenetics data and to develop IMGT®, the international ImMunoGeneTics information system® (http://imgt.cines.fr), the international reference in immunogenetics and immunoinformatics. We will show that IMGT-ONTOLOGY, a must for integrative immunogenetics and immunoinformatics, is also a paradigm for any ontology in Life Sciences. 

 


 

Title :  Data Mining Methods and Clustering in Gene Expression Data Analysis

 by Martine Collard, University of  Nice-Sophia Antipolis, France
 

Gene expression techniques like micro-arrays aim at measuring gene expression i.e RNA levels for the whole transcriptom-of an organism. These methods are powerful tools for studying biological processes according a transcriptom point of view. They produce large datasets that simultaneously measure the expression of thousands genes under different experimental conditions. One major challenge of these datasets is to understand potential information hidden in their wide volume.

Data mining methods may be applied in various domains in order to explore large volumes of data and elicit novel knowledge useful for domain experts. They need either techniques designed specifically to fit the size of datasets or solutions from related fields of data analysis, statistics, machine learning and databases. One frequent data mining task consists on clustering sets of objects according their attributes values. Multiple clustering methods are designed and particularly for very large datasets like textual corpus. On the other hand, one major objective of gene expression techniques is to discover novel information on biological systems from gene expression data analysis. Three typical kind of analysis are commonly driven: the search for differentially expressed genes under different conditions, the search for co-expressed genes i.e genes with similar expression profiles under given conditions and the search for specific expression patterns like temporal expression sequences. Gene expression data clustering consists in grouping genes into classes (or clusters) according their expression profile under given experimental conditions. Each resulting cluster is expected to contain genes that share a common expression profile. Gene expression clustering is based on the assumption that genes grouped into a same cluster should participate in a common biological process.

In this course, we will first introduce distance measures and then we will present two main clustering techniques to elicit co-expressed genes: hierarchical methods that split an initial set of genes into a hierarchy of clusters as a dendogram. Eisen [1] first highlighted extracted clusters quality by matching them with a gene functional categorization. Partitioning methods, that divide the initial set of genes into mutually exclusive subsets like the well known K-Means algorithm and its extension [2] or the SOM (Self Organizing Maps) methods.

We propose then to focus on the interpretation of the discovered clusters [4]. Indeed clusters of co-expressed genes are generally quite numerous and their interpretation depends on external knowledge sources. Thus the analysis roadmap for biologists is currently to pick some genes known to participate in a given biological function and to manually explore a preliminary raw clustering by association on these genes. Recent works were done to provide more automatic solutions like bi-clustering [3] techniques that search for groups of genes with common profiles under subsets of conditions only.

 

 


 

Title : Machine learning methods for high-dimensional and structured data,  

by  Lyhyaoui Abdelouahid, ENSAT, Morocco

 

The main aim of this session is to introduce the techniques of machine Learning and its applications to Biological data. We will make a special outline on classification methods. We distinguish between two types of classification: supervised and unsupervised classification. We present the most known methods and techniques in particular in the field of Bioinformatics.

First part: Techniques for supervised classification for Biological Data in Digital Symbol Sequences and on the Information Content of Biological Sequences. Machine-Learning Foundations: The Probabilistic Framework , Bayesian Modeling  and Bayesian Inference.  Supervised learning: Neural Networks , Universal approximation proprieties, Learning virsus generalization, Learning  algorithm: Back Propagation

Second part: Unsupervised learning. Application to the analysis of gene expression data. Characteristics of gene expression data and clustering Algorithms for Hierarchical Clustering and K-means. Neuronal Clustering: Self Organizing Maps.

Application : Techniques for generating data of expression : Pre-processing, Clustering , Validation : intern (statistical) and external (biological). Presentation of some software for clustering and Validation of clustering results

 

Training : Case study: Analysis of data of gene expression and other classification and prediction problem.

 


 

Complexes of proteins of genetics regulatory networks

by Amal  Maurady, SMBI, Morocco

 

The study of the interactions of proteins implied in the genetics regulation network requires a global strategy making it possible to reflect the complexity of the network and to predict new functions of the complexes of the macromolecules. In parallel the identification of assembly of the complexes of the implied proteins in the complexes proteins proteins and DNA proteins was carried out by the algorithm Prodistin  and by the prediction by the docking method. Currently we use docking programs rigid body docking based on the Fourier correlation approach. The Data base integrating the data on proteins implied in the regulation network from Gene Ontology, Protein Data Bank and InterPro data Banks was completed by data base for the interaction of proteins, experimental data, from literatures and predicted protein interaction in the regulatory system

 


 

Genomics Plateform in Morocco

by M. EL Fahime, CNRST, Morocco