City Hall
4th IAPR International Conference on
Pattern Recognition in Bioinformatics
Sheffield, 7-9 September 2009

Tutorials

Magnus Rattray
University of Manchester

Magnus Rattray Title: Probabilistic learning and inference in computational systems biology

Abstract: Computational systems biology involves the development and application of models describing various dynmical processes that are important in the function of cells and tissues. Useful models that have been applied to these systems include probabilistic state-space models and differential equation models. The parameters and structure of these models is often poorly constrained by available experimental knowledge. Therefore probabilistic methods are important for making model-based inferences and for learning the model parameter from the available data. I will introduce the basic concepts of probabilistic inference and give examples of applications to systems biology models.



Jagath Rajapakse
Nanyang Technological University

Jagath Rajapakse Title: Computational Methods of Detecting Sequence Motifs

Abstract: Motifs in bio-sequences are sequence segments that are associated with an important biological function. In this tutorial, computational methods of finding in bio-sequences and their recent advances will be presented. Probabilistic methods of motif findings such as profile analysis, Expectation Maximization (EM), Multiple EM Elicitation (MEME), Gibbs sampling, etc. will be presented and their limitations discussed. Graphical approaches such as WINNOWER, SP-STAR, are the method of random projections are presented for weak motif recognition. Recent applications of evolutionary algorithms for motif detection problem will be discussed.



Simon Rogers
University of Glasgow

Simon Rogers Title: Methods and algorithms for bridging Omics data levels

Abstract: Multiple -omics data sets (for example, high throughput mRNA and protein measurements for the same set of genes) are beginning to appear more widely within the fields of bioinformatics and computational biology. There are many tools available for the analysis of single data sets but two (or more) sets of coupled observations present more of a challenge. In this talk I will look at some examples of the type of data that we may come acress, and then describe some of the methods available - from classical statistical techniques to more recent advances from the fields of Machine Learning and Pattern Recognition. I will also attempt to describe some of the many open problems that remain in this area.



Giuseppe Jurman and Samantha Riccadonna
Bruno Kessler Foundation

Giuseppe Jurman Samantha Riccadonna Title: Machine Learning Pipelines for High-Throughput Functional Genomics: The mlpy package

Abstract: Designing and implementing a correct Data Analysis Protocol (DAP) is the key step to ensure reproducibility and honest estimate of a -omics profiling experiment. In this hands-on presentation the participants will be taught how to implement all components of an analysis workflow through the scripting language Python and its ad-hoc library Machine Learning PY (mlpy). mlpy is an Open Source high-performance Python package making extensive use of numpy (http://scipy.org) to provide fast N-dimensional array manipulation and easy integration of C code. By addressing a classification task on demonstrative datasets, we will show how to implement a basic DAP by means of the data resampling, error evaluation, list stability and experiment landscaping tools. In the meanwhile, we will explore some of the available algorithms for preprocessing, predictive classification and feature selection, weighting and ranking. Prerequisites: Participants are asked to bring their laptops, with the latest mlpy version installed: https://mlpy.fbk.eu. See installation instruction here: https://mlpy.fbk.eu/data/doc/install.html.