|
|
February 2007
Date of birth: August 12, 1982 Place of birth: Monfalcone (GO), Italy Nationality: Italian
Ph.D.. Computer Science (Gen 2007)
University of Pisa, Italy
M.S. Computer Science (Feb 2005 - Oct 2006)
University of Pisa, Italy
Graduated (with honors) summa cum
laude
Thesis supervisors: Alessio Micheli and Antonina Starita
Thesis title: Generative Kernel
Functions for Structured Data
B.S. Computer Science (Oct 2001 -Feb 2005)
University of Pisa, Italy
Graduated (with honors) summa cum
laude
Visiting Student (Dec 2003-Jul 2004)
Carnegie Mellon University,
Pittsburgh, Pennsylvania, USA
School of Computer Science
Enrolled as undergraduate student at the Computer Science School,
attending graduate courses from the Center for Automated Learning and
Discovery (now Machine Learning Department), Computer Science
Department, Language Technologies Institute and Robotics Institute
Graduate Student Member (01/2005-Present)
Computational Intelligence and
Machine Learning Group
Department of Computer Science, University of Pisa, Italy
Development of the software
framework Structlab for Machine Learning and Applied Statistics
on structured domains.
Supervisors: Alessio Micheli and Antonina Starita
Web reference: http://structlab.sourceforge.net
and http://ciml.di.unipi.it
Software Engineer (05/2006-09/2006)
Department of Statistics and
Applied Mathematics
University of Pisa, Italy
Implementation of a simplex based
algorithm for solving subclasses of nonlinear problems through sensitivity analysis procedures
Supervisors: Laura Carosi and Laura Martein Programmer Analyst (06/2004-10/2004)
Alberta Ingenuity Center for
Machine Learning
Edmonton, Alberta, Canada
Development and implementation of
a conditional random field model, using a min-cut algorithm for inference and a
conjugate-gradient approach for parameter training for image
segmentation and tumor growth
prediction.
Supervisor: Russel Greiner
Web reference:
http://kingman.cs.ualberta.ca/research/projects/content/projects_template.php?num=11 Intern (02/2004-06/2004)
Carnegie Mellon University
Pittsburgh, Pennsylvania, USA
Space invariant feature extraction
from 3D brain volumes, aligned through midsagittal plane extraction, using and extending the
Insight Toolkit for Image Segmentation and Registration.
Supervisors: Yanxi Liu and Leonid Teverovskiy
My main research interest is machine learning, in particular the design and analysis of systems that automatically adapt based on experience using mathematical models. During my studies and research experiences I had the opportunity to work on a sufficiently wide spectrum of methods, with particular emphasis on statistical models for time sequences and hierarchically structured data, such as Hidden Markov Models, Conditional Random Fields and Hidden Recursive Networks.
My principal area of research has been the integration of these models with Kernel Methods, through the study and development of Generative Kernel Functions, which try to combine the modeling ability of generative models with the good predictive performances of discriminative approaches.
I am mainly interested in applications on real-world structured domains, with some experience in Image Analysis, Bioinformatics and Cheminformatics.
Structlab
I developed and I am now extending a machine learning and applied
statistics software library for structured domains, which aims to be an
easy to extend framework for learning experiments with structured data,
and provides a toolbox of generative and discriminative learning
methods, together with tools for loading, preprocessing, cross
validating, and visualization. Structlab is accompanied by a graphical
user interface which allows to setup, in a visual and intuitive way,
elaborate machine learning experiments.
Web reference:
http://structlab.sourceforge.net
[1] Nicotra, L., Micheli, A., Starita, A. (2004), Fisher Kernel for Tree Structured Data, Proceedings of the IEEE International Joint Conference of Neural Networks, 1917-1922
[2] Nicotra, L., Micheli, A., Starita, A. (2007), Generative Kernels for Gene Function Prediction through Phylogenetic Tree Models of Evolution (submitted)
Generative Kernel Functions for Structured Data (2006)
This thesis explores ways of combining probabilistic models and kernel methods. A class of generative kernel functions is presented defining embeddings of the input domain based on probabilistic models of the data generating process and then combining these models in order to define a similarity measure on the domain. Among the presented classes of kernels, some are extensions of previously defined approaches to more structured domains, while some other are completely new formulations, in particular the class of relative probability kernels. The performances of generative kernels are tested on various benchmarks, comprising a set of simulated data, a classification problem of biological sequences and two domains of molecules modeled as trees: a QSPR (Quantitative Structure Property Relationship) analysis problem on a class of alkanes and a QSAR (Quantitative Structure Activity Relationship) analysis problem on a benzodiazepines class.
Mathematics and Statistics
Mathematical Analysis, Algebra, Languages and Methods of Mathematics,
Numerical Calculus, Computational Mathematics, Physical Modeling,
Probability and Statistics, Operations Research Machine Learning
Statistical Approaches for Learning and Discovery (CALD at CMU),
Machine Learning, Machine Learning Theory (CALD at CMU), Information
Theory, Neural Networks I, Neural Networks II, Data Mining Techniques,
Intelligent Systems I, Intelligent Systems II, Applied Machine Learning
Computational Genomics and System Biology (CALD at CMU),
Bioinformatics, Natural Language Processing, Learning to Turn Words
into Data: Information Extraction & Integration (CMU), Methods
in Medical Image Analysis (Robotics Institute at CMU)
Programming
C++ (2000-2006, expert), R/Splus (2004-2006, intermediate), Java
(2001-2005, intermediate), Matlab (2003-2005, beginner), C#/.NET (2006,
beginner), Perl (2004, beginner), Fortran (2004, beginner), C
(2002-2003, beginner), Python (2006, beginner), Sql (2003, beginner),
Ocaml (2003-2004, beginner) Operating systems
Linux/Unix (2000-2006, expert), Microsoft Windows (1999-2006,
intermediate) Software Toolkits Root (Data
analysis, C++), ITK (Image Segmentation and Registration, C++), Torch
(Machine Learning Library, C++), Libsvm (Support vector machine, C),
Clp (Linear programming solver, C++), Bayesian Network Toolkit
(Graphical models, Matlab), Intel Probabilistic Networks Library
(graphical models, C++), Weka (Machine learning and data mining, Java),
Boost (general purpose libraries, C++) Programming Tools
Eclipse, Emacs and Microsoft Visual Studio (IDE), Dart (testing tool),
Memproof and Valgrind (memory profilers), Cvs and Subversion
(versioning tools) Scientific Formatting Tools and Languages
LATEX/Postscript,
Lyx, Gnuplot Graphical User Interfaces Programming
wxWidgets (2006, beginner) and Gtk (2006, beginner) Web Programming
Experience in delivery of content management based websites for many
organizations (especially Drupal based), HTML
Statistical Data Analysis and Visualization,
Numerical Optimization and Scientific Writing
Italian, mother language English fluent German intermediate Latin basic
Graduation (M.S.) with Honors in Computer Science University of Pisa,
2006
Graduation (B.S.) with Honors in Computer Science University of Pisa,
2005
Internationalization of University Program Scholarship Italian Ministry of Education,
University and Research, 2004
Studentship and Full tuition support (2001-2006) University of Pisa
IEEE Computer Society since 2002
Associazione Italiana Intelligenza Artificiale since
2002
Association for Computing Machinery (ACM) since 2003
IEEE Computational Intelligence Society since 2004
IEEE Information Theory Society since 2006
American Association for Artificial Intelligence (AAAI) since 2004
American Statistical Association (ASA) since 2006
Available upon request.