David Picard
ETIS - ENSEA
ETIS - ENSEA
tuesday 05 march 2013
JKernelMachines version number bumped to 2.0!
The bigs changes are:
friday 16 november 2012
I am organizing a special session with Philippe Gosselin at this year's ESANN conference.
Machine Learning for multimedia applications
David Picard , ETIS – ENSEA,
Philippe-Henri Gosselin , INRIA Rennes (France)
In recent years, many multimedia applications have shown very successful improvements by leveraging machine learning techniques. These applications include image and video classification, object recognition, image and video retrieval, or event detection.
However, these multimedia applications also uncover new machine learning problems in areas such as mid-level features learning, distance learning, feature combination, and so on.
This special session is intended to research papers that combine machine learning for multimedia problems. The following topics are of particular interest:
wednesday 07 march 2012
Tomorrow, Hassen will be showing us some pretty things on 3D object recognition (in french). It's open to everyone, room 384 at the ENSEA.
Calcul statistique sur les variétés de formes 3D pour la reconnaissance d'identité et d'expressions
Nous proposons un cadre Riemannien pour comparer, déformer, calculer des statistiques et organiser de manière hiérarchique des surfaces faciales. Nous appliquons ce cadre à la biométrie faciale 3D indépendamment des expressions faciales. Le même framework est utilisé pour reconnaitre les expressions indépendamment de l'identité. Les surfaces faciales sont représentées par un ensemble de courbes radiales. Dans ce cas, le calcul se simplifie et l'espace des formes des courbes ouvertes se ramène à une hyper sphère de l'espace de Hilbert. Le reconnaissance d'identité est basée sur une métrique élastique afin de faire face aux déformations non-isomètriques (ne conservant pas les longueurs) des surfaces faciales. La reconnaissance d'expressions, quand à elle, est basée sur l'apprentissage de l'énergie nécessaire à déformer les visages neutres pour exprimer les six émotions universelles. L'approche de reconnaissance d'identité proposée a été validée sur des Benchmarks connus (FRGCv2, GAVAB, BOSPHORUS) et a obtenu des résultats compétitifs par rapport aux méthodes de l'état de l'art. L'approche de reconnaissance d'expressions a été testée sur la base BU4D, une base de séquences 3D, et surpasse en performance les approches de l'état de l'art.
wednesday 01 february 2012
We (N. Thome, M. Cord, A. Rakotomamonjy and me) received the notification of acceptance for a poster presentation at ESANN 2012. This work is on learning product combinations of kernels.
The sketch is as follows: suppose you have several types of features and signatures leading to a variety of kernels (typically Gaussian kernels). This is quiet a common scheme in Computer Vision. You might want to combine them, and usually people use MKL approaches (i.e. a weighted sum of kernel). However, in most cases these kernels are redundant, and you would better do a product combination of these (think of the different scales in Spatial Pyramid or different scales of the same descriptors). The product is like an 'AND' gate while the sum more like an 'OR' gate, thus if your features are redundant, the product is more likely to denoise than the sum.
The bad thing about this product counterpart of MKL is that it is non-convex (we have a nice proof about this). So we proposed and algorithm finding a local optimum. While this might not be the best combination possible, it is sufficiently robust in practice to remove non-informative kernels. The good thing is that it also performs the kernels parametrization without any need for cross-validation.
Once I've cleaned the paper of all remaining typos and corrections as suggested by the reviewers, I'll put the code online.