friday 24 april 2015
Since we had a few publications on the topic of distributed machine learning (in particular a Neurocomputing paper on distributed PCA: "Asynchronous Gossip Principal Components Analysis"), let's talk a bit more about it. My Ph.D. student Jérôme Fellus has rolled out the version first version of his libagml library. This is a distributed machine learning library in C++ that relies on Gossip protocols.
The main page is here: http://perso-etis.ensea.fr/~jerofell/software.html
The way it works is dead simple: you have a mother class that corresponds to a node, and all you have to do is derive it to make your specifi local computation and aggregation procedures. All the networking, instantiation, etc, is handle by the library. Nice, isn't it?
thursday 17 april 2014
2014 is set to be a good year! We already have the reviews for a few papers I've been working on lately. Some are in the ML domain (an ICPR paper with Romain Negrel on supervised sparse subspace learning, an ESANN paper with Jérôme Fellus on decentralized PCA), others in CV (2 journals in revision on low level visual descriptors with Olivier Kihl) and 1 in 3D indexing with Hedi Tabia (CVPR poster)
Other than that, I've been pushing version 2.3 of jkms. I've tagged it the "density edition" since most of the changes are related to density estimators (mostly one class SVM). I've introduced the density version of SimpleMKL, which could e useful to perform model selection. Basically, if you set C=1, you'll get a Parzen estimator, albeit selection the kernel from a specific set.
Finally, I'll be in Brugge next week for the ESANN 2014 conference. A good way to start new projects, if anyone volunteers!
friday 29 november 2013
Here we go again, it seems I'm only alternating new publications and update to jkms on this page.
- Fast kernel using Nystrom approximation (with fast active learning procedure as in (Tabia BMVC13))
- Large scale Kernel SVM using the Nystrom approximation
- New algorithms and better tuning in the algebra package
- Multhithreading support for algebra
- Optional dependency on EJML for faster eigen decomposition (check is at runtime, compatible with older code)
- Revised and online Javadoc
The can now optionaly depend on EJML in order accelerate the eigen-decomposition. I had a lot of fun implementing some algorithms (Jacobi, QR, householder transforms, Givens rotation, ...), which allows the library to perform all available BLAS on its own. However, it will never be competitive with dedicated libraries. So I checked the current pure java blas library, and EJML is probably the best out there (kudos to the people behind). I made a simple wrapper that checks that the library is in the classpath, and uses it in that case. No older code should break because of this. If it does, email me rapidly...
Next, I will wrap more things around EJML (i.e. not only eig), but I still want jkms to be totally autonomous. That is, not existing feature will ever require EJML (nor any other library).
Another new feature is a fast Kernel based on the Nystrom approximation, with an active learning strategy for fast training. this was among the stuff I worked with Hedi Tabia and presented at BMVC last september.
monday 10 june 2013
- new algorithms: SDCA (Shalev-Shwartz 2013), SAG (Le Roux 2012)
- new custom matrix kernel to handle train and test separately
- add fvec file format
- add experimental package for linear algebra and corresponding processing (i.e. PCA, KPCA), use at your own risk!
- add example app to perform VOC style classification
- Lots of bug fixes
The linear algebra package is at the moment very rough. I find it somehow useful to perform some king of pre-processing (like a PCA for example). At the moment, my matrix code is a bit slow. If ever I find the time to make solid matrix operations, I will add some nice features like low rank approximations of kernels (Nyström).
Nevertheless, I suggest to always pick the latest git version instead of these releases. The API is very stable now and should not change significantly, which means that all the code you write now is to be supported in the next few years. Thus, picking the latest git always assures you to have the bug-fixes and so on (I don't release versions only for bug-fixes).
One more thing: JKernelMachines has been published in JMLR last month. I encourage you to read the paper and to cite it if you ever use to code for your publications.
tuesday 05 march 2013
JKernelMachines version number bumped to 2.0!
The bigs changes are:
- All classes have migrated under fr.lip6.jkernelmachines.* This breaks backward compatibility! (hence the 2.0 version number).
- Separation of the core library and unit testing
- Junit testing added
- Lots of bug fixes
- Better examples, and many useless test classes removed
- A small demo script to benchmark the library was added