1. PygbmExperimental Gradient Boosting Machines in Python with numba.
2. PignlprocApache Pig utilities to build training corpora for machine learning / NLP out of public Wikipedia and DBpedia dumps.
5. Docker DistributedExperimental docker-compose setup to bootstrap distributed on a docker-swarm cluster.
7. Paper2ebookUtility to re-structure research papers published in US Letter or A4 format PDF files to typically remove the 2 columns layout.
8. DbpediakitPython utilities to do work with the DBpedia dumps for analytics.
11. oglearnogrisel's utility extensions for scikit-learn
12. corpusmakerclojure utilities to build training corpora for machine learning / NLP out of public wikimedia dumps: status - partially stalled - will probably be reworked as cascalog scripts -- this project is in stalled mode right now: the pignlproc project is likely to replace it due to licensing constraints for future integration in Apache projects
13. mahoutPersonal development repository to prepare contributions and patches for Apache Mahout
15. my-linux-devboxVagrant / Salt configuration with Ubuntu to work on projects related to the scipy stack under Python 3 and Python 2