Project details for Milk

Logo Milk 0.3.1

by luispedro - September 26, 2010, 23:46:27 CET [ Project Homepage BibTeX Download ]

view (6 today), download ( 3 today ), 1 subscription

OverallWhole StarWhole StarWhole StarEmpty StarEmpty Star
FeaturesWhole StarWhole Star1/2 StarEmpty StarEmpty Star
UsabilityWhole StarWhole StarWhole StarEmpty StarEmpty Star
DocumentationWhole StarWhole StarWhole Star1/2 StarEmpty Star
(based on 2 votes)

milk wraps libsvm in a Pythonic way (the models learned have weight arrays that are accessible from Python directly, the models are pickle()able, you can pass any Python function as a kernel,....)

It also supports k-means clustering with an implementation that is careful not to use too much memory (if your dataset fits into memory, milk can cluster it).

It does not have its own file format or in-memory format, which I consider a feature as it works on numpy arrays directly (or anything that is convertible to a numpy-array) without forcing you to copy memory around. For SVMs, you can even just use any datatype if you have your own kernel function.

Changes to previous version:
  • fix sparse non-negative matrix factorisation
  • mean grouped classifier
  • update multi classifier to newer interface
  • documentation & testing fixes
BibTeX Entry: Download
URL: Project Homepage
Supported Operating Systems: Agnostic
Data Formats: None
Tags: Svm, Supervised
Archive: download here

Other available revisons

Version Changelog Date

Added LASSO (using coordinate descent optimization). Made SVM classification (learning and applying) much faster: 2.5x speedup on yeast UCI dataset.

November 7, 2012, 13:08:28

Multiprocessing support (for cross-validation); more ways of handling multi-label problems (error-correcting output codes, trees); very significant performance increases. Many bug fixes.

September 23, 2012, 14:31:44
  • Added a new module: milk.ext.jugparallel to interface with jug ( This makes it easy to parallelise things such as n-fold cross validation (each fold runs on its own processor) or multiple kmeans random starts.

  • Add some new functions: measures.curves.precision_recall, milk.unsupervised.kmeans.select_best.kmeans.

  • Fixed a tricky bug in SDA and a few minor issues elsewhere

May 11, 2011, 04:18:53

Speed improvements. Bug fixes. Added folds argument to nfoldcrossvalidation. Added assign_centroids function

March 18, 2011, 22:45:11

Fix compilation on Windows.

February 12, 2011, 23:53:03
  • Logistic regression
  • Source demos included (in source and documentation)
  • Add cluster agreement metrics
  • Fix nfoldcrossvalidation bug when using origins
February 10, 2011, 15:32:54
  • Unsupervised (1-class) kernel density modeling
  • Fix for when SDA returns empty
  • weights option to some learners
  • stump learner
  • Adaboost (result of above changes)
December 20, 2010, 19:04:15
  • fixes for 64-bit machines
November 4, 2010, 05:25:35
  • Random forest learners
  • Decision trees sped up 20x
  • Much faster gridsearch (finds optimum without computing all folds)
November 1, 2010, 02:01:11
  • fix sparse non-negative matrix factorisation
  • mean grouped classifier
  • update multi classifier to newer interface
  • documentation & testing fixes
September 26, 2010, 23:46:27
  • no scipy.weave dependency
  • flatter namespace
  • faster kmeans
  • affinity propagation (borrowed from scikits-learn & slightly improved to take less memory and time)
  • pdist()
  • more documentation
September 24, 2010, 01:24:30

Cleaned up and tested code. Removed some dependencies. Better documentation. Changed the classification interface to separate model learning from model usage.

May 21, 2010, 22:05:04

Improved Performance. Removed files from the distribution that were mistakenly included.

December 17, 2009, 18:44:18

Initial Announcement on

November 24, 2009, 00:16:42


No one has posted any comments yet. Perhaps you'd like to be the first?

Leave a comment

You must be logged in to post comments.