Project details for mldata-utils

Logo mldata-utils 0.3.4

by sonne - August 24, 2010, 16:01:27 CET [ Project Homepage BibTeX Download ]

view (9 today), download ( 4 today ), 1 comment, 1 subscription


Tools to convert data + task files from and to HDF5 and also retrieve extracts from the generated files. At some stage, it will fully encapsulate h5py for the needs of

Changes to previous version:
  • Fixed an issue with data extracts.
  • Fixed an issue when updating Task files.
  • Fixed a few issues when converting to arff/octave/matlab.
BibTeX Entry: Download
URL: Project Homepage
Supported Operating Systems: Posix
Data Formats: Svmlight, Matlab, Arff, Octave, Hdf, Csv
Tags: Python, Data Formats, Weka, Libsvm
Archive: download here

Other available revisons

Version Changelog Date
  • Change task file format, such that data splits can have a variable number items and put into up to 256 categories of training/validation/test/not used/...
  • Various bugfixes.
April 8, 2011, 10:02:44
  • Various bugfixes (sparse matrix, data extraction).
  • Client api to interact with works with live website now.
December 7, 2010, 03:06:42
  • Finally reliably convert sparse, dense matrices of floating point or integer types and string lists from/to .hdf5, octave, matlab, csv, arff.
  • Added examples and a small test-suite.
November 7, 2010, 14:39:56
  • Added a fix when data.get_correct internally receives an array of array with values instead an array with values.
  • Added support for sparse matrices in data.get_correct.
August 27, 2010, 15:31:58
  • Introduced task.get_test_output to get test_idx and output_variables from Task file.
  • Introduced data.get_correct() to get the 'correct' results from Data file.
  • Fixed minor issus when converting to octave/matlab.
August 25, 2010, 18:56:53
  • Fixed an issue with data extracts.
  • Fixed an issue when updating Task files.
  • Fixed a few issues when converting to arff/octave/matlab.
August 24, 2010, 16:01:27
  • task.create now includes handling of input/output_variables and train/test_idx.
  • fixed a little error handling octave files.
August 21, 2010, 13:02:07
  • Had removed too much from data.get_extract and put it back in.
  • Added safeguard for illegal task files with no output_variables.
August 20, 2010, 10:35:46
  • Restructured package into more different modules.
  • Revamped conversion structure.
  • Bugfix re Task vs output variables.
August 19, 2010, 12:42:57
  • Caught a few more error conditions when handlings Task.
  • Temporarily removed author from package information because it threw ugly error message on older python installations.
  • Removed label_dims and improved support for input/output variables for Tasks.
  • Created new module 'data' for better encapsulation.
August 17, 2010, 12:00:51
  • Added extract function (and script) for Task datasets.
  • Moved extract function for Data from website to this tool.
  • Improved handling of Task files.
August 16, 2010, 11:52:18

Initial Announcement on

July 21, 2010, 15:03:24

Initial Announcement on

July 12, 2010, 13:33:04


Yaroslav Halchenko (on December 16, 2010, 05:47:51)

any plans for furnishing Debian package, Soeren? I see no ITP ;)

Leave a comment

You must be logged in to post comments.