Project details for FEAST

Logo FEAST 1.00

by apocock - February 13, 2012, 19:00:29 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ]

view (11 today), download ( 2 today ), 4 comments, 1 subscription

OverallWhole StarWhole StarWhole StarWhole StarWhole Star
FeaturesWhole StarWhole StarWhole StarWhole StarWhole Star
UsabilityWhole StarWhole StarWhole StarWhole StarWhole Star
DocumentationWhole StarWhole StarWhole StarWhole StarWhole Star
(based on 1 vote)
Description:

FEAST provides a set of implementations of information theoretic filter feature selection algorithms, and an implementation of RELIEF for comparison purposes.

This toolbox accompanies the paper "Conditional Likelihood Maximisation: A Unifying Framework for Information Theoretic Feature Selection", JMLR 2012 (link).

All information theoretic algorithms are implemented in C and depend upon the supplied MIToolbox (available separately at mloss here). These algorithms have a MATLAB interface wrapper, which also includes the two algorithms implemented directly in MATLAB (FCBF and RELIEF).

Contains implementations of: mim, mrmr, mifs, cmim, jmi, disr, cife, icap, condred, cmi, relief, fcbf, betagamma

MATLAB Example (using "data" as our feature matrix, and "labels" as the class label vector):

size(data) ans = (569,30) %% denoting 569 examples, and 30 features

selectedIndices = feast('jmi',5,data,labels) %% selecting the top 5 features using the jmi algorithm

selectedIndices =

28 21 8 27 23

selectedIndices = feast('mrmr',10,data,labels) %% selecting the top 10 features using the mrmr algorithm

selectedIndices =

28 24 22 8 27 21 29 4 7 25

selectedIndices = feast('mifs',5,data,labels,0.7) %% selecting the top 5 features using the mifs algorithm with beta = 0.7

selectedIndices =

28 24 22 20 29

If you wish to use MIM in a C program you can use the BetaGamma function with Beta = 0, Gamma = 0, as this is equivalent to MIM (but slower than the other implementation). MIToolbox is required to compile these algorithms, and these implementations supercede the example implementations given in that package (they have more robust behaviour when used with unexpected inputs).

Please cite this work by using the corresponding paper BibTeX.

Changes to previous version:

Initial Announcement on mloss.org.

BibTeX Entry: Download
Corresponding Paper BibTeX Entry: Download
URL: Project Homepage
Supported Operating Systems: Linux, Macosx, Windows
Data Formats: Matlab
Tags: Matlab, Feature Selection, Feature Ranking, Mutual Information, Jmlr
Archive: download here

Comments

Ryan Brauchler (on February 5, 2014, 22:57:35)

First off, I'm a huge fan of this toolbox, and generally, it works great.

I am implementing it to select features from an extremely high dimensional data set (~ 1,000 x 500,000). To do so, I placed this function into a loop where I break the initial matrix (which is held in a memmapfile to save memory) into N smaller bits and then feed that into FEAST and return the columns of the chosen features. The output of this loop is simply a 1xN cell array where each cell in the array is a list of the columns chosen by the algorithm. I am just using the mrmr algorithm in the toolbox.

After a few iterations of this loop, however, I always get an out of memory error from Matlab. The toolbox works great in isolation, but when you start using it multiple times in a loop, it seems there is a memory leak in the MEX file for mrmr, and I would imagine its also in the others. Just wanted to bring the issue to your attention. I am going to focus on fixing it myself, but figured it might be faster to put it in the hands of the experts who wrote it.

Thanks!

Adam Pocock (on February 6, 2014, 02:40:32)

Hi Ryan,

Do you know if the error occurs when using FEAST from C/C++? I've had issues with Matlab before where it doesn't properly free memory allocated in MEX files (even though mxFree was called) which eventually leads to memory fragmentation and then subsequent MEX calls fail.

What version of Matlab are you using, and how much RAM is there in the machine?

Adam

Ryan Brauchler (on February 6, 2014, 04:16:54)

Hi Adam.

Thank you so much for the prompt response.

I have yet to try running the code outside Matlab, though I could attempt to do so tomorrow. It very well could be the case that it's a Matlab fragmentation error. If that's in fact the case is there a workaround that overcomes this issue?

I'm using Matlab R2013a with 32GB of RAM in the machine. When the array is broken up into smaller pieces, each piece is less than 3MB in size, so that shouldn't end up being an issue of physical memory limitations.

Thanks,

Ryan

Adam Pocock (on February 6, 2014, 05:22:11)

Hi Ryan,

One test you could do is see if Feast has the same issue (in Matlab) if you just rerun it multiple times on the same partition of the dataset. That way Matlab shouldn't do any more allocation beyond what is necessary to store the features (though it will still fragment the RAM a bit). We tested it fairly heavily by running it repeatedly across multiple datasets in the same Matlab instance, but we didn't have anything quite as large as the one you are using.

Drop me an email at apocock at cs.man.ac.uk and we can look at this in a little more detail.

Thanks,

Adam

Leave a comment

You must be logged in to post comments.