Project details for FEAST

Logo FEAST 1.1.1

by apocock - June 30, 2014, 01:30:23 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ]

view (10 today), download ( 3 today ), 9 comments, 1 subscription

OverallWhole StarWhole StarWhole StarWhole StarWhole Star
FeaturesWhole StarWhole StarWhole StarWhole StarWhole Star
UsabilityWhole StarWhole StarWhole StarWhole StarWhole Star
DocumentationWhole StarWhole StarWhole StarWhole StarWhole Star
(based on 1 vote)
Description:

FEAST provides a set of implementations of information theoretic filter feature selection algorithms, and an implementation of RELIEF for comparison purposes.

This toolbox accompanies the paper "Conditional Likelihood Maximisation: A Unifying Framework for Information Theoretic Feature Selection", JMLR 2012 (link).

All information theoretic algorithms are implemented in C and depend upon the supplied MIToolbox v2.1, (available separately at mloss here). These algorithms have a MATLAB interface wrapper, which also includes the two algorithms implemented directly in MATLAB (FCBF and RELIEF).

A Python wrapper (PyFeast) developed by G. Ditzler & C. Morrison from Drexel University is available on GitHub.

Contains implementations of: mim, mrmr, mifs, cmim, jmi, disr, cife, icap, condred, cmi, relief, fcbf, betagamma

MATLAB Example (using "data" as our feature matrix, and "labels" as the class label vector):

size(data) ans = (569,30) %% denoting 569 examples, and 30 features

selectedIndices = feast('jmi',5,data,labels) %% selecting the top 5 features using the jmi algorithm

selectedIndices =

28 21 8 27 23

selectedIndices = feast('mrmr',10,data,labels) %% selecting the top 10 features using the mrmr algorithm

selectedIndices =

28 24 22 8 27 21 29 4 7 25

selectedIndices = feast('mifs',5,data,labels,0.7) %% selecting the top 5 features using the mifs algorithm with beta = 0.7

selectedIndices =

28 24 22 20 29

MIToolbox (v2.1) is required to compile these algorithms, and these implementations supersede the example implementations given in that package (they have more robust behaviour when used with unexpected inputs).

Please cite this work by using the corresponding paper BibTeX.

Changes to previous version:
  • Bug fixes to memory management.
  • Compatibility changes for PyFeast python wrapper (note the C library now returns feature indices starting from 0, the Matlab wrapper still returns indices starting from 1).
  • Added C version of MIM.
  • Updated internal version of MIToolbox.
BibTeX Entry: Download
Corresponding Paper BibTeX Entry: Download
URL: Project Homepage
Supported Operating Systems: Linux, Macosx, Windows
Data Formats: Matlab
Tags: Matlab, Feature Selection, Feature Ranking, Mutual Information, Jmlr
Archive: download here

Other available revisons

Version Changelog Date
1.1.1
  • Bug fixes to memory management.
  • Compatibility changes for PyFeast python wrapper (note the C library now returns feature indices starting from 0, the Matlab wrapper still returns indices starting from 1).
  • Added C version of MIM.
  • Updated internal version of MIToolbox.
June 30, 2014, 01:30:23
1.00

Initial Announcement on mloss.org.

February 13, 2012, 19:00:29

Comments

Ryan Brauchler (on February 5, 2014, 22:57:35)

First off, I'm a huge fan of this toolbox, and generally, it works great.

I am implementing it to select features from an extremely high dimensional data set (~ 1,000 x 500,000). To do so, I placed this function into a loop where I break the initial matrix (which is held in a memmapfile to save memory) into N smaller bits and then feed that into FEAST and return the columns of the chosen features. The output of this loop is simply a 1xN cell array where each cell in the array is a list of the columns chosen by the algorithm. I am just using the mrmr algorithm in the toolbox.

After a few iterations of this loop, however, I always get an out of memory error from Matlab. The toolbox works great in isolation, but when you start using it multiple times in a loop, it seems there is a memory leak in the MEX file for mrmr, and I would imagine its also in the others. Just wanted to bring the issue to your attention. I am going to focus on fixing it myself, but figured it might be faster to put it in the hands of the experts who wrote it.

Thanks!

Adam Pocock (on February 6, 2014, 02:40:32)

Hi Ryan,

Do you know if the error occurs when using FEAST from C/C++? I've had issues with Matlab before where it doesn't properly free memory allocated in MEX files (even though mxFree was called) which eventually leads to memory fragmentation and then subsequent MEX calls fail.

What version of Matlab are you using, and how much RAM is there in the machine?

Adam

Ryan Brauchler (on February 6, 2014, 04:16:54)

Hi Adam.

Thank you so much for the prompt response.

I have yet to try running the code outside Matlab, though I could attempt to do so tomorrow. It very well could be the case that it's a Matlab fragmentation error. If that's in fact the case is there a workaround that overcomes this issue?

I'm using Matlab R2013a with 32GB of RAM in the machine. When the array is broken up into smaller pieces, each piece is less than 3MB in size, so that shouldn't end up being an issue of physical memory limitations.

Thanks,

Ryan

Adam Pocock (on February 6, 2014, 05:22:11)

Hi Ryan,

One test you could do is see if Feast has the same issue (in Matlab) if you just rerun it multiple times on the same partition of the dataset. That way Matlab shouldn't do any more allocation beyond what is necessary to store the features (though it will still fragment the RAM a bit). We tested it fairly heavily by running it repeatedly across multiple datasets in the same Matlab instance, but we didn't have anything quite as large as the one you are using.

Drop me an email at apocock at cs.man.ac.uk and we can look at this in a little more detail.

Thanks,

Adam

YS L (on July 11, 2014, 03:27:02)

Hi Admin,

I just find this toolbox and am going to use it for my project. However, I still met the problem of memory. I use the functions like :"[selectedFeatures] = feast('mrmr',200,X,Y) ", where X is a (# sample) × (# feature) matrix and Y is a (# sample) × 1 binary vector (-1,1). When I call the feast function, MATLAB report error "Out of memory".

I downloaded the newest version of the toolbox. My matlab version is R2012a and memory size is 32 Gb. The data is quite small around 200 × 300. Is there any suggestion to fix the problem?

Thanks a lot !

Adam Pocock (on July 11, 2014, 04:46:35)

Hi,

There are two preprocessing steps you should do before using the toolbox.

  • Ensure the data has been discretised into bins.
  • Transform the data by relabelling it (if a feature has value {1,100,1000} convert it to {1,2,3}).

This should fix any out of memory errors for small datasets.

Adam

YS L (on July 11, 2014, 05:17:59)

Hi Adam,

That works! Thanks a lot

A Smith (on August 5, 2014, 03:08:58)

Adam,

I'm having some trouble using FCBF. Inside the "SU" subfunction, there seems to be two functions h() and mi() that are undefined. Did I not install the Toolbox correctly?

Adam Pocock (on August 5, 2014, 04:03:14)

Looks like I forgot to add a bit to my setup instructions. FCBF uses the Matlab interface to MIToolbox (unlike the rest of FEAST which compiles in the C code), so you need to add the MIToolbox folder to your Matlab path. MIToolbox contains the h.m and mi.m files which FCBF uses.

Adam

Leave a comment

You must be logged in to post comments.