<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>mloss.org new software</title><link>http://mloss.org</link><description>Updates and additions to mloss.org</description><language>en</language><lastBuildDate>Thu, 20 Nov 2008 01:48:19 -0000</lastBuildDate><item><title>MLPACK 0.1</title><link>http://mloss.org/software/view/152/0.1</link><description>&lt;html&gt;&lt;p&gt;MLPACK is the first comprehensive scalable machine learning library.
   Developed by the Fundamental Algorithmic and Statistical Tools
   laboratory (FASTlab), MLPACK and its core functions library FASTlib
   are the much needed filling of an existing void. Previously,
   researchers had to either (a) settle for poorly-scaling collections of
   methods implemented for academic purposes, (b) hunt down the often
   difficult to find and difficult to apply yet fast code writen by
   algorithms' developers, or (c) reimplement solutions to their specific
   analysis problems from scratch. With MLPACK, we offer a fourth option,
   in which researchers may find all the methods they need designed
   favoring both speed and usability.
&lt;/p&gt;
&lt;p&gt;MLPACK currently includes a wide range of the following efficient algorithms:
&lt;/p&gt;
&lt;ul&gt;
 &lt;li&gt;&lt;p&gt;$k$-nearest neighbor classifier.
&lt;/p&gt;

 &lt;/li&gt;

 &lt;li&gt;&lt;p&gt;FastICA.
&lt;/p&gt;

 &lt;/li&gt;

 &lt;li&gt;&lt;p&gt;Hidden Markov Models.
&lt;/p&gt;

 &lt;/li&gt;

 &lt;li&gt;&lt;p&gt;Information Maximization algorithm for ICA.
&lt;/p&gt;

 &lt;/li&gt;

 &lt;li&gt;&lt;p&gt;Kalman filter.
&lt;/p&gt;

 &lt;/li&gt;

 &lt;li&gt;&lt;p&gt;Kernel density estimation algorithm using series expansion.
&lt;/p&gt;

 &lt;/li&gt;

 &lt;li&gt;&lt;p&gt;Mixture of Gaussians using maximum likelihood and L2 error.
&lt;/p&gt;

 &lt;/li&gt;

 &lt;li&gt;&lt;p&gt;Naive Bayes classifier.
&lt;/p&gt;

 &lt;/li&gt;

 &lt;li&gt;&lt;p&gt;Nelder-Mead/Quasi-Newton optimizer.
&lt;/p&gt;

 &lt;/li&gt;

 &lt;li&gt;&lt;p&gt;Series expansion library for Gaussian kernel in $O(p^D)$ and $O(D^p)$ expansions.
&lt;/p&gt;

 &lt;/li&gt;

 &lt;li&gt;&lt;p&gt;Support Vector Machine classifier and regression.
&lt;/p&gt;

 &lt;/li&gt;

 &lt;li&gt;&lt;p&gt;Sequential Minimal Optimization algorithm for SVM.
&lt;/p&gt;

 &lt;/li&gt;
&lt;/ul&gt;&lt;/html&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Alexander Gray,Garry Boyer,Ryan Riegel,Nikolaos Vasiloglou,Dongryeol Lee,Chip Mappus,Nishant Mehta,Hua Ouyang,Parikshit Ram,Long Tran,Wee Chin Wong</dc:creator><pubDate>Thu, 20 Nov 2008 01:48:19 -0000</pubDate><comments>http://mloss.org/software/rss/comments/152</comments><guid>http://mloss.org/software/view/152/0.1</guid><category>clustering</category><category>kernel methods</category><category>convex optimization</category><category>classifiaction</category><category>density estimation</category><category>large scale learning</category><category>kalman filter</category><category>k nearest neighbor classification</category><category>algorithms</category><category>classifiers</category><category>nips2008</category><category>kdtree</category></item><item><title>r-cran-earth 2.1-2</title><link>http://mloss.org/software/view/123/2.1-2</link><description>&lt;html&gt;&lt;p&gt;Multivariate Adaptive Regression Spline Models: Build regression models using the techniques in Friedman's papers "Fast MARS" and "Multivariate Adaptive Regression Splines". (The term "MARS" is copyrighted and thus not used in the name of the package.)
&lt;/p&gt;&lt;/html&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Stephen Milborrow,Trevor Hastie, Rob Tibshirani</dc:creator><pubDate>Tue, 18 Nov 2008 10:07:00 -0000</pubDate><comments>http://mloss.org/software/rss/comments/123</comments><guid>http://mloss.org/software/view/123/2.1-2</guid><category>r-cran</category></item><item><title>Adaptive Resonance Theory for Unsupervised Learning 1.0</title><link>http://mloss.org/software/view/160/1.0</link><description>&lt;html&gt;&lt;p&gt;This software package includes the ART algorithms for unsupervised learning only. It is a family of four programs based on different ART algorithms (ART 1, ART 2A, ART 2A-C and ART Distance). All of them are clustering algorithms and they are command-line programs. They are able to process only numerical continuous values (with the exception of ART 1) and cannot handle missing values. ART algorithms do not need to know a number of how many clusters should be created in advance.
&lt;/p&gt;&lt;/html&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Tomas Hudik, Jan Zizka</dc:creator><pubDate>Thu, 13 Nov 2008 16:40:28 -0000</pubDate><comments>http://mloss.org/software/rss/comments/160</comments><guid>http://mloss.org/software/view/160/1.0</guid><category>clustering</category><category>online learning</category></item><item><title>DeltaLDA 0.1</title><link>http://mloss.org/software/view/161/0.1</link><description>&lt;html&gt;&lt;p&gt;This software implements the DeltaLDA model, which is a modification of the Latent Dirichlet Allocation (LDA) model.  DeltaLDA can use multiple topic mixing weight priors to jointly model multiple corpora with a shared set of topics. The inference method is Collapsed Gibbs sampling.  The program can also be used to do "standard" LDA as a special case, and is implemented as a Python C extension module.
&lt;/p&gt;&lt;/html&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">David Andrzejewski</dc:creator><pubDate>Wed, 12 Nov 2008 21:53:00 -0000</pubDate><comments>http://mloss.org/software/rss/comments/161</comments><guid>http://mloss.org/software/view/161/0.1</guid><category>python</category><category>topic analysis</category><category>topic modeling</category></item><item><title>MLPY Machine Learning Py 1.2.7</title><link>http://mloss.org/software/view/66/1.2.7</link><description>&lt;html&gt;&lt;p&gt;We introduce mlpy, a high-performance Python package for predictive modeling. It makes extensive use of NumPy to provide fast N-dimensional array manipulation and easy integration of C code. Mlpy provides high level procedures that support, with few lines of code, the design of rich Data Analysis Protocols (DAPs) for predictive classification and feature selection. Methods are available for feature weighting and ranking, data resampling, error evaluation and experiment landscaping.  The package includes tools to measure stability in sets of ranked feature lists, of special interest in bioinformatics for functional genomics, for which large scale experiments with up to 10^6 classifiers have been run on Linux clusters and on the Grid.
&lt;/p&gt;
&lt;p&gt;The modular structure of mlpy allows easily adding new algorithms to each of the 7 categories in which the package is organized. They are:
&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Classification&lt;/strong&gt;. For each algorithm, distinct methods are deployed for the training and the testing phases (whenever possible, real valued prediction can be obtained). The implemented algorithms are in the families of SVMs-Support Vector Machines (four kernels available), DA-Discriminant Analysis (Fisher, Penalized and Spectral Regression) and Nearest Neighbours.
&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Feature weighting&lt;/strong&gt;. A total of nine methods is made available to obtain weights from models such as SVMs or DAs; classifier-independent methods for weighting features are also implemented, including I-RELIEF and Discrete Wavelet Transform.
&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Feature ranking&lt;/strong&gt;. Two main schemas are used for selecting and ranking purposes, belonging either to the Recursive Feature Elimination or the Recursive Forward Selection family (for a total of six variants).
&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Resampling methods&lt;/strong&gt;. The classification and feature ranking operations can be organized within a sampling procedure such as Textbook/Monte-Carlo cross validation (stratification over labels is available), leave-one-out or user-defined train/test split schema.
&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Metric functions&lt;/strong&gt;. Performance assessment can be evaluated by a set of different measures, including Error, Accuracy, Matthews Correlation Coefficient, Area Under the ROC Curve. Variability can assessed by Standard Deviation or Bootstrap Confidence Intervals.
&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Feature list analysis&lt;/strong&gt;. The ordered lists from the feature ranking experiments can be analyzed in terms of stability (Canberra indicator, extraction/position indicator) and an optimal list can be retrieved (Borda count).
&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Landscaping tools&lt;/strong&gt;. A system of executable scripts to be used off-the-shelf to tabulate performance (e.g. Error, MCC and stability measures) on a grid of different experimental conditions by a basic DAP implementation (resampling by k-fold or Monte Carlo CV, training, feature ranking, test).
&lt;/p&gt;
&lt;p&gt;mlpy is a project developed by the MPBA research unit at FBK, the Bruno Kessler Foundation in Trento, Italy (http://mpba.fbk.eu). 
&lt;/p&gt;&lt;/html&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Davide Albanese, Stefano Merler, Giuseppe Jurman, Roberto Visintainer, Cesare Furlanello</dc:creator><pubDate>Tue, 11 Nov 2008 12:13:33 -0000</pubDate><comments>http://mloss.org/software/rss/comments/66</comments><guid>http://mloss.org/software/view/66/1.2.7</guid><category>svm</category><category>classification</category><category>fda</category><category>feature weighting</category><category>irelief</category><category>rfe</category><category>feature ranking</category><category>resampling</category><category>srda</category><category>nn</category><category>dwt</category><category>pda</category><category>nips2008</category></item><item><title>dlib 17.12</title><link>http://mloss.org/software/view/83/17.12</link><description>&lt;html&gt;&lt;p&gt;A C++ toolkit containing machine learning algorithms and tools that facilitate creating complex software in C++ to solve real world problems.
&lt;/p&gt;
&lt;p&gt;The library provides efficient implementations of the following algorithms:
&lt;/p&gt;
&lt;ul&gt;
 &lt;li&gt;
     support vector machines for classification
 &lt;/li&gt;

 &lt;li&gt;
     relevance vector machines for regression and classification 
 &lt;/li&gt;

 &lt;li&gt;
     reduced set approximation of SV decision surfaces
 &lt;/li&gt;

 &lt;li&gt;
     online kernel RLS regression 
 &lt;/li&gt;

 &lt;li&gt;
     online kernelized centroid estimation/one class classifier
 &lt;/li&gt;

 &lt;li&gt;
     kernel k-means clustering 
 &lt;/li&gt;

 &lt;li&gt;
     radial basis function networks
 &lt;/li&gt;

 &lt;li&gt;
     kernelized recursive feature ranking
 &lt;/li&gt;

 &lt;li&gt;
     Bayesian network inference using junction trees or MCMC
 &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The library also comes with extensive documentation and example programs that walk the user through the use of these machine learning techniques.
&lt;/p&gt;&lt;/html&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Davis King</dc:creator><pubDate>Tue, 11 Nov 2008 03:02:31 -0000</pubDate><comments>http://mloss.org/software/rss/comments/83</comments><guid>http://mloss.org/software/view/83/17.12</guid><category>svm</category><category>classification</category><category>clustering</category><category>regression</category><category>kernel methods</category><category>matrix library</category><category>kkmeans</category><category>optimization</category><category>algorithms</category><category>exact bayesian methods</category><category>approximate inference</category><category>bayesian networks</category><category>junction tree</category></item><item><title>RL Glue and Codecs  -- Glue 3.0 RC2 and Codecs R338</title><link>http://mloss.org/software/view/151/%20--%20Glue%203.0%20RC2%20and%20Codecs%20R338</link><description>&lt;html&gt;&lt;p&gt;RL-Glue allows agents, environments, and experiments written in Java, C/C++, Matlab, Python, and Lisp to inter operate, accelerating research by promoting software re-use in the community.
&lt;/p&gt;
&lt;p&gt;Note:  This is a release candidate, not a final release.  We are actively soliciting feedback from the community about any problems with the software or documentation that can be improved before a final release in Nov/Dec 2008.
&lt;/p&gt;
&lt;p&gt;Update posted: Oct 11/2008, Big change for main RL-Glue and C-Codec to use const-pointers instead of structs by value in parameters and return types. Breaks backward compatibility. See the tech manual in docs.
&lt;/p&gt;
&lt;p&gt;Update posted: Oct 8/2008, fixed memory leak in RL-Glue, fixed skeleton experiment build on Linux, updated some Cygwin compatibility stuff.
&lt;/p&gt;
&lt;p&gt;Overview
   &lt;hr /&gt;
        Inspired by related psychological theory, in computer science, reinforcement learning is a sub-area of machine learning concerned with how an agent ought to take actions in an environment so as to maximize some notion of long-term reward. Reinforcement learning algorithms attempt to find a policy that maps states of the world to the actions the agent ought to take in those states...
       -- Wikipedia Reinforcement Learning Article
&lt;/p&gt;
&lt;p&gt;RL-Glue is a set of common guidelines for the reinforcement learning community to follow to allow us to share and compare agents and environments with greater ease.  The software implementation of RL-Glue is the reusable glue interface to connect the basic parts of a learning experiment.
&lt;/p&gt;
&lt;p&gt;RL-Glue supports interaction between agents, environments, and experiment programs in two different modes.  In direct-compile mode, all three modules are written in C/C++ and compiled together into a single executable program.
&lt;/p&gt;
&lt;p&gt;In the more flexible socket mode: agents, environments, and experiments use inter-process communication through sockets, either locally on one computer or over the network or Internet.  In socket mode, agents, environments, and experiments written in a variety of languages can interact with each other transparently.  The language-specific software that allows creations from a particular language to connect to RL-Glue is called a codec.  We currently have codecs for:
&lt;/p&gt;
&lt;ul&gt;
 &lt;li&gt;
     C/C++
 &lt;/li&gt;

 &lt;li&gt;
     Java
 &lt;/li&gt;

 &lt;li&gt;
     Python
 &lt;/li&gt;

 &lt;li&gt;
     Matlab
 &lt;/li&gt;

 &lt;li&gt;
     Lisp
 &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Members of the reinforcement learning community are welcome to write their own language-or-project specific codecs to use with RL-Glue.  The Lisp codec is an example of a user-contributed codec.  There are currently codecs in development to connect projects as diverse as: a real-time strategy game, an atari emulator, and a robot to RL-Glue.
&lt;/p&gt;
&lt;p&gt;The RL-Glue software project, combined with the RL-Glue codecs are a powerful tool that allows members of the reinforcement learning community to re-use each others agents, environments, and experiment programs to help quicken the pace of research. Before RL-Glue most researches implemented their own experiment protocol, making collaboration difficult. 
&lt;/p&gt;
&lt;p&gt;RL-Glue has been the base for the last few reinforcement learning competitions, and that trend will continue with the 2009 Reinforcement Learning Competition.
&lt;/p&gt;
&lt;p&gt;What's new in RL-Glue 3.0
   &lt;hr /&gt;
   - A new homepage: http://glue.rl-community.org/
&lt;/p&gt;
&lt;ul&gt;
 &lt;li&gt;&lt;p&gt;Revamped build system (autotools) for maximum cross-platform install compatibility (Linux, Unix, MacOS, Cygwin)
   -Installing has never been simpler:
   &amp;gt;$ ./configure
   &amp;gt;$ make
   &amp;gt;$ sudo make install
&lt;/p&gt;

 &lt;/li&gt;

 &lt;li&gt;&lt;p&gt;RL-Glue now installs to /usr/local
&lt;/p&gt;
&lt;ul&gt;
 &lt;li&gt;
     Headers and libs in standard search paths&lt;ul&gt;
 &lt;li&gt;
     Compiling agents/environments/experiments has never been easier:
     &amp;gt;$ gcc MyAgent.c -lrlagent -o myAgent.exe
 &lt;/li&gt;
&lt;/ul&gt;

 &lt;/li&gt;
&lt;/ul&gt;

 &lt;/li&gt;

 &lt;li&gt;&lt;p&gt;Codecs for C/C++, Java, Python, MATLAB AND LISP  &amp;lt;--- MATLAB AND LISP!
&lt;/p&gt;

 &lt;/li&gt;

 &lt;li&gt;&lt;p&gt;charArray (String) observation and action types!
&lt;/p&gt;

 &lt;/li&gt;

 &lt;li&gt;&lt;p&gt;Documentation
&lt;/p&gt;
&lt;ul&gt;
 &lt;li&gt;
     General RL-Glue Overview
 &lt;/li&gt;

 &lt;li&gt;
     RL-Glue Technical Manual
 &lt;/li&gt;

 &lt;li&gt;
     A manual for each codec!
 &lt;/li&gt;
&lt;/ul&gt;

 &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;History of RL-Glue
   &lt;hr /&gt;
   We can trace RL-Glue back as far as 1996 through a project by Rich Sutton and Juan Carlos Santamaria called RL-Interface.  Since then, the project has gone through several designs and languages.  Over time the objectives of the project became more ambitious - it grew from being a convenient calling convention within a single language to a complete protocol allowing all sorts of various languages to communicate with each other.
&lt;/p&gt;&lt;/html&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Brian Tanner, Adam White, Richard S. Sutton</dc:creator><pubDate>Tue, 11 Nov 2008 01:20:00 -0000</pubDate><comments>http://mloss.org/software/rss/comments/151</comments><guid>http://mloss.org/software/view/151/ -- Glue 3.0 RC2 and Codecs R338</guid><category>control</category><category>reinforcement learning</category><category>nips2008</category></item><item><title>Model Monitor 1.0</title><link>http://mloss.org/software/view/137/1.0</link><description>&lt;html&gt;&lt;p&gt;Model Monitor is a Java toolkit for the systematic evaluation of classifiers under changes in distribution.  It provides methods for detecting distribution shifts in data, comparing the performance of multiple classifiers under shifts in distribution, and evaluating the robustness of individual classifiers to distribution change.  As such, it allows users to determine the best model (or models) for their data under a number of potential scenarios.  Additionally, Model Monitor is fully integrated with the WEKA machine learning environment,  so that a variety of commodity classifiers can be used if desired.
&lt;/p&gt;
&lt;p&gt;Some of the techniques implemented in the software come from our papers:
&lt;/p&gt;
&lt;p&gt;David A. Cieslak and Nitesh V. Chawla "Detecting Fracture Points in Classifier Performance", 7th IEEE Conference on Data Mining, pp. 123-132, 2007.
&lt;/p&gt;
&lt;p&gt;David A. Cieslak and Nitesh V. Chawla "A Framework for Monitoring Classifiers' Performance: When and Why Failure Occurs?", Knowledge and Information Systems 2008.
&lt;/p&gt;
&lt;p&gt;Interested parties can find them on our website:
&lt;/p&gt;
&lt;p&gt;http://www.nd.edu/~dial
&lt;/p&gt;&lt;/html&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Troy Raeder, Nitesh V. Chawla</dc:creator><pubDate>Fri, 07 Nov 2008 18:07:20 -0000</pubDate><comments>http://mloss.org/software/rss/comments/137</comments><guid>http://mloss.org/software/view/137/1.0</guid><category>machine learning</category><category>data mining</category><category>nips2008</category><category>distribution shift</category><category>evaluation</category></item><item><title>r-cran-mboost 1.0-4</title><link>http://mloss.org/software/view/105/1.0-4</link><description>&lt;html&gt;&lt;p&gt;Model-Based Boosting: Functional gradient descent algorithms (boosting) for optimizing general loss functions utilizing componentwise least squares, either of parametric linear form or smoothing splines, or regression trees as base learners for fitting generalized linear, additive and interaction models to potentially high-dimensional data.
&lt;/p&gt;&lt;/html&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Torsten Hothorn, Peter Buhlmann, Thomas Kneib, Matthias Schmid, Benjamin Hofner</dc:creator><pubDate>Fri, 07 Nov 2008 00:00:00 -0000</pubDate><comments>http://mloss.org/software/rss/comments/105</comments><guid>http://mloss.org/software/view/105/1.0-4</guid><category>r-cran</category></item><item><title>WEKA 3.5.8</title><link>http://mloss.org/software/view/16/3.5.8</link><description>&lt;html&gt;&lt;p&gt;The Weka workbench contains a collection of visualization tools and algorithms for data analysis and predictive modelling, together with graphical user interfaces for easy access to this functionality. The main strengths of Weka are that it is
   &lt;em&gt; freely available under the GNU General Public License,
&lt;/em&gt; very portable because it is fully implemented in the Java programming language and thus runs on almost any computing platform,
   &lt;em&gt; contains a comprehensive collection of data preprocessing and modeling techniques, and
&lt;/em&gt; is easy to use by a novice due to the graphical user interfaces it contains.
&lt;/p&gt;
&lt;p&gt;Weka supports several standard data mining tasks, more specifically, data preprocessing, clustering, classification, regression, visualization, and feature selection. All of Weka's techniques are predicated on the assumption that the data is available as a single flat file or relation, where each data point is described by a fixed number of attributes (normally, numeric or nominal attributes, but some other attribute types are also supported). Weka provides access to SQL databases using Java Database Connectivity and can process the result returned by a database query. It is not capable of multi-relational data mining, but there is separate software for converting a collection of linked database tables into a single table that is suitable for processing using Weka. Another important area that is currently not covered by the algorithms included in the Weka distribution is sequence modeling.
&lt;/p&gt;
&lt;p&gt;Weka's main user interface is the Explorer, but essentially the same functionality can be accessed through the component-based Knowledge Flow interface and from the command line. There is also the Experimenter, which allows the systematic comparison of the predictive performance of Weka's machine learning algorithms on a collection of datasets.
&lt;/p&gt;
&lt;p&gt;The Explorer interface has several panels that give access to the main components of the workbench. The Preprocess panel has facilities for importing data from a database, a CSV file, etc., and for preprocessing this data using a so-called filtering algorithm. These filters can be used to transform the data (e.g., turning numeric attributes into discrete ones) and make it possible to delete instances and attributes according to specific criteria. The Classify panel enables the user to apply classification and regression algorithms (indiscriminately called classifiers in Weka) to the resulting dataset, to estimate the accuracy of the resulting predictive model, and to visualize erroneous predictions, ROC curves, etc., or the model itself (if the model is amenable to visualization like, e.g., a decision tree). Weka contains many of the latest sophisticated methods, such as support vector machines, gaussian processes, random forests, but also classic methods like C4.5, ANNs, bagging, boosting, etc. The Associate panel provides access to association rule learners that attempt to identify all important interrelationships between attributes in the data. The Cluster panel gives access to the clustering techniques in Weka, e.g., the simple k-means algorithm. There is also an implementation of the expectation maximization algorithm for learning a mixture of normal distributions. The next panel, Select attributes provides algorithms for identifying the most predictive attributes in a dataset. The last panel, Visualize, shows a scatter plot matrix, where individual scatter plots can be selected and enlarged, and analyzed further using various selection operators.
&lt;/p&gt;
&lt;p&gt;What's new since version 3.5.6?
&lt;/p&gt;
&lt;ul&gt;
 &lt;li&gt;
     New classifiers: Bayesian logistic regression, discriminitive multinomial naive Bayes for text classification, functional, decision table-naive Bayes hybrid, averaged one dimensional estimators with subsumption resolution, J48 with grafting and classification via clustering meta classifier
 &lt;/li&gt;

 &lt;li&gt;
     Subset by expression, reservoir sampling, RELAGGS and random subset filters. 
 &lt;/li&gt;

 &lt;li&gt;
     Latent semantic analysis attribute transformer; cost sensitive attribute selection; linear forward selection and subset size forward selection search methods. 
 &lt;/li&gt;

 &lt;li&gt;
     Improved output in Logistic, NaiveBayes, EM and SimpleKMeans. 
 &lt;/li&gt;

 &lt;li&gt;
     Plugin support for the KnowledgeFlow. 
 &lt;/li&gt;

 &lt;li&gt;
     Ability to execute knowledge flows outside of the GUI. 
 &lt;/li&gt;

 &lt;li&gt;
     Output predictions for a run of cross-validation and percentage split on the command line. 
 &lt;/li&gt;

 &lt;li&gt;
     Instance weights can now be specified in a standard ARFF file. 
 &lt;/li&gt;

 &lt;li&gt;
     GUI for Bayes net classifiers
 &lt;/li&gt;

 &lt;li&gt;
     Manhattan and Chebyshev distance functions
 &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;[1] Ian H. Witten; Eibe Frank (2005). Data Mining: Practical machine learning tools and techniques, 2nd Edition. Morgan Kaufmann, San Francisco.
&lt;/p&gt;
&lt;p&gt;[2] G. Holmes; A. Donkin and I.H. Witten (1994). Weka: A machine learning workbench. Proc Second Australia and New Zealand Conference on Intelligent Information Systems, Brisbane, Australia.
&lt;/p&gt;
&lt;p&gt;[3] S.R. Garner; S.J. Cunningham, G. Holmes, C.G. Nevill-Manning, and I.H. Witten (1995). Applying a machine learning workbench: Experience with agricultural databases. Proc Machine Learning in Practice Workshop, Machine Learning Conference, Tahoe City, CA, USA 14-21.
&lt;/p&gt;
&lt;p&gt;[4] P. Reutemann; B. Pfahringer and E. Frank (2004). Proper: A Toolbox for Learning from Relational Data with Propositional and Multi-Instance Learners. 17th Australian Joint Conference on Artificial Intelligence (AI2004). Springer-Verlag.
&lt;/p&gt;
&lt;p&gt;[5] Ian H. Witten; Eibe Frank, Len Trigg, Mark Hall, Geoffrey Holmes, and Sally Jo Cunningham (1999). Weka: Practical Machine Learning Tools and Techniques with Java Implementations. Proceedings of the ICONIP/ANZIIS/ANNES'99 Workshop on Emerging Knowledge Engineering and Connectionist-Based Information Systems 192-196.
&lt;/p&gt;
&lt;p&gt;[6] Gregory Piatetsky-Shapiro (2005-06-28). KDnuggets news on SIGKDD Service Award 2005.
&lt;/p&gt;&lt;/html&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Waikato Machine Learning Group</dc:creator><pubDate>Thu, 06 Nov 2008 23:01:06 -0000</pubDate><comments>http://mloss.org/software/rss/comments/16</comments><guid>http://mloss.org/software/view/16/3.5.8</guid><category>association rules</category><category>attribute selection</category><category>classification</category><category>clustering</category><category>preprocessing</category><category>regression</category></item></channel></rss>