DRVQ is a C++ library implementation of dimensionality-recursive vector quantization, a fast vector quantization method in high-dimensional Euclidean spaces under arbitrary data distributions. It is an approximation of k-means that is practically constant in data size and applies to arbitrarily high dimensions but can only scale to a few thousands of centroids. As a by-product of training, a tree structure performs either exact or approximate quantization on trained centroids, the latter being not very precise but extremely fast.
The methods used are described in the research project and the original publication.
The latest stable release is available at SourceForge and the latest development version is available at github, where a detailed README file describes the usage of the software, including license, requirements, installation, file formats, sample data, tools, and options. With the sample data provided and the default options, it is possible to test the code immediately as a demo.
DRVQ has only been tested on clang 3.3 and g++ 4.8.1 on Linux, but it should be straightfoward to use on other platforms.
All tools are provided as very simple, script-like .cpp files of few lines of code each, with all low-level implementation hidden in included files. Each file generates a command-line executable that can be used as a stand-alone tool for a particular job, using a rich set of command-line options.
On the other hand, the source code illustrates how the library can be used, e.g. to integrate in other programs. One is welcome to tune the code beyond the options offered by the tools, including the data types used. This is straightforward because the implementation is using templates or type definitions where necessary.
- Changes to previous version:
Initial Announcement on mloss.org.
No one has posted any comments yet. Perhaps you'd like to be the first?
Leave a comment
You must be logged in to post comments.