Somoclu is a C++ tool for training self-organizing maps on large data sets using a high-performance cluster. It builds on MPI for distributing the workload across the nodes of the cluster. It is also able to boost training by using CUDA if graphics processing units are available. A sparse kernel is included, which is useful for high-dimensional but sparse data, such as the vector spaces common in text mining workflows. The code is released under GNU GPLv3 licence.
- Changes to previous version:
Initial Announcement on mloss.org.
Other available revisons
Version Changelog Date 1.6
- New: R wrapper integrates with kohonen package.
- New: MATLAB wrapper integrates with soomtoolbox.
- New: Better handling of CUDA compilation in the Python interface.
- Changed: Throws an exception if GPU kernel is requested, but it was compiled without it. The earlier behaviour quietly defaulted to the CPU kernel.
January 11, 2016, 09:40:34 1.5.1
- New: Neighborhood function can be chosen between Gaussian and bubble.
- Fixed: R wrapper passes arrays with correct orientation.
io.cppis no longer required in the wrappers. An exception is thrown when needed.
December 2, 2015, 08:18:27 1.5
- New: Python interface has visual capabilities.
- New: Option for hexagonal grid.
- New: Option for requesting compact support in updating the map.
- New: Python, R, and MATLAB interfaces now allow passing an initial codebook.
- Changed: Reduced memory use in calculating U-matrices.
- Changed: Build system rebuilt and simplified.
September 30, 2015, 13:27:52 1.4.1
- Better support for ICC.
- Faster code when compiling with GCC.
- Building instructions and documentation improved.
- Bug fixes: portability for R, using native R random number generator.
January 28, 2015, 13:19:36 1.4
- Better Windows support.
- Completed CUDA support for Python and R interfaces.
- Faster compilation by removing unnecessary flags for nvcc
- Support for CUDA 6.5.
- Bug fixes: R version no longer needs separate code.
September 5, 2014, 13:01:14 1.3.1
- Initial Windows support through GCC on Windows.
- Better I/O separation for the Python, R, and MATLAB interfaces.
- Bug fixes: major MPI initialization bug fixed.
April 10, 2014, 06:41:38 1.3
- Python, R, and MATLAB interfaces added.
- Learning rate parameter included.
- Linear and exponential cooling strategies added for radius and learning rate.
- CLI interface made more user-friendly.
- Default radius depends on both X and Y of the map.
- Bug fixes: CUDA build without MPI, best matching unit passing without MPI, coordinate order in best matching unit file.
March 31, 2014, 07:53:05 1.2
- Massive improvements in OpenMP parallelization.
- MPI libraries are no longer mandatory.
- Best matching units are saved.
- Option for specifying an initial codebook for the map.
- ESOM .lrn input format added.
- Parsing of white-space characters corrected.
- Long-named command line switches for specifying SOM dimensions.
- Fine-grained control of which interim files to save across epochs
- Option in Makefile for building shared library.
December 17, 2013, 04:31:05 1.1.2
Toroid maps were added. Initial radius is exposed as a parameter via the command line interface. Formats of codebook and U-matrix export are compatible with Databionic ESOM Tools for advanced visualisation. Bug fixes: codebook update with a compact support was removed, NaN entry no longer appears in U-matrices.
November 28, 2013, 03:20:22 1.0
Initial Announcement on mloss.org.
May 14, 2013, 06:21:13
No one has posted any comments yet. Perhaps you'd like to be the first?
Leave a comment
You must be logged in to post comments.