Project details for Somoclu

Screenshot Somoclu 1.6.1

by peterwittek - February 22, 2016, 10:42:47 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ]

view (7 today), download ( 4 today ), 3 subscriptions


Somoclu is a C++ tool for training self-organizing maps on large data sets using a massively parallel resources. It relies on OpenMP for multicore execution and it builds on MPI for distributing the workload across the nodes of the cluster. It is also able to boost training by using CUDA if graphics processing units are available. A sparse kernel is included, which is useful for high-dimensional but sparse data, such as the vector spaces common in text mining workflows. Python, R, and MATLAB interfaces facilitate use in data analysis. The code is released under GNU GPLv3 licence.

Key features:

  • Fast execution by parallelization: OpenMP, MPI, and CUDA are supported.

  • Python, R, and MATLAB interfaces for the dense multicore CPU kernel.

  • Planar and toroid maps.

  • Rectangular and hexagonal grids.

  • Gaussian and bubble neighborhood functions.

  • Both dense and sparse input data are supported.

  • Large emergent maps of several hundred thousand neurons are feasible.

  • Integration with Databionic ESOM Tools.

Changes to previous version:
  • New: Option for PCA initialization is added to the Python interface.
  • New: Clustering of the codebook with arbitrary clustering algorithm in scikit-learn is now possible in the Python interface.
BibTeX Entry: Download
Corresponding Paper BibTeX Entry: Download
URL: Project Homepage
Supported Operating Systems: Linux, Windows, Os X
Data Formats: Ascii, Libsvm, Esom
Tags: Cuda, Self Organizing Maps, Mpi, Esom, Openmp
Archive: download here

Other available revisons

Version Changelog Date
  • New: Option for PCA initialization is added to the Python interface.
  • New: Clustering of the codebook with arbitrary clustering algorithm in scikit-learn is now possible in the Python interface.
February 22, 2016, 10:42:47
  • New: R wrapper integrates with kohonen package.
  • New: MATLAB wrapper integrates with soomtoolbox.
  • New: Better handling of CUDA compilation in the Python interface.
  • Changed: Throws an exception if GPU kernel is requested, but it was compiled without it. The earlier behaviour quietly defaulted to the CPU kernel.
January 11, 2016, 09:40:34
  • New: Neighborhood function can be chosen between Gaussian and bubble.
  • Fixed: R wrapper passes arrays with correct orientation.
  • Fixed: io.cpp is no longer required in the wrappers. An exception is thrown when needed.
December 2, 2015, 08:18:27
  • New: Python interface has visual capabilities.
  • New: Option for hexagonal grid.
  • New: Option for requesting compact support in updating the map.
  • New: Python, R, and MATLAB interfaces now allow passing an initial codebook.
  • Changed: Reduced memory use in calculating U-matrices.
  • Changed: Build system rebuilt and simplified.
September 30, 2015, 13:27:52
  • Better support for ICC.
  • Faster code when compiling with GCC.
  • Building instructions and documentation improved.
  • Bug fixes: portability for R, using native R random number generator.
January 28, 2015, 13:19:36
  • Better Windows support.
  • Completed CUDA support for Python and R interfaces.
  • Faster compilation by removing unnecessary flags for nvcc
  • Support for CUDA 6.5.
  • Bug fixes: R version no longer needs separate code.
September 5, 2014, 13:01:14
  • Initial Windows support through GCC on Windows.
  • Better I/O separation for the Python, R, and MATLAB interfaces.
  • Bug fixes: major MPI initialization bug fixed.
April 10, 2014, 06:41:38
  • Python, R, and MATLAB interfaces added.
  • Learning rate parameter included.
  • Linear and exponential cooling strategies added for radius and learning rate.
  • CLI interface made more user-friendly.
  • Default radius depends on both X and Y of the map.
  • Bug fixes: CUDA build without MPI, best matching unit passing without MPI, coordinate order in best matching unit file.
March 31, 2014, 07:53:05
  • Massive improvements in OpenMP parallelization.
  • MPI libraries are no longer mandatory.
  • Best matching units are saved.
  • Option for specifying an initial codebook for the map.
  • ESOM .lrn input format added.
  • Parsing of white-space characters corrected.
  • Long-named command line switches for specifying SOM dimensions.
  • Fine-grained control of which interim files to save across epochs
  • Option in Makefile for building shared library.
December 17, 2013, 04:31:05

Toroid maps were added. Initial radius is exposed as a parameter via the command line interface. Formats of codebook and U-matrix export are compatible with Databionic ESOM Tools for advanced visualisation. Bug fixes: codebook update with a compact support was removed, NaN entry no longer appears in U-matrices.

November 28, 2013, 03:20:22

Initial Announcement on

May 14, 2013, 06:21:13


No one has posted any comments yet. Perhaps you'd like to be the first?

Leave a comment

You must be logged in to post comments.