Projects that are tagged with cuda.


Logo Somoclu 1.7.4

by peterwittek - June 6, 2017, 15:48:11 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 27544 views, 5002 downloads, 3 subscriptions

About: Somoclu is a massively parallel implementation of self-organizing maps. It relies on OpenMP for multicore execution, MPI for distributing the workload, and it can be accelerated by CUDA on a GPU cluster. A sparse kernel is also included, which is useful for training maps on vector spaces generated in text mining processes. Apart from a command line interface, Python, Julia, R, and MATLAB are supported.

Changes:
  • New: Verbosity parameter in the command-line, Python, MATLAB, and Julia interfaces.
  • Changed: Calculation of U-matrix parallelized.
  • Changed: Moved feeding data to train method in the Python interface.
  • Fixed: The random seed was set to 0 for testing purposes. This is now changed to a wall-time based initialization.
  • Fixed: Sparse matrix reader made more robust.
  • Fixed: Compatibility with kohonen 3 resolved.
  • Fixed: Compatibility with Matplotlib 2 resolved.

Logo Theano 0.9.0

by jaberg - April 10, 2017, 20:30:17 CET [ Project Homepage BibTeX BibTeX for corresponding Paper Download ] 33088 views, 5560 downloads, 3 subscriptions

About: A Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. Dynamically generates CPU and GPU modules for good performance. Deep Learning Tutorials illustrate deep learning with Theano.

Changes:

Theano 0.9.0 (20th of March, 2017)

Highlights (since 0.8.0):

* Better Python 3.5 support
* Better numpy 1.12 support
* Conda packages for Mac, Linux and Windows
* Support newer Mac and Windows versions
* More Windows integration:

    * Theano scripts (``theano-cache`` and ``theano-nose``) now works on Windows
    * Better support for Windows end-lines into C codes
    * Support for space in paths on Windows

* Scan improvements:

    * More scan optimizations, with faster compilation and gradient computation
    * Support for checkpoint in scan (trade off between speed and memory usage, useful for long sequences)
    * Fixed broadcast checking in scan

* Graphs improvements:

    * More numerical stability by default for some graphs
    * Better handling of corner cases for theano functions and graph optimizations
    * More graph optimizations with faster compilation and execution
    * smaller and more readable graph

* New GPU back-end:

    * Removed warp-synchronous programming to get good results with newer CUDA drivers
    * More pooling support on GPU when cuDNN isn't available
    * Full support of ignore_border option for pooling
    * Inplace storage for shared variables
    * float16 storage
    * Using PCI bus ID of graphic cards for a better mapping between theano device number and nvidia-smi number
    * Fixed offset error in ``GpuIncSubtensor``

* Less C code compilation
* Added support for bool dtype
* Updated and more complete documentation
* Bug fixes related to merge optimizer and shape inference
* Lot of other bug fixes, crashes fixes and warning improvements

Logo deepdetect 0.1

by beniz - June 2, 2015, 09:25:28 CET [ Project Homepage BibTeX Download ] 2290 views, 591 downloads, 3 subscriptions

About: A Deep Learning API and server

Changes:

Initial Announcement on mloss.org.


Logo MShadow 1.0

by antinucleon - April 10, 2014, 02:57:54 CET [ Project Homepage BibTeX Download ] 3374 views, 852 downloads, 1 subscription

About: Lightweight CPU/GPU Matrix/Tensor Template Library in C++/CUDA. Support element-wise expression expand in high performance. Code once, run smoothly on both GPU and CPU

Changes:

Initial Announcement on mloss.org.


Logo CXXNET 0.1

by antinucleon - April 10, 2014, 02:47:08 CET [ Project Homepage BibTeX Download ] 3735 views, 893 downloads, 1 subscription

About: CXXNET (spelled as: C plus plus net) is a neural network toolkit build on mshadow(https://github.com/tqchen/mshadow). It is yet another implementation of (convolutional) neural network. It is in C++, with about 1000 lines of network layer implementations, easily configuration via config file, and can get the state of art performance.

Changes:

Initial Announcement on mloss.org.