Examples
The challenge
Many biologically-inspired AI systems are based on large-scale systems of differential equations and complex computations being performed on many elements of the system in parallel, or at one time. These systems share similarities with physical systems found and modeled in natural phenomena including temperature loss of a physical body or the movement of a snake across the ground. Both these systems and their building blocks can be applied to many applications including predicting the stock market and monitoring live video feeds.
One major problem with implementing biologically-inspired AI systems is their speed. Conventional, single-core processors, as well as PC clusters, provide a suboptimal solution because of their inherent serial programming environment and data synchrony limitations. Low‐cost parallel hardware platforms such as GPUs and multicore CPUs are now available, but these solutions are difficult to program. In addition, migrating existing AI systems into a parallel environment poses significant technological challenges.
Example implementation: image processing
This example illustrates the advantage in terms of speedup of implementing a complex large-scale neural network system in NTP. Visual receptive fields can enhance important image information when applied many times across many different pixels of an image in parallel.
This type of architecture is widely used in biologically inspired image processing and classification models including geospatial image processing, face detection, and medical image analysis. In these applications, high processing speed is required to process large amounts of data and to numerically integrate system’s equations.
The figure above shows how the user calls the appropriate API functions to run the network either on the CPU or the GPU (the code illustrates a specific form of the network called a recurrent competitive field, RCF). Code only needs to be written once, and can run on multiple platforms with peak performance of about 230X on AMD and Firestream™ with respect to a single-core CPU implementation. Performance also increases as the network size increases indicating the larger speed gains to be achieved as more data is processed.
NTP performance
Writing data parallel AI algorithms with NTP leads to a dramatic increase in execution speed with respect to traditional, single-core CPU implementations. The third prototype of NTP, Synapcardγ, provides dramatic speedups in running a large visual receptive field algorithm consisting of a network of 15,000 neurons, almost three times the performance of Neurala’s first prototype, Synapcardα, and 232X with respect to a single-core Opteron CPU.
For more information on beta testing and pricing, please contact us.


