By Ivan Jibaja, Senior Software Engineer, Intel Corporation
What is SIMD?
As mentioned above, SIMD stands for Single Instruction, Multiple Data. SIMD allows the same operations to be performed on multiple data points simultaneously, which exploits data level parallelism. Today, a large number of processors have instructions for SIMD operations. SIMD operations can largely benefit applications in fields like 3D graphics and audio/video processing in multimedia applications.
A SIMD value has multiple lanes; for this example we are using a SIMD vector of length 4. For a SIMD vector of length 4, the lanes are named x, y, z, and w. The figure below shows a use example for a SIMD addition operation. Instead of having to perform 4 separate addition operations on pairs of 4 separate scalars, SIMD allows you to add all of them at the same time by performing the operation on all 4 lanes simultaneously. By requiring fewer operations to process a particular data set use of SIMD instructions yields higher performance and energy efficiency compared to scalar operations.
Application Programming Interface (API)
The SIMD extensions are implemented in the SIMD object. Here’s a simple example performing an addition of 2 float32x4 values:
These lines of code result in the variable c being a SIMD vector that holds the values [6.0, 8.0, 10.0, 12.0]
- SIMD.float32x4: vector with length 4 where each lane holds a IEEE-754 32-bit single-precision floating point value
- SIMD.int32x4: vector with length 4 where each lane holds a 32-bit signed integer value
- Float32x4Array: A typed array holding SIMD.float32x4 values as packed binary data
- Int32x4Array: A typed array holding SIMD.int32x4 values as packed binary data
splat creates instances of the specified SIMD data type with all lanes set to s
The following accessors are used to read values of the individual lanes:
Individual values are read using the getters x, y, z, and w. For example:
For the complete and up to date API see the current polyfill source listed in the reference section.
The following figure presents measurements of benchmark applications whose core computation is amenable to SIMD optimizations.
Speedup in Firefox browser
Speedup in Chromium browser
The following code is extracted from the visual Mandelbrot benchmark shown above. It computes the iteration count associated with a pixel. That iteration count is mapped to a color visualization.
- EcmaScript strawman proposal: http://wiki.ecmascript.org/doku.php?id=strawman:simd_number
- Peter Jensen’s Mandelbrot source: https://github.com/PeterJensen/mandelbrot