Feedback

Your feedback is important to keep improving our website and offer you a more reliable experience.

Accelerating Apache Spark machine learning with Clear Linux* OS for Intel Architecture® and Intel Software Optimizations

BY Ziya Ma ON Aug 21, 2018

Clear Linux* OS for Intel Architecture®  is a modular operating system built for efficiency and optimized with security and performance in mind. Incorporating modern continuous development/ continuous integration practices allows Clear Linux* OS for Intel Architecture® to quickly integrate the latest technologies and platform capabilities. This makes it ideal for advanced use cases, spanning Cloud to Edge.

One of the many Cloud segments benefitting from Clear Linux* OS for Intel Architecture® is big data analytics, specifically areas such as Machine Learning (ML). Apache Spark MLlib is one of the leading solutions for big data analysis, offering functionality to perform ML workloads including regression, classification, dimension reduction, clustering, and rule extraction.

Any improvement to Apache Spark MLlib performance has significant impact for enterprises using this solution. Below we outline how running Apache Spark MLlib with Clear Linux* OS for Intel Architecture® and Intel software optimizations delivers an 8x performance acceleration.

The benchmark

Intel benchmarked Apache Spark MLlib performance on a generally available operating system against the Clear Linux* OS for Intel Architecture® software optimizations (See configuration table below). This comparison used an open suite of performance tests called spark-perf and Alternating Least Squares (ALS), the most popular algorithm for building recommender systems.

As the graph shows, the optimized software stack on Clear Linux* OS for Intel Architecture® reduced test suit completion from 11026 elapsed seconds to 1361 seconds—a more than 8x improvement in training performance.

This benchmark was done on Intel® Xeon® Gold 6140 processors (codename SkyLake), and more details about the benchmark configurations are provided below. The observed performance boost will vary over different Intel processors.

Intel software Optimizations

Benchmark testing included 10+ rounds of optimization and tuning, ranging from the software stack to training hyper-parameters. Key optimizations include:

Math Routines for Machine Learning

Intel software optimizations include math routines designed specifically to boost Machine Learning workloads, including Linear Algebra, Fast Fourier Transforms (FFT), Vector Math, and Statistics functions. Apache Spark can take advantage of these routines delivered using Intel® Math Kernel Library (Intel MKL) without need for any code changes.

Java 9 Enabling

JVM plays a key role in the Apache Spark stack. Intel enables Java 9 on Apache Spark to leverage the performance advantage of Java Development Kit (JDK) 9 features. JDK9 was enabled to improve Java Garbage Collection (GC) performance; hyper-threading (HT) technology was disabled per the best known configuration to achieve better performance; and hyper-parameters of ALS were tuned to make full use of acceleration.

Generational Performance Increase on Intel® Xeon® Processors

The Intel Xeon processor scalable family on the Purley platform is a new microarchitecture with many new features compared to the previous Intel® Xeon® processor E5-2600 v4 product family (formerly Broadwell microarchitecture).

These features include:

With these features, Apache Spark and the associated Intel software optimizations can boost performance over previous 256-bit AVX2 instructions in previous Intel Xeon processor v3 and v4 generations (codename Haswell and Broadwell, respectively).

Benefits to Customers

The combination of Clear Linux* OS for Intel Architecture® software optimizations demonstrated in the tested configuration can advance Machine Learning performance over large data sets in less time. As ML workloads get access to more data, they can provide better accuracy in delivering predictive maintenance, recommendation engines, and more.

Built with performance in mind, the Clear Linux* OS for Intel Architecture® provides the ideal platform for advanced operations including machine learning. Following a fast release schedule means uses can take full advantage of the latest Intel platform features and technologies spanning the software stack.

Intel software optimizations can also boost the performance of deep learning solutions on Intel architecture platforms, such as BigDL and other Intel-optimized frameworks (i.e., TensorFlow, Caffe, and others) this can help organizations accelerate their investments in next-generation predictive analytics.

The benefits of performance gains are clear: improved performance means it’s possible to train with larger data sets, explore a larger range of the model hyper-parameter space, and train more models. Additionally, these optimizations do not require a modified version of Apache Spark or modifications to an Apache Spark application code, nor do they require procurement of extra or special hardware. It just takes a few steps to install Intel MKL on a cluster.

Benchmark hardware and software specs

Cluster: 1 master node + 4 worker nodes

Sockets/node: 2

Cluster A

Cluster B

HARDWARE: Cluster A and Cluster B have the same hardware configuration

Processors

Cores / Threads

Base / Turbo Frequency

Xeon Gold 6140

18 / 36

2.3 / 3.7 GHz

Memory / Node

384 GB

12*32GB DDR4 DIMMs

Rated @ 2400 MHz

Operating @ 2400 MHz

Storage / Node

6.4 TB

8x 800GB SATA3 SSD

Network

10 Gb Ethernet

SOFTWARE

OS

Clear Linux* OS for Intel Architecture® 23950

RHEL 7.3

Kernel

4.17.9-596.native

4.17.9-1.el7.elrepo.x86_64

(4.4.145-1.el7.elrepo.x86_64)

Java

9.0.4 Open JDK

1.8.0_102 Open JDK

Spark

2.4-Snapshot

2.0.2

Math library

Intel MKL 2018.3.222

F2JBLAS

Spark-perf

https://github.com/databricks/spark-perf

SYSTEM CONFIGURATION: SPARK

# of Executors

4

vcores / Executor

35

70

Memory / Executor

320 GB

spark.default.parallelism

140 (HT OFF)

280 (HT ON)

Workload & Training Hyper-parameters

Workload

ALS in spark-perf (Training)

Dataset

40M users; 10M products; 50M ratings

Rank

600

regularization

0.1

# of iterations

1

References

  1. Clear Linux: https://clearlinux.org/
  2. MKL: https://software.intel.com/en-us/mkl
  3. MKL Wrapper: https://github.com/Intel-bigdata/mkl_wrapper_for_non_CDH

Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase.  For more complete information about performance and benchmark results, visit www.intel.com/benchmarks.

Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at intel.com.

Intel, the Intel logo, and Xeon are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries.

*Other names and brands may be claimed as the property of others. The nominative use of third party logos serves only the purposes of description and identification.

© Intel Corporation.