Using Intel MPI 2021 with Ethernet Fabric on JARVICE™


December 7, 2021

Overview

Intel MPI 2021 represents a major improvement in scalability and reliability over previous versions, especially when using TCP over Ethernet fabric.  While tightly-coupled HPC workloads are best suited for low latency interconnects, such as InfiniBand, these capabilities are not always available.  On the other hand, Ethernet is ubiquitous, both on private clusters and on-demand private cloud infrastructure.

When developing an HPC algorithm based on the common Message Passing Interface (or MPI) pattern, running on commodity Ethernet-powered infrastructure at a reasonable scale should be an important goal for broad adoption.  Fortunately, Intel’s MPI library provides a wide range of fabric options and performs well in all conditions.

During a recent technical collaboration between Intel and Nimbix, specifically testing MPI on Ethernet-based interconnects in the public cloud, tightly-coupled algorithms such as Computational Fluid Dynamics (CFD) in popular solvers routinely scaled to over 1,200 cores with good performance using Intel’s 2021 version of MPI.  Older versions of Intel MPI, or even other flavors of MPI libraries on the same infrastructure, fell significantly short in comparison.  In some tests with popular solvers, the 2021 version performed reliably at approximately six times the scale of early 2019 versions.

Implementation Notes

In the following example, we’ll deploy a container packaging Intel MPI 2021.4 that can be used as a test environment or as a base image for more complex MPI-based algorithms.  Any JARVICE™ XE-based cluster can run this application, but for illustration purposes, we’ll use the Nimbix Cloud’s “us-01” zone, powered by JARVICE running on Ethernet-based Google Compute Platform using instances based on the latest Intel Xeon Scalable Processor.  Please note that Intel’s MPI is licensed under the Intel Simplified Software License.

For building the example, you’ll need a desktop or server environment running Docker or a publicly accessible Git repository to connect to a JARVICE environment’s PushToCompute™ function.  For more details on building and deploying containers on the Nimbix Cloud, please see PushToCompute in the Nimbix help center.

The Dockerfile for the Base Image

This image implements an optimized multi-stage build.  The “final” image contains the “IMB-P2P” compiled Intel MPI Benchmarks.  If you want to add a custom solver in the future built against this specific MPI version, it’s best to compile it in the build stage of the Docker image.

Here is the Dockerfile used to build the example gcr.io/jarvice/impi-example:2021.4 container:

# build stage
FROM centos:7

WORKDIR /tmp

# install mpi-common and build tools
RUN yum -y install epel-release wget file git gcc gcc-c++ make && \
    curl -H 'Cache-Control: no-cache' \
    https://raw.githubusercontent.com/nimbix/mpi-common/master/install-mpi-common.sh \
    | bash

# install the Intel MPI standalone library in /opt/intel
# for the latest link, visit:
# https://www.intel.com/content/www/us/en/developer/articles/tool/oneapi-standalone-components.html#mpi
# Be sure to use the Linux "Offline" version!!!
# NOTE: this installer cannot be easily piped into bash directly from curl
RUN curl -o impi.sh \
    https://registrationcenter-download.intel.com/akdlm/irc_nas/18186/l_mpi_oneapi_p_2021.4.0.441_offline.sh \
    && bash ./impi.sh -a -s --eula accept --install-dir /opt/intel && rm -f impi.sh && rm -rf /opt/intel/oneapi

# build packaged "hello world" test program
WORKDIR /opt/intel/mpi/latest/test
RUN source /opt/intel/mpi/latest/env/vars.sh && mpicc -o test ./test.c

# build the IMB suite
WORKDIR /opt/intel/mpi/latest/benchmarks/imb
RUN source /opt/intel/mpi/latest/env/vars.sh && make CC=mpicc IMB-P2P

# final stage
FROM centos:7

# install image-common as a best practice for JARVICE application environments;
# note that we skip the upstream MPI packages as we'll be installing our own
# MPI later
RUN curl -H 'Cache-Control: no-cache' \
    https://raw.githubusercontent.com/nimbix/image-common/master/install-nimbix.sh \
    | bash -s -- --skip-mpi-pkg
COPY --from=0 /opt/intel /opt/intel

# license the environment so that JARVICE prompts for acceptance
RUN mkdir -p /etc/NAE && cp -f /opt/intel/mpi/latest/licensing/license.txt /etc/NAE/license.txt

# pull optimization into JARVICE platform (run this at the end of the container)
# see: https://jarvice.readthedocs.io/en/latest/docker/#best-practices
RUN mkdir -p /etc/NAE && touch /etc/NAE/{screenshot.png,screenshot.txt,license.txt,AppDef.json}

Running on a JARVICE System

Running this container requires a JARVICE account with developer access.  If you are interested in signing up for service on the Nimbix Cloud, please contact us.

Once logged in, create a new application target in the PushToCompute™ section of the portal. You may use the published test container mentioned above if you are not using your own build:





Click the impi_example app after creating it and select the Server endpoint to start a web shell:



For example, to run a 32 core session using Ethernet with TCP fabric:



After clicking the Submit button, JARVICE will queue the job. Once it’s available to use, you’ll see a connection link in the dashboard:



Clicking the connection link will bring up a web shell, where you can source the Intel MPI environment script and run the basic “Hello World” test across all nodes:



Use /etc/JARVICE/cores rather than /etc/JARVICE/nodes to run on all cores across all nodes. For example, to run the IMB-P2P benchmarks on all cores:



The benchmarks will continue to run until completion.
Once you have finished testing, click the red “power” button in the dashboard to shut down the job to avoid accruing more usage charges:





Once the job status shows “Terminated,” it is no longer running.

Summary

The example container can be used to run MPI tests directly or as a base image to build custom applications based on Intel MPI 2021.4. This stack provides good performance and scalability on commodity cloud instances using Ethernet interconnect and will also automatically detect and benefit from more advanced fabrics such as InfiniBand when available.

For more information on Intel MPI and other developer tools, visit Intel oneAPI Toolkits.  For more information on the Nimbix Cloud and JARVICE, contact us.

Subscribe to our quarterly email newsletter to get resources like this in your inbox!