Today we will look at how to utilize FPGAs to accelerate compute workloads and how to create a JARVICE™ application using the PushToCompute™ CI/CD pipeline.
The JARVICE platform is a one stop shop for FPGA kernel development, testing, and deployment. The NIMBIX App Marketplace includes the latest Xilinx SDAccel Development environment to design FPGA kernels using OpenCL, C/C++, and RTL. We have also purpose built JARVICE to handle a variety of provisioning schemes to protect 3rd party IP from the end user.
Xilinx SDAccel
SDAccel is a high-level synthesis tool that abstracts away low level details of a hardware platform. This allows software developers to leverage the power of FPGAs without extensive knowledge of the underlying platform. For more information on SDAccel, see here.
For simplicity, we will use a few OpenCL example kernels provided by Xilinx on GitHub.
OpenCL Example [vadd]
A JARVICE account is required to use the NIMBIX cloud. Sign up here
To start a new SDAccel session:
- Login into JARVICE
- Select Xilinx SDAccel Development (2018.2 XDF). Note: different from the runtime environment
- Start Desktop Mode Without FPGA
- Click on desktop preview to open session in a web browser
Open a Terminal from the Desktop and clone the SDAccel Examples
git clone --single-branch --branch 2018.2_xdf --depth=1 https://github.com/Xilinx/SDAccel_Examples cd SDAccel_Examples/getting_started/misc/vadd/src ls
You will see two types of source files:
- host.cpp := C++ source code for host application running on the CPU
- krnl_vadd.cl := OpenCL source code for accelerated kernel running on the FPGA
The host code uses the OpenCL API to interact with the accelerated kernel. Reference card. The SDAccel Development environment will compile the *.cl
code into an *.xclbin
file for the targeted FPGA platform. The *.xclbin
file contains a bitstream for the FPGA which will describe the kernels specialized architecture. This process will take several hours depending on the size of the FPGA. To aid in kernel development, the SDAccel Development environment includes a CPU and Hardware emulation mode which compiles in minutes and can run without access to a FPGA accelerator.
To build the vadd example, specify the target mode (emulation vs hardware) and target device. For example:
cd ~/SDAccel_Examples/getting_started/misc/vadd make TARGETS=sw_emu DEVICES=_u200_xdma_201820_1 all
The above command will generate vadd
and krnl_vadd.sw_emu._u200_xdma_201820_1.xclbin
for software emulation of the Xilinx Alveo u200. Prepare the software emulation environment and run vadd example:
export XCL_EMULATION_MODE=sw_emu emconfigutil --platform '_u200_xdma_201820_1' --nd 1 ./vadd
Stop your job using either shutdown from the Desktop menu (logout -> shutdown) or the shutdown button on the JARVICE dashboard
Alveo options for SDAccel
Flag | Options |
---|---|
TARGETS | sw_emu , hw_emu , hw |
DEVICES | _u200_xdma_201820_1 , _u250_xdma_201820_1 |
Additional information on using SDAccel Development environment on Nimbix cloud here
Create FPGA Application on JARVICE
This section will go over how to package the binary files generated by SDAccel into a JARVICE application. JARVICE apps use the PushToCompute CI/CD pipeline. This walk-through will use Der to containerize our application with the Xilinx Runtime and give an overview of the Appdef.json required by JARVICE.
This section is intended to use your computer. Review prerequisites here. Use App Ubuntu Linux for Intel
if running on the Nimbix cloud.
To Start, clone this repository from GitHub:
git clone https://github.com/nimbix/-tutorial
Our application will require a few SDAccel kernels to illustrate the different capabilities of JARVICE. Use the attached _u250_xdma_201820_1_golden.tar.gz
or generate the kernels using Build SDAccel bitstreams
Unpack the archive at -tutorial/docker-build/
mv _u250_xdma_201820_1_golden.tar.gz <path-to-repo>/-tutorial/docker-build/
cd <path-to-repo>/-tutorial/docker-build/
tar -xvf _u250_xdma_201820_1_golden.tar.gz
This will create an exe
and xclbin
folder
Build SDAccel bitstreams
Skip this section if using _u250_xdma_201820_1_golden.tar.gz
The build-scripts/build-xcl-examples.sh
uses the JARVICE API to submit a build job using the Xilinx SDAccel Development
environment. The default kernels to build are sum_scan
and vdotprod
from getting_started/misc
section. Update repo_path
and kernels
at the beginning of the script to select different kernel from Xilinx SDAccel Examples. The kernels
string uses |
as a delimiter.
Use the following to build the default SDAccel kernels for the Alveo u250:
./build-scripts/build-xcl-examples.sh -t hw -d _u250_xdma_201820_1 -u <jarvice-user> -k <jarvice-apikey>
Replace <jarvice-user>
and <jarvice-apikey>
with your JARVICE account. Your API key is listed at https://platform.jarvice.com under the Account
menu on the right.
Note We need to build with the hw
SDAccel flow. This job will take 6+ hours and automatically terminate
After submitting a job to JARVICE, the script will ask to copy/paste a password:
Started JARVICE job: 463856
Enter this password at prompt: MCAqHDS02aEgj3w
This will transfer and run a build script (run.sh
) on the JARVICE job. The generated files will be saved to your Vault
Job 463856 building /data/xcl_pv3/_u250_xdma_201820_1.tar.gz
Check JARVICE dashboard for status
Transfer the *.tar.gz
file to your machine using one of these methods
Create Docker container
The build-scripts/build-docker.sh
will create a Docker container for a JARVICE application utilizing a Xilinx FPGA machine type. docker-build
contains the Dockerfile
and build context for the container. A JARVICE application requires additional metadata provided by AppDef.json. The FPGA machine types will also require *.xclbin
file for each SDAccel kernel. The docker-build
directory should include exe
and xclbin
directories from Build SDAccel bitstreams.
See additional information using FPGAs on JARVICE
The Dockerfile is based from Xilinx Runtime available on DockerHub.
FROM nimbix/ubuntu-xrt:<xrt-runtime>
The FPGA *.xclbin
files are added to /opt/example
inside the container
# Test FPGA bitstream
ADD xclbin/$DSA/$XCLBIN_PROGRAM /opt/example/test.xclbin
RUN chmod 555 /opt/example/test.xclbin
# Add additional FPGA bitstream to demo removal of unused kernels
ADD xclbin/$DSA/$XCLBIN_REMOVE /opt/example/remove1.xclbin
ADD xclbin/$DSA/$XCLBIN_REMOVE /opt/example/remove2.xclbin
# Test host application
ADD exe/vdotprod /opt/example/vdotprod
The following tags in the Dockerfile
are replaced by build-scripts/build-docker.sh
:
Tag | Description | Sample Value |
---|---|---|
<xrt-runtime> |
Xilinx Runtime version | 201802.2.1.83_16.04 |
<jarvice-machine> |
JARVICE FPGA machine type | nx6u |
<xcl-dsa-name> |
SDAccel DSA | _u250_xdma_201820_1 |
The default container registry is DockerHub. build-scripts/build-docker.sh
script will build a Docker container and push it to a Docker registry.
Note Pushing to a DockerHub repository that does not exist will create a public repository
docker login
./build-scripts/build-docker.sh <docker_repo> <docker_tag>
<docker_repo>
e.g. nimbix/-tutorial
<docker_tag>
e.g. latest
PushToCompute flow
PushToCompute is the final step to create our JARVICE application.
- Login into JARVICE
- Select
PushToCompute
from the menu on the right - Login into your Docker registry from the menu on the left
- Click on the
New
app button and fill out the form
Form | Value |
---|---|
App ID | _tutorial |
Docker or Singularity Repository | Docker repository used in Create Docker container (e.g. nimbix/-tutorial:latest) |
Git Source URL (to Clone) | Leave blank |
System Architecture | Intel x86 64-bit (x86_64) |
This will create a new App card on PushToCompute that is private to your account (or team if Team Visible
was selected).
- Click on the menu on the top left of the app card
- Click on
Pull
to start pull from your Docker Registry
- Use the same menu to check on pull progress from
History
- Close the
Pull History
when you seePull completed
Test FPGA Application JARVICE
Our application is now ready to run on JARVICE. This simple application will run the vdotprod
example from the Xilinx SDAccel_Examples
repository on GitHub. The Dockerfile adds the examples files generated from SDAccel to /opt/example
.
/opt/example/remove1.xclbin
/opt/example/remove2.xclbin
/opt/example/test.xclbin
/opt/example/vdotprod
/opt/example/run-test.sh
The *.xclbin
files are used to configure the FPGA with the desired kernel, vdotprod
binary is the CPU executable, and run-test.sh
is a simple exerciser script for our example.
This application supports 3 flows: No Xclbin Protection
, Single Xclbin Protection
, and Launch vdotprod w/ Xclbin Protection
- Click on the app card from
PushToCompute
to see the different flows
No Xclbin Protection (standard)
The standard FPGA flow passes all *.xclbin
files untouched to the user's session. The FPGA can then be configured by the user via the Xilinx runtime.
Notice all the *.xclbin
files are in the user's session and are ~40MB in size. The file size indicates the FPGA bitstream is available.
Single Xclbin Protection
This flow uses a JARVICE platform feature to Protect kernel bitstream. The Protect
command in the Appdef.json sets XCLBIN_BITSTREAM_PROGRAM
to /opt/example/test.xclbin
. This instructs the JARVICE platform to configure the FPGA w/ the bitstream contained in /opt/example/test.xclbin
while provisioning a user's job. The kernel bitstream is then removed from test.xclbin
before granting a user access to the job session.
Notice the file size for test.xclbin
after the FPGA bitstream has been removed by the JARVICE platform. This prevents a user from accessing/copying a kernel bitstream from the application.
Launch vdotprod w/ Xclbin Protection
The final flow extends Single Xclbin Protection
by removing all unused *.xclbin
files. This is done by adding the XCLBIN_BITSTREAM_PROTECT
variable to the Worflow
command in Appdef.json. The |
character is used as a delimiter to specify multiple files to remove.
The vdotprod
example can be run from any flow with:
/opt/example/run-test.sh
Conclusion
This post provided a brief introduction on developing accelerated kernels for FPGAs available on the NIMBIX cloud. NIMBIX has partnered with Xilinx to offer a one stop shop to develop and deploy FPGA accelerated applications using the SDAccel Development environment and leveraging the power of the JARVICE platform. In addition to focusing on performance, we have extended JARVICE to optionally protect 3rd party IP when using FPGA accelerators. This added security enables ISVs to offer proprietary accelerated code in the Nimbix application marketplace.
Get started accelerating applications with the JARVICE platform on the Nimbix cloud today:
- Sign up here
- Alveo tutorial