A couple of weeks ago I posted about CUDA application run times using Nimbix Application Environments. Now it is time for end-user testing and benchmarking to validate NAE run times for various workloads. We are excited to roll out our first wave of NAE hardware classes, which feature NVIDIA Tesla GPUs. Of course we’ve been working with NVIDIA and providing GPU-powered computing for over 3 years now, but with our JARVICE compute platform, we now deliver bare-metal performance on-demand via a web console and API. Additionally, we will be rapidly adding new hardware classes for end-user access. These new classes include the latest NVIDIA GPUs (hint: K20s, K40s), Xeon Phi, as well as large memory configurations. Our hardware classes will have a nomenclature that helps describe the capabilities of the machine and will follow a general naming convention. Here is an example of one of our newly introduced classes:
This example is a Nimbix Application Environment that runs on 16 cores, with 32GB of memory, has (2) NVIDIA M2090 Tesla GPUs and has access to high speed Infiniband interconnect. Additional types may have different numbers of available GPUs, memory, CPU cores, or interconnect, each with their own hourly rate. Of course, we bill to the minute, so if the NAE is being used to process a task that runs for 10 minutes, the end-user sees an accrued fee of 1/6 of the hourly NAE rate.
These hardware classes and the integration of new accelerator capabilities usher in the era of heterogeneous cloud computing in full and it is all enabled by our new JARVICE cloud computing platform.
As I mentioned in my previous post, we’ve been doing some early benchmarking around performance, cost, provisioning speed, etc, and we have been pleased with results. We recently ran some n-body simulations using a GPU-accelerated sample code from the NVIDIA CUDA 5.5 toolkit and found that on average, the Nimbix NAE outperformed similar offerings by as much as 25% and ranged as high as 50% performance gain depending on how the applications interacted between GPU and host. When computing time is charged by the hour, longer run times are directly translated into higher costs. We will have some more explicit data to publish around performance and cloud costs in a subsequent post.
While many of our early users are excited about the on-demand build environments that JARVICE enables, it’s important to remember that the next phase of our deployment is around the corner: marrying the NACC processing API with end-user NAEs. This functionality enables large scale HPC and batch processing to be seamlessly extended from end-user clusters into the Nimbix cloud. JARVICE enables users to essentially “design” their own compute node in an NAE and snapshot it. Once this is done, NAEs are assembled into an HPC cluster dynamically at runtime scaling compute tasks to many nodes on-demand with a simple mouse click or API call.
As we are still in beta, we certainly look forward to end-user feedback, questions and comments, but are truly excited to bring new capabilities to the public cloud.