Share this:

Data scientists, informatics professionals, engineers and High Performance Computing users will all agree that data sets are getting larger and computational requirements are growing exponentially. 

Simultaneous with this trend, cloud computing technologies and infrastructure have been transforming the IT landscape for several years now and their adoption for use in High Performance Computing tasks is well under way.

Although users and system administrators of the world’s largest supercomputers would argue that they have been engaged in cloud computing for decades, few would disagree that innovations in deployment models and hardware architectures are ushering in change.   For example, in today’s public clouds, an individual user can essentially click a few mouse buttons and spin up a machine that can rival a Top 500 supercomputer.  The challenge with this model is that provisioning and running large cloud machines is not a trivial task.  It’s also not always inexpensive.   

And what about efficiency?  Certainly cloud virtualization technology has been shown to increase efficiency from a hardware perspective, but does this hold true for large, computational HPC workloads?  

As computational scale grows, so too does the energy required to provide reasonable walltimes for a given problem and as any HPC machine operator knows, this is becoming increasingly important.  

So in my view, part of managing efficiency in large scale cloud HPC workloads is balancing between minimizing walltimes and reducing the energy required to do so.   Nimbix achieves this by operating a public cloud supercomputer that leverages hybrid accelerators running optimized applications designed for high throughput computing.   The other important component of efficiency is reducing the “time to run” applications.  Energy expenditure should be minimized in the setup and configuration of public cloud resources.   This can be addressed by building a web service around the workload, not the machines.  Having a web service designed around the application and data, much like Salesforce, Intuit and SaaS models in general, also makes cloud supercomputers much easier to consume as a service.

To further illustrate this concept, I refer to my favorite line from the summer blockbuster film, “Avengers,” where character Tony Stark (Ironman) is busy to solving a big computational problem.  He makes a reference to a [cloud supercomputer] where he could “clock” his job “around 600 teraflops” to get his computation done quickly.   It is precisely this picture of large-scale, public supercomputers that reflects my own view of the future of cloud-based supercomputing.