March 16, 2021
The year 2020 was as eventful for high-performance computing (HPC) as the rest of the world. HPC is now a staple computing environment for businesses that want to speed up their product development using simulations and modeling. From designing better rockets to developing new medical devices, cutting-edge biometric identification, and so much more, the applications for HPC seem more endless than ever.
As pioneers in high-performance computing and a leading provider in cloud-based supercomputing, our goal is to contribute to the broader accessibility of HPC. As we keep a close watch on industry developments, we're as excited about this next phase in HPC as anybody.
Looking back on 2020, here are some of the advances in the world of HPC that excited us the most.
THE ACCELERATED SHIFT TOWARDS CLOUD-BASED HPC
We saw the drive towards infrastructure flexibility in cloud-based HPC growth, fueled even more by an overall shift towards remote operations in 2020 due to the COVID-19 pandemic.
The flexibility and cost-efficiency of the pay-as-you-go model prove attractive to organizations from major institutions with on-prem supercomputers to small businesses that lack such resources. SMBs can't afford the HPC infrastructure or in-house expertise, yet they still want its advantages. They are using cloud-based HPC to speed up and improve their product design and improve their operating efficiencies. Existing players in HPC, such as research institutions and large, science-driven industries, incorporate cloud-based HPC to improve capacity without adding significant expense.
With HPC and AI computing available to a broader market, the cloud HPC market hit $4.5 billion in 2020 and is forecast to hit $11.54 billion by 2026. The affordability of cloud-based HPC leads companies of all sizes to reassess the ROI potential of testing HPC. Cloud HPC lowers the cost of ownership and increases flexibility to manage expenses, accelerating the adoption of cloud applications. Now that the cloud makes HPC resources more affordable and accessible, the challenge is to make using those resources easier for organizations to employ.
HPC TECHNOLOGIES AND THE FIGHT AGAINST COVID-19
Perhaps the most consuming story of 2020 was COVID-19. Nothing and no one escaped the impact of the pandemic. However, HPC engineers and scientists turned their attention to ensure HPC contributed to the fight against COVID.
In the United States, the COVID-19 High-Performance Computing Consortium was established in March 2020. Its members include government agencies, academic institutions, and HPC industry companies. As of publication, the consortium has overseen 97 projects. An early project addressed the design and testing of a device that would allow the use of a ventilator for more than one patient. A current project uses HPC-powered causal inference models on COVID observational data sets to conduct mitigation and response studies since building randomized data pools is impossible.
In Europe, the Partnership for Advanced Computing in Europe (PRACE) shifted its resources towards battling COVID. It provided resources to the Polytechnic University of Valencia, which performed network modeling to understand how COVID spread through communities. Another project used molecular dynamics simulations to predict how the virus might mutate.
THE CONVERGENCE OF HPC AND ARTIFICIAL INTELLIGENCE (AI) WORKLOADS AND ARCHITECTURE
Increasingly, HPC and AI workloads are running on shared architecture. While educational institutions that run supercomputers have often led the way, companies keen on realizing the value-creation and revenue AI-driven insights can generate are not far behind. Converged workloads are also becoming more common as companies look to incorporate AI and machine learning (ML) into HPC-driven simulations and modeling.
Yet, AI and HPC workloads have different resource needs. An HPC architecture that maximizes compute-centric needs might not manage the unpredictable, data-expanding workloads of AI. As a result, the pressure for infrastructure flexibility that doesn't tie either AI or HPC down to fixed hardware resources intensifies. Composable infrastructure allows for dynamic resource provisioning and utilization to serve HPC, AI and converged workloads.
THE USE OF GPU ACCELERATORS ITSELF ACCELERATES
In 2018, the Titan (ORNL) was the first grand supercomputer to use GPU accelerators. By April 2020, 136 of the Top500 were using GPU acceleration. According to the Gartner Priority Matrix for AI, 2020, GPU accelerator use will grow by 100% over the next two to five years.
The increase in AI and ML applications is a huge driver in this growth. Another driver is the massive data sets—including unstructured data—that are getting harder to query. The sequential processing of CPUs can't handle the immense analytical workloads demanded. In contrast, the parallel architecture of the GPU can better handle compute-intensive applications. Workloads that would take days to complete running on CPUs often complete in an hour with GPU acceleration. The enhanced performance helps manage HPC processing budgets.
One of the original uses of GPU acceleration was in deep learning inference applications. Supercomputers added GPU accelerators as they needed to run mixed workloads. Now, GPU acceleration is finding its way to companies to extract business insights. Building smaller data subsets so that queries can run on CPU processors exclude many data from the analytical process. GPU acceleration allows academics and businesses to take advantage of AI and ML to uncover their most valuable intelligence by not forcing them to portion out their data.
SOFTWARE-DEFINED STORAGE AND ITS IMPACT ON HPC
As part of the broader shift to hybrid environments and the growing pace and size of AI/ML workloads, HPC engineers and architects are looking for updated storage mechanisms.
The input/output allowed with traditional HPC storage becomes a bottleneck for the high-performance analytical applications running on massive, unstructured datasets. Software-defined storage (SDS) will enable engineers to scale storage needs as a process is running. It can do this while unifying storage across a hybrid environment regardless of the underlying hardware. Thus, SDS can overcome the barriers presented by data silos and massive data sets.
SDS also allows for optimization of data placement within defined storage and access policies. Optimized data placement minimizes the I/O processes, improving process performance and lowering the cost. NERSC Advanced Technologies studied its data flows in 2019, and according to storage architect Glen Lockwood, 15-30% of the data flow involved users moving data between storage tiers. At SC20, Lockwood proclaimed that "Tiers decelerate scientific discovery—the exact opposite of what HPC should be doing!"
Examining how SDS fits in transitioning HPC architectures from the petascale to the exascale era—like the work BigHPC is doing—will continue.
OUR PREDICTION: IN 2021, EVERYTHING ABOUT HPC WILL KEEP GETTING BIGGER
From the data sets to the use cases, HPC became ever more prominent in 2020. As HPC resources become more affordable and accessible, their viability for new commercial applications is also growing. Perhaps the most significant HPC trend in 2020 was growing demand outside the traditional academic and data science fields to the commercial enterprise. Let's see where 2021 will take us next.