Nimbix BlogSuper musing about all things supercomputing
Accelerated computing affords us many luxuries, faster computing, computes on a larger data set, but one aspect that is often not mentioned is the luxury of exploring multiple methods for solving a particular problem. Luxury? Hold on a minute, exploring various methods for solving a particular challenge is a necessity. If the reader indulges me in a little bit of abstraction, we find that for every problem there is a set of possible solutions. This solution space is often referred to as the event space, and this space has some interesting features and topography. I promise that’s as far as I’m going with this stuff, you can exhale now. What this means is that for a given problem, there is potentially more than one solution and for the sake of correctness, should we not evaluate either all possible solutions or at least a selection of methods to generate solutions?
This is where accelerated computing makes strong sense. If an accelerated machine can run a model in half of the time a non-accelerated machine can run the model, we now have a conundrum. Are we satisfied with the result of one model or, is it more economical, in terms of time, to run two different models in the time it would usually take to run a single model? In essence, it is the balance between volume and speed. Would I rather have two answers to evaluate or one answer to evaluate in half of the time? Similarly, increasing this would result in many solutions to assess and the need to review all of these solutions to find the best one for our application.
One way to look at this is to assume that all models are fundamentally incorrect (after all, they are approximations, even in the simplest of systems) and running multiple models allows us to examine different solution methods that have been optimized differently or that are computed using various methods. This makes it possible to take a more holistic view of our proposed solutions, and it gives us confidence in our interpretation. Multiple solutions mitigate the possibility that a solution generated may be faulty due to the input conditions. If by chance the input data causes the analysis to go down a particular path which results in incorrect results, then a single instance can be very detrimental. The ability to run multiple cases concurrently and arrive at an array of solutions reduces this risk. The incorrect result will be an outlying data point while the majority of solutions will fall within the norm. The luxury of multiple solutions also creates the necessity to review all of these solution sets and utilize the appropriate one that works best for our application.
The notion of gaining confidence from multiple corroborating sources is often referred to as evidence-based support. This is commonly seen in decision support applications. In the decision support context, a question is postulated, and evidence is collected, and from that evidence, support is either present for the conclusion or it is not. It is the support that is then evaluated concerning validity. Multiple solutions also pose a new problem where the user needs to determine which solution is best for the application. This can quickly become very difficult to decipher particularly since the tolerance grows for the variables and boundary conditions established for the analysis.
In many cases, operating in the decision support context provides more freedom for interpretation than with a single model. In fields that area based primarily on expert interpretation, the decision support model has considerable traction. One such field is the practice of medicine. In particular, anatomic pathology. Pathologists are highly trained specialist physicians who have an extensive specialty (an additional five years minimum post medical school) and are trained to diagnose disease based on laboratory analysis. One of the most common pathological techniques is a histological examination of surgical specimens. If you ever have surgery where something is removed, that bit of removed tissue is sent “to the lab” for analysis. As part of that examination, the lab makes thin sections of the surgical sample and mounts them on microscope slides and stains them with chemicals that highlight different areas in different colors. The pathologist examines the slides and treats the slides, to make a determination as to the nature of the pathology of the surgical sample. The pathologist then writes up a report and sends it to the referring physician, frequently the surgeon. Now, the pathologist is an expert diagnostician and is often the final word on whether that tumor removed from your lung is cancerous or not. If it is cancerous that sets in motion a series of life-changing events, if not, then life continues as usual. Now, given that the pathologist is often the first to detect or confirm cancer, would you want them to make that decision based upon one point of data? Not me! I’d want them to run the full battery of tests and assays and paint the most comprehensive picture possible and support their diagnosis.
Engineering and other consumers of high performance computing are becoming more like pathologists. A practical engineering example of where multiple solutions are reviewed is a dynamic analysis of a car crash. Engineers build a model of the car in question and test the crash using simulation. The slightest change in input will result in significantly different results. If these tests were conducted with only a precise single contact from the front end of the car and it results with passengers not being harmed, then it is irresponsible for the engineer to conclude the that the car is safe during impact base on this single solution. Just slightly changing the impact location by a few inches up or down can have drastic changes downstream in the analysis. Supportive information is becoming more and more critical; therefore, exploring the solution space with multiple models to support a particular conclusion will be made more necessary. Accelerated computing solves this economically in that it can allow a larger area to be explored for any given unit of time. So, I guess multiple solutions isn’t a luxury, after all, they are a necessity.
When we think about Artificial Intelligence, we have a large array of potential models to choose from. We can imagine a rule based engine, a neural network, or some other, more exotic method such as a Generative Adversarial Network, that classifies an input and then executes an action based upon it’s classification. Rarely though, do we ever take a step back and look at the system as a whole and how the system maintains its Viability (you remember Viability from my Eight V’s…blog post). That is the topic of this blog post, system Viability and we’re going to look at it through the lens of The Viable Systems Model.
The Viable Systems Model was proposed and refined by Stafford Beer and others (John von Nemann, Norbert Weiner, W. Ross Ashby, Alan Turing, and many others) between the 1930’s and 1970’s. What these theoreticians were attempting to model were the universal mechanisms for a system, any system, to be self-governing and self-perpetuating – Viable. The field of study Beer and his associates were operating in was called Cybernetics, the science of communication and automated control. You could think of Cybernetics as a contemporary and companion to Systems Science, Control Theory, and all of the other disciplines that seek to describe and understand how things (gene regulatory networks, organisms, groups, corporations, political bodies, societies, etc.) adapt and change over time. Beer was primarily concerned with corporations and economic governance, but, his models and theory is general enough and universal enough to be applied within any discipline including art and sports training.
This all sounds intuitive enough. Almost fifty years ago a group of theoreticians came up with a model of governance and feedback. How does that impact us now? Heck, in the early 1970’s computers were the size of large rooms that had special elevated floors and programs were on punch cards or paper tape. They didn’t have cell phones, or even fuzzy logic rice cookers. Polyester was cool, hair styles were generally regrettable, and the Brady Bunch was in first run.
Unlike Mike Brady’s perm, and penchants for flammable petroleum-based synthetic fabrics, this theoretical system of governance has found a place within machine learning and is the corner stone of the “Eight V’s of Big Data and Artificial Intelligence” in the area of Viability. Now, our task is to very briefly examine and explain the Viable Systems Model (VSM) within the context of Artificial Intelligence and Big Data.
The VSM is a five system or layer model that governs or controls different aspects of an entities existence through time, layers 1 -3 respond to stimuli that influence activities in the “here and now,” layer 4 deals with reconfigurations for future or predictive elements that will influence the entities long term viability and layer 5 which seeks to balance or buffer layers 1-3 against layer 4 (see diagram 1).
Dropping out of this level of abstraction and examining the parts individually through the lens of an organism, we see we can relate to, lions (diagram 2)
Diagram 2 – The Viable System Model, https://www.slideshare.net/issip/an-introduction-to-systems-thinking-for-tackling-wicked-problems-57502299.
System 1 – Is the activity itself that describes the system, a living system, for example metabolizes and respires (burns food, produces waste). Lions do this, right?
System 2 – These are communication systems between within the living body, a nervous system or signally system of some sort. Lions have nervous systems.
System 3 – This is the monitor and control system for System 1, in a living system, this system regulates simple activities like metabolic rate and respiration rate. In humans and mammals, this can be thought of as the autonomic or involuntary nervous system. Depending upon the complexity of the organism, System 3 can also encompass circadian rhythms and other innate behaviors. Lions sleep, area awake, hunt, mate and have other certain behaviors.
System 4 – This is the first set of outwardly looking systems that take in input from the external world or milieu. These systems can be thought of as external sensors, touch, sight, hearing, and so forth. With these sensors are also rules that allow for self-preserving behaviors. For example, System 2 communicates to System 1 that the organism is running low on energy (is hungry). System 4 identifies the communication as hunger and identifies a food source and begins to eat. Lions do this very frequently, we can think of this as typical individual lion behavior.
System 5 – This is the component of the system that governs or balances System 4 activities against Systems 1-3. For example, if we look at pride of lions, we see System 5 activities taking place with feeding priorities, young weened cubs are higher up the feeding ladder (eating with their mothers who did the hunting) than are older cubs who eat last. This is done to assure the next generation of cubs can nutritionally make it to adulthood while maintain the social order of the pride. In this case, System 5 is the lion pride dynamics that govern a group of lions and modulates their behavior. On a systemic level, we can equate System 5 Rosseau’s Social Contract, https://en.wikipedia.org/wiki/The_Social_Contract.
Another way to think of System 5 activities are those activities that allow the organism or entity to co-exist with other entities like it and interact within its milieu.
All of this translates directly to artificial intelligence. If we look back at the concise definition proposed by Accenture in that was put forth in an earlier post, we see that artificial intelligence is defined as the ability to “sense, comprehend, and act”, we see that VSM maps directly. Systems 1-3 sense, System 4 comprehends, and System 5 balances the needs that have been sensed by Systems 1 – 3 with the actions proposed by System 4. If all five of these systems are tuned and trained appropriately, then exists a system that is viable over time and can change and adapt to its environment. This is ideally what we want in an artificial intelligence. It does us very little good to develop an Artificial Intelligence that only works at time point zero or use case zero, that’s like being a lion and not understanding the concept of hunting or eating. If that is true, as a lion, your viability will be very short.
As we build AI’s, we need to keep this abstract model in mind and think about the Viability, the continued Viability of the products that we are creating. There are very few universal truths, one of them is, change is difficult, even for AI’s, and what the VSM does is give the AI a built-in mechanism to introduce self-change in response to the inputs that it is receiving. Models need continual training to remain relevant. When viewed through the lens of the VSM, AI’s become more than just automated decision points but entities that adapt over time to the changing landscape of their niche. This then brings us back to the utility of accelerated computing, in order to make truly viable AI’s there needs to be continual training, and continual monitoring and modeling of the external milieu as well as internal response model. This continual level of self-monitoring requires accelerated computing to maintain viability or the monitoring activities over take the AI’s ability to respond, think of this as a modern day “swap of death” situation. So, save your lions, use accelerated computing to enable your AI’s to be truly Viable Systems.
By Tom McNeill
Artificial intelligence is really nothing new, it is the ability for a machine to “sense, comprehend, and act”, according to an Accenture publication. The real use for AI is in wading through the increasing volumes of data that are being generated on a daily basis and automating responses to signals from that data. Let’s look at one of the first commercial uses for fuzzy logic, a form of AI, the fuzzy logic rice cooker by Zojirushi. You select your type of rice, you put in water, set it and forget it. The rice cooker has sensors that monitor temperature and humidity and adjusts the temperature and cook time accordingly ensuring well-cooked rice. What it is really doing is automating and adjusting the cooking process of a food product that has been being cooked for thousands of years. In short, people know how to cook rice, the Zojirushi product has a model for cooking rice well and implemented an automation for making rice.
So, if we take our rice cooker analogy a step further, we find that the cooker is only capable of making a pot of rice with types of rice, or grains with which it is familiar. Jasmine rice, OK, short grain rice, no problem. What about wild rice which actually isn’t a rice at all but a grass, or a pork chop which certainly isn’t rice? Depending upon how the logic is implemented, all the rice cooker can do is monitor temperature and humidity in the pot, everything to the rice cooker is rice because it’s models don’t know about the edge case of wild rice or true outlier, a pork chop. This is the problem with fixed model systems, they don’t deal well with new or unusual things. That’s where learning comes in.
Learning is the real power and downfall of artificial intelligence. In most cases, AI’s are trained on sets that have been assembled to replicate a truth, a calculable entrance requirement to a labeled set or category. The entrance requirement can be a set of metadata that is weighted to achieve a score which determines the entrance to a particular category. We see this process go humorously wrong with toddlers when they are learning to speak. For example, little Freddy is 10 months old and he calls the family dog, Rover, ‘doggie’. Rover has four legs, a tail, and fur. On a day out with the family, Freddie sees a horse for the first time, points to it and says, “doggie”. Freddy just had a false positive because he had never seen a horse before and defaulted to the label he knew for things with four legs, a tail, and fur. In short, much like toddlers, AI’s are only as accurate as the training (experiences) they have been given. Can an AI train itself? Yes, it’s an old field called cybernetics, and that will be a topic for another blog post and no because there needs to be some sort of seeded a priori knowledge.
OK, rice cookers and Freddie the toddler, what does this have to do with supercomputing and artificial intelligence? These two examples have shown that just like people, artificial intelligences are bound by what they have been taught or trained upon and bound by the topology of their internal classification scheme. Supercomputers, or more specifically put, computers with accelerated hardware, are capable of increasing the speed of the system that governs the artificial intelligence. Due to its increased speed and capacity, it can train faster, on larger, more comprehensive sets, as well as on more focused and deeper training sets. This ability then allows a more fine-grained ability to discriminate input and a greater classification topology (more classes for classification and more complex relationships between classes).
Going back to our definition of artificial intelligence from Accenture, “sense, comprehend, act,” we see that artificial intelligence is just an automated classification…oh, yes, you in the back row, what’s that…inference and prediction? You’ve been reading ahead. Yes, both inference and prediction appear to be forecasting into the future; however, in both cases, they are using the models they have been trained upon, and forecasting methodologies that they have been trained to use, so in fact, they are rearward looking and very similar if not identical to our classification example. Inference simply trims and optimizes the classifiers in response to use, and prediction merely extends the models that are constructed in some logical way. We could even go so far as to say that any inference or prediction is a function of the training given to the AI. So again, we come back to our models and our classification topology.
If we accept the fact that AI’s are topology bound, this means that to get closer to the truth (whatever that is), every set of data can be categorized against numerous different classification topologies, we can call these different topologies “facets”. This is where accelerated computing shines. Instead of attempting to classify against a single entrance requirement, multiple AI’s can be trained against multiple entrance requirements that represent different potential semantic realities. For example, if the requirement is to classify types of ‘blues’, one logical set might be to name colors, (navy blue, sky blue, baby blue, …), a second might be musical genres (Delta blues, Chicago blues, Texas electric blues, …), and a third might have to do with Major League Baseball teams and players (Toronto Blue Jays, Vida Blue, …). The result is that once the facet space has been identified and defined, a more whole or full AI solution can be generated. So, if something as simple as ‘blue’ requires at least three fully trained classifiers, more complex search spaces will require much more. This expansive requirement means one thing, more compute time for training. This is where accelerated computing is a natural fit, faster compute means more facets can be trained per unit time. More facets trained means a more robust and complete AI coverage. With better semantic coverage, you are less likely to have your AI pointing to a “horse” and calling it “doggie.”
If you dive in to the field of Supercomputing and Big Data you will begin to run across blog posts talking about the “V’s” of the field, the six, the eight, the ten, the twelve, and so forth. No, we’re not talking about engines, we’re talking about lists of nouns that name aspects or properties of Big Data or Supercomputing that need to be balanced or optimized. The list of eight balances being complete while remaining concise, the higher numbered lists tend to veer off into data governance issues that are generally not issues we need concern ourselves with at this point.
The eight V’s: Volume, Velocity, Variety, Veracity, Vocabulary, Vagueness, Viability and Value
Most of these are pretty self-explanatory, but let’s go through them just for drill.
Volume: The amount of data needing to be processed at a given time. This can manifest either as amount over time or amount that needs to be processed at one time. For example, doing a matrix operation on a 1 billion by 1 billion matrix or scanning the contents of every published newspaper in a day for key words are both examples of volume that can constrain computing.
Velocity: Similar to Volume, this has to do with the speed of the data coming in and the speed of the transformed data leaving the compute. An example of a high velocity requirement is telemetry that needs to be analyzed in real time for a self-driving car. The enemy of velocity is latency.
Variety: The spice of life, or the bane of computing? In the computing context we are discussing, this term refers to heterogeneous data sources that need to be identified and normalized before the compute can occur. In data science, this is often referred to as data cleaning, this operation is frequently the most labor intensive as it involves all of the pre-work required to set-up the high-performance compute. This is where the vast majority of errors and issues are found with data and this is the fundamental bottle neck in high-performance computing.
Vocabulary: This term has two meanings. The first meaning is less a computing issue than it is a communication issue between provider and customer and it has to do with the language used to describe the desired outcome of an analysis. For example, the term “accuracy” or “performance” may have different meaning in the context of structural engineering than it does in rendering animation. The second meaning branches into semantic searching and operations within a semantic space. Here we are dealing with controlled vocabularies (ontologies) that represent a specific definition but also a relatedness to another term. For example, the term “child” infers that it has a “parent” and so forth. This term architecture is very important when operating with clients in the artificial intelligence space where search and retrieval is used to uncover unknown relationships. As it turns out, the strength of the ontology is what leads to the relative success or failure in projects that mine with semantic-based technologies.
Vagueness: This term describes an interpretation issue with results being returned. Douglas Adams articulated this beautifully in the “Hitchhiker’s Guide to the Galaxy” where the answer to all questions in the galaxy was postulated to be the number 42. This is a bit tongue-in-cheek, but, it is a very real problem with scientific and big data computes. These computes are able to marshal and transforms huge oceans of data but what does it mean? What do I do with the answer. We see the same issue in statistics when we do correlation studies. A famous example is the direct correlation between sales of chocolate ice cream and violent crime in Cleveland. So, what does this mean, does this mean that there is something in chocolate ice cream that makes people violent? As a well-meaning city official, you might consider banning the sale of chocolate ice cream, but, you’d look foolish, here’s why. Correlation does not imply causation, as it turns out, both ice cream sales and violent crime spike in the summer due to heat and lack of central air conditioning. This is vagueness. Computes that produce correlations are often misinterpreted as causation, more data doesn’t necessarily mean better or more accurate results, this is something that we all need to keep in the back of our minds when dealing with clients.
Viability: This refers to a model’s ability to represent reality. Model’s by their very nature are idealized approximations of reality. Some are very good, others are all dangerously flawed. Frequently, model builders simplify their models in order for them to be computationally tractable. With hardware acceleration, we can remove these shackles from the model builder and let them simulate closer to reality.
Value: This term is defined as whatever is important to the customer. Another way to define value is the removal of obstacles in their path to allow them to get to their stated destination. We often think of value in terms of cost, but, we can also think of Value in terms of enablement and what that is worth to the customer.
Here are some relationships between these terms that might be helpful…
As the first six V’s increase for any given problem, the problem outstrips the ability and capacity of commodity hardware and leads to a decrease in Viability and Value from that compute on commodity hardware.
Hardware deals primarily with Volume and Velocity as these are physical constraints of the data.
Software deals primarily with Variety, Veracity, Vocabulary, and Vagueness as these are logical or organizational constraints upon the data.
Artificial Intelligence/Machine Learning can be described as any technology that contains logic that discriminates between two or more classifications (member or non-member, odd or even, etc.) These systems deal primarily in the area of controlling or limiting Vocabulary and Vagueness and add Value and Viability through this control.
From these eight V’s and their relationships to hardware, software and artificial intelligence/machine learning we now have a lens though which we can examine our customer’s requirements and determine a measure of Value for the service that we provide.
Engineering simulation has become much more prevalent in engineering organizations than it was even 5 years ago. Commercial tools have gotten significantly easier to use whether you are looking at tools embedded within CAD programs or the standalone flagship analysis tools. The driving force behind these changes are to ultimately let engineers and companies understand their design quicker with more fidelity than before.
Engineering simulation is one of those cliché items where everyone says “We want more!” Engineers want to analyze bigger problems, more complex problems and even do large scale design of experiments with hundreds of design variations – and they want these results instantaneously. They want to be able to quickly understand their designs and design trends and be able to make changes accordingly so then can get their products optimized and to the market quicker.
ANSYS, Inc. spends a significant amount of R&D in helping customers get their results quicker and a large component of that development is High Performance Computing, or HPC. This technology allows engineers to solve their structural, fluid and/or electromagnetic analyses across multiple processors and even across multiple computing machines. Engineers can leverage HPC on laptops, workstations, clusters and even full data centers.
PADT is fortunate to be working with Nimbix, a High Performance Computing Platform that easily allowed us to quickly iterate through different models with various cores specified. It was seamless, easy to use, and FAST!
Let’s take a look at four problems: Rubber Seal FEA, Large Tractor Axle Model, Quadrocopter CFD model and a Large Exhaust CFD model. These problems cover a nice spectrum of analysis size and complexity. The CAD files are included in the link below.
TRACTOR AXLE FEA
This model has several parts all with contact defined and has 51 bolts that have pretension defined. A very large but not overly complex FEA problem. As you can see from the results, even by utilizing 8 cores you can triple your analysis throughput for a work day. This leads to more designs being analyzed and validated which gives engineers the results they need quicker.
- 58 Parts
- 51 x Bolts with Pretension
- 928K Elements, 1.6M Nodes
|Cores||Elapsed Time [s]||Estimated Models Per 8 [hours]|
RUBBER SEAL FEA
The rubber seal is actually a relatively small size problem, but quite complex. Not only does it need full hyperelastic material properties defined with large strain effects included, it also includes a leakage test. This will pressurize any exposed areas of the seal. This will of course cause some deformation which will lead to more leaked surfaces and so on. It basically because a pressure advancing solution.
From the results, again you can see the number of models that can be analyzed in the same time frame is signifcantly more. This model was already under an hour, even with the large nonlinearity, and with HPC it was down to less than half an hour.
- 6 Parts
- Mooney Rivlin Hyperelastic Material
- Seal Leakage with Advancing Pressure Load
- Frictional Contact
- Large Deformation
- 42K Elements, 58K Nodes
|Cores||Elapsed Time [s]||Estimated Models Per 8 [hours]|
QUADROCOPTER DRONE CFD
The drone model is a half symmetry model that includes 2 rotating domains to account for the propellers. This was ran as a steady state simulation using ANSYS Fluent. Simply utilizing 8 cores will let you solve 3 designs versus 1.
- Multiple Rotating Domains
- 2M Elements, 1.4M Nodes
|Cores||Elapsed Time [hours]||Speedup|
The exhaust model is a huge model with 33 million elements with several complicated flow passages and turbulence. This is a model that would take over a week to run using 1 core but with HPC on a decent workstation you can get that down to 1 day. Leveraging more HPC hardware resources such as a cluster or using a cloud computing platform like Nimbix will see that drop to 3 hours. Imagine getting results that used to take over 1 week that now will only take a few hours. You’ll notice that this model scaled linearly up to 128 cores. In many CFD simulations the more hardware resources and HPC technology you throw at it, the faster it will run.
- K-omega SST Turbulence
- 33M Elements, 7M Nodes
|Cores||Elapsed Time [hours]||Speedup|
As seen from the results leveraging HPC technology can be hugely advantageous. Many simulation tools out there do not fully leverage solving on multiple computing machines or even multiple cores. ANSYS does and the value is easily a given. HPC makes large complex simulation more practical as a part of the design process timeline. It allows for greater throughput of design investigations leading to better fidelity and more information to the engineer to develop an optimized part quicker.
If you’re interested in learning more about how ANSYS leverages HPC or if you’d like to know more about NIMIBX, the cloud computing platform that PADT leverages, please reach out to me at email@example.com