By Tom McNeill
Artificial intelligence is really nothing new, it is the ability for a machine to “sense, comprehend, and act”, according to an Accenture publication. The real use for AI is in wading through the increasing volumes of data that are being generated on a daily basis and automating responses to signals from that data. Let’s look at one of the first commercial uses for fuzzy logic, a form of AI, the fuzzy logic rice cooker by Zojirushi. You select your type of rice, you put in water, set it and forget it. The rice cooker has sensors that monitor temperature and humidity and adjusts the temperature and cook time accordingly ensuring well-cooked rice. What it is really doing is automating and adjusting the cooking process of a food product that has been being cooked for thousands of years. In short, people know how to cook rice, the Zojirushi product has a model for cooking rice well and implemented an automation for making rice.
So, if we take our rice cooker analogy a step further, we find that the cooker is only capable of making a pot of rice with types of rice, or grains with which it is familiar. Jasmine rice, OK, short grain rice, no problem. What about wild rice which actually isn’t a rice at all but a grass, or a pork chop which certainly isn’t rice? Depending upon how the logic is implemented, all the rice cooker can do is monitor temperature and humidity in the pot, everything to the rice cooker is rice because it’s models don’t know about the edge case of wild rice or true outlier, a pork chop. This is the problem with fixed model systems, they don’t deal well with new or unusual things. That’s where learning comes in.
Learning is the real power and downfall of artificial intelligence. In most cases, AI’s are trained on sets that have been assembled to replicate a truth, a calculable entrance requirement to a labeled set or category. The entrance requirement can be a set of metadata that is weighted to achieve a score which determines the entrance to a particular category. We see this process go humorously wrong with toddlers when they are learning to speak. For example, little Freddy is 10 months old and he calls the family dog, Rover, ‘doggie’. Rover has four legs, a tail, and fur. On a day out with the family, Freddie sees a horse for the first time, points to it and says, “doggie”. Freddy just had a false positive because he had never seen a horse before and defaulted to the label he knew for things with four legs, a tail, and fur. In short, much like toddlers, AI’s are only as accurate as the training (experiences) they have been given. Can an AI train itself? Yes, it’s an old field called cybernetics, and that will be a topic for another blog post and no because there needs to be some sort of seeded a priori knowledge.
OK, rice cookers and Freddie the toddler, what does this have to do with supercomputing and artificial intelligence? These two examples have shown that just like people, artificial intelligences are bound by what they have been taught or trained upon and bound by the topology of their internal classification scheme. Supercomputers, or more specifically put, computers with accelerated hardware, are capable of increasing the speed of the system that governs the artificial intelligence. Due to its increased speed and capacity, it can train faster, on larger, more comprehensive sets, as well as on more focused and deeper training sets. This ability then allows a more fine-grained ability to discriminate input and a greater classification topology (more classes for classification and more complex relationships between classes).
Going back to our definition of artificial intelligence from Accenture, “sense, comprehend, act,” we see that artificial intelligence is just an automated classification…oh, yes, you in the back row, what’s that…inference and prediction? You’ve been reading ahead. Yes, both inference and prediction appear to be forecasting into the future; however, in both cases, they are using the models they have been trained upon, and forecasting methodologies that they have been trained to use, so in fact, they are rearward looking and very similar if not identical to our classification example. Inference simply trims and optimizes the classifiers in response to use, and prediction merely extends the models that are constructed in some logical way. We could even go so far as to say that any inference or prediction is a function of the training given to the AI. So again, we come back to our models and our classification topology.
If we accept the fact that AI’s are topology bound, this means that to get closer to the truth (whatever that is), every set of data can be categorized against numerous different classification topologies, we can call these different topologies “facets”. This is where accelerated computing shines. Instead of attempting to classify against a single entrance requirement, multiple AI’s can be trained against multiple entrance requirements that represent different potential semantic realities. For example, if the requirement is to classify types of ‘blues’, one logical set might be to name colors, (navy blue, sky blue, baby blue, …), a second might be musical genres (Delta blues, Chicago blues, Texas electric blues, …), and a third might have to do with Major League Baseball teams and players (Toronto Blue Jays, Vida Blue, …). The result is that once the facet space has been identified and defined, a more whole or full AI solution can be generated. So, if something as simple as ‘blue’ requires at least three fully trained classifiers, more complex search spaces will require much more. This expansive requirement means one thing, more compute time for training. This is where accelerated computing is a natural fit, faster compute means more facets can be trained per unit time. More facets trained means a more robust and complete AI coverage. With better semantic coverage, you are less likely to have your AI pointing to a “horse” and calling it “doggie.”