Ideal Architecture for AI/ML and Analytics

May 4, 2018

By: Tom McNeill

VIEW ALL

More in this series:
Machine Learning and Analytics in the High Performance Cloud


Ring the bell!  School is back in session, and I’ve had some time to think, and cogitate on the topics of Machine Learning, Artificial Intelligence, and Analytics in the High Performance Cloud.   Now, granted, much of this deep study and cogitation was done with my toes in the sand on South Padre Island while trying to avoid the jellyfish that wash up on the beach and keep sand out of my socks, so I’d appreciate your indulgence here.

OK, so, in previous blog posts we’ve gone through the High Performance Basics, and those were written in a rather no-nonsense manner to act as a series of reference materials.  Well, this isn’t reference material, this is more of my thinking on these topics.

To recap where we are with these topics, in the area of Artificial Intelligence we’ve had blog posts entitled, “Supercomputing and Artificial Intelligence”, “The Most Powerful AI and Machine Learning Platform In the World”, “What is Deep Learning and Neural Nets”, and many other.  Suffice to say, Artificial Intelligence, Deep Learning, Neural Nets and all applications analytical are important in the realm of high performance computing.  Just look at the amount of digital ink spent upon the topic! So how is this post going to be any different? Am I not just going to re-hash the same old exhausted territory?

Glad you asked, Grasshopper.  In this post I’m going to do something I’ve never done before.  I’m going to propose what I’m calling the “Ideal Architecture” for AI/ML and Analytics.  Buckle up, this is going to be fun.

Now, just to set the ground rules and expectations, I will not be suggesting or endorsing individual frameworks or hardware brands, or methods as these all vary with application, what I will be doing is suggesting the flow of work and mapping it back to the Viable Systems Model.  Additionally, I will also be highlighting some upcoming technology use cases that may prove to be interesting.

So, here we go!  Artificial Intelligence/Machine Learning/Deep Learning workflows, as executed in the cloud, generally all walk the same path.  The user has a bolus of data and wants to generate a model/category from this data so they can find items that “correspond to” or are “like” the elements in their model/category.  Once found, then some sort of logic is executed as a reaction to the detection of a new member that conforms to the model or category. I’ve simplified this in several blogs with the following bit of pseudocode…

Category(x) = {x|x satisfies some model}

foreach j in list{ 
   if list[j] = Category(x){
       Do something 
       …
   }
   else{
       Do something else 
       … 
       }
   }

}

and then I go on to make the pithy statement, that the hard part is deciding how to build Category{x}.

It’s here in the construction and optimization of Category{x} where we get into the long haired math and fancy Greek letters and loads of linear algebra that makes my head hurt, but, can be accelerated with GPUs.  With the end result generally being accuracies in the mid- to high 80% range.  It is at this point where cloud-based high performance platforms like Nimbix really shine.  I’ve even gone as far as to suggest that much of this ambiguity is actually a model granularity problem and can be resolved by building several highly specific categories instead of one single category that attempts to encompass all aspects of a term or topic.  

So, where do we go from here?  Well, there are two places to go, the first is to talk about the construction of the data set that we train upon, the second place to go is to talk about the new technologies and architectures that enable the actual real deployment of these created models to devices on the edge.  

The ideal architecture model

I like to think and explain things top-down, so I’ll start with the full diagram of the architecture and explain it as we go.

Well, there we have it, there’s the model.  Thing of beauty, isn’t it? In my mind it encompasses the circle of knowledge generation and copes with evolution and change quite effectively.  You could almost call it the circle of life for artificial intelligence and machine learning. OK, let me see if I can explain it to you.

As we’ve tangentially discussed in the first section, a model/Category is only as good as the data that is used to build it, at this point the methods for how the model is constructed is utterly inconsequential.  And if you are following along on the model, we’re going to start at “Semantic tagging, metadata creation,” an awkward place to start, but, if I do my job it will all make sense.

So let’s start with a question, “How do we assemble a bolus of material to build our model/Category?”  Some folks might say, “You have a pile of data, you select at random 30% of your data to train with then test with the other 70%,” and they will provide many references supporting this.  I agree with this methodology to a point, where I differ is when it comes to real world sets, especially material that is based on any sort of natural language.  When dealing with natural languages you are confronted with a degree of ambiguity based upon time, place, context, dialect and other factors.  So, let’s trot out my favorite example, the term or concept “blue.” So, let me ask you a question,

“How many blues are there?”  Which of these answers are wrong: baby blue, Texas electric blues, Blue Jays?  

Trick question, they’re all correct, the only difference is the metadata that can be attached to them to indicate semantic or descriptive function.  For example, ‘baby blue’ might have metadata indicating that it is color, or a type of gourmet sweet corn, or a type of whiskey distilled outside of Waco, Texas distilled from an heirloom variety of blue corn grown here in Texas (anyone ever have blue corn tortilla chips?).  The same is true for all of the other examples, the meaning of ‘blue’ is dependent upon context. The result means that in order to create a meaningful bolus to generate a model, we first have to define what our semantic landscape looks like prior to training our AI. To do this we would need a variety of ontologies to generate the appropriate metadata and tag the material appropriately.  Otherwise, we’ll be trying to generate categories using colors of paint and whiskey. Imagine what happens if we go to ‘Blue Jays’, we have birds and baseball teams! In some cases this might be fine, in others, it would introduce too much ambiguity into your model. This is especially true if you are trying to automate or continuously train a model. In other words, a fundamental source of ambiguity in models is the result of semantic irregularities, and the concept of semantic irregularity can be extended to encompass visual as well as written and audio data.  I make the same argument, albeit much more esoterically, with classical music and composers “quoting” one another in their pieces, jazz is even more replete with examples, and rap and EDM are just blatant in borrowing.

Once the data has been appropriately tagged, it is then possible to create clusters of material that are related.  The formation of these clusters is driven by the ontologies chosen to represent the facets of the term space you are interested in modeling.  I know what you are thinking, “What’s an ontology?” The simple answer is a taxonomy of concepts or synonyms that have defined relationships between them ‘is a’, or defines a property with ‘has a’ or some other controlled verbiage to define a relationship, but ‘is a’ and ‘has a’ can be used to construct any relationship.  

For example…

Baby Blue is a color

Baby Blue has a RGB color code of 137, 207, 240

OK, so now we’ve figured out how to automate the construction of our bolus.  We’ve got our ontologies that describe our space. We can use NLP or other appropriate technology to tag our data.  From here we can use our 70/30 rule and go on our happy way, right? Well, not so fast Grasshopper. If your goal is to automate construction of your AI and have it learn over time, you’ll need to have it do two more things, the first is talk to your data sources, these can be internal or external to your enterprise, and you’ll have to generate some sort of automated mechanism for segregating your data into boluses (boli?) and describe those boluses.  I suggest starting with K-means clustering, but there are many other methods that can be employed. Kernal methods are also useful. Then you can select the boluses based on some sort of criteria and either combine them to form a set to train and validate with or train separately with each bolus and logically hook the, then created categories, together. It all depends upon what you are trying to accomplish. But, I would argue, that most AI/ML/DL applications are going to be involved in some sort of continuous training and will need this type of mechanism to supply them with new and useful material to train upon.

If we continue in a clockwise direction, you’ll see that we’ve moved from “Semantic tagging, metadata creation”, through “Scored topic for learning” and “Semantic aggregation” this is where we cluster our data.  At this point we are where the typical AI/ML/DL training workflow in the high performance computing environment begins at “Model training”, and according to the graphic below, we are now in the blue are which is well supported by a variety of tools and platforms.  

If we continue clockwise, to “Inference testing”, we see two things, first is a red arrow back to our first step, “Semantic tagging and metadata creation,” and a second dotted arrow to “Convert Model to RTL….”  Let’s take the red arrow first, here is the first bit of data re-use and self-correction (actually cybernetics, but I’ll get into that later with the Viable Systems Model). The red arrow indicates a path for data that has failed classification after training to be tagged and pushed back into the “Semantic tagging.”  At this point, steps may have to be taken to either remove that piece of data from further consideration or explore changes to the NLP steps to enable correction. The dotted blue arrow, however, is the topic of our next section, “Moving to the edge”.

variable systems model

If we continue clockwise, to “Inference testing”, we see two things, first is a red arrow back to our first step, “Semantic tagging and metadata creation,” and a second dotted arrow to “Convert Model to RTL….”  Let’s take the red arrow first, here is the first bit of data re-use and self-correction (actually cybernetics, but I’ll get into that later with the Viable Systems Model). The red arrow indicates a path for data that has failed classification after training to be tagged and pushed back into the “Semantic tagging.”  At this point, steps may have to be taken to either remove that piece of data from further consideration or explore changes to the NLP steps to enable a correction. The dotted blue arrow, however, is the topic of our next section, “Moving to the edge”.

TwitterLinkedInFacebookPrintGoogle+

Other Articles to Read