Understanding the Different Types of Machine Learning Algorithms
Neural networks. Support vector machines. Random forests. Naïve Bayes. Logistic regression. Deep learning. Gradient boosting. Decisions trees… the list of machine learning algorithm types is long and filled with confusing buzzwords.
What machine learning algorithms have in common is that they analyse historic data to work out what’s going on (i.e. relevant relationships and patterns), and create a “model” based on that. The model can then be applied to make judgements about new or future cases.
The actual algorithms can be thought of as commodities, a rich set of tools that data scientists can pick from as they tackle demanding analytical and AI projects. What’s more important than the specific algorithms is to understand the capabilities they provide to tackle a range of data science tasks:
- Classification: Differentiating between different outcomes – for example, equipment that is about to fail versus running in a normal stable state.
- Scoring: Adding a measure of likelihood to a classification. A piece of equipment about to fail might have a score of 0.9 or higher, whereas a low score such as 0.25 indicates low risk of imminent failure.
- Estimation/Forecasting: Predicting a numerical value – for example, predicting the level of oil in water in 12 or 24 hours’ time.
- Clustering: Grouping together similar cases. Clustering algorithms can be used, for example, to automatically identify different operating modes of specific equipment.
- Association analysis: Finding sets of events which tend to occur together, such as patterns of closed and open valves.
- Sequence analysis: Finding sets of events which tend to happen, in order, over time – for example, the order in which wells are brought on- or off-line.
- Anomaly detection: recognising abnormal incidents or cases. An anomaly detection algorithm might be used, for example, to automatically learn the normal operating modes of a system and hence be able to flag situations that deviate from these.
Machine learning is often split into techniques which are “supervised”, where previous cases with known outcomes are the basis for learning, or “unsupervised”, where the algorithms are tasked with learning which patterns exist. Supervised learning is used for classification, scoring and estimation/forecasting. Clustering, association analysis, sequence analysis and anomaly detection use unsupervised learning.
The models produced by machine learning are used in two ways: to give insights to human decision makers so they can better understand how systems operate, and to make accurate and robust predictions/judgements to feed directly into improved operational/business decision making.
Different algorithms have different strengths and limitations. Rule induction, for example, provides classification rules which are easy to read and understand, but give “lumpy” scoring (many similar cases given identical scores). Neural networks, on the other hand, provide fine-grained scoring (so cases can be ranked and compared), but their operations is opaque and it is hard to understand exactly how they come to their conclusions.
The maxim “two heads are better than one” applies to models as well as human decision makers.
“Ensembles” are sets of two or more models used together (in the style of a panel of experts) to improve the accuracy and reliability of their conclusions; because the models are typically produced by different algorithms, they will tend to make different types of errors in their learning, and combining them helps cancel these out. In deploying an AI solution, data scientists will often test and evaluate not only individual models, but also multiple configurations of ensembles to find the one(s) which give(s) the best performance.