Machine learning (ML) has made tremendous disruption in the world of technology and in businesses. Humans no longer view machines as mere machines but as intelligent partners that may soon outwit them in informed decision-making. Today, ML, a niche under artificial intelligence (AI), has grown into a billion-dollar industry that is projected to grow at 42.08% CAGR between 2018-2024. Still, its full potential is yet to be explored as more ML tools and technologies emerge. Businesses large and small are adopting ML to leverage the massive volumes of data in their custody to gain insights for efficient operations and business growth.
Machine learning, a data analytics technique, teaches machines to learn from data and improve from experience. Machine learning algorithms are programs designed to learn from data without being explicitly programmed to do so. They analyze input data to make future predictions. Algorithms improve performance as new data samples are fed to them while continuously learning from the data to improve their predictive performance. This explains why ML algorithms are also commonly known as predictive modeling or predictive analytics.
Why does machine learning matter?
Machine learning algorithms are key problem-solvers in most if not all industries. Consider a financial institution that employs machine learning algorithms for credit scoring, a healthcare research facility that uses machine learning in drug discovery and disease outbreak detection, treatment facilities that use ML for tumor detection and DNA sequencing, or manufacturing, automotive, and aerospace industries that have adopted machine learning for predictive maintenance. Other ML technologies like image processing, computer vision, and natural language processing are important for face recognition, motion detection, and voice recognition applications.
Three categories of machine learning algorithms
Machine learning algorithms are divided into three broad categories which are:
- Supervised learning
Supervised learning uses known data sets and responses and learns from them to build ML models that can be used to make future predictions for new data samples. The models are trained continuously using labeled (training) datasets until they can make predictions to the desired level of accuracy before being applied to newer datasets. This type of machine learning is useful in scenarios where known data is available for the values being predicted.
Supervised learning algorithms are classified further into two categories based on the type of problems that they solve which are:
- Classification techniques
- Regression techniques
Classification techniques
Classification models are best applied to data that can be labeled, categorized into distinct categories or classes. They are used to predict discrete values in known sets of data. Classifiers can be:
- Binary classifiers in which output is classified into two distinct classes for instance genuine and spam mail, cancerous or benign tumors, positive or negative sentiments, etc.
- Multi-class classification classifies data into more than two distinct classes for types of iris plant species.
Common classification algorithms
- Naive Bayes. This classification model is based on Bayes’ theorem in which all the attributes of data points are classified independently of each other. It is the easiest to build, the most simplistic, and a good model to use on overly large datasets.
- Support vector machine (SVM). These models are used to filter data that will be used for classification and regression analyses.
- K-nearest neighbor. K-Nearest-Neighbour algorithm is used to estimate the likelihood of a data point being in one cluster or another by looking at the majority data points around a single data point in other words its k neighbors. This process is applied to an entire data set every time new input data is introduced. This algorithm can be used to solve both classification and regression problems.
Regression techniques
Regression analysis is a supervised machine learning technique that is applied to datasets to predict continuous outcomes (dependent variables) against one or multiple independent variables. The aim of regression analysis is to analyze the relationship between the dependent and independent variable(s) and is best applied to forecast trends and patterns, time-series, and to establish cause-effect relations between variables in a data range for instance in the weather forecast, electricity load forecasting, and sales forecasting.
Common regression analysis models include:
- Linear regression. This is the most common regression analysis model. This model is used to determine the output (dependent) value based on the input (dependent) values. It is used to quantify the relationship between dependent and independent variables by fitting the best line that is nearest to most of the plotted data points.
- Logistics regression. Logistic regression is different from linear regression in that linear regression is applied when predicting continuous values i.e temperature. On the other hand, logistic regression estimates the probability of an event occurring from historical data. Unlike linear regression, logistic regression outcomes are discrete values i.e yes/no, positive/negative, and are thus best suited for binary data classification.
- Decision trees. A decision tree is in essence a flowchart in a structure similar to that of a tree in which data is continuously branched (split) based on certain attributes (independent variables) such that the leaves represent a possible outcome or decision while each node represents a test applied to a variable. Decision trees are applicable to both classification and regression techniques.
- Neural networks. Artificial neural networks resemble the human brain comprising numerous interconnected neurons organized in layers. These neurons work together to process information hence learn through experience and by example. They are used for analyzing non-linear relationships in a high-dimensional dataset.
- Unsupervised learning
Unsupervised learning discovers hidden patterns from unlabeled datasets. Unsupervised learning is best applied to clustering populations.
The most common unsupervised technique is:
Clustering
Clustering is used for exploratory and dimensionality reduction data analysis to discover hidden patterns or natural groupings in a dataset. Cluster analysis models are commonly applied to market research, threat detection, and object recognition.
Common clustering algorithms are:
- K-means. K-means models classify datasets into a specific number of clusters represented by the variable k. Each data point is assigned to one of the clusters based on the features of the cluster.
- Apriori algorithm. The Apriori algorithm learns correlations and relations between variables in datasets to discover association rules.
- Hierarchical clustering. Hierarchical clustering involves grouping objects with similar attributes into clusters with a predetermined hierarchical order from top to bottom.
- Neural networks are also a common algorithm under the clustering technique.
- Reinforcement learning
Reinforcement learning involves training models within a complex uncertain environment to make a series of specific decisions. Through trial and error, the agents take actions that are either rewarded or ‘punished’. Over time, the agents adapt to taking the rewarded actions to maximize the reward.
Common reinforcement learning algorithms
- Markov Decision Process (MDP). The Markov decision process involves an agent (decision-maker) in an environment with which it interacts. At discrete time points, the agent takes actions that the environment will reward positively or negatively to generate a new state. The MDP expresses the environment in reinforcement learning in a mathematical format.
- Q learning. Unlike the Markov decision process, the Q-learning technique finds the best action to take for individual states with neither an environment nor a model. It is thus considered an off-policy or policy-free reinforcement learning algorithm.
Which machine learning algorithm should you use?
Each of the machine learning algorithms explained above takes a unique approach in learning from data. For this reason, it is important to select the appropriate algorithm because there is no one-size-fits-all. Importantly, to select the right algorithm, one needs to consider:
- The type and size of data he/she is working with
- Quality of data
- Available time to train the models
- The insights to be drawn from the data
- Application of data insights
Also important is experimentation with the models to evaluate their appropriateness to the situation that they are being applied to. This may require some time yet ultimately leads to accurate predictions, something that every business needs.