Machine Learning Algorithms form the core of machine learning processes, providing the computational methods necessary for systems to learn from data and make predictions or decisions. These algorithms can be categorized into different types, each suited for specific tasks such as regression, classification, and clustering. Understanding these algorithms is crucial for effectively implementing machine learning solutions.

Common Types of Machine Learning Algorithms

Machine Learning Algorithms can be broadly categorized into three main types:

  • Regression - Used to determine a numerical value, predicting “how much” or “how many.”
  • Classification - Assigns labels or categories to data points.
  • Clustering - Groups similar data points together based on patterns or characteristics.
graph TD
    A(ML Algorithms) -->|Used to determine a Numerical value| B(Regression)
    A -->|"Used to determine a Category (Categorization)"| C(Classification)
    A -->|Used for Grouping Similar data points| D(Clustering)


Regression Algorithms

Regression Algorithm calculated the co-relation between the input features and output(identifying the dependant and independent values of the input for the output ). Its prediction is always in continuous spectrum( its always a number). Regression algorithms are employed when the goal is to predict a numerical value. Some commonly used regression algorithms include:

  • Decision Tree Regression - Divides inputs into subsets to make decisions.
  • Neural Network Regression - Classification problems (hand written digits classification)
  • LASSO Regression - Its a form of Linear Regression that shrinks the input data values (good for true predictions rather than inferences)
  • Ridge Regression - Its a method for analyzing multiple regressions and very similar to LASSO (good for true predictions rather than inferences)
  • Elastic Net Regression - Hybrid of LASSO and Ridge
  • Linear Regression - Modelling the relation between the Input features and outputs
  • Polynomial Regression - Relationship between the dependent and independent variables are modelled to nth degree
  • Stepwise Regression - form of fitting regression model where the choice of predictive features is automated and determined on the fly
  • Logistic Regression - it is used to detect success or failure of an event (Note: Usually used for classification tasks)

Deciding on the Best Regression Algorithm

Choosing the most suitable regression algorithm involves considering factors such as data exploration, goodness of fit, the objective of the model, and cross-validation.

Classification Algorithms

Classification algorithms are designed to assign labels or categories to data points. They are commonly used in tasks such as image recognition, spam detection, and sentiment analysis. Based on the training data certain boundary conditions are determined by model and then these are applied to predict the target classification.

  • Classifier - algorithm that maps the data to the categories
  • Classification Model - which predicts the classification of the target data
  • Feature - individual measurable property

Types of classification algorithms

  • Binary Classification - Two class classification
  • Multi-class Classification - More than two classifications
  • Multi-label Classification - More than one label for the data

Different classification algorithms

  • Support Vector Machine Algorithm - Each item is plotted in n-dimension space (n: no of features), the vector lines are drawn between these and the best lines that separate the classifications into even distributions are picked to detect the classification.
  • Random Forest Algorithm - these use decision trees and are created on the fly
  • Stochastic Gradient Descent Algorithm -
  • Logistic Regression Algorithm - Used for binary classification usually, has a hypothesis based on sigmoid curve
  • Naive Bayes Algorithm - Detects the independence of the features , like if one feature is completely unrelated to others. Text based usages like spam
  • Decision Tree Algorithm - Used for prediction in general, outcomes are result of various decisions
  • K-nearest neighbors algorithm - Used for pattern recognition , data mining and intrusion detection, the algorithm based on the features of the input data predicts the classification we already know.

Clustering Algorithms

Clustering algorithms group similar data points together based on patterns or characteristics. This is useful in tasks such as customer segmentation or anomaly detection.

Its an unsupervised learning as it uses unlabelled data and then group them logically into clusters.

Types of clustering algorithms

  • Density-based Algorithm - Groups based on high concentration of data points
  • Distribution-based Algorithm - Groups based on probability that it belongs to certain cluster.
  • Centroid-based algorithms - Groups based on Geographic center of population of data points
  • Hierarchical-based algorithms - Based on branches

Different clustering algorithms

  • K-means clustering algorithm - its a centroid based algorithm , works better with smaller datasets , tries to minimize the variance between data points
  • DBSCAN clustering algorithm - Density based algorithm
  • Gaussian Mixture algorithm - Fix inconsistently distributed data and then predict the distribution
  • BIRCH algorithm - works well Large Dataset, its a Hierarchical-based algorithm.
  • Affinity propagation clustering algorithm -
  • Mean-Shift clustering algorithm - mode-seeking algorithm
  • OPTICS Algorithm
  • Agglomerative hierarchy clustering algorithm - based on similarities merges datapoints into clusters