Unit 2 - ML - SRM
Unit 2 - ML - SRM
• As observed in Fig 1, the red plots poorly fit the normal distribution,
hence their ‘likelihood estimate’ is also lower. The green PDF curve
has the maximum likelihood estimate as it fits the data perfectly. This
is how the maximum likelihood estimate method works.
Least Square Method
• X = (1+2+4+6+8)/5 = 4.2
• Y = (3+4+8+10+15)/5 = 8
SOLUTIO
N
SOLUTION
• The slope of the line of best fit can be calculated from the formula as follows:
• c = Y – mX
• c = 8 – 1.68*4.2 = 0.94
• Thus, the equation of the line of best fit becomes, y = 1.68x + 0.94.
Problem 2:
• Find the line of best fit for the following data of heights and weights
of students of a school using the least squares method:
Ridge Regression
Bayesian Linear
Regression
Bayes Theorem
what is the use of bayesian
linear regression in machine
learning
• Bayesian Linear Regression is used in machine learning for several
important reasons, particularly when dealing with uncertainty, small
datasets, or when prior knowledge about the parameters is available.
• Bayesian linear regression extends the traditional linear regression
framework by incorporating prior beliefs about the parameters,
resulting in a full posterior distribution for the parameters. This
approach offers several advantages, including better handling of
uncertainty and robustness to overfitting.
Key Concepts of Bayesian Linear
Regression
Linear models for classification
• Discriminant function
• Probabilistic generative models,
• Probabilistic discriminative models
• Laplacian approximation
• Bayesian logistic regression
Linear Models for Classification: Discriminant Function
• Linear models are commonly used for classification tasks, where the
goal is to assign inputs (data points) to one of several possible classes.
A discriminant function is a type of linear model used to separate
different classes by defining a decision boundary.
• Example: Classifying Points Based on Two Features
Problem Setup:
• Suppose we have a dataset with two features (x1 and x2) and two classes
(Class 1 and Class 2). The goal is to classify a new data point into either
Class 1 or Class 2 using a linear discriminant function.
• Here’s a small dataset:
X1 X2 Class
2 3 ?
3 4 ?
4 2 ?
5 3 ?
Probabilistic Generative Models
Probabilistic Generative Models
One major weakness of the Laplace approximation is that, since it is based on a Gaussian
distribution, it is only directly applicable to real variables.
In other cases it may be possible to apply the Laplace approximation to a transformation of the
variable. For instance if 0 <T < ∞ then we can consider a Laplace approximation of ln τ .
The most serious limitation of the Laplace framework, however, is that it is based purely on the
aspects of the true distribution at a specific value of the variable, and so can fail to capture important
global properties.
Bayesian Logistic Regression
Logistic Regression
• Logistic regression is one of the most popular Machine Learning algorithms, which
comes under the Supervised Learning technique. It is used for predicting the
categorical dependent variable using a given set of independent variables.
• Logistic regression predicts the output of a categorical dependent variable. Therefore
the outcome must be a categorical or discrete value. It can be either Yes or No, 0 or 1,
true or False, etc. but instead of giving the exact value as 0 and 1, it gives the
probabilistic values which lie between 0 and 1.
• Logistic Regression is much similar to the Linear Regression except that how they are
used. Linear Regression is used for solving Regression problems, whereas Logistic
regression is used for solving the classification problems.
• In Logistic regression, instead of fitting a regression line, we fit an "S" shaped logistic
function, which predicts two maximum values (0 or 1).
• The curve from the logistic function indicates the likelihood of something such as
whether the cells are cancerous or not, etc.
• Logistic Regression is a significant machine learning algorithm because it has the
ability to provide probabilities and classify new data using continuous and discrete
datasets.
• Logistic Regression can be used to classify the observations using different types of
data and can easily determine the most effective variables used for the classification.
The below image is showing the logistic function:
Logistic Function (Sigmoid Function)
Bayesian Logistic Regression
• Bayesian Logistic Regression is an extension of logistic regression where we
apply Bayesian inference to estimate the model parameters.
• It combines logistic regression with Bayesian inference.
• In contrast to traditional logistic regression, which provides point estimates
for the parameters, Bayesian logistic regression treats the parameters as
random variables and provides a probability distribution over them.
• This approach allows us to quantify the uncertainty in the parameter
estimates and make probabilistic predictions.
Example Scenario
• Imagine you're working on a medical diagnosis problem where you
need to predict whether a patient has a particular disease (binary
outcome: 0 for no, 1 for yes) based on some clinical measurements
(features). If you have prior knowledge about the likely effect of these
measurements on the disease (say from previous studies), Bayesian
Logistic Regression allows you to incorporate this knowledge and
update it as more patient data becomes available.
• For instance, if a certain measurement is known to be positively
correlated with the disease but you're uncertain about the strength of
this relationship, you might use a prior that reflects this belief. As you
collect more patient data, the posterior distribution will reflect both
the prior knowledge and the new data, giving you a more refined
estimate.