0% found this document useful (0 votes)
5 views7 pages

Faiml Unit 2

Machine Learning (ML) is a subset of Artificial Intelligence (AI) that enables computers to learn from data and improve without explicit programming. It encompasses various learning types, including supervised, unsupervised, and reinforcement learning, and has applications across multiple industries, such as healthcare and finance. The evolution of ML has been driven by advancements in algorithms, big data, and computational power, leading to significant breakthroughs in areas like image recognition and natural language processing.

Uploaded by

boiianonymous8
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views7 pages

Faiml Unit 2

Machine Learning (ML) is a subset of Artificial Intelligence (AI) that enables computers to learn from data and improve without explicit programming. It encompasses various learning types, including supervised, unsupervised, and reinforcement learning, and has applications across multiple industries, such as healthcare and finance. The evolution of ML has been driven by advancements in algorithms, big data, and computational power, leading to significant breakthroughs in areas like image recognition and natural language processing.

Uploaded by

boiianonymous8
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Introduction to Machine Learning

Machine Learning (ML) is broadly defined as a subfield of artificial intelligence that enables computers
to learn from data and improve over time without being explicitly programmed. In essence, ML
algorithms build models from data that can make predictions or decisions. For example, IBM defines
ML as a branch of AI focused on “enabling computers and machines to imitate the way that humans
learn” and improve performance through exposure to more data 1 . DataCamp similarly notes that ML
involves “algorithms that improve automatically through experience and by the use of data,” allowing
machines to learn patterns and make predictions on new data 2 . In practical terms, instead of writing
rules by hand, we give the computer many examples (input–output pairs) and let it generalize a
mapping from inputs to outputs.

ML’s power comes from learning from data. A machine‐learning model is trained by fitting a
mathematical function to a dataset: during training it adjusts internal parameters (e.g. weights in a
neural network) to minimize prediction error 3 4 . The training set comprises labeled examples (for
supervised learning) or just features (for unsupervised learning), and an optimization algorithm (like
gradient descent) tweaks the model so its predictions match the known answers 5 4 . Once trained,
the model can predict outputs for new, unseen inputs by applying the learned mapping. As Wikipedia
explains, ML “build[s] a mathematical model from input data” so that the model can make data-driven
predictions or decisions 6 .

Machine learning is just one approach within the broader field of Artificial Intelligence (AI). AI refers
to any computer program that performs tasks “similar to how humans solve problems,” such as
reasoning, perception, or language understanding 7 . ML is a subset of AI that focuses specifically on
learning from data. For instance, DataCamp states that “Machine Learning is a subset of AI” that uses
data-driven algorithms for predictions 2 8 . In practice, ML powers many of today’s AI applications
(like image recognition, recommendation systems, autonomous vehicles) by automatically finding
patterns in data, whereas other AI systems might use hand-coded rules or logic without learning from
data.

History of Machine Learning


Machine learning has evolved over decades through foundational research and breakthroughs. One
landmark was the 1943 work of Warren McCulloch and Walter Pitts, who developed the first
mathematical model of a neural network – effectively “the first steps in modeling brain-like
computation” 9 . In 1950, Alan Turing introduced the “Turing test” for machine intelligence, opening
the conceptual door to AI 10 . In the 1950s, pioneers like Arthur Samuel and Frank Rosenblatt made key
advances: Samuel created a checkers-playing program (the first self-learning game player) and coined
the term machine learning in 1959 11 , while Rosenblatt built the Perceptron (a simple neural network)
in 1958 12 .

The 1960s and 70s saw symbolic AI and early neural models (e.g. the ELIZA chatbot in 1966) 13 , but
also a so-called “AI winter” when progress stalled. Research picked up again in the 1980s and 90s as
new algorithms were invented (e.g. backpropagation for training multi-layer networks in 1969 14 ,
convolutional networks for image recognition in 1989 15 , and Q-learning for reinforcement learning in
1989 16 ). Notable successes included IBM’s Deep Blue beating the world chess champion in 1997 and

1
advances in probabilistic methods. In the 2000s, deep learning emerged: Geoffrey Hinton coined “deep
learning” in 2006 to describe multi-layer neural networks 17 , and the creation of large datasets like
ImageNet sparked a rapid AI boom. Today’s ML builds on these foundations: massive computational
power, big data, and advanced algorithms combine to drive modern AI applications.

Big Data and Machine Learning


“Big Data” refers to extremely large and complex datasets – often characterized by high volume, velocity,
and variety – that traditional data tools cannot handle easily 18 . ML thrives in the era of big data: more
data generally means better learning. As WEKA’s Glossary notes, “machine learning, by and large,
requires vast quantities of training data to function at the level of innovation it does today” 18 . In other
words, the availability of big data is a key enabler for ML. Machine learning algorithms can leverage
insights from big data by detecting subtle patterns and correlations that humans or simpler models
would miss 19 . For example, companies may collect terabytes of user behavior, sensor, or transactional
data, and ML models trained on this data can make accurate predictions or identify trends. Thus, big
data and ML are closely linked: big data provides the raw material for ML to learn, and ML provides the
methods to turn raw data into actionable insights 19 18 .

Applications of Machine Learning


Machine learning is widely leveraged across industries to automate tasks, improve decision-making,
and create new capabilities. In business and science, “data is the new oil,” and ML is the engine driving
data analytics 20 . A 2020 Deloitte survey found that 67% of companies were already using ML, with
97% using or planning to use it soon 21 . ML powers personalization (e.g. recommendation engines for
shopping or media), fraud detection in banking, predictive maintenance in manufacturing, and more. In
healthcare, ML is used to improve diagnostics and treatment: for instance, Google’s Med-PaLM2 LLM
helps interpret complex medical information, aiding clinicians in decision-making 22 . In finance, banks
use ML for credit scoring and risk management; JPMorgan employs AI chatbots for asset management
23 . Satya Nadella, CEO of Microsoft, summarizes ML’s impact: “Machine learning is the most

transformative technology of our time. It’s going to transform every single vertical” 24 . These examples
show that ML leverages data to optimize processes, uncover hidden patterns, and make predictions
that benefit sectors from retail to robotics.

Descriptive vs Predictive Analytics


In data analytics, descriptive and predictive approaches serve different goals. Descriptive analytics
focuses on summarizing what has happened in the past. For example, a dashboard showing last
quarter’s sales by region is descriptive. IBM defines it simply: “As the name implies, this type of analytics
describes the data it contains” 25 . It often involves statistical summaries and visualizations (pie charts,
histograms, etc.) to explain historical trends. In contrast, predictive analytics uses models to anticipate
what might happen in the future. It “mines existing data, identifies patterns and helps companies predict
what might happen in the future based on that data” 26 . Predictive analytics employs advanced
statistical techniques, ML algorithms, and data mining to forecast outcomes (e.g. forecasting sales,
weather, or equipment failures) 26 27 .

In summary: Descriptive analytics answers “What happened?” by examining historical data 25 ,


whereas predictive analytics answers “What will happen?” by using models to extrapolate from the
past 26 27 . Both play crucial roles: descriptive analytics provides context and basic insights, while
predictive analytics offers foresight to inform proactive decisions.

2
Machine Learning and Statistics
Machine learning and statistics are closely related fields, both dealing with data and inference, but they
emphasize different goals. A concise way to see the distinction is: statistics aims for inference; machine
learning aims for prediction. As Nature Methods summarizes: *“Statistics draws population inferences from a
sample, and machine learning finds generalizable predictive patterns.” 28 . In practice, many ML algorithms
(e.g. linear regression, logistic regression, Bayesian models) originated in statistics. However, statistics
traditionally focuses on drawing conclusions about data (e.g. confidence intervals, hypothesis testing)
and understanding relationships within data. Machine learning, by contrast, emphasizes building
models that perform well on new data: its goal is often “making repeatable predictions by finding
patterns within data” 29 .

DataRobot also explains that statistics is about making inferences about a population from a sample,
whereas ML is about making predictive models that work on unseen data 29 . Importantly, modern ML
practitioners often borrow statistical tools (e.g. probability theory, distributions) but apply them at large
scale with iterative algorithms and automation. In summary, ML can be seen as a practical extension of
statistical learning: it uses statistical foundations but with the objective of maximizing predictive
accuracy and scalability on complex datasets 28 29 .

Artificial Intelligence vs. Machine Learning


Artificial Intelligence (AI) and Machine Learning (ML) are related but distinct terms. AI is the broader
concept of machines exhibiting intelligence – that is, performing tasks that would require human
intelligence (reasoning, perception, decision-making). ML is specifically the part of AI that involves
learning from data. Put differently, ML is one way to achieve AI. DataCamp clarifies: “Machine learning is
a subset of AI, which uses algorithms that learn from data to make predictions” 2 30 . MIT Sloan likewise
describes ML as “a subfield of artificial intelligence that gives computers the ability to learn without
explicitly being programmed” 8 .

Thus, while all machine learning is a form of AI, not all AI systems use machine learning. Traditional AI
might use rule-based expert systems or search algorithms, whereas ML systems learn patterns
automatically. In practice today, many AI applications (like vision, NLP, robotics) use ML as their core,
since ML has proven highly effective at tasks like recognizing images or translating language.

Types of Machine Learning


Machine learning tasks are typically categorized by how they learn from data:

• Supervised Learning: The algorithm is trained on a labeled dataset (each example has an input
and a known output). The model learns a mapping from inputs to outputs and can then predict
labels for new data. Common supervised tasks include classification (predicting categories) and
regression (predicting continuous values). For example, training a model on thousands of
labeled cat/dog images lets it classify new images as “cat” or “dog.” 31

• Unsupervised Learning: The algorithm is given unlabeled data and must find structure or
patterns on its own. Common techniques include clustering (grouping similar data points) and
dimensionality reduction. For instance, an unsupervised model can group customers into
segments based on their behavior without any predefined labels 32 .

3
• Semi-Supervised Learning: This lies between supervised and unsupervised. The model is
trained on a large amount of unlabeled data plus a small amount of labeled data. Typically, the
model is first trained on the labeled subset and then refined using the unlabeled data 33 . This is
useful when labeling is expensive: a few labeled examples can guide the learning on much more
unlabeled data.

• Reinforcement Learning: Here, an agent learns by interacting with an environment and


receiving rewards or penalties. It must learn a policy of actions to maximize cumulative reward.
Unlike the above, reinforcement learning deals with sequential decision-making (e.g. game
playing, robotics control) where each action influences future states 34 . A classic example is
training a computer to play chess or Go by playing many games and learning from wins/losses.

Figure: Comparing Supervised and Unsupervised Learning. In supervised learning, models train on labeled
data (known input–output pairs), whereas unsupervised learning uses unlabeled data to discover
inherent structures 31 32 .

Supervised Learning (Classification vs. Regression)

In supervised learning, the model learns a function from inputs to outputs using labeled examples
31 . Two main types of supervised tasks are:

• Classification: The output is a discrete class or category (e.g. “spam” vs. “not spam”, or “cat” vs.
“dog”). For classification problems, the model learns decision boundaries between classes. As
GeeksforGeeks explains, classification handles “discrete outcomes” such as yes/no or multi-class
labels 35 . For example, an email filter that flags spam uses classification.

• Regression: The output is a continuous value (e.g. price, temperature). Regression aims to find
the best-fit curve or line through the data. It “focuses on finding the best-fitting line to predict
numerical outcomes” 36 . For example, predicting tomorrow’s stock price or a house’s market
value uses regression.

Both classification and regression require labeled training data. In practice, many algorithms (like
decision trees or neural networks) can be applied to either task depending on how the output is
encoded (categorical vs. numeric).

Bayesian Methods

Bayesian approaches apply probability theory and Bayes’ theorem to learning. In ML, naive Bayes
classifiers are a simple yet effective example. These models compute the probability of each class given
the input features and choose the class with the highest posterior probability. Naive Bayes assumes
feature independence and uses the formula:

$$P(\text{Class}| \text{data}) \propto P(\text{data}|\text{Class}) \times P(\text{Class}).$$

Despite its simplicity, naive Bayes often works well in practice. Wikipedia describes naive Bayes as a
family of “probabilistic classifiers which assumes that the features are conditionally independent” given
the class 37 . Because it relies on straightforward probability counts, naive Bayes scales to very large
datasets (counting feature occurrences) and is commonly used in text classification and spam filtering.
In general, Bayesian methods in ML combine prior beliefs and observed data to make inferences,
though many modern implementations are rooted in frequentist training for convenience 37 .

4
Clustering (Unsupervised Learning)

Clustering is a key unsupervised technique for grouping similar data. Given only features and no labels,
clustering algorithms partition the data into clusters so that points in the same cluster are more similar
to each other than to those in other clusters. A popular example is k-means clustering, which
iteratively assigns points to the nearest of k centroids and adjusts centroids to minimize within-cluster
variance. DataCamp notes that unsupervised learning is “often used for clustering,” with k-means as a
common algorithm 38 . Other clustering methods include hierarchical clustering, DBSCAN, and
Gaussian mixtures. Clustering helps uncover hidden structures, such as grouping customers by
purchasing behavior or segmenting images by content, without requiring labeled examples.

Decision Tree Learning

A decision tree is a versatile supervised learning model for both classification and regression 39 . It
builds a tree of decisions: each internal node tests a feature, branches to different child nodes based on
the feature value, and leaves represent predicted outputs 39 . For example, a decision tree for loan
approval might ask, “Is income > \$50K?” then branch accordingly. Trees are intuitive and interpretable,
as they mimic human decision rules. However, they can overfit if too deep. IBM notes that smaller trees
are easier to generalize, while larger trees risk “data fragmentation” and overfitting 40 . To avoid this,
trees are often pruned (removing low-importance branches) or ensembled (e.g. Random Forests or
boosting) 41 . Overall, decision trees are powerful because they capture nonlinear feature interactions
and require little data preprocessing.

Dimensionality Reduction

Dimensionality reduction techniques reduce the number of features while retaining important
information. This helps simplify models and avoid the “curse of dimensionality” when data has many
variables. Principal Component Analysis (PCA) is a common method: it finds a new set of orthogonal
axes (principal components) that capture the most variance in the data. As DataCamp explains,
dimensionality reduction “involves reducing the number of random variables under consideration by
obtaining a set of principal variables” 42 . Other methods include Linear Discriminant Analysis (LDA), t-
SNE, and autoencoders (neural-network-based). Reducing dimensions can improve visualization, speed
up training, and remove noise.

Neural Networks and Deep Learning

A neural network is a computational model inspired by biological brains. It consists of layers of


interconnected nodes (“neurons”) that transform input data into outputs. A simple neural network has
an input layer, one or more hidden layers, and an output layer 43 . Each connection has a weight and
threshold; during training, these weights are adjusted to improve accuracy 4 43 . Neural networks
can model highly complex functions by combining many nonlinear layers.

“Deep learning” refers to neural networks with many layers (deep architectures) 44 . DataCamp
describes deep learning as a subfield of ML using multi-layered artificial neural networks that can
“learn from enormous amounts of data” to achieve high accuracy 44 . Deep networks automatically
extract features: for example, early layers in a convolutional neural network might learn edges, while
later layers learn shapes and objects. Modern deep learning has led to breakthroughs in image and
speech recognition, natural language processing, and more.

5
Figure: Illustration of a neural network concept. Neural networks consist of layers of interconnected
neurons 43 . Deep learning uses many such layers (often hidden) to learn complex feature hierarchies
from data 44 .

Neural networks are typically trained by backpropagation and gradient descent, similar to other models
3 . They require large amounts of data and computation, but can achieve superior performance on

tasks like image classification and machine translation.

Training Machine Learning Systems


Training an ML model is an iterative process of learning from data. First, the dataset is split into
training, validation, and test sets 3 45 . The model is fit on the training set: its parameters (weights)
are adjusted to minimize an error or loss function 5 . For example, if using supervised learning with
gradient descent, the model makes predictions on each training example, measures error against the
known labels, and updates weights to reduce that error 5 . This “evaluate and optimize” loop is
repeated (often in epochs) until performance on the training data stops improving.

Next, the model is evaluated on a separate validation set 45 . The validation set simulates unseen data
and is used to tune hyperparameters (e.g. number of layers, learning rate) and guard against overfitting
45 . For instance, if validation error starts rising while training error keeps falling, the model is likely

overfitting (memorizing the training data). Techniques like early stopping (halting training when
validation error increases) and regularization (penalizing large weights) are used to improve
generalization 45 .

Finally, the model’s performance is measured on the test set, which the model has never seen 46 . This
provides an unbiased estimate of how the model will perform in the real world. Common evaluation
metrics include accuracy, precision/recall for classification, and mean-squared error for regression.
Cross-validation (resampling the data into different train/validation folds) is often used to get a robust
performance estimate when data is limited. Throughout training, engineers also perform feature
engineering, tuning, and error analysis.

In summary, training involves optimizing a model on data and rigorously testing its predictive ability. As
IBM describes, the ML process has a decision process (predicting/classifying), an error function
(measuring accuracy), and a model optimization step (adjusting weights) 47 5 . Good training yields
a model that not only fits historical data but also generalizes well to new data.

Sources: Definitions and insights above are drawn from authoritative ML references and recent surveys,
including IBM and MIT Sloan articles 1 8 , DataCamp and Doma articles 2 48 , and technical
sources on ML history and methods 49 11 3 .

6
1 4 43 47 What Is Machine Learning (ML)? | IBM
https://www.ibm.com/think/topics/machine-learning

2 20 22 23 24 30 31 32 34 38 42 44 48 What is Machine Learning? Definition, Types, Tools &


More | DataCamp
https://www.datacamp.com/blog/what-is-machine-learning

3 5 6 45 46 Training, validation, and test data sets - Wikipedia


https://en.wikipedia.org/wiki/Training,_validation,_and_test_data_sets

7 8 21 Machine learning, explained | MIT Sloan


https://mitsloan.mit.edu/ideas-made-to-matter/machine-learning-explained

9 10 11 12 13 14 15 16 17 49 History and Evolution of Machine Learning: A Timeline


https://www.techtarget.com/whatis/feature/History-and-evolution-of-machine-learning-A-timeline

18 19 Big Data & Machine Learning (How Do They Relate?) - WEKA


https://www.weka.io/learn/glossary/ai-ml/big-data-machine-learning/

25 26 What Is Business Analytics? | IBM


https://www.ibm.com/think/topics/business-analytics

27 Descriptive Analytics vs Predictive Analytics: A Practical Guide - Panintelligence


https://panintelligence.com/blog/descriptive-analytics-vs-predictive-analytics-a-guide/

28 Statistics versus machine learning | Nature Methods


https://www.nature.com/articles/nmeth.4642?error=cookies_not_supported&code=bba63378-9ab1-4133-
b773-40883abc2270

29 Statistics and machine learning: what’s the difference? | DataRobot Blog


https://www.datarobot.com/blog/statistics-and-machine-learning-whats-the-difference/

33 Semi-Supervised Learning Explained


https://www.oracle.com/artificial-intelligence/machine-learning/semi-supervised-learning/

35 36 Classification vs Regression in Machine Learning | GeeksforGeeks


https://www.geeksforgeeks.org/ml-classification-vs-regression/

37 Naive Bayes classifier - Wikipedia


https://en.wikipedia.org/wiki/Naive_Bayes_classifier

39 40 41 What is a Decision Tree? | IBM


https://www.ibm.com/think/topics/decision-trees

You might also like