Unit
Unit
Introduction: What is AI? - The Foundations of Artificial Intelligence - The History of Artificial
Intelligence - The State of the Art - Risks and Benefits of AI; Intelligent Agents: Agents and
Environments – Good Behavior – The Nature of Environments – The Structure of Agents;
Philosophy, Ethics, and Safety of AI: The Limits of AI - Can Machines Really Think? - The Ethics
of AI, The Future of AI: AI Components - AI Architectures.
INTRODUCTION
INTELLIGENCE ARTIFICIAL
INTELLIGENCE
It is a natural process. It is programmed by humans.
It is actually hereditary. It is not hereditary.
Knowledge is required for KB and electricity are required
intelligence. to generate output.
No human is an expert. We Expert systems are made which
may get better solutions from aggregate many person’s
other humans. experience and ideas.
DEFINITION
The study of how to make computers do things at which at the moment, people are better.
“Artificial Intelligence is the ability of a computer to act like a human being”.
In today's world, technology is growing very fast, and we are getting in touch with different new
technologies day by day.
Here, one of the booming technologies of computer science is Artificial Intelligence which is ready to create
a new revolution in the world by making intelligent machines.The Artificial Intelligence is now all around us.
It is currently working with a variety of subfields, ranging from general to specific, such as self-driving cars,
playing chess, proving theorems, playing music, Painting, etc.
AI is one of the fascinating and universal fields of Computer science which has a great scope in future. AI
holds a tendency to cause a machine to work as a human.
Artificial Intelligence is composed of two words Artificial and Intelligence, where Artificial defines "man-
made," and intelligence defines "thinking power", hence AI means "a man-made thinking power."
"It is a branch of computer science by which we can create intelligent machines which can behave like a
human, think like humans, and able to make decisions."
Artificial Intelligence exists when a machine can have human based skills such as learning, reasoning, and
solving problems
With Artificial Intelligence you do not need to preprogram a machine to do some work, despite that you can
create a machine with programmed algorithms which can work with own intelligence, and that is the
awesomeness of AI.
It is believed that AI is not a new technology, and some people says that as per Greek myth, there were
Mechanical men in early days which can work and behave like humans.
What is AI?
AI stands for Artificial Intelligence, which is a field of study that focuses on creating computer systems that
can performtasks that normally require human intelligence.
Narrow AI (Weak AI) – Designed for specific tasks (e.g., Siri, Google Assistant, ChatGPT).
General AI (Strong AI) – Aims to perform any intellectual task a human can (still theoretical).
Super AI – Hypothetical AI surpassing human intelligence.
Artificial intelligence can be organized in several ways, depending on stages of development or
actions being performed.
For instance, four stages of AI development are commonly recognized.
1. Reactive machines: Limited AI that only reacts to different kinds of stimuli based on
preprogrammed rules. Does not use memory and thus cannot learn with new data. IBM’s Deep Blue
that beat chess champion Garry Kasparov in 1997 was an example of a reactive machine.
2. Limited memory: Most modern AI is considered to be limited memory. It can use memory to
improve over time by being trained with new data, typically through an artificial neural network or
other training model. Deep learning, a subset of machine learning, is considered limited memory
artificial intelligence.
3. Theory of mind: Theory of mind AI does not currently exist, but research is ongoing into its
possibilities. It describes AI that can emulate the human mind and has decision-making capabilities
equal to that of a human, including recognizing and remembering emotions and reacting in social
situations as a human would.
4. Self aware: A step above theory of mind AI, self-aware AI describes a mythical machine that is
aware of its own existence and has the intellectual and emotional capabilities of a human. Like
theory of mind AI, self-aware AI does not currently exist.
A more useful way of broadly categorizing types of artificial intelligence is by what the machine can
do. All of what we currently call artificial intelligence is considered artificial “narrow” intelligence,
in that it can perform only narrow sets of actions based on its programming and training. For
instance, an AI algorithm that is used for object classification won’t be able to perform natural
language processing. Google Search is a form of narrow AI, as is predictive analytics, or virtual
assistants.
Artificial general intelligence (AGI) would be the ability for a machine to “sense, think, and act” just
like a human. AGI does not currently exist. The next level would be artificial superintelligence
(ASI), in which the machine would be able to function in all ways superior to a human.
AI Technologies:
Machine Learning (ML) – AI learns from data to improve over time.
Deep Learning – Uses neural networks to process complex patterns.
Natural Language Processing (NLP) – Enables AI to understand human language (e.g., chatbots).
Computer Vision – Helps AI interpret and analyze images/videos.
Robotics – AI-powered robots perform tasks like manufacturing and self-driving.
Applications and use cases for artificial intelligence”
Speech recognition
Automatically convert spoken speech into written text.
Image recognition
Identify and categorize various aspects of an image.
Translation
Translate written or spoken words from one language into another.
Predictive modeling
Mine data to forecast specific outcomes with high degrees of granularity.
Data analytics
Find patterns and relationships in data for business intelligence.
Cybersecurity
Autonomously scan networks for cyber attacks and threats.
Before Learning about Artificial Intelligence, we should know that what is the importance of AI and
why should we learn it. Following are some main reasons to learn about AI:
o With the help of AI, you can create such software or devices which can solve real-world problems
very easily and with accuracy such as health issues, marketing, traffic issues, etc.
o With the help of AI, you can create your personal virtual Assistant, such as Cortana, Google
Assistant, Siri, etc.
o With the help of AI, you can build such Robots which can work in an environment where survival of
humans can be at risk.
o AI opens a path for other new technologies, new devices, and new Opportunities.
Artificial Intelligence is not just a part of computer science even it's so vast and requires lots of
other factors which can contribute to it. To create the AI first we should know that how
intelligence is composed, so the Intelligence is an intangible part of our brain which is a
combination of Reasoning, learning, problem- solving perception, language understanding,
etc.
To achieve the above factors for a machine or software Artificial Intelligence requires the
following discipline:
oMathematics
oBiology
oPsychology
oSociology
oComputer Science
oNeurons Study
oStatistics
Advantages of Artificial Intelligence
o High Accuracy with less errors: AI machines or systems are prone to less errors and high accuracy
as it takes decisions as per pre-experience or information.
o High-Speed: AI systems can be of very high-speed and fast-decision making, because of that AI
systems can beat a chess champion in the Chess game.
o High reliability: AI machines are highly reliable and can perform the same action multiple times
with high accuracy.
o Useful for risky areas: AI machines can be helpful in situations such as defusing a bomb, exploring
the ocean floor, where to employ a human can be risky.
o Digital Assistant: AI can be very useful to provide digital assistant to the users such as AI
technology is currently used by various E-commerce websites to show the products as per customer
requirement.
o Useful as a public utility: AI can be very useful for public utilities such as a self-driving car which
can make our journey safer and hassle-free, facial recognition for security purpose, Natural language
processing to communicate with the human in human-language, etc.
Every technology has some disadvantages, and thesame goes for Artificial intelligence. Being so
advantageous technology still, it has some disadvantages which we need to keep in our mind while
creating an AI system. Following are the disadvantages of AI:
o High Cost: The hardware and software requirement of AI is very costly as it requires lots of
maintenance to meet current world requirements.
o Can't think out of the box: Even we are making smarter machines with AI, but still they cannot
work out of the box, as the robot will only do that work for which they are trained, or programmed.
o No feelings and emotions: AI machines can be an outstanding performer, but still it does not have
the feeling so it cannot make any kind of emotional attachment with human, and may sometime be
harmful for users if the proper care is not taken.
o Increase dependency on machines: With the increment of technology, people are getting more
dependent on devices and hence they are losing their mental capabilities.
o No Original Creativity: As humans are so creative and can imagine some new ideas but still AI
machines cannot beat this power of human intelligence and cannot be creative and imaginative.
Prerequisite
Before learning about Artificial Intelligence, you must have the fundamental knowledge of
following so that you can understand the concepts easily:
o Any computer language such as C, C++, Java, Python, etc.(knowledge of Python will be an advantage)
o Knowledge of essential Mathematics such as derivatives, probability theory, etc.
The idea of “artificial intelligence” goes back thousands of years, to ancient philosophers
considering questions of life and death. In ancient times, inventors made things called
“automatons” which were mechanical and moved independently of human intervention. The word
“automaton” comes from ancient Greek, and means “acting of one’s own will.” One of the earliest
records of an automaton comes from 400 BCE and refers to a mechanical pigeon created by a
friend of the philosopher Plato. Many years later, one of the most famous automatons was created
by Leonardo da Vinci around the year 1495.
So while the idea of a machine being able to function on its own is ancient, for the purposes of
this article, we’re going to focus on the 20th century, when engineers and scientists began to make
strides toward our modern-day AI.
Groundwork for AI:
1900-1950In the early 1900s, there was a lot of media created that centered around the idea of
artificial humans. So much so that scientists of all sorts started asking the question: is it possible to
create an artificial brain? Some creators even made some versions of what we now call “robots”
(and the word was coined in a Czech play in 1921) though most of them were relatively simple.
These were steam-powered for the most part, and some could make facial expressions and even
walk.
Dates of note:
1921: Czech playwright Karel Čapek released a science fiction play “Rossum’s Universal
Robots” which introduced the idea of “artificial people” which he named robots. This was the first
known use of the word.
1929: Japanese professor Makoto Nishimura built the first Japanese robot,
named Gakutensoku.
1949: Computer scientist Edmund Callis Berkley published the book “Giant Brains, or
Machines that Think” which compared the newer models of computers to human brains.
Birth of AI: 1950-1956
This range of time was when the interest in AI really came to a head. Alan Turing published his
work “Computer Machinery and Intelligence” which eventually became The Turing Test, which
experts used to measure computer intelligence. The term “artificial intelligence” was coined and
came into popular use.
Dates of note:
1950: Alan Turing published “Computer Machinery and Intelligence” which proposed a test
of machine intelligence called The Imitation Game.
1952: A computer scientist named Arthur Samuel developed a program to play checkers,
which is the first to ever learn the game independently.
1955: John McCarthy held a workshop at Dartmouth on “artificial intelligence” which is the
first use of the word, and how it came into popular usage.
AI maturation: 1957-1979
The time between when the phrase “artificial intelligence” was created, and the 1980s was a
period of both rapid growth and struggle for AI research. The late 1950s through the 1960s was a
time of creation. From programming languages that are still in use to this day to books and films
that explored the idea of robots, AI became a mainstream idea quickly.
The 1970s showed similar improvements, such as the first anthropomorphic robot being built in
Japan, to the first example of an autonomous vehicle being built by an engineering grad student.
However, it was also a time of struggle for AI research, as the U.S. government showed little
interest in continuing to fund AI research.
1987: The market for specialized LISP-based hardware collapsed due to cheaper and more
accessible competitors that could run LISP software, including those offered by IBM and Apple.
This caused many specialized LISP companies to fail as the technology was now easily accessible.
1988: A computer programmer named Rollo Carpenter invented the chatbot Jabberwacky,
which he programmed to provide interesting and entertaining conversation to humans.
AI agents: 1993-2011
Despite the lack of funding during the AI Winter, the early 90s showed some impressive strides
forward in AI research, including the introduction of the first AI system that could beat a reigning
world champion chess player. This era also introduced AI into everyday life via innovations such
as the first Roomba and the first commercially-available speech recognition software on Windows
computers.
The surge in interest was followed by a surge in funding for research, which allowed even more
progress to be made.
Notable dates include:
1997: Deep Blue (developed by IBM) beat the world chess champion, Gary Kasparov, in a
highly-publicized match, becoming the first program to beat a human chess champion.
1997: Windows released a speech recognition software (developed by Dragon Systems).
2000: Professor Cynthia Breazeal developed the first robot that could simulate human
emotions with its face,which included eyes, eyebrows, ears, and a mouth. It was called Kismet.
2002: The first Roomba was released.
2003: Nasa landed two rovers onto Mars (Spirit and Opportunity) and they navigated the
surface of the planet without human intervention.
2006: Companies such as Twitter, Facebook, and Netflix started utilizing AI as a part of
their advertising and user experience (UX) algorithms.
2010: Microsoft launched the Xbox 360 Kinect, the first gaming hardware designed to track
body movement and translate it into gaming directions.
2011: An NLP computer programmed to answer questions named Watson (created by IBM)
won Jeopardy against two former champions in a televised game.
2011: Apple released Siri, the first popular virtual assistant.
Artificial General Intelligence: 2012-present
That brings us to the most recent developments in AI, up to the present day. We’ve seen a surge
in common-use AI tools, such as virtual assistants, search engines, etc. This time period also
popularized Deep Learning and Big Data..
Notable dates include:
2012: Two researchers from Google (Jeff Dean and Andrew Ng) trained a neural network to
recognize cats by showing it unlabeled images and no background information.
2015: Elon Musk, Stephen Hawking, and Steve Wozniak (and over 3,000 others) signed an
open letter to the worlds’ government systems banning the development of (and later, use of)
autonomous weapons for purposes of war.
2016: Hanson Robotics created a humanoid robot named Sophia, who became known as the
first “robot citizen” and was the first robot created with a realistic human appearance and the
ability to see and replicate emotions, as well as to communicate.
2017: Facebook programmed two AI chatbots to converse and learn how to negotiate, but as
they went back and forth they ended up forgoing English and developing their own language,
completely autonomously.
2018: A Chinese tech group called Alibaba’s language-processing AI beat human intellect
on a Stanford reading and comprehension test.
2019: Google’s AlphaStar reached Grandmaster on the video game StarCraft 2,
outperforming all but 0.2% of human players.
2020: OpenAI started beta testing GPT-3, a model that uses Deep Learning to create code,
poetry, and other such language and writing tasks. While not the first of its kind, it is the first that
creates content almost indistinguishable from those created by humans.
2021: OpenAI developed DALL-E, which can process and understand images enough to
produce accurate captions, moving AI one step closer to understanding the visual world.
The Turing Test, proposed by Alan Turing (1950), was designed to provide a satisfactory
operational definition of intelligence. A computer passes the test if a human interrogator, after
posing some written questions, cannot tell whether the written responses come from a person or
from a computer.
natural language processing to enable it to communicate successfully in English;
knowledge representation to store what it knows or hears;
automated reasoning to use the stored information to answer questions and to
draw new conclusions
machine learning to adapt to new circumstances and to detect and extrapolate patterns.
Total Turing Test includes a video signal so that the interrogator can test the subject’s
perceptual abilities, as well as the opportunity for the interrogator to pass physical objects
“through the hatch.” To pass the total Turing Test, the computer will need
computer vision to perceive objects, and robotics to manipulate objects and move about.
Analyse how a given program thinks like a human, we must have some way of determining
how humans think. The interdisciplinary field of cognitive science brings together computer
models from AI and experimental techniques from psychology to try to construct precise and
testable theories of the workings of the human mind.
Although cognitive science is a fascinating field in itself, we are not going to be discussing it
all that much in this book. We will occasionally comment on similarities or differences
between AI techniques and human cognition. Real cognitive science, however, is necessarily
based on experimental investigation of actual humans or animals, and we assume that the
reader only has access to a computer for experimentation. We will simply note that AI and
cognitive science continue to fertilize each other, especially in the areas of vision, natural
language, and learning.
The Greek philosopher Aristotle was one of the first to attempt to codify ``right thinking,'' that
is, irrefutable reasoning processes. His famous syllogisms provided patterns for argument
structures that always gave correct conclusions given correct premises.
For example, ``Socrates is a man; all men are mortal; therefore Socrates is mortal.''
These laws of thought were supposed to govern the operation of the mind, and initiated the
field of logic.
Acting rationally means acting so as to achieve one's goals, given one's beliefs. An agent is
just something that perceives and acts.
The right thing: that which is expected to maximize goal achievement, given the available
information
For Example - blinking reflex- but should be in the service of rational action.
Key Developments:
Deep Learning & Neural Networks – AI models like GPT-4, DALL·E, and AlphaFold rely
on deep learning techniques.
Quantum Computing & AI – Emerging research on quantum-enhanced machine learning.
AI & Edge Computing – AI processing on edge devices for real-time decision-making.
Applications:
Autonomous systems (self-driving cars, drones).
AI-powered cybersecurity (threat detection & prevention).
AI-driven software development (Copilot, AlphaCode).
2. Economics & AI
AI is transforming economics through market predictions, automated trading, and economic
modeling.
Key Innovations:
Algorithmic Trading – AI-driven trading strategies optimize financial markets.
AI-driven Economic Forecasting – Predicting inflation, GDP growth, and market crashes.
Game Theory & AI – Used in reinforcement learning for AI decision-making in uncertain
environments.
Applications:
AI-powered fintech (fraud detection, credit scoring).
Supply chain optimization with predictive AI.
AI-driven economic policies and market regulation.
3. Psychology & AI
Understanding human cognition, emotions, and decision-making helps improve AI systems
that interact with people.
Key Innovations:
Affective Computing (Emotion AI) – AI models that detect and respond to human emotions.
Cognitive AI – Simulating human-like reasoning in AI (e.g., IBM Watson).
AI-based Behavioral Analysis – AI-driven analysis of consumer behavior and mental health
patterns.
Applications:
AI-powered mental health chatbots (Wysa, Woebot).
Human-AI collaboration in workplaces (AI as a cognitive assistant).
AI-driven personalized learning (adaptive AI tutors).
4. Neuroscience & AI
AI development is inspired by how the human brain processes information and learns.
Key Innovations:
Neuromorphic Computing – AI models that mimic brain neurons for energy-efficient
computing.
Brain-Computer Interfaces (BCI) – AI-powered interfaces that allow direct brain-machine
communication (e.g., Neuralink).
Memory-Augmented Neural Networks (MANNs) – AI that mimics human memory and
learning processes.
Applications:
AI-assisted prosthetics and neurorehabilitation.
Brain-inspired deep learning models for advanced reasoning.
AI-driven medical diagnosis for neurological disorders.
5. Mathematics & AI
Mathematics underpins AI through statistics, probability, algebra, and optimization
techniques.
Key Innovations:
Bayesian Inference & Probabilistic AI – AI models that handle uncertainty effectively.
Graph Neural Networks (GNNs) – Used in social network analysis and molecular research.
Mathematical Optimization in AI – Improving AI performance using advanced calculus and
linear algebra.
Applications:
AI in scientific research (protein folding, physics simulations).
AI-driven logistics and route optimization.
Improved AI explainability through mathematical models.
6. Linguistics & AI
AI in natural language processing (NLP) helps machines understand, generate, and interact in
human language.
Key Innovations:
Large Language Models (LLMs) – AI models like GPT-4, LLaMA, and Claude understand
and generate human-like text.
Speech Recognition & Synthesis – AI-powered text-to-speech (TTS) and automatic speech
recognition (ASR).
Multilingual AI – AI that understands and translates multiple languages with high accuracy.
Applications:
AI-driven customer support chatbots.
Real-time AI translators and transcription services.
AI-generated content for creative industries (music, writing, and storytelling).
7. Philosophy & AI
Philosophy helps define AI ethics, consciousness, and the nature of intelligence.
Key Innovations:
Ethical AI (Fairness, Bias Reduction) – Ensuring AI is unbiased and responsible.
Explainable AI (XAI) – Making AI decisions interpretable for humans.
AI & Consciousness – Research into whether AI can achieve self-awareness or reasoning
beyond human capability.
Applications:
AI-driven ethical decision-making in automation.
AI in law and governance (legal AI for case analysis).
AI for philosophical research in ethics and morality.
8. Control Theory & AI
Control theory ensures AI systems remain stable, efficient, and adaptive in dynamic
environments.
Key Innovations:
Reinforcement Learning (RL) – AI that learns optimal actions based on feedback (e.g.,
AlphaGo, OpenAI Five).
Adaptive AI Systems – AI that adjusts parameters dynamically in real-time.
AI-powered Control Systems – Used in industrial automation and robotics.
Applications:
Self-driving cars using AI-based control algorithms.
AI-driven robotic arms in manufacturing.
Autonomous drone control systems.
9. Cybernetics & AI
Cybernetics studies how AI systems interact with biological and mechanical systems.
Key Innovations:
Human-AI Augmentation – AI-powered prosthetics, wearables, and neural implants.
Biohybrid AI – AI integrated with biological elements for enhanced computing.
Autonomous Learning Systems – AI that adapts in real-time like living organisms.
Applications:
AI-powered exoskeletons for mobility assistance.
Neural AI interfaces for controlling machines with thoughts.
AI-driven feedback loops in smart environments (smart cities, IoT).
RISKS OF AI
Right Education can enhance the power of individuals/nations; on the other hand, misuse of the
same could lead to devastating results.
3. AI in Finance
Quantification of growth for any country is directly related to its economic and financial
condition. As AI has enormous scope in almost every field, it has great potential to boost
individuals’ economic health and a nation. Nowadays, the AI algorithm is being used in
managing equity funds.
An AI system could take a lot number of parameters while figuring out the best way to manage
funds. It would perform better than a human manager. AI-driven strategies in the field of
finance are going to change the classical way of trading and investing. It could be devastating
for some fund managing firms who cannot afford such facilities and could affect business on a
large scale, as the decision would be quick and abrupt. The competition would be tough and on
edge all the time.
AI-assisted strategies would enhance mission effectiveness and will provide the safest way to
execute it. The concerning part with AI-assisted system is that how it performs algorithm is not
quite explainable. The deep neural networks learn faster and continuously keep learning the
main problem here would be explainable AI. It could possess devastating results when it
reaches in the wrong hands or makes wrong decisions on its own.
An agent is anything that can be viewed as perceiving its environment through sensors
and acting upon that environment through actuators.
Human Sensors:
Eyes, ears, and other organs for sensors.
Human Actuators:
Hands, legs, mouth, and other body parts.
Robotic Sensors:
Mic, cameras and infrared range finders for sensors
Robotic Actuators:
Motors, Display, speakers etc An agent can be:
Human-Agent: A human agent has eyes, ears, and other organs which work for sensors
h and, legs, vocal tract work for actuator
Robotic Agent: A robotic agent can have cameras, infrared range finder, NLP for sensors and
various motors for actuators.
Software Agent: Software agent can have keystrokes, file contents as sensory input and
act on those inputs and display output on the screen.
Hence the world around us is full of agents such as thermostat, cell phone, camera, and
even we are also agents. Before moving forward, we should first know about sensors, effectors,
and actuators.
Sensor: Sensor is a device which detects the change in the environment and sends the
information to other electronic devices. An agent observes its environment through sensors.
Actuators: Actuators are the component of machines that converts energy into
motion. The actuators are only responsible for moving and controlling a system. An actuator can
be an electric motor, gears, rails, etc.
Effectors: Effectors are the devices which affect the environment. Effectors can be legs,
wheels, arms, fingers, wings, fins, and display screen.
PROPERTIES OF ENVIRONMENT
An environment is everything in the world which surrounds the agent, but it is not a part of an
agent itself. An environment can be described as a situation in which an agent is present.
The environment is where agent lives, operate and provide the agent with something to sense and
act upon it.
Fully observable vs Partially Observable:
If an agent sensor can sense or access the complete state of an environment at each point of time
then it is a fully observable environment, else it is partially observable.
A fully observable environment is easy as there is no need to maintain the internal state to keep
track history of the world.
An agent with no sensors in all environments then such an environment is called as
unobservable.
Example: chess – the board is fully observable, as are opponent’s moves. Driving – what is
around the next bend is not observable and hence partially observable.
1. Deterministic vs Stochastic
If an agent's current state and selected action can completely determine the next
state of the environment, then such environment is called a deterministic
environment.
stochastic environment is random in nature and cannot be determined completely
by an agent.
In a deterministic, fully observable environment, agent does not need to worry
about uncertainty.
2. Episodic vs Sequential
In an episodic environment, there is a series of one-shot actions, and only the
current percept is required for the action.
However, in Sequential environment, an agent requires memory of past actions to
determine the next best actions.
3. Single-agent vs Multi-agent
If only one agent is involved in an environment, and operating by itself then such
an environment is called single agent environment.
However, if multiple agents are operating in an environment, then such an
environment is called a multi-agent environment.
The agent design problems in the multi-agent environment are different from
single agent environment.
4. Static vs Dynamic
If the environment can change itself while an agent is deliberating then such
environment is called a dynamic environment else it is called a static
environment.
Static environments are easy to deal because an agent does not need to continue
looking at the world while deciding for an action.
However for dynamic environment, agents need to keep looking at the world at
each action.
Taxi driving is an example of a dynamic environment whereas Crossword puzzles
are an example of a static environment.
5. Discrete vs Continuous
If in an environment there are a finite number of precepts and actions that can be
performed within it, then such an environment is called a discrete environment
else it is called continuous environment.
A chess game comes under discrete environment as there is a finite number of
moves that can be performed.
A self-driving car is an example of a continuous environment.
6. Known vs Unknown
Known and unknown are not actually a feature of an environment, but it is an
agent's state of knowledge to perform an action.
In a known environment, the results for all actions are known to the agent. While
in unknown environment, agent needs to learn how it works in order to perform
an action.
It is quite possible that a known environment to be partially observable and an
Unknown environment to be fully observable.
7. Accessible vs. Inaccessible
If an agent can obtain complete and accurate information about the state's
environment, then such an environment is called an Accessible environment else
it is called inaccessible.
An empty room whose state can be defined by its temperature is an example of an
accessible environment.
Information about an event on earth is an example of Inaccessible environment.
Task environments, which are essentially the "problems" to which rational agents are the
"solutions."
PEAS: Performance Measure, Environment, Actuators, Sensors
Performance
The output which we get from the agent. All the necessary results that an agent gives after
processing comes under its performance.
Environment
All the surrounding things and conditions of an agent fall in this section. It basically consists of
all the things under which the agents work.
Actuators
The devices, hardware or software through which the agent performs any actions or processes
any information to produce a result are the actuators of the agent.
Sensors
The devices through which the agent observes and perceives its environment are the sensors of
the agent.
Figure 1.5 Examples of agent types and their PEAS descriptions
Rational Agent - A system is rational if it does the “right thing”. Given what it knows.
Characteristic of Rational Agent
The agent's prior knowledge of the environment.
The performance measure that defines the criterion of success.
The actions that the agent can perform.
The agent's percept sequence to date.
For every possible percept sequence, a rational agent should select an action that is expected to
maximize its performance measure, given the evidence provided by the percept sequence and
whatever built-in knowledge the agent has.
An omniscient agent knows the actual outcome of its actions and can act accordingly; but
omniscience is impossible in reality.
Ideal Rational Agent precepts and does things. It has a greater performance measure.
Eg. Crossing road. Here first perception occurs on both sides and then only action. No perception
occurs in Degenerate Agent.
Eg. Clock. It does not view the surroundings. No matter what happens outside. The clock works
based on inbuilt program.
Ideal Agent describes by ideal mappings. “Specifying which action an agent ought to take in
response to any given percept sequence provides a design for ideal agent”.
Eg. SQRT function calculation in calculator.
Doing actions in order to modify future precepts-sometimes called information gathering- is an
important part of rationality.
A rational agent should be autonomous-it should learn from its own prior knowledge
The Structure of Intelligent Agents
Agent = Architecture + Agent Program
Architecture = the machinery that an agent executes on. (Hardware)
Agent Program = an implementation of an agent function.
(Algorithm, Logic – Software)
The Simple reflex agents are the simplest agents. These agents take decisions on the
basis of the current percepts and ignore the rest of the percept history (past State).
The Simple reflex agent does not consider any part of percepts history during their
decision and action process.
The Simple reflex agent works on Condition-action rule, which means it maps the
current state to action. Such as a Room Cleaner agent, it works only if there is dirt in
the room.
The Model-based agent can work in a partially observable environment, and track the
situation.
A model-based agent has two important factors:
o Model: It is knowledge about "how things happen in the world," so it is called a
Model-based agent.
o Internal State: It is a representation of the current state based on percept history.
These agents have the model, "which is knowledge of the world" and based on the
model they perform actions.
Updating the agent state requires information about:
o How the world evolves
o How the agent's action affects the world.
o The knowledge of the current state environment is not always sufficient to decide for
an agent to what to do.
o The agent needs to know its goal which describes desirable situations.
o Goal-based agents expand the capabilities of the model-based agent by having the
"goal" information.
o They choose an action, so that they can achieve the goal.
o These agents may have to consider a long sequence of possible actions before
deciding whether the goal is achieved or not. Such considerations of different scenario
are called searching and planning, which makes an agent proactive.
o Utility-based agent act based not only goals but also the best way to achieve the goal.
o The Utility-based agent is useful when there are multiple possible alternatives, and an
agent has to choose in order to perform the best action.
o The utility function maps each state to a real number to check how efficiently each
action achieves the goals.
o A learning agent in AI is the type of agent which can learn from its past experiences,
or it has learning capabilities.
o It starts to act with basic knowledge and then able to act and adapt automatically
through learning.
b. Critic: Learning element takes feedback from critic which describes that how
well the agent is doing with respect to a fixed performance standard.
o Hence, learning agents are able to learn, analyze performance, and look for new ways
to improve the performance.
o Hence, learning agents are able to learn, analyze performance, and look for new ways
to improve the performance.
Figure 1.10 Learning Agents
Learning Agent
Multi-Agent Systems
These agents interact with other agents to achieve a common goal. They may have to coordinate
their actions and communicate with each other to achieve their objective.
A multi-agent system (MAS) is a system composed of multiple interacting agents that are designed
to work together to achieve a common goal. These agents may be autonomous or semi-autonomous
and are capable of perceiving their environment, making decisions, and taking action to achieve the
common objective.
MAS can be used in a variety of applications, including transportation systems, robotics, and social
networks. They can help improve efficiency, reduce costs, and increase flexibility in complex
systems. MAS can be classified into different types based on their characteristics, such as whether
the agents have the same or different goals, whether the agents are cooperative or competitive, and
whether the agents are homogeneous or heterogeneous.
In a homogeneous MAS, all the agents have the same capabilities, goals, and behaviors.
In contrast, in a heterogeneous MAS, the agents have different capabilities, goals, and behaviors.
This can make coordination more challenging but can also lead to more flexible and robust systems.
Cooperative MAS involves agents working together to achieve a common goal, while competitive
MAS involves agents working against each other to achieve their own goals. In some cases, MAS
can also involve both cooperative and competitive behavior, where agents must balance their own
interests with the interests of the group.
MAS can be implemented using different techniques, such as game theory, machine learning, and
agent-based modeling. Game theory is used to analyze strategic interactions between agents and
predict their behavior. Machine learning is used to train agents to improve their decision-making
capabilities over time. Agent-based modeling is used to simulate complex systems and study the
interactions between agents.
Overall, multi-agent systems are a powerful tool in artificial intelligence that can help solve complex
problems and improve efficiency in a variety of applications.
Hierarchical Agents
These agents are organized into a hierarchy, with high-level agents overseeing the behavior of
lower-level agents. The high-level agents provide goals and constraints, while the low-level agents
carry out specific tasks. Hierarchical agents are useful in complex environments with many tasks
and sub-tasks.
Hierarchical agents are agents that are organized into a hierarchy, with high-level agents overseeing
the behavior of lower-level agents. The high-level agents provide goals and constraints, while the
low-level agents carry out specific tasks. This structure allows for more efficient and organized
decision-making in complex environments.
Hierarchical agents can be implemented in a variety of applications, including robotics,
manufacturing, and transportation systems. They are particularly useful in environments where there
are many tasks and sub-tasks that need to be coordinated and prioritized.
In a hierarchical agent system, the high-level agents are responsible for setting goals and constraints
for the lower-level agents. These goals and constraints are typically based on the overall objective of
the system. For example, in a manufacturing system, the high-level agents might set production
targets for the lower-level agents based on customer demand.
The low-level agents are responsible for carrying out specific tasks to achieve the goals set by the
high-level agents. These tasks may be relatively simple or more complex, depending on the specific
application. For example, in a transportation system, low-level agents might be responsible for
managing traffic flow at specific intersections.
Hierarchical agents can be organized into different levels, depending on the complexity of the
system. In a simple system, there may be only two levels: high-level agents and low-level agents. In
a more complex system, there may be multiple levels, with intermediate-level agents responsible for
coordinating the activities of lower-level agents.
One advantage of hierarchical agents is that they allow for more efficient use of resources. By
organizing agents into a hierarchy, it is possible to allocate tasks to the agents that are best suited to
carry them out, while avoiding duplication of effort. This can lead to faster, more efficient decision-
making and better overall performance of the system.
Overall, hierarchical agents are a powerful tool in artificial intelligence that can help solve complex
problems and improve efficiency in a variety of applications.
Uses of Agents
Agents are used in a wide range of applications in artificial intelligence, including:
Robotics: Agents can be used to control robots and automate tasks in manufacturing, transportation,
and other industries.
Smart homes and buildings: Agents can be used to control heating, lighting, and other systems in
smart homes and buildings, optimizing energy use and improving comfort.
Transportation systems: Agents can be used to manage traffic flow, optimize routes for autonomous
vehicles, and improve logistics and supply chain management.
Healthcare: Agents can be used to monitor patients, provide personalized treatment plans, and
optimize healthcare resource allocation.
Finance: Agents can be used for automated trading, fraud detection, and risk management in the
financial industry.
Games: Agents can be used to create intelligent opponents in games and simulations, providing a
more challenging and realistic experience for players.
Natural language processing: Agents can be used for language translation, question answering, and
chatbots that can communicate with users in natural language.
Cybersecurity: Agents can be used for intrusion detection, malware analysis, and network security.
Environmental monitoring: Agents can be used to monitor and manage natural resources, track
climate change, and improve environmental sustainability.
Social media: Agents can be used to analyze social media data, identify trends and patterns, and
provide personalized recommendations to users.
Good Behavior:
An agent should act as a Rational Agent. A rational agent is one that does the right thing that
is the right actions will cause the agent to be most successful in the environment.
Performance measures
A performance measures embodies the criterion for success of an agent‘s behavior. As a general
rule, it is better to design performance measures according to what one actually wants in the
environment, rather than according to how one thinks the agent should behave.
Rationality
What is rational at any given time depends on four things:
The performance measure that defines the criterion of success.
The agent‘s prior knowledge of the environment.
The actions that the agent can perform.
The agent‘s percept sequence to date.
This leads to a definition of a rational agent (ideal rational agent)
“For each possible percept sequence, a rational agent should select an action that is expected to
maximize its performance measure, given the evidence provided by the percept sequence and
whatever built-in knowledge the agent has, that is the task of rational agent is to improve the
performance measure depends on percept sequence”
A fundamental question in AI and philosophy is whether machines can truly think or just
simulate thinking.
Strong AI vs. Weak AI
Strong AI (Artificial General Intelligence - AGI): Hypothetical AI with human-like reasoning.
Weak AI (Artificial Narrow Intelligence - ANI): AI specialized for specific tasks (e.g., Siri,
ChatGPT).
Turing Test
Proposed by Alan Turing, it states that an AI is intelligent if it can imitate human conversation so
well that a human cannot distinguish it from another human.
Chinese Room Argument
Proposed by John Searle, it suggests that an AI can simulate intelligence without truly
understanding anything.
2. The Ethics of AI
AI raises several ethical concerns, including:
Bias & Fairness: AI models may inherit biases from training data, leading to discrimination.
Privacy & Surveillance: AI can track and monitor individuals without consent.
Job Displacement: Automation may replace human workers.
AI in Warfare: AI-powered autonomous weapons pose security risks.
Ethical AI Solutions:
Developing transparent AI models.
Implementing regulations like the EU AI Act.
Using AI for social good, such as medical diagnosis.
3. The Limits of AI
Despite its advancements, AI has several limitations:
1. Lack of Common Sense – AI cannot reason like humans.
2. Bias in Data – AI decisions can be unfair.
3. Computational Costs – Training deep learning models requires significant resources.
4. Security Risks – AI can be manipulated or attacked (e.g., deepfakes).
Limits Of AI:
While AI has made remarkable advancements, it still has several fundamental limitations that
prevent it from achieving human-like intelligence or fully replacing human decision-making. These
limits are influenced by factors such as technology, data, ethics, and our understanding of
intelligence itself.
Artificial intelligence (AI) has several limitations, including:
Lack of creativity: AI can generate content and ideas, but it can't create original solutions or
innovate beyond its programming.
Lack of common sense: AI systems are good at specific tasks, but they don't have a deep
understanding of the world.
Lack of explainability: Some AI models are difficult to understand, making it hard to know how
they reach conclusions. This is known as the "black box" problem.
Data dependency: AI systems are dependent on the quality and quantity of training data.
Resource intensiveness: Training AI models requires a lot of computational power and energy.
Limited transfer learning: AI models are good at the specific tasks they're trained for, but it's hard
to transfer their knowledge to new tasks.
Vulnerability to adversarial attacks: AI systems can be misled if the input data is intentionally
manipulated.
Lack of emotional intelligence: AI systems struggle to understand and respond to human
emotions.
Bias: AI systems can perpetuate biases in decision-making, which can lead to discriminatory
results.
Contextual understanding: AI systems can struggle with understanding nuance or context, which
can lead to errors in decision-making.
The Ethics of AI
The ethics of artificial intelligence (AI) involves the principles and practices that ensure AI is
developed and used in a responsible and fair way.
Ethical considerations
Privacy: AI systems can collect a lot of personal information, so it's important to protect it from
unauthorized access and misuse.
Explainability: AI systems can make life-changing decisions, so it's important to be able to explain
how those decisions were made.
Fairness: AI systems should be fair and unbiased.
Transparency: AI systems should be transparent and understandable.
Beneficence: AI systems should promote well-being and have a positive impact on society.
Environmental sustainability: AI systems should be designed to be environmentally sustainable.
Regulatory Frameworks:
Governments and international bodies are likely to develop regulatory frameworks for AI,
ensuring that its deployment is ethical, transparent, and aligned with public interest.
AI Governance: Organizations and governments will increasingly develop guidelines and
frameworks for responsible AI development, focusing on ensuring that AI serves humanity without
violating individual rights.
7. AI and Creativity
AI is already being used in fields like art, music, film, and writing, pushing the boundaries of what
we consider "creative."
Key Developments:
Art Generation: AI systems like DALL·E and DeepArt are capable of generating art that can
sometimes rival human creativity.
Music Composition: AI has been used to compose original pieces of music by analyzing existing
works and generating new compositions.
Storytelling and Content Creation: AI tools, like GPT-based models, are capable of generating
texts that range from poetry to full-length novels.
Challenges:
Authorship and Ownership: As AI creates content, questions arise about who owns the intellectual
property rights of AI-generated works.
Human Creativity: How much of the creative process should be left to humans, and how much
should be outsourced to machines?
8. Long-Term Risks of AI
As AI continues to evolve, existential risks related to superintelligent AI or AGI have become a
topic of intense debate.
Key Concerns:
Loss of Control: The potential for superintelligent AI to operate beyond human control could lead
to unintended consequences.
Alignment Problem: Ensuring that AI’s objectives align with human values and priorities is a key
concern. If an AI's goals are misaligned with humanity's well-being, it could pose a significant
threat.
Weak AI:
Also known as narrow AI, refers to machines designed to perform specific tasks but with no true
understanding of the task. Most AI today is Weak AI.
Example: A chess-playing AI can perform incredibly well at chess but has no understanding of what
chess is in a broader sense. It doesn’t think about chess in the way a human would.
Key Point:
Strong AI posits that machines can think like humans, while Weak AI suggests that while machines
can perform tasks intelligently, they don't "think" in the human sense.
AI COMPONENTS:
The Five Branches of AI
Below are the five primary branches or subfields of Artificial Intelligence (AI), each contributing
uniquely to the development and capabilities of intelligent systems.
1. Machine Learning
Machine Learning (ML) stands as a vital subset within AI, focusing on machines’ capacity to learn
autonomously from data and algorithms. ML leverages the foundational elements of AI to make
decisions without explicit programming by humans, enhancing its adaptability and problem-solving
capabilities.
2. Deep Learning
Deep Learning (DL) operates as a subset of machine learning, utilizing artificial neural networks
(ANNs) inspired by the human brain. DL excels at extracting intricate features from data, leading to
superior performance compared to traditional machine learning.
It minimizes human intervention further, although it requires substantial amounts of data. Common
applications include natural language processing improvements in technologies like Amazon Alexa
or Google Home.
AI ARCHITECTURES
AI architectures refer to the underlying structures and frameworks that define how an AI
system is designed, organized, and operates to perform tasks that typically require human
intelligence. These architectures provide the foundation for the development of AI models and
systems, enabling them to handle tasks like learning, decision-making, pattern recognition, and
problem-solving. Below are some of the key AI architectures:
1. Neural Network Architecture
The Neural Network (NN) Architecture is inspired by the human brain and consists of layers of
interconnected neurons (nodes). It is one of the most common architectures used in AI, especially in
deep learning models.
Key Components of Neural Network Architecture:
Input Layer: The layer where the raw data (features) is input into the neural network.
Hidden Layers: Intermediate layers that process information received from the input layer, applying
weights, biases, and activation functions to make decisions.
Output Layer: The final layer that produces the prediction or output based on the network's
computation.
Popular Neural Network Architectures:
Feedforward Neural Networks (FNN): The simplest form of a neural network where data flows in
one direction from input to output.
Convolutional Neural Networks (CNN): Primarily used for image recognition and processing tasks.
CNNs use convolutional layers to detect patterns like edges, shapes, and textures.
Recurrent Neural Networks (RNN): Used for sequential data, such as time-series data or natural
language. RNNs have feedback loops that allow information to persist over time.
Deep Neural Networks (DNN): Deep neural networks have multiple hidden layers, enabling them to
learn complex patterns from large datasets.
Applications:
Image recognition
Speech processing
Natural language processing
Autonomous systems
2. Rule-Based Expert Systems
Expert Systems are AI architectures that use a knowledge base of if-then rules to simulate the
decision-making abilities of human experts in a specific domain. These systems are designed to
solve complex problems by reasoning through the rules and facts stored in the knowledge base.
Components:
Knowledge Base: A collection of rules, facts, and domain-specific knowledge.
Inference Engine: The component that applies the rules to the facts in the knowledge base to draw
conclusions or make decisions.
User Interface: Allows interaction with the expert system, providing input and receiving output.
Applications:
Medical diagnosis
Customer service support
Troubleshooting systems
3. Probabilistic Graphical Models (PGMs)
Probabilistic Graphical Models (PGMs) represent a class of models that combine probability theory
and graph theory to model uncertain information and relationships between variables. PGMs are
particularly useful in scenarios where uncertainty is inherent, such as in decision-making and
reasoning.
Types of PGMs:
Bayesian Networks: Directed acyclic graphs (DAGs) where nodes represent random variables, and
edges represent conditional dependencies. Bayesian networks are used for reasoning and decision-
making under uncertainty.
Markov Networks: Undirected graphs used to model dependencies in systems with a large number
of variables, often used for image processing and computer vision tasks.
Applications:
Risk analysis
Decision support systems
Natural language processing
Image segmentation
4. Reinforcement Learning Architecture
Reinforcement Learning (RL) is a type of machine learning where an agent interacts with an
environment and learns to make decisions through trial and error. It receives rewards or penalties
based on the actions it takes and aims to maximize the cumulative reward over time.
Key Components:
Agent: The entity that takes actions in the environment to achieve a goal.
Environment: The world in which the agent operates, which provides feedback based on the agent’s
actions.
State: A representation of the current situation in the environment.
Action: A choice made by the agent that affects the state of the environment.
Reward: Feedback received by the agent to evaluate the success of its actions.
Policy: A strategy or mapping from states to actions that the agent uses to make decisions.
Applications:
Game playing (e.g., AlphaGo, Chess)
Robotics
Autonomous vehicles
Financial trading
5. Hybrid AI Architectures
Hybrid AI architectures combine multiple AI paradigms to leverage the strengths of different
approaches. By combining symbolic reasoning with subsymbolic methods (like machine learning or
neural networks), hybrid systems aim to handle complex tasks that require both structured
knowledge and learning from data.
Types of Hybrid AI:
Neuro-symbolic AI: Combines the pattern recognition abilities of neural networks with the
structured reasoning capabilities of symbolic systems. It integrates deep learning models with logic-
based reasoning to handle tasks like natural language understanding and visual reasoning.
Cognitive Architectures: These architectures are inspired by the human brain’s cognitive processes
and often combine elements of symbolic AI, machine learning, and cognitive psychology to create
systems that can reason, learn, and adapt. An example is ACT-R (Adaptive Control of Thought—
Rational).
Applications:
Complex decision-making systems
Autonomous agents in uncertain environments
Human-computer interaction
6. Transformer Architectures
Transformers are a type of deep learning architecture that has significantly advanced the field of
natural language processing (NLP). They use self-attention mechanisms to handle sequences of data
more efficiently than previous models, such as RNNs.
Key Components:
Self-Attention: The ability of the model to focus on different parts of the input sequence when
processing each element, improving the ability to capture long-range dependencies in the data.
Encoder-Decoder Structure: A common transformer structure, where the encoder processes the input
sequence and the decoder generates the output sequence.
Positional Encoding: Since transformers don’t have a built-in sense of sequence order (like RNNs),
positional encoding is added to input data to give the model information about the position of each
word in the sequence.
Applications:
Machine translation (e.g., Google Translate)
Text generation (e.g., GPT, BERT, T5)
Speech recognition
Language understanding tasks
7. Cloud-Based AI Architectures
Cloud-based AI architectures enable the deployment, training, and inference of AI models on cloud
platforms. These architectures provide scalability, flexibility, and accessibility by leveraging cloud
computing resources like compute power, storage, and data management services.
Key Features:
Distributed Training: AI models, particularly deep learning models, require significant
computational resources. Cloud platforms allow for distributed training of models across multiple
machines, reducing training time and enabling the use of larger datasets.
Scalable Inference: Cloud services can scale AI inference based on demand, allowing for real-time
processing without the need for on-premise infrastructure.
Managed AI Services: Cloud providers like AWS, Azure, and Google Cloud offer managed AI
services, enabling users to quickly build and deploy AI models without managing infrastructure.
Applications:
Cloud-based AI platforms (e.g., AWS SageMaker, Azure AI)
Real-time data processing
Machine learning model deployment at scale
8. Edge AI Architectures
Edge AI refers to running AI models directly on edge devices (e.g., smartphones, sensors, IoT
devices) rather than relying on centralized cloud computing. This architecture is designed to process
data locally on the device, enabling faster, more efficient decision-making without the need for
constant connectivity to the cloud.
Key Features:
Low Latency: Edge AI provides real-time processing, as there is no need to transmit data to the
cloud for processing.
Data Privacy: Since data is processed locally, edge AI can enhance data privacy and reduce the risk
of sensitive information being exposed.
Reduced Bandwidth Usage: Edge AI reduces the amount of data that needs to be sent to the cloud,
which can save bandwidth and reduce network congestion.
Applications:
Smart devices (e.g., smartphones, smart cameras)
Autonomous vehicles
Industrial IoT systems
Healthcare monitoring systems
The architecture of AI typically refers to the structure or framework of how artificial intelligence systems
are built, with different components and layers working together to perform tasks such as learning,
reasoning, decision-making, and problem-solving. Depending on the specific field or domain of AI (like
machine learning, deep learning, reinforcement learning, etc.), the architecture can vary. However, here is
a general breakdown of key architectural components in AI:
2. Data Representation
Feature Extraction: The process of identifying and selecting relevant features from raw data. This
is essential for making the AI system more efficient and effective.
Embeddings: In the case of unstructured data (such as images or text), embedding techniques like
Word2Vec, GloVe (for text), or CNNs (for images) are used to represent complex data in a more
manageable form.
Common AI Architectures
2. Generative Models:
o Generative Adversarial Networks (GANs): Consist of two networks (a generator and a
discriminator) competing against each other to generate realistic synthetic data.
o Variational Autoencoders (VAEs): A type of generative model that uses probabilistic inference to
generate new data similar to the training set.
3.Hybrid Architectures:
o These involve combining different AI models or paradigms, such as combining symbolic AI with
machine learning, or using reinforcement learning with deep learning for complex decision-
making tasks (e.g., AlphaGo).
Some of the most popularly used problem solving with the help of artificial intelligence are:
1. Chess.
2. Travelling Salesman Problem.
3. Tower of Hanoi Problem.
4. Water-Jug Problem.
5. N-Queen Problem.
Problem Searching
Problem: Problems are the issues which comes across any system. A solution is needed to
solve that particular problem.
steps : Solve Problem Using Artificial IntelligenceThe process of solving a problem consists of
five steps. These are:
Figure 1.11 Problem Solving in Artificial Intelligence
Defining The Problem: The definition of the problem must be included precisely. It should
contain the possible initial as well as final situations which should result in acceptable solution.
1. Analyzing The Problem: Analyzing the problem and its requirement must be done as
few features can have immense impact on the resulting solution.
3. Choosing a Solution: From all the identified solutions, the best solution is chosen basis
on the results produced by respective solutions.
1. Search Space: Search space represents a set of possible solutions, which a system may have.
2. Start State: It is a state from where agent begins the search.
3. Goal test: It is a function which observe the current state and returns whether the goal state is
achieved or not.
Search tree: A tree representation of search problem is called Search tree. The root of the
search tree is the root node which is corresponding to the initial state.
Actions: It gives the description of all the available actions to the agent.
Solution: It is an action sequence which leads from the start node to the goal node. Optimal
Solution: If a solution has the lowest cost among all solution.
Example Problems
A Toy Problem is intended to illustrate or exercise various problem-solving methods.
Areal- world problem is one whose solutions people actually care about.
Toy Problems
Vacuum World
States: The state is determined by both the agent location and the dirt locations. The agent is in one
of the 2 locations, each of which might or might not contain dirt. Thus there are 2*2^2=8 possible
world states.
Actions: In this simple environment, each state has just three actions: Left, Right, and
Suck. Larger environments might also include Up and Down.
Transition model: The actions have their expected effects, except that moving Left in the
leftmost square, moving Right in the rightmost square, and Sucking in a clean square have no effect.
The complete state space is shown in Figure.
Goal test: This checks whether all the squares are clean.
Path cost: Each step costs 1, so the path cost is the number of steps in the path
The simplest formulation defines the actions as movements of the blank space Left,
Right, Up, or Down. Different subsets of these are possible depending on where the blank is.
Transition model: Given a state and action, this returns the resulting state; for
example, if we apply Left to the start state in Figure 3.4, the resulting state has the 5 and the
blank switched.
Goal test: This checks whether the state matches the goal configuration shown in
Figure. Path cost: Each step costs 1, so the path cost is the number of steps in the path.
Queens Problem
Consider the given problem. Describe the operator involved in it. Consider the water jug
problem: You are given two jugs, a 4-gallon one and 3-gallon one. Neither has any
measuring marker on it. There is a pump that can be used to fill the jugs with water. How can
you get exactly 2 gallon of water from the 4-gallon jug ?
Explicit Assumptions: A jug can be filled from the pump, water can be poured out of a jug
on to the ground, water can be poured from one jug to another and that there are no other
measuring devices available.
Here the initial state is (0, 0). The goal state is (2, n) for any value of n.
State Space Representation: we will represent a state of the problem as a tuple (x, y)
where x represents the amount of water in the 4-gallon jug and y represents the amount of
water in the 3-gallon jug. Note that 0 ≤ x ≤ 4, and 0 ≤ y ≤ 3.
To solve this we have to make some assumptions not mentioned in the problem. They are:
How can you get exactly 2 gallon of water into the 4-gallon jug?
The solution of many problems can be described by finding a sequence of actions that lead
to a desirable goal. Each action changes the state and the aim is to find the sequence of
actions and states that lead from the initial (start) state to a final (goal) state.
state
Operator or successor function - for any state x returns s(x), the set of states reachable
from x with one action
State space - all states reachable from initial by any sequence of actions
Path cost - function that assigns a cost to a path. Cost of a path is the sum of costs of
individual actions along the path
What is Search?
Search is the systematic examination of states to find path from the start/root state to
the goal state.
The set of possible states, together with operators defining their connectivity constitute
the search space.
The output of a search algorithm is a solution, that is, a path from the initial state to a
state that satisfies the goal test.
Problem-solving agents
To illustrate the agent’s behavior, let us take an example where our agent is in the city of
Arad, which is in Romania. The agent has to adopt a goal of getting to Bucharest.
Goal formulation, based on the current situation and the agent’s performance measure, is the
first step in problem solving.
The agent’s task is to find out which sequence of actions will get to a goal state.
Problem formulation is the process of deciding what actions and states to consider given
a goal.
Example: Route finding
problem Referring to figure
On holiday in Romania : currently in Arad. Flight leaves tomorrow from Bucharest
Formulate goal: be in Bucharest
Formulate problem: states: various cities
actions: drive between cities
Find solution:
sequence of cities, e.g., Arad, Sibiu, Fagaras, Bucharest
Problem formulation
A problem is defined by four items:
initial state e.g., “at Arad"
successor function S(x) = set of action-state pairs e.g., S(Arad) = {[Arad -
>Zerind;Zerind],….} goal test, can be
explicit, e.g., x = at Bucharest" implicit, e.g., NoDirt(x)
path cost (additive)
e.g., sum of distances, number of actions executed, etc. c(x; a; y) is the step cost,
assumed to be >= 0
A solution is a sequence of actions leading from the initial state to a goal state.
Goal formulation and problem formulation
EXAMPLE PROBLEMS
The problem solving approach has been applied to a vast array of task environments.
Some best known problems are summarized below. They are distinguished as toy or real-
world problems
A real world problem is one whose solutions people actually care about.
TOY PROBLEMS
o States: The agent is in one of two locations, each of which might or might not
contain dirt. Thus there are 2 x 22 = 8 possible world states.
o Successor function: This generates the legal states that results from trying the three
actions (left, right, suck). The complete state space is shown in figure
o Goal Test: This tests whether all the squares are clean.
o Path test: Each step costs one, so that the path cost is the number of steps in the path.
Vacuum World State Space
The 8-puzzle
An 8-puzzle consists of a 3x3 board with eight numbered tiles and a blank space. A
tile adjacent to the balank space can slide into the space. The object is to reach the goal state,
as shown in Figure 2.4
o States : A state description specifies the location of each of the eight tiles and the
blank in one of the nine squares.
o Initial state : Any state can be designated as the initial state. It can be noted that any
given goal can be reached from exactly half of the possible initial states.
o Successor function : This generates the legal states that result from trying the four
actions(blank moves Left, Right, Up or down).
o Goal Test : This checks whether the state matches the goal configuration shown in
Figure(Other goal configurations are possible)
o Path cost : Each step costs 1,so the path cost is the number of steps in the path.
The 8-puzzle belongs to the family of sliding-block puzzles, which are often used as
test problems for new search algorithms in AI. This general class is known as NP-complete.
The 8-puzzle has 9!/2 = 181,440 reachable states and is easily solved.
The 15 puzzle ( 4 x 4 board ) has around 1.3 trillion states, an the random instances
can be solved optimally in few milli seconds by the best search algorithms.
The 24-puzzle (on a 5 x 5 board) has around 1025 states and random instances are still
quite difficult to solve optimally with current machines and algorithms.
8-Queens problem
The goal of 8-queens problem is to place 8 queens on the chessboard such that no
queen attacks any other.(A queen attacks any piece in the same row, column or diagonal).
Figure 2.3 shows an attempted solution that fails: the queen in the right most column
is attacked by the queen at the top left.
A better formulation would prohibit placing a queen in any square that is already
attacked.
o States : Arrangements of n queens ( 0 <= n < = 8 ),one per column in the left
most columns, with no queen attacking another are states.
o Successor function : Add a queen to any square in the left most empty column
such that it is not attacked by any other queen.
This formulation reduces the 8-queen state space from 3 x 1014 to just 2057,and
solutions are easy to find.
For the 100 queens the initial formulation has roughly 10400 states whereas the
improved formulation has about 1052 states. This is a huge reduction, but the improved state
space is still too big for the algorithms to handle.
REAL-WORLD PROBLEMS
ROUTE-FINDING PROBLEM
o States: Each is represented by a location (e.g., an airport) and the current time.
o Initial state: This is specified by the problem.
o Successor function: This returns the states resulting from taking any scheduled flight
(further specified by seat class and location),leaving later than the current time plus
the within-airport transit time, from the current airport to another.
TOURING PROBLEMS
As with route-finding the actions correspond to trips between adjacent cities. The
state space, however, is quite different.
The goal test would check whether the agent is in Bucharest and all 20 cities have been
visited.
Is a touring problem in which each city must be visited exactly once. The aim is to
find the shortest tour. The problem is known to be NP-hard. Enormous efforts have been
expended to improve the capabilities of TSP algorithms. These algorithms are also used in
tasks such as planning movements of automatic circuit-board drills and of stocking
machines on shop floors.
VLSI layout
The example includes assembly of intricate objects such as electric motors. The aim in
assembly problems is to find the order in which to assemble the parts of some objects. If the
wrong order is choosen, there will be no way to add some part later without undoing some
work already done. Another important assembly problem is protein design, in which the goal
is to find a sequence of Amino acids that will be fold into a three-dimensional protein with
theright properties to cure some disease.
INTERNET SEARCHING
In recent years there has been increased demand for software robots that perform
Internet searching, looking for answers to questions, for related information, or for shopping
deals. The searching techniques consider internet as a graph of nodes(pages) connected by
links.
Strategies that know whether one non goal state is “more promising” than another are
Called Informed search or heuristic search strategies.
o Breadth-first search
o Uniform-cost search
o Depth-first search
o Depth-limited search
o Iterative deepening search
BREADTH-FIRST SEARCH
o Breadth-first search is a simple strategy in which the root node is expanded first, then
all successors of the root node are expanded next, then their successors, and so on. In
general, all the nodes are expanded at a given depth in the search tree before any
nodes at the next level are expanded.
Figure 2.5 Breadth-first search on a simple binary tree. At each stage, the node to
be expanded next is indicated by a marker.
Properties of breadth-first-search
Assume every state has b successors. The root of the search tree generates b nodes at
the first level, each of which generates b more nodes, for a total of b2 at the second level.
Each of these generates b more nodes, yielding b 3 nodes at the third level, and so on. Now
suppose, that the solution is at depth d. In the worst case, we would expand all but the last
node at level d, generating bd+1 - b nodes at level d+1.
Then
the total number of nodes generated is b + b2 + b3 + …+ bd + ( bd+1 + b) = O(bd+1).
Every node that is generated must remain in memory, because it is either part of the
fringe or is an ancestor of a fringe node. The space compleity is, therefore, the same as the
time complexity
UNIFORM-COST SEARCH
Instead of expanding the shallowest node, uniform-cost search expands the node n
with the lowest path cost. Uniform-cost search does not care about the number of steps a path
has, but only about their total cost.
DEPTH-FIRST-SEARCH
Depth-first-search always expands the deepest node in the current fringe of the search
tree. The progress of the search is illustrated in Figure 1.31. The search proceeds immediately
to the deepest level of the search tree, where the nodes have no successors. As those nodes
are expanded, they are dropped from the fringe, so then the search “backs up” to the next
shallowest node that still has unexplored successors.
Figure 2.7 Depth-first-search on a binary tree. Nodes that have been expanded and have
node scendants in the fringe can be removed from the memory; these are shown in
black. Nodes at depth 3 are assumed to have no successors and M is the only goal node.
For a state space with a branching factor b and maximum depth m, depth-first-search
requires storage of only bm + 1 nodes.
Using the same assumptions as Figure, and assuming that nodes at the same depth as
the goal node have no successors, we find the depth-first-search would require 118 kilobytes
instead of 10 petabytes, a factor of 10 billion times less space.
Drawback of Depth-first-search
The drawback of depth-first-search is that it can make a wrong choice and get stuck
going down very long(or even infinite) path when a different choice would lead to solution
near the root of the search tree. For example, depth-first-search will explore the entire left
subtree even if node C is a goal node.
BACKTRACKING SEARCH
A variant of depth-first search called backtracking search uses less memory and only
one successor is generated at a time rather than all successors.; Only O(m) memory is needed
rather than O(bm)
DEPTH-LIMITED-SEARCH
Depth limited search will be nonoptimal if we choose l > d. Its time complexity is
l
O(b ) and its space complete is O(bl). Depth-first-search can be viewed as a special case of
depth- limited search with l = oo Sometimes, depth limits can be based on knowledge of the
problem. For, example, on the map of Romania there are 20 cities. Therefore, we know that if
there is a solution, it must be of length 19 at the longest, So l = 10 is a possible choice.
However, it can be shown that any city can be reached from any other city in at most 9 steps.
This number known as the diameter of the state space, gives us a better depth limit.
Depth-limited-search can be implemented as a simple modification to the general tree- search
algorithm or to the recursive depth-first-search algorithm. The pseudocode for recursive depth-
limited-search is shown in Figure.
It can be noted that the above algorithm can terminate with two kinds of failure : the
standard failure value indicates no solution; the cutoffvalue indicates no solution within the
depth limit. Depth-limited search = depth-first search with depth limit l,returns cut off if any
path is cut off by depth limit
The idea behind bidirectional search is to run two simultaneous searches – one
forward from the initial state and the other backward from the goal, stopping when the two
searches meet in the middle
The motivation is that bd/2 + bd/2 much less than, or in the figure, the area of the two
small circles is less than the area of one big circle centered on the start and reaching to the
goal.
Figure 2.14 A schematic view of a bidirectional search that is about to succeed, when
a Branch from the Start node meets a Branch from the goal node.
• Before moving into bidirectional search let’s first understand a few terms.
• We must traverse the tree from the start node and the goal node and wherever they
meet the path from the start node to the goal through the intersection is the optimal
solution. The BS Algorithm is applicable when generating predecessors is easy in
both forward and backward directions and there exist only 1 or fewer goal states.
Figure 2.15 Comparing Uninformed Search Strategies
Figure 2.16 Evaluation of search strategies, b is the branching factor; d is the depth
of the shallowest solution; m is the maximum depth of the search tree; l is the depth
limit. Superscript caveats are as follows: a complete if b is finite; b complete if step
costs >= E for positive E; c optimal if step costs are all identical; d if both directions
use breadth-first search.
Best-first search
For example, in Romania, one might estimate the cost of the cheapest path from Arad
to Bucharest via a straight-line distance from Arad to Bucharest (Figure 2.19).
HEURISTIC function are the most common form in which additional knowledge is
imparted to the search algorithm.
GREEDY BEST-FIRST SEARCH
Greedy best-first search tries to expand the node that is closest to the goal, on the
grounds that this is likely to a solution quickly.
It evaluates the nodes by using the heuristic function f(n) = h(n).
Taking the example of Route-finding problems in Romania, the goal is to reach
Bucharest starting from the city Arad. We need to know the straight-line distances to
Bucharest from various cities as shown in Figure. For example, the initial state is
In(Arad),and the straight line distance heuristic hSLD (In(Arad)) is found to be 366.
Using the straight-line distance heuristic hSLD, the goal state can be reached faster.
Figure shows the progress of greedy best-first search using hSLD to find a path from Arad to
Bucharest. The first node to be expanded from Arad will be Sibiu, because it is closer to Bucharest
than either Zerind or Timisoara. The next node to be expanded will be Fagaras, because it is closest.
Fagaras in turn generates Bucharest, which is the goal.
o Complete: No–can get stuck in loops, e.g., Iasi !Neamt !Iasi !Neamt !
Complete in finite space with repeated-state checking
o Time: O(bm), but a good heuristic can give dramatic improvement
o Space: O(bm) - keeps all nodes in memory
o Optimal: No
A* Search is the most widely used form of best-first search. The evaluation function f(n) is
obtained by combining
A* Search is both optimal and complete. A* is optimal if h(n) is an admissible heuristic. The
obvious example of admissible heuristic is the straight-line distance hSLD. It cannot be an
overestimate.
A* Search is optimal if h(n) is an admissible heuristic – that is, provided that h(n) never
overestimates the cost to reach the goal.
An obvious example of an admissible heuristic is the straight-line distance hSLD that we used in
getting to Bucharest. The progress of an A* tree search for Bucharest is shown in Figure
The values of ‘g ‘ are computed from the step costs shown in the Romania map(figure).Also the values
of hSLD are given in Figure
o In many optimization problems, the path to the goal is irrelevant; the goal state itself
is the solution
o For example, in the 8-queens problem, what matters is the final configuration of
queens, not the order in which they are added.
o In such cases, we can use local search algorithms. They operate using a single
current state (rather than multiple paths) and generally move only to neighbors of
that state.
o The important applications of these class of problems are (a) integrated-circuit design,
(b) Factory-floor layout, (c) job-shop scheduling, (d) automatic programming, (e)
telecommunications network optimization, (f) Vehicle routing, and (g) portfolio
management.
Key advantages of Local Search Algorithms
(1) They use very little memory – usually a constant amount; and
(2) they can often find reasonable solutions in large or infinite(continuous) state spaces
for which systematic algorithms are unsuitable.
OPTIMIZATION PROBLEMS
In addition to finding goals, local search algorithms are useful for solving pure
optimization problems, in which the aim is to find the best state according to an objective
function.
State Space Landscape
current ←MAKE-NODE(INITIAL-STATE[problem])
loop do
neighbor ← a highest valued successor of current
if VALUE [neighbor] ≤ VALUE[current] then return STATE[current]
current ←neighbor
Figure 2.24 The hill-climbing search algorithm (steepest ascent version), which is
the most basic local search technique. At each step the current node is replaced
by the best neighbor; the neighbor with the highest VALUE. If the heuristic cost
estimate h is used, we could find the neighbor with the lowest h.
Hill-climbing is sometimes called greedy local search because it grabs a good
neighbor state without thinking ahead about where to go next. Greedy algorithms often
perform quite well. Problems with hill-climbing
Local maxima: a local maximum is a peak that is higher than each of its neighboring
states, but lower than the global maximum. Hill-climbing algorithms that reach the
vicinity of a local maximum will be drawn upwards towards the peak, but will then be
stuck with nowhere else to go
Plateaux: A plateau is an area of the state space landscape where the evaluation
function is flat. It can be a flat local maximum, from which no uphill exit exists, or a
shoulder, from which it is possible to make progress.
Figure 2.25 Illustration of why ridges cause difficulties for hill-climbing. The grid
of states(dark circles) is superimposed on a ridge rising from left to right,
creating a sequence of local maxima that are not directly connected to each other.
From each local maximum, all the available options point downhill.
Hill-climbing variations
Stochastic hill-climbing
o Random selection among the uphill moves.
o The selection probability can vary with the steepness of the uphill move.
First-choice hill-climbing
o cfr. stochastic hill climbing by generating successors randomly until a
better one is found.
Random-restart hill-climbing
o Tries to avoid getting stuck in local maxima.
SIMULATED ANNEALING SEARCH
A hill-climbing algorithm that never makes “downhill” moves towards states with
lower value (or higher cost) is guaranteed to be incomplete, because it can stuck on a local
maximum. In contrast, a purely random walk –that is, moving to a successor choosen
uniformly at random from the set of successors – is complete, but extremely inefficient.
Game Theory
In Artificial Intelligence (AI), Game Theory plays a critical role in modeling and analyzing the
interactions between rational agents, where each agent makes decisions to maximize its own utility
while considering the decisions of others. This becomes especially important in multi-agent systems,
where multiple AI agents interact, compete, or cooperate with each other to achieve their respective
goals.
In AI, multi-agent systems involve multiple intelligent agents interacting with one another. Game
theory provides a formal framework to understand and design the strategies these agents should
follow to achieve optimal outcomes in a competitive or cooperative environment.
Cooperative vs. Non-Cooperative Games: In cooperative games, agents can form coalitions and
make binding agreements to work together. In non-cooperative games, agents act independently and
cannot make enforceable agreements.
Nash Equilibrium: One of the key concepts from game theory, Nash Equilibrium (NE), is widely
used in AI. It helps predict the outcome of strategic interactions between agents where no player
benefits from unilaterally changing their strategy, provided the other players’ strategies remain
unchanged. AI systems can use NE to decide on the optimal strategy in competitive environments.
2. Adversarial Search
In many AI applications, agents are involved in adversarial settings, such as games, where one
agent’s gain is another agent’s loss. Adversarial search is a domain in AI where game theory is
applied to decision-making.
Minimax Algorithm: This is one of the core concepts in game theory used in AI for two-player zero-
sum games (e.g., chess, checkers, tic-tac-toe). The minimax algorithm assumes that both players are
rational and will make optimal moves. It calculates the best possible move for a player by assuming
the opponent will also play optimally.
Game theory and reinforcement learning (RL) intersect in environments where multiple agents are
learning and interacting over time. In RL, agents learn strategies by interacting with the environment
and receiving rewards or punishments based on their actions.
Multi-Agent Reinforcement Learning (MARL): In MARL, multiple agents interact within a shared
environment. Game theory is used to model these interactions to understand how agents should
balance cooperation and competition.
o In a zero-sum game, one agent’s gain corresponds to the other’s loss (e.g., in competitive games like
poker). Game theory models this interaction to help agents learn competitive strategies.
o In non-zero-sum games, the agents' payoffs are not strictly inversely related, meaning that
cooperation might lead to a win-win scenario (e.g., collaborative tasks like team-based problem-
solving). In these cases, game theory helps agents learn how to cooperate while still maximizing their
individual payoffs.
Evolutionary Game Theory (EGT): This is used in AI to model the evolution of strategies in
populations of agents. EGT is particularly useful for studying how cooperative behavior might evolve
in populations where agents repeatedly interact. For example, it can be applied in evolutionary
robotics, where robots (agents) evolve over time to become better at a given task through repeated
interactions and learning.
4. Auction Theory in AI
Game theory is heavily used in designing auction systems, which are key in AI for tasks like
resource allocation, advertising, and pricing. AI agents can participate in auctions, where they bid for
resources based on their strategies, goals, and the behavior of other agents.
Vickrey Auctions: A type of sealed-bid auction where bidders submit bids without knowing the bids
of others. The highest bidder wins, but they pay the price submitted by the second-highest bidder. AI
agents use game theory to determine their optimal bidding strategies.
Combinatorial Auctions: In these auctions, agents bid on combinations of items rather than just
individual items. Game theory helps AI agents compute optimal bidding strategies that maximize
their utility, especially when the auction items are interdependent.
5. Mechanism Design
Mechanism design is a subfield of game theory that deals with creating rules or mechanisms to
achieve a desired outcome in a game or system. In AI, mechanism design is used to design systems
where agents interact with each other, such as in online markets, recommendation systems, or
distributed networks.
Incentive Compatibility: In mechanism design, it is important that agents are motivated to act in a
way that leads to a socially optimal outcome. For example, in an AI-based auction, the mechanism
should ensure that agents have an incentive to bid truthfully.
Social Choice: This is about designing systems that aggregate individual preferences (like voting
mechanisms or collective decision-making systems) to reach an optimal collective decision. Game
theory helps AI design algorithms that handle these processes fairly and efficiently.
In AI, security can be modeled as a game between attackers and defenders. Game theory is used to
design security protocols, where agents (e.g., users, attackers, or defenders) make strategic decisions
regarding encryption, data sharing, and other security measures.
Defensive Game Theory: AI systems use game theory to design optimal strategies for defense
mechanisms, where attackers try to compromise the system, and defenders attempt to protect it.
Cryptographic Protocols: Game theory is used to design secure communication protocols, where
agents must consider the actions of adversaries and ensure that their strategies prevent unauthorized
access to sensitive information.
Bargaining Theory: This is used to analyze negotiation scenarios where agents must agree on terms
or resources. AI agents can use bargaining strategies derived from game theory to reach mutually
beneficial agreements.
Coalition Formation: Game theory is used to model the formation of coalitions where agents
combine their resources or capabilities to achieve a common goal. For example, AI agents might need
to form coalitions in distributed computing tasks to improve efficiency or solve a problem
collaboratively.
1. Complexity: In real-world AI applications, the number of agents, strategies, and possible outcomes
can be extremely large, leading to computational challenges. Finding optimal strategies in large
games may require significant computational resources or approximation methods.
2. Incomplete Information: In many AI scenarios, agents don’t have complete information about the
environment or other agents, making it difficult to model strategies and predict outcomes accurately.
This is addressed in Bayesian Game Theory, where agents reason about the types and actions of
others using probabilities.
3. Dynamic and Evolving Environments: In dynamic environments, where the state of the game
changes over time or agents evolve, traditional game theory may need to be extended to account for
time-varying payoffs, strategies, and uncertainty.
In Artificial Intelligence (AI), the concept of optimal decisions in games refers to the
strategies that an agent should adopt to maximize its utility or payoff in a given game. The optimal
decision is typically one that takes into account not only the agent’s own actions but also the
possible actions of other agents or players involved in the game.
Optimal decision-making in games can be studied through various frameworks, such as game
theory, adversarial search, and reinforcement learning. These methods help AI agents determine
the best course of action in competitive, cooperative, or mixed environments.
1. Players (Agents): The entities involved in the game, each making decisions to maximize their own
payoff or utility. In AI, players are typically modeled as intelligent agents.
2. Strategies: A strategy defines the set of actions or decisions that an agent can take. A pure strategy
specifies a single action, while a mixed strategy defines a probability distribution over possible
actions.
3. Payoff/Utility Function: The payoff (or utility) is the reward or value an agent gets based on the
combination of strategies chosen by all players. An agent aims to maximize its payoff function
through optimal decision-making.
4. Information: The level of information available to agents about the game environment or the other
agents’ actions. Games with complete information provide all players with the same knowledge
about the game, while games with incomplete information involve some uncertainty.
5. Equilibrium: In many games, the concept of Nash Equilibrium is used to define the optimal
strategy. At Nash Equilibrium, no player can improve their payoff by unilaterally changing their
strategy, assuming all other players' strategies are fixed. The equilibrium can be in pure or mixed
strategies.
In AI, adversarial search is used to model competitive games where one player's gain is another
player's loss, such as chess, checkers, or tic-tac-toe. The goal is for the AI agent to make the
optimal move while assuming that the opponent is also playing optimally.
Minimax Algorithm:
The minimax algorithm is the foundation for decision-making in two-player, zero-sum games. It
assumes that both players are rational and will choose optimal moves. Here's how it works:
In the minimax algorithm, the search tree is explored, and each leaf node represents a possible
outcome of the game. The algorithm evaluates the value of each leaf node using a heuristic
function, which estimates how good a game state is for the maximizing player. The algorithm then
propagates these values back up the tree to select the best move at the root node.
Alpha-Beta Pruning: This is an optimization technique for the minimax algorithm. It eliminates
branches of the game tree that will not be explored, significantly reducing the number of nodes that
need to be evaluated.
In zero-sum games, the optimal strategy for an agent is often one that is Nash Equilibrium, where
no player can improve their outcome by changing their strategy unilaterally. In AI, this is useful for
games like poker, where optimal strategies are computed using game-theoretic approaches (e.g.,
Nash equilibrium strategies in imperfect information games).
In cooperative games, players can form coalitions and make binding agreements to achieve a
common goal. The payoff is distributed among players based on the agreement. The challenge in
cooperative games is to find an optimal strategy that maximizes the overall utility of the coalition
while ensuring fairness in how the payoffs are divided.
Shapley Value: The Shapley value is a concept from cooperative game theory that provides a fair
distribution of payoffs among players in a coalition. In AI, it can be used to allocate resources or
rewards in multi-agent systems.
Pareto Efficiency: A strategy is Pareto optimal if no player can be made better off without making
another player worse off. AI agents may use this concept to find socially optimal decisions in
collaborative environments.
In Reinforcement Learning, agents learn optimal decision-making strategies through trial and
error. The agent interacts with an environment, receiving rewards or punishments based on the
actions it takes, and uses this feedback to adjust its strategy over time.
Actions (A): The set of possible actions the agent can take.
Transition Function (T): Defines the probability of transitioning from one state to another given an
action.
Reward Function (R): Provides feedback (rewards or penalties) based on the state and action taken.
The goal in RL is for the agent to learn an optimal policy (a mapping from states to actions) that
maximizes the expected cumulative reward over time. The agent may use techniques like Q-
learning or Deep Q-Networks (DQN) to learn optimal policies.
Q-learning: A model-free RL algorithm that estimates the value of an action in a given state. Over
time, the agent updates its Q-values based on the rewards received, converging to an optimal policy.
Policy Gradient Methods: These are another class of algorithms used in RL, where the agent
directly learns the optimal policy (as opposed to estimating Q-values). This is particularly useful for
continuous action spaces.
In Evolutionary Game Theory, the strategies of agents evolve over time through processes like
natural selection. In AI, this can be applied to situations where agents adapt and learn from their
interactions.
Genetic Algorithms: AI systems may use genetic algorithms to evolve solutions to optimization
problems. Agents in these systems "mate" and produce offspring with mutations based on their
success in the environment.
Evolutionary Strategies: AI agents can use evolutionary strategies to select optimal decision-
making processes by simulating evolutionary mechanisms like selection, mutation, and
reproduction.
In some games, players might use mixed strategies, where they randomize over their possible
actions instead of choosing one action deterministically. For example, in games like rock-paper-
scissors, players randomize their choices to prevent the opponent from exploiting predictable
patterns.
In mixed-strategy Nash equilibrium, each player chooses their actions based on specific
probabilities. The optimal strategy in this case is to choose actions in such a way that no player can
improve their expected payoff by changing their strategy.
1. Autonomous Vehicles: In multi-agent environments, autonomous vehicles can use game theory and
optimal decision-making algorithms to navigate safely and efficiently while interacting with other
vehicles and pedestrians.
2. Robotics: Multi-robot systems can use cooperative and competitive game-theoretic strategies to
accomplish tasks like exploration, formation control, or task allocation.
4. Economics and Auctions: AI agents use optimal decision-making to participate in auctions, setting
bidding strategies based on the behavior of other participants and maximizing their payoff.
5. Healthcare: AI can use game theory for optimal decision-making in resource allocation, diagnosis,
and treatment planning, particularly when multiple healthcare providers are involved.
ADVERSARIAL SEARCH
Competitive environments, in which the agent’s goals are in conflict, give rise to
adversarial search problems – often known as games.
Games
We will consider games with two players, whom we will call MAX and MIN. MAX
moves first, and then they take turns moving until the game is over. At the end of the game,
points are awarded to the winning player and penalties are given to the loser. A game can be
formally defined as a search problem with the following components:
o The initial state, which includes the board position and identifies the player to move.
o A successor function, which returns a list of (move, state) pairs, each indicating a
legal move and the resulting state.
o A terminal test, which describes when the game is over. States where the game has
ended are called terminal states.
o A utility function (also called an objective function or payoff function), which give a
numeric value for the terminal states. In chess, the outcome is a win, loss, or draw,
with values+1,-1, or 0. he payoffs in backgammon range from +192 to -192.
Game Tree
The initial state and legal moves for each side define the game tree for the game. Figure 2.18
shows the part of the game tree for tic-tac-toe (noughts and crosses). From the initial state, MAX has
Figure 2.41 A partial search tree. The top node is the initial state, and
MAX move first, placing an X in an empty square.
nine possible moves. Play alternates between MAX’s placing an X and MIN’s placing a 0 until we
reach leaf nodes corresponding to the terminal states such that one player has three in a row or all the
squares are filled. He number on each leaf node indicates the utility value of the terminal state from
the point of view of MAX; high values are assumed to be good for MAX and bad for MIN. It is the
MAX’s job to use the search tree (particularly the utility of terminal states) to determine the best
move.
In normal search problem, the optimal solution would be a sequence of move leading
to a goal state – a terminal state that is a win. In a game, on the other hand, MIN has
something to say about it, MAX therefore must find a contingent strategy, which specifies
MAX’s move in the initial state, then MAX’s moves in the states resulting from every
possible response by MIN, then MAX’s moves in the states resulting from every possible
response by MIN those moves, and so on. An optimal strategy leads to outcomes at least as
good as any other strategy when one is playing an infallible opponent
In AI, Heuristic Alpha-Beta Pruning is an optimization technique used to improve the efficiency of the
minimax algorithm in adversarial search, which is commonly applied to two-player zero-sum games
like chess, checkers, or tic-tac-toe.
The minimax algorithm explores the entire game tree to make decisions by considering every possible
move that each player can make. The main idea behind minimax is that both players act rationally and
will choose the optimal move in each situation. The algorithm assigns values to the terminal nodes of the
tree based on a heuristic evaluation function and propagates these values up the tree to determine the
optimal move.
However, minimax can be computationally expensive, as it requires evaluating all possible states in the
game tree. This is where alpha-beta pruning comes in to optimize the search and cut down the number
of nodes evaluated. When combined with heuristics, this becomes a highly efficient search method for
decision-making in AI.
1. Alpha-Beta Pruning:
Alpha-beta pruning is an enhancement of the minimax algorithm that reduces the number of nodes
that need to be evaluated by eliminating branches that will not affect the final decision. This is done
by maintaining two values during the search:
Alpha (α): The best value that the maximizing player can guarantee at any point along the path. It
represents the best already explored option for the maximizing player.
Beta (β): The best value that the minimizing player can guarantee at any point along the path. It
represents the best already explored option for the minimizing player.
Pruning Decision:
While performing the search, if at any point, we find that the current branch (subtree) will not lead to
a better outcome than what is already known, we can prune (ignore) that branch. The key pruning
conditions are:
Prune: If α ≥ β, no further exploration is needed for that branch since the opponent will avoid that
branch anyway.
Cutoff: If α ≥ β, the branch can be pruned because the current value for the maximizing player (α) is
better than the value the minimizing player could force (β).
Since it is often impractical to explore every possible game state (especially in large games like
chess), we use a heuristic evaluation function to estimate the desirability of a given game state. The
heuristic function provides an estimate of how good a particular state is for a player, helping guide
the search.
For example, in chess, a heuristic might assign values based on the material count (the number of
pieces each player has), the control of the board, or the positioning of the pieces.
The heuristic function doesn't guarantee an optimal solution, but it provides a way to evaluate
intermediate game states more quickly, enabling more efficient pruning in the search tree.
When combined, alpha-beta pruning and heuristics drastically reduce the number of nodes explored
compared to a brute-force minimax search, making it feasible to perform deeper searches within a
reasonable amount of time.
1. Maximizing Player's Move: The algorithm starts by assuming that the maximizing player is trying
to maximize their payoff (for example, trying to make the best move in a game like chess).
2. Minimizing Player's Move: After considering the maximizing player’s possible moves, the
algorithm then simulates the minimizing player’s move, which aims to minimize the maximizing
player's payoff (for example, trying to block the opponent’s move).
3. Alpha and Beta Values: The algorithm keeps track of the best possible values that both the
maximizing and minimizing players can achieve at each point in the search.
4. Heuristic Evaluation: At the leaf nodes of the game tree (where the game has ended or a cutoff
depth has been reached), the heuristic evaluation function is used to estimate the value of that node.
5. Pruning: If a node’s value is found to be worse than the previously explored paths (based on α and β
values), the branch is pruned.
Example: Alpha-Beta Pruning with Heuristics
Let’s look at a simplified version of alpha-beta pruning with heuristics in a 3-level game tree (where
the search depth is 3):
Initially, α = -∞, β = ∞.
At this level, the minimizing player will try to minimize the value.
3. Move down to the maximizing player’s level (Level 2), and evaluate the terminal nodes.
Suppose the heuristic evaluation for each terminal node is computed. For example, let's say the
heuristic values are: 3, 5, 2, and 4.
As the search proceeds, the algorithm keeps track of the best possible moves for each player.
If a move is found that’s worse than the previously explored ones, the algorithm prunes that branch.
For instance, if a minimizing player has a node with a value of 2, but earlier in the search, the
maximizing player had a value of 3 (which is better), then that branch can be pruned.
5. Apply Pruning:
If the heuristic evaluation of a node suggests that continuing the search won’t lead to a better outcome
for the current player (based on the values of α and β), the algorithm prunes the subtree, saving
computation time.
The effectiveness of alpha-beta pruning depends on the order in which the game tree is explored.
The earlier you encounter good moves (high-value nodes for the maximizing player or low-value
nodes for the minimizing player), the more branches you can prune, resulting in significant
computational savings.
Best-case scenario: If the algorithm is able to prune most of the branches, alpha-beta pruning can
reduce the time complexity of the minimax algorithm from O(b^d) (where b is the branching factor
and d is the depth of the tree) to O(b^(d/2)), effectively cutting the search time in half.
Heuristics improve this even further by guiding the search to the most promising parts of the tree,
further optimizing the pruning process.
MONTE CARLO TREE SEARCH (MCTS) ALGORITHM:
In MCTS, nodes are the building blocks of the search tree. These nodes are formed based on the
outcome of a number of simulations. The process of Monte Carlo Tree Search can be broken down
into four distinct steps, viz., selection, expansion, simulation and backpropagation. Each of these
steps is explained in details below:
Selection: In this process, the MCTS algorithm traverses the current tree from the root node using
a specific strategy. The strategy uses an evaluation function to optimally select nodes with the
highest estimated value. MCTS uses the Upper Confidence Bound (UCB) formula applied to trees
as the strategy in the selection process to traverse the tree. It balances the exploration-exploitation
trade-off. During tree traversal, a node is selected based on some parameters that return the
maximum value. The parameters are characterized by the formula that is typically used for this
purpose is given below.
where;
Si = value of a node i
xi = empirical mean of a node i
C = a constant
t = total number of simulations
When traversing a tree during the selection process, the child node that returns the greatest value
from the above equation will be one that will get selected. During traversal, once a child node is
found which is also a leaf node, the MCTS jumps into the expansion step.
Expansion: In this process, a new child node is added to the tree to that node which was optimally
reached during the selection process.
Simulation: In this process, a simulation is performed by choosing moves or strategies until a
result or predefined state is achieved.
Backpropagation: After determining the value of the newly added node, the remaining tree must
be updated. So, the backpropagation process is performed, where it backpropagates from the new
node to the root node. During the process, the number of simulation stored in each node is
incremented. Also, if the new node’s simulation results in a win, then the number of wins is also
incremented.
The above steps can be visually understood by the diagram given below:
These types of algorithms are particularly useful in turn based games where there is no element of
chance in the game mechanics, such as Tic Tac Toe, Connect 4, Checkers, Chess, Go, etc. This
has recently been used by Artificial Intelligence Programs like AlphaGo, to play against the
world’s top Go players. But, its application is not limited to games only. It can be used in any
situation which is described by state-action pairs and simulations used to forecast outcomes.
Pseudo-code for Monte Carlo Tree search:
Python3
return best_child(root)
As we can see, the MCTS algorithm reduces to a very few set of functions which we can use any
choice of games or in any optimizing strategy.
1. As the tree growth becomes rapid after a few iterations, it requires a huge amount of memory.
2. There is a bit of a reliability issue with Monte Carlo Tree Search. In certain scenarios, there might
be a single branch or path, that might lead to loss against the opposition when implemented for
those turn-based games. This is mainly due to the vast amount of combinations and each of the
nodes might not be visited enough number of times to understand its result or outcome in the long
run.
3. MCTS algorithm needs a huge number of iterations to be able to effectively decide the most
efficient path. So, there is a bit of a speed issue there.
Issues in Monte Carlo Tree Search:
The goal during selection is to choose nodes that either have high win rates (exploitation) or are
still underexplored (exploration).
2. Expansion
Once a leaf node is reached, if the node is not terminal (i.e., the game isn't over), the algorithm
expands the tree by generating one or more child nodes representing possible future game states.
For each move, a new node is created, and it represents a potential future state of the game.
3. Simulation
A simulation (also called playout or rollout) is run from the newly expanded node to simulate the
outcome of the game. The simulation is typically done by choosing actions randomly (or using a
simple heuristic) until a terminal state (win, loss, or draw) is reached.
The outcome of the simulation provides a reward (e.g., +1 for a win, 0 for a draw, and -1 for a
loss), which will be used for the backpropagation step.
4. Backpropagation
After a simulation is completed, the result (reward) is backpropagated up the tree. This means
updating the values (e.g., win rates) of the nodes visited during the selection process.
For each node along the path from the leaf node to the root node, the reward of the simulation is
added to that node's total reward, and the visit count is incremented.
Iterative Process
These four steps are repeated for a large number of iterations, with each iteration selecting,
expanding, simulating, and backpropagating. After enough iterations, the root node will contain an
approximation of the best move based on the simulations. The move corresponding to the child
node with the highest visit count or win rate is then chosen as the optimal move.
Advantages of MCTS
Scalability: MCTS can handle large, complex state spaces because it doesn't require exploring
every possible move in advance. It focuses computational effort on promising areas of the search
space.
Adaptability: MCTS works well in both deterministic and non-deterministic environments and
can adapt to changes in the game state.
No Need for a Heuristic: Unlike traditional methods like minimax, MCTS doesn’t require an
evaluation function to estimate the value of non-terminal states. Instead, it uses random
simulations, making it suitable for games like Go, where a good heuristic is hard to define.
Weaknesses of MCTS
Computationally Expensive: While MCTS is scalable, it can still be computationally intensive,
especially for games with very large state spaces or where each simulation requires significant
time.
Longer Planning Horizons: The performance of MCTS improves with more simulations. In real-
time applications, there may be a trade-off between the number of simulations and time available
for decision-making.
Applications of MCTS in AI
1. Go: One of the most famous successes of MCTS was in the game of Go, where it was used by AI
systems such as AlphaGo developed by DeepMind. MCTS provided a way to effectively search
through the enormous state space of Go without requiring a handcrafted evaluation function.
2. Games with Large State Spaces: MCTS is widely used in strategy games, where the branching
factor of the game tree is enormous. Games like chess, Shogi, and Real-Time Strategy (RTS)
games benefit from the ability of MCTS to focus on high-reward regions of the state space.
3. Robotics and Planning: MCTS can be used for planning in robotics and motion planning
problems where the number of possible actions grows exponentially with time or complexity.
4. Real-Time Decision Making: In dynamic, real-time scenarios, MCTS is applied in systems that
need to make decisions on the fly, such as autonomous vehicles, robotic systems, and multi-
agent systems.
A partially observable game can be formally defined as an extension of a stochastic game but with a
partially observable state space. The key elements that distinguish it from fully observable games
include:
1. Partially Observable States: Instead of the agents having access to the full state of the game, they
only have access to an observation oio_ioi that provides partial information about the true state. This
observation is typically noisy or incomplete.
2. Belief State: In a partially observable environment, agents maintain a belief state (a probability
distribution) over the possible true states. The belief state represents the agent’s internal estimate of
what the true state could be, based on the information it has gathered through observations and
actions.
3. Hidden Information: This is the information that is not observable by the agents, but could be critical
for making optimal decisions. In many games, opponents' actions or parts of the environment may be
hidden from the agent.
4. Partially Observable Markov Decision Processes (POMDPs): A common formalism for partially
observable games is the POMDP model. POMDPs generalize Markov Decision Processes (MDPs)
by incorporating the partial observability of the environment. In POMDPs, an agent’s observation
provides indirect information about the true state of the system, and it must decide what actions to
take based on this belief state.
Each constraint Ci involves some subset of variables and specifies the allowable combinations of values for
that subset.
A State of the problem is defined by an assignment of values to some or all of the variables,{Xi = vi,Xj =
vj,…}. An assignment that does not violate any constraints is called a consistent or legal assignment. A
complete assignment is one in which every variable is mentioned, and a solution to a CSP is a complete
assignment that satisfies all the constraints.
Some CSPs also require a solution that maximizes an objective function.
Example for Constraint Satisfaction Problem
Figure shows the map of Australia showing each of its states and territories. We are given
the task of coloring each region either red, green, or blue in such a way that the neighboring regions
have the same color. To formulate this as CSP, we define the variable to be the
regions :WA,NT,Q,NSW,V,SA, andT.
The domain of each variable is the set {red,green,blue}.The constraints require neighboring
regions to have distinct colors; for example, the allowable combinations for WA and NT are the
pairs
{(red,green),(red,blue),(green,red),(green,blue),(blue,red),(blue,green)}.
The constraint can also be represented more succinctly as the inequality WA not = NT,
provided the constraint satisfaction algorithm has some way to evaluate such expressions.) There
are many possible solutions such as
It is helpful to visualize a CSP as a constraint graph, as shown in Figure 2.29. The nodes of
the graph corresponds to variables of the problem and the arcs correspond to constraints.
Figure 2.29 Principle states and territories of Australia. Coloring this map can be
viewed as a constraint satisfaction problem. The goal is to assign colors to each
region so that no neighboring regions have the same color.
CSP can be viewed as a standard search problem as follows:
Initial state: the empty assignment {},in which all variables are unassigned.
Successor function: a value can be assigned to any unassigned variable, provided
that it does not conflict with previously assigned variables.
Goal test: the current assignment is complete.
Path cost: a constant cost(E.g.,1) for every step.
VARIETIES OF CSPS
(i) Discrete variables Finite domains
The simplest kind of CSP involves variables that are discrete and have finite
domains. Map coloring problems are of this kind. The 8-queens problem can also be viewed
as finite- domain
CSP, where the variables Q1,Q2,…..Q8 are the positions each queen in columns 1,
….8 and each variable has the domain {1,2,3,4,5,6,7,8}. If the maximum domain size of
any variable in a CSP is d, then the number of possible complete assignments is O(d n) – that
is, exponential in the number of variables. Finite domain CSPs include Boolean CSPs, whose
variables can be either true or false. Infinite domains
Discrete variables can also have infinite domains – for example, the set of integers or
the set of strings. With infinite domains, it is no longer possible to describe constraints by
enumerating all allowed combination of values. Instead a constraint language of algebric
inequalities such as Startjob1 + 5 <= Startjob3.
Varieties of constraints
The term backtracking search is used for depth-first search that chooses values for
one variable at a time and backtracks when a variable has no legal values left to assign. The
algorithm is shown in figure
Figure 2.34 A simple backtracking algorithm for constraint satisfaction problem. The
algorithm is modeled on the recursive depth-first search
Figure 2.34 Part of the search tree generated by simple backtracking for the map-
coloring problem
Figure 2.35 Part of search tree generated by simple backtracking for the map
coloring problem.
FORWARD CHECKING
One way to make better use of constraints during search is called forward checking.
Whenever a variable X is assigned, the forward checking process looks at each unassigned
variable Y that is connected to X by a constraint and deletes from Y ’s domain any value that
is inconsistent with the value chosen for X. Figure 5.6 shows the progress of a map-coloring
search with forward checking.
Figure 2.36 The progress of a map-coloring search with forward checking. WA = red
is assigned first; then forward checking deletes red from the domains of the
neighboring variables NT and SA. After Q = green, green is deleted from the domain
of NT, SA, and NSW. After V = blue, blue, is deleted from the domains of NSW and
SA, leaving SA with no legal values.
CONSTRAINT PROPAGATION
Although forward checking detects many inconsistencies, it does not detect all of them.
Constraint propagation is the general term for propagating the implications of a constraint on one
variable onto other variables.
Arc Consistency
k-Consistency
Independent Subproblems
Artificial intelligence (AI) 's initial goal is to build machines capable of carrying out tasks that
usually call for human intelligence. Among the core functions of AI is real-life problem-
solving. Understanding "problems," "problem spaces," and "search" is fundamental to
comprehending how AI systems handle and resolve challenging jobs in the current situation.
Problems in AI
A problem is a particular task or challenge that calls for decision-making or solution-finding.
In artificial intelligence, an issue is simply a task that needs to be completed; these tasks can
be anything from straightforward math problems to intricate decision-making situations.
Artificial intelligence encompasses various jobs and challenges, from basic math operations to
sophisticated ones like picture recognition, natural language processing, gameplay, and
optimization. Every problem has a goal state that must be attained, a defined set of initial
states, and potential actions or moves.