Project Report
Project Report
Chapter 1
INTRODUCTION
In the modern era, human-computer interaction (HCI) has evolved significantly, extending
beyond traditional input devices like keyboards and mice to more natural and intuitive methods.
Gesture recognition technology stands at the forefront of this transformation, offering a
seamless and immersive way to interact with electronic systems. This project report presents
the development of a "Gesture Controlled Audio System," an innovative approach that
leverages gesture recognition to manipulate audio functionalities without physical contact. In
today's rapidly evolving technological landscape, the interaction between humans and machines
is becoming increasingly seamless and intuitive. The traditional reliance on physical input
devices such as keyboards, mice, and touchscreens is giving way to more sophisticated and
natural methods of control, such as voice commands and gesture recognition. This project report
introduces the "Gesture Controlled Audio System," a cutting-edge solution that leverages hand
gestures to manage audio playback and control. This system aims to provide an intuitive,
hygienic, and accessible interface, enhancing the overall user experience and opening new
possibilities for human-computer interaction.
Gesture recognition technology has gained significant traction in recent years, driven
by advancements in computer vision, machine learning, and sensor technology. The ability to
control devices through simple hand movements is not only futuristic but also practical in
various applications, from gaming and virtual reality to automotive interfaces and home
automation. The motivation behind developing a gesture-controlled audio system stems from
the desire to create a more natural and engaging way to interact with audio devices. By utilizing
gestures, users can control playback, adjust volume, and navigate tracks without touching any
physical buttons or screens. This touchless interaction is particularly advantageous in
environments where hygiene is a concern, such as public spaces or during the COVID-19
pandemic, and for individuals with physical disabilities who may find traditional controls
challenging to use. Gesture-controlled audio systems revolutionize the way we interact with
audio devices by enabling hands-free control through intuitive gestures. Leveraging cutting-
edge gesture recognition technology, these systems interpret users' hand movements or body
gestures in real-time, translating them into commands for playback, volume adjustment, track
navigation, and more. By eliminating the need for physical buttons or touchscreens, gesture-
controlled audio systems offer enhanced convenience, accessibility, and safety in various
scenarios, including driving, exercising, or multitasking. Moreover, they represent a
convergence of human-computer interaction and artificial intelligence, providing more natural
and intuitive interfaces for users. This technology holds promise across diverse industries, from
automotive and consumer electronics to healthcare and entertainment, as it continues to evolve
and integrate with other emerging technologies. In this report, we explore the principles,
implementation, applications, and future prospects of gesture-controlled audio systems,
unveiling their potential to transform the audio experience and redefine human-machine
interaction.
• To design and integrate a robust circuitry system that combines flex sensors, an MP3
module, an audio amplifier, and a microphone with an Arduino microcontroller.
• To develop a comprehensive software program that governs sensor inputs, audio playback,
and volume control, ensuring seamless integration and functionality.
Chapter 2
LITERATURE SURVEY
M. Alagu sundaram et al.,[1] proposed a model that provides an in-depth analysis of
various hand gesture recognition techniques, including vision-based methods using cameras,
sensor-based approaches using wearable devices, and hybrid methods combining multiple
sensors. It discusses their applications across diverse fields such as human-computer
interaction, virtual reality, and robotics.
J. Li et al. [2] proposed a model that presents a real-time hand gesture recognition
system based on convolutional neural networks (CNNs). The system achieves high accuracy in
recognizing dynamic hand gestures captured by depth sensors, demonstrating its potential for
applications in human-computer interaction.
K. Saroha et al. [3] proposed a model that explores the use of machine learning
techniques, including deep learning algorithms, for gesture recognition applications. It
discusses various datasets, algorithms, and evaluation metrics commonly used in gesture
recognition research, highlighting recent advancements and challenges in the field.
S. Velázquez et al. [4] proposed a model that focuses on gesture recognition systems
designed for ambient assisted living (AAL) environments to support elderly and disabled
individuals. It discusses the importance of non-intrusive interaction techniques and presents
state-of-the-art approaches for gesture recognition in AAL scenarios.
A. Gupta et al. [5] proposed a model that provides an overview of gesture-controlled
music playback systems, including both commercial products and research prototypes. It
discusses various technologies and methods used for gesture recognition and audio control,
highlighting their applications in entertainment and multimedia environments.
Georgi et al. [6] proposed a model that the combination of IMU and EMG sensors can
capture both motion data and muscle activity, providing a robust dataset for gesture
classification. The IMU sensors, comprising accelerometers and gyroscopes, track the
orientation and movement of the hand. Concurrently, the EMG sensors monitor muscle
contractions, offering insights into the underlying muscle activities driving these movements.
This dual-sensing approach addresses the limitations of using a single type of sensor, such as
IMU's susceptibility to drift and EMG's sensitivity to noise.
Ariyanto et al. [7] proposed a model that focus on the use of EMG sensors to capture
the electrical activity produced by muscle contractions during finger movements. The collected
EMG data serve as input to an ANN, which is trained to recognize specific movement patterns.
This method addresses common challenges in EMG-based recognition systems, such as signal
variability and noise, by utilizing the ANN's ability to learn complex patterns and generalize
across different data sets.
The authors conducted extensive experiments to evaluate the performance of their
proposed system. They report high accuracy rates in recognizing various finger movement
patterns, demonstrating the effectiveness of using ANNs for EMG signal classification. This
approach shows promise for applications in prosthetics, where accurate interpretation of muscle
signals is critical for the control of artificial limbs.
Jorgensen et al. [8] proposed a model that employ neural network algorithms to process
and interpret the EMG and EPG signals. Their experiments demonstrate that it is possible to
achieve accurate speech recognition using these sub-auditory signals. The neural networks are
trained to identify specific speech patterns, allowing the system to recognize words and phrases
from the muscle and contact data alone.
This research has significant implications for the development of silent communication
systems and assistive technologies. For instance, it can be used in military or covert operations
where silent communication is crucial, or in assistive devices for individuals who cannot speak
audibly. The ability to recognize speech without sound opens new possibilities for human-
computer interaction and accessibility.
Zhang et al. [9] proposed a model that develop an algorithm that processes and fuses
data from the accelerometer and EMG sensors. Their system includes preprocessing steps to
filter noise and normalize the sensor signals, followed by feature extraction to capture the
essential characteristics of each gesture. The extracted features are then fed into a machine
learning classifier to identify and categorize the gestures.
Chapter 3
PROPOSED SYSTEM
The primary focus of this project is to design and implement a gesture-controlled audio system
capable of generating distinct sounds in response to user gestures. The system comprises ten
flex sensors, an Arduino microcontroller, an MP3-TF-16P MP3 SD Card Module, a PAM84403
audio amplifier, an electret microphone (MIC), and a 6W speaker. Each flex sensor is connected
to the Arduino, and when a user bends a sensor, a specific pre-recorded sound stored on the
MP3 module is played through the speaker. In addition to gesture-based control, the system
incorporates a microphone to detect loud sounds. When the microphone detects a sound above
a predefined threshold, it triggers a specific audio file to play through the speaker. The audio
amplifier enhances the audio output, ensuring a clear and audible sound experience for the user.
The MP3-TF-16P module serves as the central audio source for the system, storing a variety of
audio files corresponding to different gestures and loud sounds. The Arduino microcontroller
orchestrates the playback of these audio files based on the sensor inputs and microphone
readings, creating a dynamic and interactive audio environment.
Objectives to be fulfilled:
• To design and integrate a robust circuitry system that combines flex sensors, an MP3
module, an audio amplifier, and a microphone with an Arduino microcontroller.
• To develop a comprehensive software program that governs sensor inputs, audio playback,
and volume control, ensuring seamless integration and functionality.
• To test, evaluate, and optimize the system to ensure reliability, accuracy, and user-
friendliness, validating its performance across diverse scenarios and applications.
Chapter 4
A flex sensor is a type of sensor that acts as a variable resistor, changing its resistance based on
the amount of bend or flex applied to it. Typically made from a flexible substrate coated with a
conductive material, the sensor exhibits an increase in resistance as it bends. This change in
resistance can be measured and translated into a quantifiable signal, allowing the detection and
measurement of bending angles or motion.
Flex sensors are widely used in various applications due to their versatility and ease of
integration. In wearable technology, flex sensors are often embedded in gloves, sleeves, or other
garments to capture precise movements of the body. For instance, in smart gloves, they can
track the bending of fingers, enabling applications in virtual reality (VR) and augmented reality
(AR) where hand gestures control virtual objects or interfaces. This application is particularly
valuable in gaming, providing a more immersive and interactive experience by allowing users
to interact naturally with virtual environments.
In the field of robotics, flex sensors are crucial for enhancing the dexterity and responsiveness
of robotic limbs. By providing real-time feedback on the position and movement of joints, these
sensors enable robots to perform tasks with greater precision and adaptability. This capability
is essential in industries where robots handle delicate or complex operations, such as in medical
surgery or automated manufacturing.
Flex sensors also play a significant role in assistive technology for individuals with disabilities.
They can be used to develop gesture-controlled devices, such as prosthetics or communication
aids, that respond to the user's movements, thereby improving accessibility and quality of life.
automation systems, wearable devices, and interactive art installations. In professional contexts,
Arduino is utilized for rapid prototyping, allowing engineers and designers to test and iterate
on their concepts quickly and cost-effectively.
Furthermore, the platform's open-source nature encourages innovation and customization.
Users can modify existing designs or create their own hardware extensions, fostering a culture
of sharing and collaboration. Overall, Arduino microcontrollers have revolutionized the way
people engage with electronics, democratizing technology and empowering individuals to turn
their ideas into reality.
This setup is particularly beneficial in applications where hygiene is paramount or where users
have physical disabilities that make traditional controls challenging. The combination of
gesture recognition technology with the MP3-TF-16P module results in an innovative audio
system that enhances accessibility, convenience, and user interaction.
battery life in portable applications. Additionally, the amplifier's differential input architecture
enhances noise immunity, providing clear and high-quality audio output.
In summary, the PAM84403 is an ideal solution for compact and energy-efficient audio
systems, offering robust performance, minimal external components, and integrated protection
features.
An electret microphone (MIC) is a type of condenser microphone widely used in various audio
applications due to its small size, low cost, and good performance. The core of an electret
microphone is a diaphragm coated with an electret material that retains a permanent electric
charge. This diaphragm is placed close to a metal backplate, forming a capacitor.
When sound waves strike the diaphragm, it vibrates, causing variations in the capacitance
between the diaphragm and the backplate. These variations result in changes in the electric
field, generating a corresponding electrical signal that represents the sound.
Electret microphones are valued for their simplicity and reliability. They require minimal
external components, typically just a bias resistor and a power supply, making them easy to
integrate into electronic circuits. They are also highly sensitive and have a relatively flat
frequency response, making them suitable for capturing a wide range of audio frequencies with
clarity.
Common applications for electret microphones include smartphones, hearing aids, laptops,
voice recorders, and other consumer electronics. They are also used in professional audio
equipment, such as lavalier microphones and headsets, due to their ability to deliver clear sound
in a compact form factor.
Overall, electret microphones are an essential component in modern audio capture, offering a
balance of performance, size, and cost.
4.1.7 6W Speaker
• The speaker produces the audio output, allowing users to hear the played audio files
and responses generated by the system.
• Speakers come in various sizes and power ratings; a 6W speaker is chosen to provide
sufficient audio volume and clarity for the intended application.
4.1.8 Breadboard
A breadboard is a fundamental tool in electronics for prototyping and testing circuits without
soldering. It consists of a rectangular plastic board with a grid of interconnected holes where
electronic components and wires can be inserted. The internal connections of the breadboard
are typically organized in rows and columns, making it easy to create and modify circuits
quickly.
The breadboard is divided into two main sections: the terminal strips and the bus strips.
Terminal strips, located in the center, are used for placing and connecting electronic
components like resistors, capacitors, and integrated circuits. Each row in a terminal strip is
electrically connected, allowing components to share connections. The bus strips, usually
positioned along the sides, are used for power distribution. They consist of long columns for
the positive and negative power rails, providing a convenient way to distribute power to the
entire circuit.
Breadboards are invaluable for experimenting with and troubleshooting circuit designs,
allowing for easy adjustments and replacements of components. They are reusable, which
makes them cost-effective for iterative development. Additionally, they are widely used in
educational settings to teach electronics and circuit design principles, as they offer a hands-on
way to learn about electrical connections and circuitry without permanent commitment.
Arduino provides a standard form factor that breaks the functions of the micro-controller into
a more accessible package.
A program for Arduino may be written in any programming language for a compiler that
produces binary machine code for the target processor. Atmel provides a development
environment for their microcontrollers, AVR Studio and the newer Atmel Studio.
The Arduino project provides the Arduino integrated development environment (IDE), which
is a cross-platform application written in the programming language Java. It originated from the
IDE for the languages Processing and Wiring. It includes a code editor with features such as
text cutting and pasting, searching and replacing text, automatic indenting, brace matching, and
syntax highlighting, and provides simple one-click mechanisms to compile and upload
programs to an Arduino board. It also contains a message area, a text console, a toolbar with
buttons for common functions and a hierarchy of operation menus.
A program written with the IDE for Arduino is called a sketch. Sketches are saved on the
development computer as text files with the file extension Arduino Software (IDE) pre-1.0
saved sketches with the extension. pde.
The Arduino IDE supports the languages C and C++ using special rules of code structuring.
The Arduino IDE supplies a software library from the Wiring project, which provides many
common input and output procedures. User-written code only requires two basic functions, for
starting the sketch and the main program loop, that are compiled and linked with a program
stub main() into an executable cyclic executive program with the GNU toolchain, also included
with the IDE distribution.
A minimal Arduino C/C++ sketch, as seen by the Arduino IDE programmer, consist of only
two functions:
• setup (): This function is called once when a sketch starts after power-up or reset. It is
used to initialize variables, input and output pin modes, and other libraries needed in the
sketch.
• loop (): After setup () has been called, function loop () is executed repeatedly in the main
program. It controls the board until the board is powered off or is reset.
Arduino Installation: After learning about the main parts of the Arduino UNO board, we are
ready to learn how to set up the Arduino IDE. Once we learn this, we will be ready to upload
our program on the Arduino board.
In this section, we will learn in easy steps, how to set up the Arduino IDE on our computer and
prepare the board to receive the program via USB cable.
Step 1 − First you must have your Arduino board and a USB cable. In case you use Arduino
UNO, Arduino Duemilanove, Nano, Arduino Mega 2560, or Diecimila, you will need a
standard USB cable (A plug to B plug), the kind you would connect to a USB printer.
In case you use Arduino Nano, you will need an A to Mini-B cable instead.
You can get different versions of Arduino IDE from the download page on the Arduino Official
website. You must select your software, which is compatible with your operating system
(Windows, IOS, or Linux). After your file download is complete, unzip the file as shown in
figure 4.8
The Arduino Uno, Mega, Duemilanove and Arduino Nano automatically draw power from
either, the USB connection to the computer or an external power supply. If you are using an
ArduinoDiecimila, you have to make sure that the board is configured to draw power from the
USB connection. The power source is selected with a jumper, a small piece of plastic that fits
onto two of the three pins between the USB and power jacks. Check that it is on the two pins
closest to the USB port.
Connect the Arduino board to your computer using the USB cable. The green power LED
(labeled PWR) should glow.
After your Arduino IDE software is downloaded, you need to unzip the folder. Inside the folder,
you can find the application icon with an infinity label (application.exe). Double-click the icon
to start the IDE.
Here, we are selecting just one of the examples with the name Blink. It turns the LED on and
off with some time delay. You can select any other example from the list.
To avoid any error while uploading your program to the board, you must select the correct
Arduino board name, which matches with the board connected to your computer.
Here, we have selected Arduino Uno board according to our tutorial, but you must select the
name matching the board that you are using.
Select the serial device of the Arduino board. Go to Tools → Serial Port menu. This is likely
to be COM3 or higher (COM1 and COM2 are usually reserved for hardware serial ports). To
find out, you can disconnect your Arduino board and re-open the menu, the entry that disappears
should be of the Arduino board. Reconnect the board and select that serial port.
Before explaining how we can upload our program to the board, we must demonstrate the
function of each symbol appearing in the Arduino IDE toolbar shown in fig 4.9.
F − Serial monitor used to receive serial data from the board and send the serial data to the
board.
Now, simply click the "Upload" button in the environment. Wait a few seconds; you will see
the RX and TX LEDs on the board, flashing. If the upload is successful, the message "Done
uploading" will appear in the status bar.
Note − If you have an Arduino Mini, NG, or other board, you need to press the reset button
physically on the board, immediately before clicking the upload button on the Arduino
Software.
4.2.2 Embedded C
Embedded C is one of the most popular and most commonly used Programming Languages in
the development of Embedded Systems.
Embedded C is perhaps the most popular languages among Embedded Programmers for
programming Embedded Systems. There are many popular programming languages like
Assembly, BASIC, C++ etc. that are often used for developing Embedded Systems but
Embedded C remains popular due to its efficiency, less development time and portability.
An Embedded System can be best described as a system which has both the hardware and
software and is designed to do a specific task. A good example for an Embedded System, which
many households have, is a Washing Machine.
All these devices have one thing in common: they are programmable i.e. we can write a program
(which is the software part of the Embedded System) to define how the device actually works.
Embedded Software or Program allow Hardware to monitor external events (Inputs) and control
external devices (Outputs) accordingly. During this process, the program for an Embedded
System may have to directly manipulate the internal architecture of the Embedded Hardware
(usually the processor) such as Timers, Serial Communications Interface, Interrupt Handling,
and I/O Ports etc.
From the above statement, it is clear that the Software part of an Embedded System is equally
important to the Hardware part. There is no point in having advanced Hardware Components
with poorly written programs (Software). There are many programming languages that are used
for Embedded Systems like Assembly (low-level Programming Language), C, C++, JAVA
(high-level programming languages), Visual Basic, JAVA Script (Application level
Programming Languages), etc. In the process of making a better embedded system, the
programming of the system plays a vital role and hence, the selection of the Programming
Language is very important.
The following are few factors that are to be considered while selecting the Programming
Language for the development of Embedded Systems.
• Size: The memory that the program occupies is very important as Embedded Processors
like Microcontrollers have a very limited amount of ROM.
• Speed: The programs must be very fast i.e. they must run as fast as possible. The hardware
should not be slowed down due to a slow running software.
• Portability: The same program can be compiled for different processors.
• Ease of Implementation
• Ease of Maintenance
• Readability
Earlier Embedded Systems were developed mainly using Assembly Language. Even though
Assembly Language is closest to the actual machine code instructions, the lack of portability
and high amount of resources spent on developing the code, made the Assembly Language
difficult to work with.
There is actually not much difference between C and Embedded C apart from few extensions
and the operating environment. Both C and Embedded C are ISO Standards that have almost
same syntax, datatypes, functions, etc.
The next thing to understand in the Basics of Embedded C Program is the basic structure or
Template of Embedded C Program. This will help us in understanding how an Embedded C
Program is written.
Python is an easy to learn, powerful programming language. It has efficient high-level data
structures and a simple but effective approach to object-oriented programming. Python’s
elegant syntax and dynamic typing, together with its interpreted nature, make it an ideal
language for scripting and rapid application development in many areas on most platforms. The
Python interpreter is easily extended with new functions and data types implemented in C or
C++ (or other languages callable from C). Python is also suitable as an extension language for
customizable applications.
Python 3.7 is a suitable version for machine learning tasks. Many popular machine learning
libraries such as TensorFlow, PyTorch, scikit-learn, and Keras support Python 3.7. You can
utilize these libraries to build, train, and deploy machine learning models effectively.
Here's how you can set it up for machine learning:
Install Python 3.7: Make sure you have Python 3.7 installed on your system. You can download
it from the official Python website.
Open Python IDLE: Once Python 3.7 is installed, you can open Python IDLE by searching for
it in your operating system's application launcher or by running idle3 or idle command in the
terminal/command prompt.
Install machine learning libraries: You'll need to install the necessary machine learning libraries
to work with in Python IDLE. You can install libraries like TensorFlow, PyTorch, scikit-learn,
etc., using pip.
Write and run your code: You can now start writing Python code for machine learning tasks
directly in Python IDLE. You can create a new Python file by selecting "File" > "New File"
from the menu, or simply start typing in the interactive shell. You can then run your code by
selecting "Run" > "Run Module" from the menu or by pressing F5.
Debugging: Python IDLE provides basic debugging capabilities, such as setting breakpoints,
stepping through code, inspecting variables, etc. You can utilize these features to debug your
machine learning code as needed.
While Python IDLE is a simple and easy-to-use IDE, it may lack some of the advanced features
and integrations that are available in other IDEs specifically designed for machine learning
development.
Chapter 5
IMPLEMENTATION
The proposed work involves four primary sub-sections i.e., Sensor Interfacing, Data Collection
and Pre-processing, Feature Extraction and Selection, and Gesture Recognition.
The total work-flow is illustrated in Figure 5.1.
Flex sensor (SEN - 08606) is the primary component used in this work. The hardware prototype
with flex sensors mounted on fingers. Flex sensors’ terminal resistance changes when they are
bent, and this helps in detecting the motion in a specific part of the body. The Flex sensor does
not contain polarized terminals. So there are no positive and negative terminals. In general, the
pin number P1 is connected to positive of power source and P2 is connected to ground, as
illustrated in Figure 5.2. With increase in bent/Flex in the Flex sensor, the resistance increases.
Figure 5.2 shows the connection followed in order to Interface the Arduino. After properly
connecting the sensor with Arduino, the later is connected to PC. With the help of Arduino IDE,
a specific designed code is exported into the Arduino based on which when there is a slight
change in the bend of Flex sensor, there is a change observed in the readings obtained as output
from the Arduino.
• Long copper wires were used to connect the flex sensors to Arduino and resistor as per the
circuit diagram.
When the setup is ready the connection was checked again to ensure no loose connection and
the work then proceeded to the next stage, i.e., data preprocessing.
Data Pre-processing: In this phase, the collected data were categorized on four basic class
labels, i.e., Class label 0, Class label 1, Class label 2 and Class label 3. The Gestures signified
by the Class labels are described here-under:
• Class label 0: This class label indicated the hand gesture in which both of the index finger
and middle finger have curled inside.
• Class label 1: This class label indicated the hand gesture in which both of the index finger
and middle finger are fully stretched and kept in a manner that represented ‘V’ symbol.
• Class label 2: For this class, the middle finger is curled inside while the index finger
remains as it is in the case before and the readings are noted.
• Class label 3: This is the last class label we are considering. The hand gesture in this class
label is derived from the hand gesture considered in class label 1. The difference is that the
index finger is curled inside in this case and all the other finger position remaining same.
The data recorded from each sensor is an integer value. In the interval of 0.25 seconds, a new
value was obtained based on the change done in bending of the sensor. The two data that is
obtained is labelled as x and y in our dataset. The x and y represented the value obtained from
the flex sensor tied to index finger and middle finger, respectively. Before the use of the above
dataset, shuffling was carefully done by using random library of python. The shuffling ensured
sufficient availability of various label data to the machine learning model for training, testing
and validation purposes. Some out-layer physical noises were also removed from the dataset.
• First Difference: This kind of feature ex-traction takes into account the values of previous
row as well and finds the deviation occurred from it, i.e., Xfd and Yfd. The absolute value
is considered. Therefore, Xfd = abs (xi–xi−1) and Yfd = abs (yi–yi−1).
• Second Difference: This kind of feature extraction takes into account the values of previous
row and the upcoming row and finds the mean deviation of the three, i.e., Xsd and Ysd. The
absolute value is considered. Therefore, Xsd = abs (xi+1–2∗xi+xi−1) and Ysd = abs (yi+1–2∗yi
+yi−1)
A total of 6 features were extracted from the pre-existing data and were then used for Machine
Learning of the models. It is observed that the models were able to perform better when the
extra features were present. The data included the values of x and y when the gestures were
prominent and the data that is recorded while the transition from one gesture to another or when
the gesture is ambiguous is ignored.
The main motivation behind using this model is to reach to a maximum possible accuracy even
if there is some alteration which has occurred in the data for training or testing phase by external
factors. The external factors can be anything or anyone who is trying to change the data and
expecting a wrong output, intentionally. Such people may be termed as hackers. The various
attacks that are more likely to occur for our deep learning models include fast-gradient sign
method (FGSM), basic iterative method (BIM) or momentum iterative method (MIM). The
earlier mentioned attacks are some pure form of many gradient based evading techniques that
attackers use for evading the classification model. Adversarial attacks take place when noise is
added to the data which in turn, while validating the already trained model may result in the
classification of false labels. The proposed model is used to prevent such attacks in order to
develop an efficient and robust ANN model which can rightfully classify labels despite being
feed with noisy data. The architecture of Adversarial Learning Model is illustrated in Figure
5.3.
Chapter 6
RESULT
The entire code for the experiment was developed using Keras, a Python framework which is
used commonly for developing deep learning models. Additionally, other libraries used
consisted of Scikit-learn, pandas and NumPy. The experiment required the use of a computer
server with dedicated Graphical Processing Unit (GPU) as a result of which a machine with
AMD Ryzen 5 3550H processor and NVidia GeForce GTX 1050 graphics card was used.
The Adversarial Learning was found to have outperformed standard classifiers. The
comparisons were done by running all the machine learning algorithms on the dataset obtained
from our initial work. We have tried our best to keep all the necessary conditions as similar as
possible in order to have an ideal comparison of all of the models being compared.
Table 6.1 shows the performance matrix of ANN used before the application of Adversarial
Learning on our dataset:
Considering Model Accuracy graph in Figure 6.1 , it is evident from the figure that the model
is performing well. Initially, the Accuracy values are less for both Testing and Training but after
20 epochs, a sporadic increase in their value is observed. The values keep on increasing and
decreasing for the next 8-10 epochs. After 30 epochs, a pretty high accuracy values are observed
with not much change in them. This phenomenon continues up to next 70 epochs.
Considering Model Loss graph in Figure 6.2, we can observe that Training Loss and Testing
loss are very much similar by value throughout the execution. This indicates us that the chances
of over-fitting and underfitting is extremely less.
From the graph given in Figure 6.3, it is observed that the testing process records a
commendable accuracy of 88.32%.
However, it becomes more important to compare the results of this approach with some other
standard classifiers in order to draw some definite comparisons.
From the Table 6.2, it is quite evident that our proposed model has recorded the best
performance matrices as compared to the other standard classifiers. We also observe that the
results for Logistic Regression and Linear Discriminant Analysis are the worst in the lot. This
is primarily because of the fact that they are inconsistent when it comes to classifying multi-
class data set. Other classifiers’ performance was very mediocre.
• Enhanced Accessibility: Hand gesture controls make audio systems more accessible to
individuals with mobility impairments or physical disabilities, allowing them to interact
with devices without needing to press buttons or turn knobs.
• Hygienic Operation: Since gesture controls eliminate the need for physical contact, they
are more hygienic, reducing the risk of spreading germs and making them ideal for use in
public or shared environments.
• Intuitive Use: Gestures are a natural form of communication, making the system easy to
use without a steep learning curve. Users can quickly learn and remember simple gestures
to control audio functions.
• Convenience: Users can control audio playback from a distance, without needing to
physically reach the device. This is especially useful in large rooms, while cooking, or
during exercise.
• Modern User Experience: Gesture control adds a futuristic and sophisticated element to
audio systems, enhancing user engagement and satisfaction through innovative interaction
methods.
• Reduced Wear and Tear: Since there are no physical buttons or dials being used, the
system experiences less mechanical wear and tear, potentially extending the lifespan of the
device and reducing maintenance costs.
6.2 Disadvantages
• Calibration Sensitivity: Flex sensors require precise calibration to accurately detect and
interpret gestures. Any deviation can lead to incorrect commands, reducing the system’s
reliability and user satisfaction.
• Limited Gesture Range: Flex sensors primarily detect bending movements, which can
limit the range of recognizable gestures. Complex gestures involving multiple hand or
finger positions might be challenging to implement and detect accurately.
• Durability and Wear: Continuous bending and flexing can wear out the sensors over time,
reducing their lifespan and necessitating frequent replacements or maintenance, especially
in high-usage scenarios.
• Comfort and Ergonomics: Wearing gloves or attachments with integrated flex sensors for
extended periods can be uncomfortable or restrictive, potentially causing strain or
discomfort to the user.
• Cost: Implementing flex sensor technology can be expensive due to the need for high-
quality sensors, calibration equipment, and integration with existing audio systems, making
it less accessible for budget-conscious users or applications.
• Complex Setup: Initial setup and integration of flex sensor-based gesture control systems
can be complex, requiring technical expertise to ensure proper functionality. Users without
technical knowledge may find it difficult to install and troubleshoot the system.
• Wearable Technology: Flex sensors embedded in gloves can be used to control audio
playback on wearable devices, such as smartwatches or fitness trackers, allowing users to
manage music and other audio without needing to touch the device.
• Rehabilitation and Physical Therapy: In physical therapy settings, patients can use flex
sensor-equipped gloves to control therapeutic audio or exercise instructions, providing an
engaging and interactive way to follow therapy regimens.
• Gaming and Virtual Reality: Flex sensors in VR gloves can be used to control in-game
audio, enhancing the immersive experience by allowing gamers to adjust sound settings or
switch audio tracks with hand gestures.
• Assistive Devices for Disabilities: For individuals with mobility impairments, flex sensors
can be integrated into assistive devices to enable gesture-based control of audio systems,
facilitating easier and more intuitive interactions with technology.
• Home Automation: Flex sensors can be part of a smart home system, where users wearing
gesture-detecting gloves can control home audio systems to play music, adjust volume, or
switch tracks seamlessly while performing other tasks.
• Educational Tools: In classrooms, teachers can use flex sensor-equipped gloves to control
audio-visual aids during lectures, making it easier to manage multimedia content while
engaging with students interactively.
• Automotive Controls: Drivers can wear gloves with flex sensors to control the car’s audio
system, enabling safe and convenient hands-free interaction to adjust volume, change
stations, or navigate playlists without taking their hands off the steering wheel.
Chapter 7
REFERENCES
[1] M. Alagu Sundaram, “A Review on Hand Gesture Recognition Techniques,
Methodologies, and Its Application,” International Journal of Computer Applications,
Volume 117(23), Pages 6-11.
[2] J. Li, “Real-Time Hand Gesture Recognition System for Human-Computer Interaction,"
IEEE Transactions on Human-Machine Systems, Volume 49(1), Pages 123-136.
[3] K. Saroha, "Gesture Recognition Using Machine Learning: A Review," Pattern
Recognition Letters, Volume 142, Pages 21-35.
[4] S. Velázquez, “Gesture Recognition in Ambient Assisted Living Environments: A
Review,” Sensors, Volume 19(14), Pages 3091-3112.
[5] A. Gupta, “Gesture-Controlled Music Playback Systems: A Survey,” Multimedia Tools
and Applications, Volume 79(13), Pages 9231-9255.
[6] Georgi, Marcus, Christoph Amma, and Tanja Schultz. ”Recognizing Hand and Finger
Ges-tures with IMU based Motion and EMG based Muscle Activity Sensing.” In
Biosignals, pp. 99-108. 2015.
[7] Ariyanto, Mochammad, Wahyu Caesarendra, Khusnul A. Mustaqim, Mohamad Irfan,
Jonny A. Pakpahan, Joga D. Setiawan, and Andri R. Winoto. ”Finger movement pattern
recogni-tion method using artificial neural network based on electromyography (EMG)
sensor.” In 2015 International Conference on Automation, Cognitive Science, Optics,
Micro Electro-Mechanical System, and Information Technology (ICACOMIT), pp. 12-
17. IEEE, 2015.
[8] Chuck, Jorgensen, D. Lee, and Shane Aga-bon. ”Sub Auditory Speech Recognition
Based on EMG/EPG Signals.” In Proceedings of the International Joint Conference on
Neural Networks, pp. 1098-7576. 2003.
[9] Zhang, Xu, Xiang Chen, Yun Li, Vuokko Lantz, Kongqiao Wang, and Jihai Yang. ”A
framework for hand gesture recognition based on accelerometer and EMG sensors.”
IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans 41,
no. 6 (2011): 1064- 1076.
[10] Dixit, Dr Shantanu K., and Mr Nitin S. Shin-gi. ”Implementation of flex sensor and
electronic compass for hand gesture based wireless automation of material handling
robot.” International Journal of Scientific and Re-search Publications 2, no. 12 (2012):
1-3.
APPENDIX