A Mini Project Report on Autoencoders (1)
A Mini Project Report on Autoencoders (1)
Autoencoders
Submitted in partial fulfilment of award of
Degree
in
Computer Science and Engineering
By
Trapti Chauhan
2200821530049
Under the Guidance of
Ms. Anu Sharma
Mr. Varun
Agarwal
This project would not have been possible without the combined efforts of
all those who contributed directly or indirectly. Their belief in my potential
has been instrumental in helping me achieve this milestone.
Smita Singh
Section – D
Roll No - 2200821530049
Table of Contents
Abstract 1
Acknowledgement 2
List of Tables 3
List of Figures 4
Chapter 1: Introduction
3.1 Methodology
3.2 Architecture Design
3.3 Flowchart
Chapter 4: Implementation
6.1 Conclusion
6.2 Future Scope
References
List of Tables
This table provides an overview of the dataset used to train the Autoencoder
model. It includes important statistics such as the number of samples, the
dimensions of the images (e.g., 28x28 pixels for MNIST), the data split (e.g.,
training, validation, and test
sets), and the preprocessing steps applied (e.g., normalization or reshaping
of images). These statistics help the reader understand the scope and
nature of the data used in the training process.
This figure presents the training loss curve during the autoencoder's
learning process. The loss curve shows how the reconstruction error
decreases as the model trains over time. By analysing this curve, one can
determine if the model is learning effectively,
identify potential overfitting or underfitting, and decide whether the training
process needs adjustment.
This figure compares the original input images with the corresponding
reconstructed images produced by the autoencoder. It demonstrates the
model's ability to capture the essential features of the data. The closer the
reconstructed images are to the inputs, the better the autoencoder has
learned to encode and decode the information. This comparison is crucial
for evaluating the model’s performance.
Chapter 1: Introduction
1. Encoder: The encoder takes the input data and maps it to a lower-
dimensional latent space, also known as the bottleneck. This step
compresses the input by extracting the most critical features.
Types of
Autoencoders
1.3.2 Denoising
Autoencoders are effective for anomaly detection because they learn the
patterns of normal data during training. When presented with anomalous
data, the autoencoder struggles to reconstruct it accurately, resulting in
a higher reconstruction error. This discrepancy can be used to identify
anomalies.
1. Encoder
3. Decoder
z = f_ theta(x)
where f_ theta is a function with parameters theta (for example, weights and
biases of the network).
x_ hat = g_ phi(z)
where g_ phi is a function with parameters phi. The goal of the decoder is to
produce x_ hat that closely resembles x.
Input -> Encoder -> Latent Space -> Decoder -> Reconstructed Output
Layers of an Autoencoder
Applications:
Data compression
Feature extraction
Key Idea:
Input: x + noise
Applications:
Image denoising
Signal processing
Applications:
Image generation
Anomaly detection
Data synthesis
2.4 Loss Functions
The Mean Squared Error (MSE) is the most commonly used loss function for
autoencoders. It calculates the average squared difference between the original
input x and the reconstructed output x_ hat:
where:
x_ i = Original input
Advantages of MSE:
Interpretation:
A lower MSE indicates that the reconstructed output is closer to the original
input, meaning the autoencoder is learning effectively.
While MSE is the most common, other loss functions can be used based on
the specific task:
3.1 Methodology
Data Collection
For this project, the MNIST dataset is used as the primary source of data.
The MNIST dataset is a collection of 70,000 grayscale images of
handwritten digits, ranging from 0 to 9. Each image is 28x28 pixels in size,
making it suitable for autoencoder models due to its simplicity and
relatively low computational cost. The dataset is divided into:
Preprocessing
1. Normalization:
The pixel values in the images are normalized to a range between 0 and
1. This helps the model converge faster during training. The
normalization formula is:
2. Flattening:
Each 28x28 image is flattened into a 784-dimensional vector before
being fed into the autoencoder. This allows the input to be processed
by fully connected (dense) layers.
4. Batching:
The data is loaded in mini-batches during training to improve efficiency. A
typical batch size used is 128.
Summary of Methodology Steps:
Design Overview
Encoder Design
Decoder Design
The decoder reconstructs the input data from the latent space. It mirrors the
encoder's structure:
The following flowchart illustrates the overall data flow in the autoencoder
model, from input preprocessing to training and reconstruction.
Fig-3.3.1
Explanation of the Flowchart
1. MNIST Dataset: The dataset serves as the input for the
autoencoder.
4.2 Preprocessing
This ensures that the input values to the autoencoder are within a range that is
easier for the model to process.
4.3 Autoencoder Model
The autoencoder consists of two main parts: the encoder and the
decoder.
Encoder:
Decoder:
The decoder reconstructs the input data from the compressed latent space.
It consists of the following layers:
models # Encoder
encoder = models.Sequential([
layers.Flatten(input_shape=(28,
28)), layers.Dense(128,
activation='relu'),
layers.Dense(64,
activation='relu'),
layers.Dense(32,
activation='relu')
])
# Decoder
decoder = models.Sequential([
layers.Dense(64,
activation='relu'),
layers.Dense(128,
activation='relu'),
layers.Reshape((28, 28))
])
# Autoencoder Model
autoencoder = models.Sequential([encoder,
autoencoder.compile(optimizer='adam',
loss='mse')
Once the model is defined, it is trained using the training data (x_train). We
train the autoencoder for 10 epochs, using Mean Squared Error
(MSE) as the loss function, which measures the difference between
the input and reconstructed image.
The Training Loss Curve is a critical indicator of how well the model is
learning over time. In this project, the loss function used is Mean
Squared Error (MSE), which measures the difference between the
input images and their corresponding
reconstructed images.
Towards the end of the training, the curve starts to flatten, which
means the model has converged and further improvements in
reconstruction quality are minimal.
5.2 Reconstructed Images
One of the primary goals of the autoencoder is to reconstruct the input images
after compressing them into a lower-dimensional latent space. Here, we
compare the
original input images with their corresponding reconstructed images
produced by the trained autoencoder.
Analysis:
The original images are displayed on the left, and the reconstructed
images are shown on the right.
These results show that the autoencoder is capable of capturing the essential
features of the MNIST digits and reconstructing them with minimal loss of
information.
Fig-5.2.1
5.3 Discussion
Conclusion
The model performed well on the MNIST dataset, and the training loss curve
confirmed that the autoencoder effectively minimized reconstruction error
over time. These results highlight the versatility and effectiveness of
autoencoders in learning meaningful representations of data, even with
limited training epochs.
Future Scope
3. TensorFlow Documentation:
4. Kaggle Tutorials: