0% found this document useful (0 votes)
14 views

NB4-06 PT I Using CNN

The document provides an overview of using PyTorch for biomedical imaging, detailing its features, applications, and advantages over other frameworks. It outlines the process of setting up a workspace, loading datasets, and utilizing DataLoaders for efficient data handling during model training. Additionally, it emphasizes the importance of normalization and dataset preparation for training, validation, and testing in machine learning.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

NB4-06 PT I Using CNN

The document provides an overview of using PyTorch for biomedical imaging, detailing its features, applications, and advantages over other frameworks. It outlines the process of setting up a workspace, loading datasets, and utilizing DataLoaders for efficient data handling during model training. Additionally, it emphasizes the importance of normalization and dataset preparation for training, validation, and testing in machine learning.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 21

NB4-06: Biomedical Imaging and Pytorch I

NOTE: We will train a network model to 30 epochs to see that the whole algorithm
works. Then we will load a better network model trained to 100 epochs and validate
with it. This 100 epochs must be loaded from our local disk. You must download NB4-
06 folderfromgithub to local disk.

1. Introduction

PyTorch is an open-source machine learning library developed by Facebook's AI


Research lab (FAIR) in 2016. It is widely used for applications such as natural language
processing, computer vision, and other machine learning tasks. PyTorch provides a
flexible and dynamic environment for deep learning research and production, making
it a popular choice among researchers and practitioners.

PyTorch is built on the Torch library, which was initially developed inLua. However,
PyTorch is implemented in Python, which has contributed to its rapid adoption due to
Python's widespread use in the data science community.

Key Features of PyTorch

 Dynamic Computation Graphs: Unlike some other deep learning frameworks


(like TensorFlow 1.x, which used static computation graphs), PyTorch allows the
creation of dynamic computation graphs. This means the graph is built on the
fly as operations are executed, providing greater flexibility and ease of
debugging.

 Pythonic Nature: PyTorch is deeply integrated with Python, making it easy to


use and learn. Its syntax and interface are consistent with Python’s standard
libraries.

 Autograd (Automatic Differentiation): PyTorch includes a powerful tool


called autograd, which automatically computes gradients of tensors. This
feature is critical for training neural networks using backpropagation.

 Tensor Computation: PyTorch's tensors are similar to NumPy arrays but with
added capabilities, such as GPU acceleration, which makes PyTorch efficient for
large-scale numerical computation.

 GPU Acceleration: PyTorch supports GPU acceleration through CUDA, allowing


for significant performance improvements in deep learning tasks.

 TorchScript: PyTorch includes TorchScript, which allows for seamless


transitioning between research and production by converting PyTorch models
into a statically typed intermediate representation that can be optimized and
run in various environments.

Applications of PyTorch

PyTorch is versatile and can be used for a variety of machine learning tasks:

 Computer Vision: PyTorch is widely used in the field of computer vision for
tasks such as image classification, object detection, and Generative Adversarial
Networks(GANs). Libraries like Torchvision, which is built on top of PyTorch,
provide tools for image transformations, datasets, and pre-trained models.

 Natural Language Processing (NLP): PyTorch has extensive support for NLP
tasks, such as sentiment analysis, machine translation, and text generation.
Libraries like Torchtext provide utilities to handle text data and build models.

 Reinforcement Learning: PyTorch is also used in reinforcement learning,


where agents learn to make decisions by interacting with an environment. The
flexibility of PyTorch makes it suitable for implementing complex algorithms.

 Generative Models: PyTorch is commonly used to implement generative


models, including Variational Autoencoders(VAEs) and Generative Adversial
Networks(GANs), which are popular in creative AI applications.

Comparison with Other Tools

PyTorch is one of several popular deep learning frameworks. Here’s how it compares
to others:

 TensorFlow: TensorFlow, developed by Google, is one of the most widely used


deep learning frameworks. TensorFlow 2.x introduced eager execution, making
it more similar to PyTorch’s dynamic graph approach. However, PyTorch remains
preferred for research due to its flexibility and simplicity.

 Keras: Keras is a high-level API that runs on top of TensorFlow. It’s designed to
be user-friendly and is ideal for beginners. PyTorch, while slightly more complex,
offers greater control and flexibility, making it preferred for more advanced
users.

 MXNet: Developed by Apache, MXNet is another deep learning


framework known for its efficiency and scalability, especially in distributed
computing environments. While it offers similar functionalities to PyTorch, it
hasn’t achieved the same level of popularity or community support.

 Chainer: Chainer is a framework that also uses dynamic computation


graphs, similar to PyTorch. It’s popular in Japan but hasn’t seen widespread
adoption globally.

Advantages of Using PyTorch

 Flexibility: The dynamic nature of PyTorch allows for easier debugging and
experimentation, which is crucial in research settings.

 Community and Ecosystem: PyTorch has a large and active community,


which means there are numerous tutorials, forums, and libraries available. The
ecosystem includes libraries like Torchvision (for computer vision), Torchtext (for
NLP), and more.

 Research to Production: PyTorch’s TorchScript and integration


with ONNX(Open Neural Network Exchange) facilitate the transition
from research models to production environments.
 Ease of Learning: The Pythonic design of PyTorch makes it easy to pick up,
especially for those who are already familiar with Python and NumPy.

2. Setting up Our Workspace

Checking the Computing Device GPU

First, we check if GPU is connected. In Google Colab this fact is immediate and
unnecessary but in other programming environment (local programming) my be
necessary this check.

We can use:

import tensorflow as tf
tf.config.list_physical_devices('GPU')

but we don't want use TensorFlow in this notebook. One alternative is use a system
command.

The nvidia-smi command (NVIDIA System Management Interface) is used to monitor


and manage NVIDIA GPUs (Graphics Processing Units) in a system. It provides detailed
information about the status and performance of the GPUs, including GPU utilization,
temperature, memory usage, processes utilizing the GPU, and more.

Explanation of the output provided by the nvidia-smi command:

1. Driver and CUDA Information:

o Driver Version: The version of the installed NVIDIA driver.

o CUDA Version: The version of CUDA installed on the system.

2. GPU Information:

o GPU: Id number.

o Name: Name of the GPU.

o Persistence-M: Indicates whether GPU persistence is enabled or disabled.


Feature that allows NVIDIA GPUs to remain active and ready to process
tasks even when there isn't a specific computing task to perform. Instead
of powering off after completing a task, the GPU stays active in a low-
power state, reducing downtime and the wait time for the GPU to be
activated again when needed for a new task.

o Bus-Id: GPU bus identifier.

o Disp.A: Display state (on or off).

o Volatile Uncorr. ECC: Volatile uncorrectable memory error correction.

o Fan: GPU fan speed.

o Temp: GPU temperature.


o Perf: GPU performance.

o Pwr:Usage/Cap: GPU power usage / Maximum power capacity.

o Memory-Usage: GPU memory usage.

o GPU-Util: GPU utilization.

o Compute M.: GPU utilization for compute.

o MIG M.: Multi-Instance GPU.

3. Running Processes:

o Lists the running processes that are utilizing the GPU, if any.

o GPU: GPU number the process is running on.

o GI: GPU instance ID.

o CI: Compute instance ID.

o PID: Process ID.

o Type: Process type.

o Process name: Name of the running process.

o GPU Memory Usage: GPU memory usage by the process.

In this particular case, there are no processes currently running that are utilizing the
GPU, as indicated in the "Processes" section. The GPU is in an idle state, with a
temperature of 56°C and a power usage of 10W out of a maximum of 70W.
Additionally, no processes are currently using the GPU.

Enable GPU persistence

For example. To enable GPU persistence in NVIDIA, you can use the nvidia-
smi command along with the --persistence-mode option. Here are the steps to enable
GPU persistence:

Persistence mode keeps the GPU driver loaded even when no applications are
running, which can reduce the time to start new applications.

Setting up Our Workspace: /content and /content/datasets

Setting our Home

We save the root directory of our workspace '/content' as 'HOME' since we will be
navigating through the directory to have multiple projects under the same HOME.
Additionally, we will have the datasets in the 'datasets' directory, so all datasets are
easily accessible for any project.

Mount Google Drive

Next, it imports the drive module from the google.colab library, which provides
functionalities for mounting Google Drive in Google Colab.
Additionally, Google Drive is mounted in Google Colab and made available at the
path /content/drive. The user will be prompted to authorize access to Google Drive.
Once authorized, the content of Google Drive will be accessible from that point
onwards in the Colab notebook.

3. Load a Dataset (DataLoader)

Create a directory where we can save our dataset

Create the dataset directory (only if it doesn't exist), where we are going to save the
dataset with which we are going to train our CNN.

Change to new directory datasets

Check if the file specified by file does not exist in the current directory. If it doesn't
exist, the code block inside the conditional, which in this case would be downloading
the file from the specified URL, is executed. then, it extracts the contents
of exp0.zip into the current directory quietly, overwriting any existing files if
necessary.

Display 8 images from a class from test

Now, we will use the matplotlib library to display multiple images in a 2x4 grid layout.
Next code imports necessary modules, including matplotlib.pyplot for plotting, glob for
file matching, and matplotlib.image for image handling.

It specifies the directory containing the images and retrieves the paths of the first
8 .jpg images in that directory using glob.glob().

Then, it creates a figure with subplots arranged in a 4x3 grid and iterates through the
image paths, displaying each image in a subplot using imshow(). The title of each
subplot is set to indicate the image index, and axis labels are turned off.

After displaying all images, it adjusts the layout to prevent overlapping and shows the
figure.

Finally, it prints the size of the last image processed.

A. Using glob

B. Or with os library instead of using glob

C. And if I don't want to use mpimg and use e.g. PIL

Setting a Dataloader

The purpose of a DataLoader is fundamental in the context of machine learning and


deep learning, especially when working with large or complex datasets. Its main
purpose is to facilitate the efficient loading and manipulation of data during model
training.

Here are some key purposes of a DataLoader:

1. Efficient data loading: DataLoaders enable loading data in batches, meaning


that instead of loading the entire dataset into memory at once, data is loaded in
small batches. This optimizes memory usage and allows working with datasets
that wouldn't fit entirely in RAM.

2. Data preprocessing: DataLoaders can apply transformations to the data in


real-time while loading, such as normalization, resizing, cropping, rotation,
among others. This simplifies the data preprocessing process and helps ensure
consistency in the application of transformations.

3. Data shuffling and randomization: When training a model, it's common to


shuffle the data to prevent the model from memorizing patterns sequentially.
DataLoaders can randomize the order of the data in each epoch (complete
iteration through the dataset), which helps improve the model's generalization.

4. Parallel data loading: In systems with multiple CPU or GPU cores,


DataLoaders can load data in parallel, making full use of available hardware
resources and reducing wait time during data loading.

5. User-friendly interface: DataLoaders provide a simple and uniform interface


for accessing data during model training. This facilitates implementation and
maintenance of code, as developers can focus on the model logic rather than
data manipulation.

Load Libraries for DataLoader

Next, we will sets up a data loader using PyTorch and torchvision for handling
datasets. Here's a summary of what each library does:

 torch: The core PyTorch library.

 torchvision: Provides datasets, models, and transformations for computer vision


tasks.

 DataLoader: From torch.utils.data, loads datasets in batches.

 datasets: From torchvision, accesses standard datasets like MNIST, CIFAR-10,


etc.

 transforms: From torchvision, contains image transformations.

 tqdm: Displays progress bars during iterations.

Train, Val y Test Sets

In machine learning, it is common to divide the dataset into three main parts: training
set, validation set and test set. Here I explain each of them:

 Train Set: This dataset is used to train the model. That is, the model learns
from this data by adjusting its parameters to minimise the loss function. The
model is iteratively fitted to this data set during training, using optimisation
techniques such as gradient descent. Generally, the training set is the largest,
as an adequate amount of data is required for the model to learn meaningful
patterns.

 Validation Set: After training the model with the training set, the validation set
is used to adjust the hyperparameters of the model and evaluate its
performance. The val set is used to select the best model among several
possible configurations, avoiding overfitting to the test set. This dataset is used
to adjust the model architecture, learning rate or other hyperparameters, in
order to obtain a generalisable model.

 Test Set: This data set is used to check the final performance of the model
after it has been trained and evaluated. The test set is essentially a stand-alone
data set that the model has not seen during training or evaluation. It provides
an objective estimate of the model's performance on unseen data and helps
assess its ability to generalise to new samples.

It is a good practice to normalize both the training set and theval setin the same
way. This ensures that the data are on the same scale and distribution, which can help
the model converge more quickly during training and make more consistent
predictions during evaluation.

It is important to remember that when normalizing the data, you need to calculate the
mean and standard deviationonly on the training setand then apply those same
statistics to the val set. This is because the validation set should simulate "new" or
"unknown" data for the model, so it should not be used to calculate any normalization
statistics.

Therefore, after calculating the mean and standard deviation on the training set, you
can normalize both the training set and the val set as follows:

1. Calculate the mean and standard deviation on the training set.

2. Normalize the training set using these statistics.

3. Normalize the val set using the same statistics calculated on the training set.

Normalize the dataloaders using Statistics (26s)

 Normalization: Normalization is crucial for ensuring that pixel values across


images are on a similar scale, which helps in stabilizing and speeding up the
training process of deep neural networks. This is achieved using the formula:

Where:

o original_image is the original pixel value (typically between 0 and 255).

o μ is the mean of the training dataset (calculated for each channel, R, G,


B).

o σ is the standard deviation of the training dataset (also per channel).

After normalization, the pixel values will be centered around zero, which facilitates the
training of many models.

 Dataset Preparation: Each dataset (train_data, val_set, test_set) is prepared


with consistent transformations and normalization, facilitating uniformity in data
processing across training, validation, and testing phases.
This setup ensures that the datasets are properly preprocessed and ready to be used
in training and evaluating machine learning models, particularly deep neural
networks, using PyTorch.

The train set is unmodified in size because transform() transform the data but it don't
augment the dataset

Displaying all classes

Display one example from each class

Let us show one example for each class, for fun. As we've transformed the image by
normalizing it, we should undo the transformation before visualizing the image.

To revert the normalization, you need to set a "new mean" and "new standard
deviation" in such a way that the effect of the previous normalization is canceled.

We have that according to the normalisation formula:

where original_image is:

If we use transform.Normalize(new_mean,new_std) we get:

both original_image must be the same, therefore (you can easily check it out):

σN=1/σ

and

μN=−μ/σ

Therefore, our new_mean or inverse mean and new_std or inverse standard deviation
are as follows:

 Inverse Mean: To counteract the effect of subtracting the previous mean, you
now add that original mean divided by the standard deviation, which is achieved
using [-m/s for m, s in zip(mean, std)]. This adjustment reverses the operation of
subtracting the previous mean.

 Inverse Standard Deviation: To counteract the effect of dividing by the


standard deviation, you now multiply by the standard deviation (which is
equivalent to dividing by 1/s), as done by [1/s for s in std].

Therefore, the inv_normalize function does the following:


 Adjusts the values of the image that were normalized using the previous mean
and standard deviation.

 Returns the image to its original state before it was normalized.

Settings Hyperparameters

Hyperparameters are parameters that are not directly learned from the training
process of the model but are set before the training process begins. They are
configurations that control the training process of the model and affect its
performance and behavior.

Here are some key characteristics of hyperparameters:

1. Not directly learned: Unlike model parameters, such as weights in a neural


network, which are adjusted during training to minimize a loss
function, hyperparameters are not directly adjusted during training. Instead,
they are set before training begins and remain constant throughout the training
process.

2. Control the model's behavior: Hyperparameters influence how the model


learns during training. They can affect aspects such as the convergence speed
of the model, its ability to generalize to unseen data (generalization), its ability
to avoid overfitting, and other aspects of model performance.

3. Examples of hyperparameters: Some common examples of hyperparameters


include the learning rate in optimization algorithms, the number of layers and
neurons in a neural network, the batch size in batch training, regularization
parameters like L1 or L2 penalty, and many other settings that may vary from
one model to another.

4. Hyperparameter tuning: In practice, hyperparameter tuning is an important


step in the development of machine learning and deep learning models. It
involves searching for optimal combinations of hyperparameter values to
maximize the model's performance on a given dataset. This can be done using
methods (outside the scope of this course) such as grid search, random search,
Bayesian optimisation and others.

In summary, hyperparameters are configurations that control the behavior of the


model during training and are set before the training process. They are critical for
optimizing the performance and generalization of the model across different datasets
and problems.

We are going to define some training parameters for the network, such as the number
of batches, epochs, and classes in the dataset because they are needed for
dataloaders in order to set up our training loop.
4. Define a Convolutional Neural Network

CNNs have revolutionized the field of computer vision and have been widely adopted
for tasks such as image classification, object detection, image segmentation, and
more. Their ability to automatically learn hierarchical representations from raw data
makes them highly effective for a wide range of visual recognition tasks.

The importance of the volumes generated by a Convolutional Neural Network (CNN)


increases as the network deepens due to several key factors:

1. Hierarchical Feature Representation: In a CNN, the initial layers are


specialized in detecting simple features such as edges and textures, while
deeper layers learn to combine these simple features to form more complex
and abstract representations of objects in the images. These hierarchical
representations are crucial for the network's ability to understand and recognize
patterns in the input data.

2. Dimensionality Reduction: As images pass through the convolutional and


pooling layers, the dimensions of the feature volumes (or feature maps) may
decrease. This is especially true if a large filter size or aggressive pooling layers
are used. Dimensionality reduction helps to concentrate relevant information
and simplifies processing in later layers of the network.

3. Extraction of Relevant Features: Deeper layers of the CNN typically have


smaller but denser feature volumes. These volumes contain more abstract and
information-rich representations that are crucial for the recognition task the
network is performing. A mistake in the network design that causes premature
loss of important features could significantly compromise the model's
performance.

It's important to be careful when designing the architecture of a CNN to avoid errors
that may lead to premature loss of information in the data volumes. Some common
errors that can result in data volume loss include:
1. Using Overly Aggressive Pooling Layers: Employing pooling layers with too
large a window size or too large a stride can reduce the size of the feature
volume too quickly, resulting in the loss of relevant information in the process.

2. Inadequate Dimensions of Convolutional Layers: The dimensions of the


convolutional filters and the padding used can influence the dimensions of the
resulting feature volume. Poorly adjusting these dimensions can result in
excessive information loss.

3. Inappropriate Depth of the Network: A network that is either too shallow or


too deep can compromise performance. A network that is too shallow may not
capture enough relevant features, while a network that is too deep may suffer
from overfitting and a higher likelihood of premature loss of information.

Import Libraries

Import the necessary modules from the PyTorch library for defining neural network
architectures. Here's what each part means:

import torch.nn as nn: This line imports the torch.nn module, which contains pre-
defined neural network layers, loss functions, and other utility functions for building
neural networks. By importing nn, we gain access to classes such
as Conv2d, Linear, MaxPool2d, etc., which are used to define the layers of a neural
network.

import torch.nn.functional as F: This line imports the torch.nn.functional module,


which contains functional versions of neural network operations. Functions
like F.relu, F.softmax, F.sigmoid, etc., are commonly used inside the forward method of
a neural network to apply non-linearities and other operations to the input data.

Define the model

The output volume of a convolutional layer is determined by several factors, including


the kernel size, padding, stride, and the dimensions of the input tensor.

For the nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1) layer, let's assume
the input has a size of (batch_size, 32, H, W), where batch_size is the batch size, 32 is
the number of input channels (feature maps), H is the height of the image, and W is
the width of the image.

The output size is calculated using the following formula:

Substituting the values for this layer:

 Input: (32, H, W) (32 input channels)

 padding: 1 (padding=1)

 kernel_size: 3 (kernel_size=3)
 stride: 1 (stride=1)

Then, the calculation would be:

Therefore, the output will have a size of (batch_size, 64, 32, 32), meaning that for
each sample in the batch, there will be 64 feature maps of size 32x32.

Functional definition

Here's the equivalent functional definition of the provided CNN.

In this functional definition, we use PyTorch's functional API (torch.nn.functional) to


define each layer's operations.

The code structure follows the same sequence of operations as in the original CNN,
but without the need for the nn.Sequential container.

Each layer's output is passed directly to the next layer's operation, just like in the
object-oriented implementation. Finally, the output of the last layer is returned as the
output of the myCNN function.

Improved Model

In this modified version:

 Added Batch Normalization layers (nn.BatchNorm2d for convolutional layers


and nn.BatchNorm1d for fully connected layers) after each activation function to
stabilize and accelerate training.

 Dropout layers (nn.Dropout2d for convolutional layers and nn.Dropout for fully
connected layers) were introduced after each activation function to prevent
overfitting.

 Adjusted the input size of the first fully connected layer (nn.Linear) based on the
output size of the previous layer's Flatten operation.

 The dropout rate for fully connected layers was set to 0.5, which is a common
value for dropout rates in practice.

Setting Up the Computing Device

Setting CUDA environment

This code sets up the device (CPU or GPU) for running the neural network and creates
an instance of the myCNN model, moves it to the selected device, and optionally
utilizes data parallelism if multiple GPUs are available.

NOTE: Outside of Google Colab, it is necessary to explicitly specify the device (GPU or
CPU) and manage model parallelization if using multiple GPUs. This is crucial to
ensure the model runs on the intended GPU and leverages the available hardware
effectively. While Google Colab manages GPU allocation automatically, specifying the
device ("cuda") and using nn.DataParallel can still be beneficial for explicit control and
utilization of available resources, especially if you have specific requirements or want
to ensure optimal performance. However, for many basic use cases, simply letting
Colab manage the resources with automatic GPU selection) will suffice.

In summary:

 Outside of Google Colab: It is necessary to manually specify and manage the


device (GPU or CPU) and model parallelization if utilizing multiple GPUs.

 In Google Colab: While not strictly necessary due to automatic resource


management, specifying the device and managing parallelization can still be
beneficial to ensure the model runs where expected and to optimize
performance in the case of multiple GPUs.

Understanding how to handle and optimize GPU resource usage is essential for
efficient performance of deep learning models in both scenarios.

Here's a detailed explanation of the code:

1. Device Selection: The code checks if CUDA (NVIDIA's parallel computing


platform) is available on the system. If CUDA is available, the device is set to
"cuda"; otherwise, it's set to "cpu". This allows the code to run on the GPU if
available, which typically speeds up the training of neural network models.

2. Model Creation: An instance of the myCNN model is created. Then, the model
is moved to the selected device using the .to(device) method. This ensures that
all operations of the model are performed on the appropriate device (CPU or
GPU). = myCNN().to(device)

3. Data Parallelism (Optional): If there's more than one GPU available


(torch.cuda.device_count() > 1), a message indicating the number of available
GPUs is printed, and nn.DataParallel is used to parallelize calculations across
multiple GPUs. This automatically divides the data and operations in the
network among the available GPUs, which can speed up model training in multi-
GPU environments.

4. Model Printing: Finally, the model is printed, providing a detailed description


of its architecture, including parameters and layers. This facilitates model
verification and debugging.

If you haven't done so already, let's change the execution environment to T4.

Display the summary of our model

torchsummary is a Python library that provides a summary of a PyTorch model's


architecture, including details such as the number of parameters, the output shape of
each layer, and the total memory consumption.

Here's what torchsummary does:

1. Model Summary: It generates a concise summary of the PyTorch


model, displaying information about each layer, such as the layer type, input
shape, output shape, number of parameters, and memory usage.
2. Layer Details: For each layer in the model, torchsummary provides detailed
information about its configuration, including kernel size, stride, padding, and
activation function.

3. Total Parameters: It calculates and displays the total number of trainable


parameters in the model, which is useful for understanding the model's
complexity and memory requirements.

4. Total Memory Consumption: torchsummary estimates the total memory


consumption of the model, which can be helpful for optimizing memory usage,
especially when working with large models or limited hardware resources.

Obtain a graphical view of our model

torchviz is a Python library used for visualizing PyTorch computational graphs. It


provides a convenient way to visualize the flow of data through a neural network
model, making it easier to understand and debug complex architectures.

Here's what torchviz does:

1. Graph Visualization: torchviz generates visual representations of PyTorch


computational graphs, showing the connections between different layers and
operations in the model.

2. Layer Relationships: It visually depicts the relationships between different


layers in the model, including how the input data flows through the layers and
how the output is generated.

3. Debugging: Visualizing the computational graph with torchviz can help in


debugging neural network architectures by allowing developers to inspect the
structure of the model and identify any potential issues or errors.

4. Model Understanding: torchviz aids in understanding the inner workings of a


PyTorch model, providing insights into how data is transformed as it passes
through the layers of the network.

Overall, torchviz is a valuable tool for visualizing and understanding PyTorch models,
especially for developers and researchers working on deep learning projects.

Define a Loss function and optimizer

To train a model, we need a loss function and an optimizer. Let's use a Classification
Cross-Entropy loss and SGD with momentum.

A loss function, also known as a cost function or objective function, measures the
discrepancy between the predicted output of a model and the actual target values in
the training dataset. It quantifies how well the model is performing and provides
feedback to the optimization algorithm during training.

An optimizer, on the other hand, is responsible for updating the parameters of the
model (e.g., weights and biases) based on the gradients of the loss function with
respect to those parameters. The goal of the optimizer is to minimize the loss
function, thereby improving the model's performance on the task at hand.
In PyTorch, selecting the appropriate loss function and optimizer are crucial aspects
for effectively training a Deep Learning model. These tools enable the evaluation of
model performance and adjusting its parameters to minimize error and enhance
accuracy.

Common Loss Functions in PyTorch:

1. nn.MSELoss (Mean Squared Error Loss): Primarily used in regression tasks,


where the goal is to predict continuous values. It measures the average squared
difference between the predicted and actual values.

criterion = nn.MSELoss()

2. nn.BCELoss (Binary Cross-Entropy Loss): Employed in binary classification


tasks, where the model classifies each input into two categories. It calculates
the entropy of the probability distribution predicted by the model compared to
the true distribution.

criterion = nn.BCELoss()

3. nn.CrossEntropyLoss (Categorical Cross-Entropy Loss): A generalization


of nn.BCELoss for multi-class classification problems. It measures the categorical
cross-entropy between the predicted and true probability distributions.

criterion = nn.CrossEntropyLoss()

4. nn.L1Loss (Mean Absolute Error Loss): Measures the average absolute


difference between the predicted and actual values. It is less sensitive to
outliers than nn.MSELoss.

criterion = nn.L1Loss()

Common Optimizers in PyTorch:

1. SGD (Stochastic Gradient Descent): A simple and efficient algorithm that


updates the model's parameters in the direction of the negative gradient of the
loss function.

optimizer = optim.SGD(model.parameters(), lr=0.01)

2. SGD with Momentum: Similar to SGD but incorporates a momentum term to


preserve the direction of the previous movement of parameter updates, which
can accelerate training.

optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.9)

3. Adam (Adaptive Moment Estimation): An adaptive optimizer that


automatically adjusts the learning rate for each parameter based on its
historical gradients. Known for its good convergence and generalization.

optimizer = optim.Adam(model.parameters(), lr=0.001)

4. RMSprop (Root Mean Square Propagation): Similar to Adam, adapts the


learning rate individually for each parameter, but uses the root mean square of
past gradients instead of momentum. Can be effective for problems with noisy
gradients.
optimizer = optim.RMSprop(model.parameters(), lr=0.001)

5. Train the network

Number of images and batches

In a training loop, a training dataset is used. This dataset consists of input examples
(for example, images in an image classification problem) and their corresponding
labels or desired outputs (for example, class labels associated with each image). The
purpose of the training loop is to iterate over this dataset to update the model's
weights during training.

The training dataset is used to adjust the model's parameters, i.e., the weights of the
connections between the neurons of the neural network. During each iteration of the
training loop, the model calculates predictions for a batch of input examples,
compares those predictions with the true labels using a loss function, and then adjusts
the model's weights to minimize this loss function.

The metrics used to determine good learning depend on the specific problem you are
addressing. Some common metrics include:

1. Accuracy: It is the proportion of examples correctly classified by the model


with respect to the total number of examples.

2. Loss: It is a measure of how well the model is doing in its predictions. Loss
functions typically assign a numerical value to the difference between the
model's predictions and the true labels. The goal of training is to minimize this
loss.

3. Class-wise Accuracy: In classification problems with multiple classes, it can be


useful to examine the model's accuracy for each class individually. MEJOR NO

4. Training Time: The amount of time needed to train the model can be an
important metric, especially in real-time applications. MEJOR NO

These metrics are used to evaluate the model's performance during training and
validation, and to make decisions about the model's architecture, hyperparameters,
and other aspects of the training process.

Defining Training Loop

The training loop is composed of two procedures: train and val, a loop that occurs at
each epoch.

Defining train

The train function is responsible for training the neural network model. It takes the
training dataloader, the model, the loss function, and the optimizer as inputs. Within
the function, it iterates over the data batches, performs the forward pass to obtain the
model predictions, calculates the loss between the predictions and the ground truth
labels, performs backpropagation to compute the gradients of the loss with respect to
the model parameters, and finally updates the model parameters using the optimizer.
During training, it also tracks the loss and accuracy of the model at each iteration and
prints them for monitoring purposes.
Defining val

The val function evaluates the model's performance on a test dataset. It computes the
loss and accuracy over the entire test dataset using batches and prints the accuracy
and val loss. Finally, it returns the val loss and accuracy.

Defining training loop (30 epochs (6 mins))

This code trains the neural network model for a specified number of epochs (epochs).
It iterates over each epoch, calling the train function to train the model on the training
data and the val function to evaluate the model on the val data. It then saves the
training and val loss, as well as the training and val accuracy, for each epoch. Finally,
it saves the trained model's state dictionary to a file named "myCNN.pth" and saves
the metrics (loss and accuracy) to a CSV file named "metrics_myCNN.csv".

If your training process is oscillating and not improving accuracy, there could be
several reasons for this issue. Here are some common ones and possible solutions:

1. Learning rate too high or too low: If the learning rate is too high, the
optimization process may overshoot the optimal solution, causing oscillations.
Conversely, if the learning rate is too low, the optimization process may get
stuck in local minima. Try adjusting the learning rate. You can try reducing it
gradually during training (learning rate scheduling) or using adaptive learning
rate algorithms like Adam.

2. Poor initialization: The initial weights of the neural network could be poorly
chosen, leading to oscillations. Try initializing the weights using different
strategies, such as Xavier or He initialization. Here are some links that explain
the importance of proper initialization in neural networks and how different
initialization strategies, like Xavier (Glorot) and He initialization, can help
mitigate issues like oscillations:

3. Overfitting: If your model is too complex relative to the amount of training


data, it may overfit, causing oscillations in performance. Regularization
techniques like dropout or weight decay can help combat overfitting.

4. Insufficient data augmentation: If you're working with a small dataset,


augmenting the data (e.g., by rotating, flipping, or scaling the images) can help
the model generalize better and reduce oscillations.

5. Model architecture: The architecture of your neural network may not be


suitable for the problem at hand. Experiment with different architectures, layer
sizes, and activation functions to see if you can achieve better performance.

6. Hyperparameter tuning: Other hyperparameters such as batch size, number


of layers, and optimizer choice could also affect the training process. Try
experimenting with different values for these hyperparameters to find the best
combination.

7. Early stopping: Implement early stopping based on validation performance to


prevent overfitting and avoid wasting computational resources on training
epochs that do not improve validation performance.
8. Data imbalance: If the classes in your dataset are imbalanced, the model may
focus too much on the majority class and perform poorly on the minority class.
Consider using techniques such as class weights or
oversampling/undersampling to address this issue.

By systematically diagnosing and addressing these potential issues, you should be


able to stabilize the training process and improve the accuracy of your model.

Displaying Results

Downloading/uploading files to/from your local file system (interactive way)

The files.download method will prompt the browser to download the file to your local
computer.

Download the file to your local machine

When you run files.download("myCNN.pth"), a dialog box will appear in your browser,
allowing you to download the file to your local machine. If you need to save more files
or perform additional save operations, you can use the same method, adjusting the
file name as needed.

From here, this can be a different NB where we can upload a model


previously trained and saved from local disk.

Upload files from local (time 8 min)

You can download this files to your local disk from here in raw format. You will need to
load them next.

 metrics_myCNN(100).csv

 myCNN(100).png

 myCNN(100).pth

We've loaded a pre-trained model at 100 epochs (aorus/project/bioimages/myCNN.*).


The *.pth file is the one containing the weights of the model and it is a bit heavy so it
may take some time to load (4 or 5m).

6. Validating our model

Once you've trained a model with a training dataset (train set) and evaluated it with a
val dataset (val set), it can be useful to perform additional check using a test dataset
(test set). Test with a test dataset is used to fine-tune the model's hyperparameters
and prevent overfitting.

Here's a brief explanation of what test with a test dataset means and how it's done:

1. Meaning of validation with a validation dataset:

o Validation with a validation dataset involves splitting your available data


into three sets: training (train set), validation (val set), and testing (test
set).

o After training the model with the training set, its performance is evaluated
using the validation set.
o The idea is to adjust the model's hyperparameters (such as learning rate,
batch size, neural network depth, etc.) based on its performance on the
validation set. This helps prevent overfitting to the training set and
improves the model's ability to generalize to unseen data.

2. How validation with a validation dataset is done:

o After splitting your data into training and testing sets, an additional
portion of the data is reserved for the validation set.

o The size of this validation set can vary depending on the total size of your
data and your specific needs, but typically around 10% to 20% of the data
is reserved.

o You then train the model using the training set and evaluate it using the
validation set.

o You can adjust the model's hyperparameters, as mentioned earlier, and


repeat the training and validation process until you're satisfied with the
model's performance on the validation set.

o Finally, after tuning and validating your model, you evaluate its final
performance using the test set to obtain an unbiased estimate of its
performance on unseen data.

Validation with a validation dataset is a common practice in machine learning model


development and is crucial for building models that generalize well to new, unseen
data.

Confusion matrix

In addition to the confusion matrix, several statistics can be useful for evaluating the
performance of a classification model. Here are some options you might consider:

1. Accuracy: The proportion of correct predictions out of the total predictions.

from sklearn.metrics import accuracy_score


accuracy = accuracy_score(val_labels, val_predictions)
print("Accuracy:", accuracy)

2. Precision: The proportion of true positives out of the total positive predictions.

from sklearn.metrics import precision_score


precision = precision_score(val_labels, val_predictions)
print("Precision:", precision)

3. Recall (Sensitivity or True Positive Rate): The proportion of true positives


out of the total actual positives.

from sklearn.metrics import recall_score


recall = recall_score(val_labels, val_predictions)
print("Recall:", recall)

4. F1 Score: The harmonic mean of precision and recall, providing a balance


between the two metrics.
from sklearn.metrics import f1_score
f1 = f1_score(val_labels, val_predictions)
print("F1 Score:", f1)

These statistics provide different perspectives on the model's performance and can be
useful in various contexts. You can choose the ones that best fit your evaluation
needs.

Display the Classification Report

A classification report is a detailed metric report used to evaluate the performance of


a classification algorithm. It provides several key performance indicators, which help
in understanding how well the model is performing on a given set of data. The report
typically includes the following metrics for each class:

1. Precision: The ratio of true positive predictions to the total predicted positives
(i.e., the number of correctly predicted positive samples divided by the total
number of samples predicted as positive). It measures the accuracy of the
positive predictions.

2. Recall (Sensitivity or True Positive Rate): The ratio of true positive


predictions to the total actual positives (i.e., the number of correctly predicted
positive samples divided by the total number of actual positive samples). It
measures the ability of the model to identify all positive samples.

3. F1-Score: The harmonic mean of precision and recall. It provides a balance


between precision and recall, especially useful when you need to balance both
false positives and false negatives.

4. Support: The number of actual occurrences of the class in the dataset (i.e., the
number of true instances for each label).

Additionally, the classification report typically includes averages for these metrics:

 Accuracy: The ratio of the number of correct predictions to the total number of
predictions.

 Macro Average: The arithmetic mean of the precision, recall, and F1-score
calculated for each class, treating all classes equally.

 Weighted Average: The weighted mean of the precision, recall, and F1-score,
taking into account the support of each class, which gives more importance to
the classes with more samples.
Display a ROC curve

To plot a ROC (Receiver Operating Characteristic) curve in Python using scikit-learn,


you can follow these steps:

 Classifier Training: We train a RandomForestClassifier on a synthetic dataset


generated using make_classification.

 Predictions: We obtain probability predictions (probs) for the positive class


(preds).

 ROC Curve Calculation: Using roc_curve from sklearn.metrics, we compute


the false positive rate (fpr) and true positive rate (tpr) at various thresholds.

 AUC Calculation: auc computes the Area Under the Curve (AUC) from the ROC
curve.

 Plotting: We use Matplotlib to plot the ROC curve. The diagonal dashed line
represents the ROC curve of a random classifier.

 Legend: The legend displays the AUC score on the plot.

This example illustrates how to generate and plot a ROC curve for binary classification
using scikit-learn. Adjustments can be made for multi-class classification by
computing ROC curves for each class separately or by using techniques like one-vs-
rest.

You might also like