Feed Forward Neural Network Assignment PDF
Feed Forward Neural Network Assignment PDF
Contents
Exercise ......................................................................................................................................... 1
§1 Data Preparation ........................................................................................................... 1
§2 Network Design .............................................................................................................. 1
§3 Network Training .......................................................................................................... 2
§3.1 Training Function .............................................................................................. 2
§3.2 Early Stopping ..................................................................................................... 4
§4 Network Testing ............................................................................................................. 4
§5 Conclusion ......................................................................................................................... 5
Appendix ........................................................................................................................... 6
Feed Forward Neural Network
Exercise:
This exercise deals with the approximation of functions by neural networks. The so called
function approximation (regression), is to find a mapping f’ satisfying || f’(x) - f(x) ||< e, (e is the
tolerance; ||·|| can be any error measurement). In general, it is enough to have a single layer of
nonlinear neurons in a neural network in order to approximate a nonlinear function. The goal of
this exercise is then to build a feed forward neural network that approximates the following
function:
1. Data Preparation
Three types of data sets are prepared, namely a training set, a validation set and a test set.
Training set is a set of value pairs that include information about the target network training
activity. A confirmation set is associated with an early stopping technique. During the
training phase, verification error is monitored to prevent the network from overspending
training data. Typically, a test set is used to test later network performance. However, in
this problem the root mean square error (RMSE) in the test set is used as the working
principle of network training.
In the current problem, training and test data are taken from the same grids (l0xl0 pairs of
training data values, 9x9 pairs of test data), as shown in Fig.1. The scope of workflow is
already within the interval [-1 1]. Therefore, it is not necessary to measure the target
function. Validation data has been taken randomly from function surface.
2. Network Design
In the current problem, it is two layer feed-forward neural networks, in which one non-
linear neurons hidden layer and one is linear neurons output layer. As defined above, target
1|Page
Feed Forward Neural Network
function has two inputs (x,y) and one output. As we can see in the fig.2, there are two
inputs, one hidden layer tansig (Tan-sigmoid transfer function) which consist 8 hidden
neurons and one output layer purelin (linear transfer function). The reason behind using
tansig function in hidden layer is because; as we know that it predicts the probability
between (0 to1). Meanwhile, purelin function calculates a layer’s output from its net input.
3. Network Training
In general, we can train a network in two kinds of styles: batch training or incremental
training. In batch training, weights and biases of the network are only updated after all of
the inputs are presented to the network, while in incremental(on-line) training the network
parameters are updated each time an input is presented to it.
In this problem batch-training has been applied, because it’s faster and more reliable.
2|Page
Feed Forward Neural Network
Newton’s method. BFGS has a limited memory; so far large network this algorithm
is not a good choice, however, for small network trainbfg is still an efficient
function.
• Traingd: It is a basic Gradient Descent (GD) algorithm that can update weights
and bias values according to it. The main benefit of this training function is that it
takes directly path to the minimum. The major disadvantage of traingd is due to its
slow learning rate, it can converge at local minima and saddle points and when an
update is performed, we can go through all the observations.
• Traingdm: It can train any network as well as its net input, weight and transfer
function. Traingdm training function is faster than traingd and it improves Gradient
descent by momentum.
3|Page
Feed Forward Neural Network
Concerning to the above problem, some training functions have been applied to this network to
get optimal results. The maximum epochs were 5000, set by default and learning rate set by 0.02.
As shown in Table.1, trainbfg, trainlm and traingda achieve performance goal, while traingd
and traingdm fail. Traingda spend less time and get maximum output than trainbfg but trainlm
takes no time and in 5 epochs get optimal output value. Trainlm produces highest correlation
between outputs and targets. Thus, trainlm is the best option for this problem. (See fig.3).
In order to examine the early stopping in this training, a randomly generated validation set is
used during trainlm training (Maximum validation failures=10, Erms=0.02 for the test set). The
early stopping mechanism is not triggered during the training. The results estimate that 8 hidden
neurons are best choice for this problem.
4. Network Testing
After the training phase, the testing can be done on test target data and network output. Fig.5
shows the correlation and comparing both values. Blue line shows the network output while
4|Page
Feed Forward Neural Network
red line indicates test output. It estimates the correlation between the network output and the
target is 0.99925 through graph.
5. Conclusion
A two-layer network with two inputs, eight tansig hidden units and one pure/in output
unit is built for the approximation problem mentioned above. The network is trained by
trainlm. No early stopping is used during the training. The maximum number of epochs
to train and the learning rate are set to be 5000 and 0.02 respectively. Fig.6 shows the
Parametric Surfaces of the original and the approximated functions.
5|Page
Feed Forward Neural Network
Appendix
Matlab Code
main.m
fprintf ('\t-------------------------------------\n');
fprintf ('\t- Problem 1: Function Approximation -\n');
fprintf ('\t-------------------------------------\n\n');
% >>>>> STEP 1: Generate training and test data <<<<<
fprintf ('Step 1: Generate training and testdata...\n');
fprintf ('===========================================\n');
[train_input,train_target,test_input,test_target,val_input,val_target] = generate_data;
fprintf ('Data generation is finished ! \n\n');
% >>>>> STEP 2: Create a two layer feedforward network <<<<<
fprintf ('Step 2: Create a two layer feedforward network...\n');
fprintf ('=================================================\n');
net = create_network;
fprintf ('Network creation is finished ! \n\n');
% >>>>>STEP 3:Train thenetwork for Erms=0.02 for test set <<<<<
fprintf ('Step 3: Train the network...\n');
fprintf ('============================\n');
[error,network_output]=train_network( net,train_input,train_target,test_input,test_target,val_input,val_target);
fprintf ('Network training is finished ! \n\n');
% >>>>>FINAL step: Plot the result... <<<<<
fprintf ('FINAL step: Plot the result...\n');
fprintf ('==============================\n');
plot_result(net,test_input,test_target,network_output,error);
fprintf ('Hope the training result is good : )');
generate_data.m
6|Page
Feed Forward Neural Network
Z = cos(X + 6*a*Y) + 2.0*a*X.*Y;
figure,
subplot(1,2,1);
surfc(X,Y,Z);
create_network
train_network.m
function [error,network_output] =
train_network(net,train_input,train_target,test_input,test_target,val_input,val_target)
val.P = val_input;
val.T = val_target;
test.P = test_input;
test.T = test_target;
% ask the user for the training parameters
epoch = round( input('Maximum number of epochs to train [5000]: ')); % maximum number of epochs to train
Lr = input('Learning rate [.02]: '); % learning rate
trainFcn= input('Training function [trainlm]-> ','s'); % training function (Automated Regularization (trainbr))
net.trainFcn = trainFcn;
7|Page
Feed Forward Neural Network
net.trainParam.lr = Lr;
net.trainParam.epochs = epoch;
net.trainParam.show = 40;% Epochs between displays
net.trainParam.goal = 0.02;% Mean-squared error goal
stop_crit = input('Use early stopping? y/n [n]:','s');
erms = 1;
% Training...
if(stop_crit=='n')% no stop criteria
tic, % start a stopwatch timer.
while erms> 0.02
net = train(net,train_input,train_target,[],[],[],test);
network_output = sim(net,test_input);
error = test_target - network_output;
erms = sqrt(mse(error)); % root mean-square error
net.trainParam.goal = net.trainParam.goal*0.5;
end
toc; % prints the elapsed time since tic was used
else % use early stopping
tic,
net.trainParam.max_fail = input('Maximum validation failures [10]:');
while erms> 0.02
net = train(net,train_input,train_target,[],[],val,test);
network_output = sim(net,test_input);
error = test_target - network_output;
erms = sqrt(mse(error)) % root mean-square error
net.trainParam.goal = net.trainParam.goal*0.5;
end
toc;
end
plot_result.m
function plot_result(net,input,target,network_output,error)
X = reshape(input(1,:),9,9);
Y = reshape(input(1,:),9,9);
Z = reshape(target,9,9);
No = reshape(network_output,9,9);
E = reshape(error,9,9);
8|Page
Feed Forward Neural Network
figure,
[C,h] = contour(X, Y, E);
clabel(C,h);
xlabel('x');
ylabel('y');
title('level courve of the error')
figure,
[C,h1] = contour(X, Y, Z,'k'); % create level curve of target
set(h1,'LineWidth',2);
% clabel(C,h);
hold on
[C,h2] = contour(X, Y, No,'m');
% create level curve of approximation% clabel(C,h);
set(h2,'LineWidth',2);
hold off
legend([h1(1);h2(1)],'target','approximation');
xlabel('x');
ylabel('y');
title('level courves of the target and approximation functions')
9|Page