Strange output of TNNetFullConnectReLU #35

HuguesDug · 2020-11-15T08:08:11Z

Hello,

I have made a basic network, for testing and debugging.

All inputs in the training set are always 0.
All expected output are always the same (3 outputs, 0.1 / 0.25 / 0.50).

So, I would expect the netwrok to learn the bias rapidelly. It does not.
The output is always 0.

If now I set the inputs to a random number, the network will learn.

Network is rather simple, although I used the "multi input" option.

// Create Network
NN := TNNet.Create();

// Create input layers structure
for i := 0 to length(InputLayers) - 1 do
begin
InputLayers[i] := NN.AddLayer(TNNetInput.Create(NbQuotes, NbQuotesData, 1));
end;

for i := 0 to Length(Branch) - 1 do
Branch[i] := NN.AddLayerAfter(TNNetFullConnectReLU.Create(3), InputLayers[i]);

// Merge branches
NN.AddLayer(TNNetConcat.Create(Branch));

// Output layer
NN.AddLayer(TNNetFullConnectReLU.Create(3));
NN.AddLayer(TNNetFullConnectReLU.Create(3));

// Init weights
NN.InitWeights;

// Set learning Rate
NN.SetLearningRate(0.01, 0.95);
NN.ErrorProc := MyErrorProc;

joaopauloschuler · 2020-11-15T09:13:33Z

Thank you so much for the detailed bug report.

I'll have a look and reply.

May I ask you please if you are working with the latest version of the source code?

HuguesDug · 2020-11-15T11:17:01Z

Hello,

You are always so quick in replying. I really appreciate what you do to bring Pascal (delphi/Lazarus) community a decent API for neural networks.

To answer your question, yes, I do use the latest version.
My environment is a Delphi 10.3 community.

As you can see from the stucture of the network, each banch has a Relu fully connected layer with Bias. So, during learning process, the bias should reach a value so that the next two layers after concact should rapidelly get some proper weights reaching the expected "constant" values of the outputs.

With inputs "always 0", outputs will be always 0.
With inputs being "random number," outputs will be "nearly OK". Tipically 2 of them are OK, the last one stays at 0, not always the same.
With inputs being "1/Epoch", the 3 outputs will fit to the target.

Strange isn't it ?

It looks like the bias is not calculated.

joaopauloschuler · 2020-11-16T10:03:38Z

Please let me know if this fix works.

joaopauloschuler · 2020-11-16T10:14:42Z

The error was: when input was zero, there was no derivative available to be used with gradient descent. This is why it was working with random inputs.

Thank you for reporting with plenty of details.

HuguesDug · 2020-11-17T06:18:52Z

Tested : work fine now !

Thanks for the fix.

joaopauloschuler closed this as completed in af5e925 Nov 16, 2020

joaopauloschuler added a commit that referenced this issue Nov 16, 2020

Fixes TNNetReLU derivative #35..

f2f6808

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Strange output of TNNetFullConnectReLU #35

Strange output of TNNetFullConnectReLU #35

HuguesDug commented Nov 15, 2020

joaopauloschuler commented Nov 15, 2020

HuguesDug commented Nov 15, 2020

joaopauloschuler commented Nov 16, 2020

joaopauloschuler commented Nov 16, 2020

HuguesDug commented Nov 17, 2020

Strange output of TNNetFullConnectReLU #35

Strange output of TNNetFullConnectReLU #35

Comments

HuguesDug commented Nov 15, 2020

joaopauloschuler commented Nov 15, 2020

HuguesDug commented Nov 15, 2020

joaopauloschuler commented Nov 16, 2020

joaopauloschuler commented Nov 16, 2020

HuguesDug commented Nov 17, 2020