Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange output of TNNetFullConnectReLU #35

Closed
HuguesDug opened this issue Nov 15, 2020 · 5 comments
Closed

Strange output of TNNetFullConnectReLU #35

HuguesDug opened this issue Nov 15, 2020 · 5 comments

Comments

@HuguesDug
Copy link

Hello,

I have made a basic network, for testing and debugging.

All inputs in the training set are always 0.
All expected output are always the same (3 outputs, 0.1 / 0.25 / 0.50).

So, I would expect the netwrok to learn the bias rapidelly. It does not.
The output is always 0.

If now I set the inputs to a random number, the network will learn.

Network is rather simple, although I used the "multi input" option.

// Create Network
NN := TNNet.Create();

// Create input layers structure
for i := 0 to length(InputLayers) - 1 do
begin
InputLayers[i] := NN.AddLayer(TNNetInput.Create(NbQuotes, NbQuotesData, 1));
end;

for i := 0 to Length(Branch) - 1 do
Branch[i] := NN.AddLayerAfter(TNNetFullConnectReLU.Create(3), InputLayers[i]);

// Merge branches
NN.AddLayer(TNNetConcat.Create(Branch));

// Output layer
NN.AddLayer(TNNetFullConnectReLU.Create(3));
NN.AddLayer(TNNetFullConnectReLU.Create(3));

// Init weights
NN.InitWeights;

// Set learning Rate
NN.SetLearningRate(0.01, 0.95);
NN.ErrorProc := MyErrorProc;

@joaopauloschuler
Copy link
Owner

Thank you so much for the detailed bug report.

I'll have a look and reply.

May I ask you please if you are working with the latest version of the source code?

@HuguesDug
Copy link
Author

Hello,

You are always so quick in replying. I really appreciate what you do to bring Pascal (delphi/Lazarus) community a decent API for neural networks.

To answer your question, yes, I do use the latest version.
My environment is a Delphi 10.3 community.

As you can see from the stucture of the network, each banch has a Relu fully connected layer with Bias. So, during learning process, the bias should reach a value so that the next two layers after concact should rapidelly get some proper weights reaching the expected "constant" values of the outputs.

With inputs "always 0", outputs will be always 0.
With inputs being "random number," outputs will be "nearly OK". Tipically 2 of them are OK, the last one stays at 0, not always the same.
With inputs being "1/Epoch", the 3 outputs will fit to the target.

Strange isn't it ?

It looks like the bias is not calculated.

@joaopauloschuler
Copy link
Owner

Please let me know if this fix works.

@joaopauloschuler
Copy link
Owner

The error was: when input was zero, there was no derivative available to be used with gradient descent. This is why it was working with random inputs.

Thank you for reporting with plenty of details.

@HuguesDug
Copy link
Author

Tested : work fine now !

Thanks for the fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants