Skip to Main content Skip to Navigation
Preprints, Working Papers, ...

Numerical influence of ReLU’(0) on backpropagation

Abstract : In theory, the choice of ReLU'(0) in [0, 1] for a neural network has a negligible influence both on backpropagation and training. Yet, in the real world, 32 bits default precision combined with the size of deep learning problems makes it a hyperparameter of training methods. We investigate the importance of the value of ReLU'(0) for several precision levels (16, 32, 64 bits), on various networks (fully connected, VGG, ResNet) and datasets (MNIST, CIFAR10, SVHN). We observe considerable variations of backpropagation outputs which occur around half of the time in 32 bits precision. The effect disappears with double precision, while it is systematic at 16 bits. For vanilla SGD training, the choice ReLU'(0) = 0 seems to be the most efficient. We also evidence that reconditioning approaches as batch-norm or ADAM tend to buffer the influence of ReLU'(0)'s value. Overall, the message we want to convey is that algorithmic differentiation of nonsmooth problems potentially hides parameters that could be tuned advantageously.
Document type :
Preprints, Working Papers, ...
Complete list of metadata

https://hal.archives-ouvertes.fr/hal-03265059
Contributor : Bertoin David Connect in order to contact the contributor
Submitted on : Tuesday, June 29, 2021 - 11:01:31 AM
Last modification on : Wednesday, June 30, 2021 - 3:41:50 AM

Files

Impact_of_ReLU_prime.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-03265059, version 2

Citation

David Bertoin, Jérôme Bolte, Sébastien Gerchinovitz, Edouard Pauwels. Numerical influence of ReLU’(0) on backpropagation. 2021. ⟨hal-03265059v2⟩

Share

Metrics

Record views

68

Files downloads

31