Appropriate Number and Allocation of ReLUs in Convolutional Neural Networks

Authors

DOI:

https://doi.org/10.20535/1810-0546.2017.1.88156

Keywords:

Convolutional neural network, ReLU, EEACL26, CIFAR-10

Abstract

Background. Due to that there is no common conception about whether each convolutional layer must be followed with a ReLU, the question on an appropriate number of ReLUs and their allocation is considered.

Objective. The goal is to find a law for ascertaining an appropriate number of ReLUs. If this number is less than the number of convolutional layers, then the law shall stand for an appropriate allocation of ReLUs.

Methods. A method of evaluating performance on the EEACL26 and CIFAR-10 datasets over various versions of ReLUs’ allocation is defined. The performance is evaluated through 4 and 8 epochs for EEACL26 and CIFAR-10, respectively, for each allocation version. The best scores of performance are extracted.

Results. In convolutional neural networks with 4 or 5 convolutional layers, the first three convolutional layers shall be followed with ReLUs, and the rest of convolutional layers shall not be ReLUed. It is plausible that appropriateness of ReLUs includes from-the-start compactness of allocating them, i. e. all ReLUs are allocated one by one from the very first convolutional layer. An appropriate number of ReLUs is an integer between a half of the convolutional layers’ number and the half increased by 1.

Conclusions. In some cases, the gain can grow up to 100 % and more. The gain for CIFAR-10, if any, is of roughly 10 to 20 %. Generally, as the training process goes on, the gain expectedly drops. Nevertheless, the stated appropriateness of number and allocation of ReLUs rationalizes the convolutional neural network architecture. Convolutional neural networks under the appropriate ReLUs’ allocation can be progressively optimized further on its other hyperpa­ra­meters.

Author Biography

Vadim Romanuke, Khmelnitskiy National University

Doctor of sciences (engineering), professor, professor at the Khmelnitskiy National University

References

D. Costarelli and R. Spigler, “Approximation results for neural network operators activated by sigmoidal functions”, Neural Networks, vol. 44, pp. 101–106, 2013. doi: 10.1016/j.neunet.2013.03.015

G.A. Anastassiou, “Multivariate sigmoidal neural network approximation”, Neural Networks, vol. 24, iss. 4, pp. 378–386, 2011. doi: 10.1016/j.neunet.2011.01.003

A. Krizhevsky et al., “ImageNet classification with deep convolutional neural networks”, Advances in Neural Information Processing Systems, vol. 1, pp. 1097–1105, 2012.

J. Kim et al., “Convolutional neural network with biologically inspired retinal structure”, Procedia Comp. Sci., vol. 88, pp. 145–154, 2016. doi: 10.1016/j.procs.2016.07.418

K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition”, in Proc. 5th Int. Conf. Learning Representations (ICLR 2015), arXiv:1409.1556v6, 2015.

C. Szegedy et al., “Going deeper with convolutions”, in Proc. 2015 IEEE Conf. Computer Vision and Pattern Recognition (CVPR), arXiv:1409.4842, 2015. doi: 10.1109/CVPR.2015.7298594

V.V. Romanuke, “Two-layer perceptron for classifying flat scaled-turned-shifted objects by additional feature distortions in training”, J. Uncertain Systems, vol. 9, no. 4, pp. 286–305, 2015.

V.V. Romanuke, “Optimal training parameters and hidden layer neurons number of two-layer perceptron for generalized scaled objects classification problem”, Inform. Technol. Management Sci., vol. 18, pp. 42–48, 2015.

D. Ciresan et al., “Flexible, high performance convolutional neural networks for image classification”, in Proc. XXII Int. Joint Conf. Artificial Intelligence, vol. 2, pp. 1237–1242, 2011.

[10] P. Date et al., “Design index for deep neural networks”, Procedia Comp. Sci., vol. 88, pp. 131–138, 2016. doi: 10.1016/ j.procs.2016.07.416

Downloads

Published

2017-03-01

Issue

Section

Art