Appropriate Number and Allocation of ReLUs in Convolutional Neural Networks
DOI:
https://doi.org/10.20535/1810-0546.2017.1.88156Keywords:
Convolutional neural network, ReLU, EEACL26, CIFAR-10Abstract
Background. Due to that there is no common conception about whether each convolutional layer must be followed with a ReLU, the question on an appropriate number of ReLUs and their allocation is considered.
Objective. The goal is to find a law for ascertaining an appropriate number of ReLUs. If this number is less than the number of convolutional layers, then the law shall stand for an appropriate allocation of ReLUs.
Methods. A method of evaluating performance on the EEACL26 and CIFAR-10 datasets over various versions of ReLUs’ allocation is defined. The performance is evaluated through 4 and 8 epochs for EEACL26 and CIFAR-10, respectively, for each allocation version. The best scores of performance are extracted.
Results. In convolutional neural networks with 4 or 5 convolutional layers, the first three convolutional layers shall be followed with ReLUs, and the rest of convolutional layers shall not be ReLUed. It is plausible that appropriateness of ReLUs includes from-the-start compactness of allocating them, i. e. all ReLUs are allocated one by one from the very first convolutional layer. An appropriate number of ReLUs is an integer between a half of the convolutional layers’ number and the half increased by 1.
Conclusions. In some cases, the gain can grow up to 100 % and more. The gain for CIFAR-10, if any, is of roughly 10 to 20 %. Generally, as the training process goes on, the gain expectedly drops. Nevertheless, the stated appropriateness of number and allocation of ReLUs rationalizes the convolutional neural network architecture. Convolutional neural networks under the appropriate ReLUs’ allocation can be progressively optimized further on its other hyperparameters.References
D. Costarelli and R. Spigler, “Approximation results for neural network operators activated by sigmoidal functions”, Neural Networks, vol. 44, pp. 101–106, 2013. doi: 10.1016/j.neunet.2013.03.015
G.A. Anastassiou, “Multivariate sigmoidal neural network approximation”, Neural Networks, vol. 24, iss. 4, pp. 378–386, 2011. doi: 10.1016/j.neunet.2011.01.003
A. Krizhevsky et al., “ImageNet classification with deep convolutional neural networks”, Advances in Neural Information Processing Systems, vol. 1, pp. 1097–1105, 2012.
J. Kim et al., “Convolutional neural network with biologically inspired retinal structure”, Procedia Comp. Sci., vol. 88, pp. 145–154, 2016. doi: 10.1016/j.procs.2016.07.418
K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition”, in Proc. 5th Int. Conf. Learning Representations (ICLR 2015), arXiv:1409.1556v6, 2015.
C. Szegedy et al., “Going deeper with convolutions”, in Proc. 2015 IEEE Conf. Computer Vision and Pattern Recognition (CVPR), arXiv:1409.4842, 2015. doi: 10.1109/CVPR.2015.7298594
V.V. Romanuke, “Two-layer perceptron for classifying flat scaled-turned-shifted objects by additional feature distortions in training”, J. Uncertain Systems, vol. 9, no. 4, pp. 286–305, 2015.
V.V. Romanuke, “Optimal training parameters and hidden layer neurons number of two-layer perceptron for generalized scaled objects classification problem”, Inform. Technol. Management Sci., vol. 18, pp. 42–48, 2015.
D. Ciresan et al., “Flexible, high performance convolutional neural networks for image classification”, in Proc. XXII Int. Joint Conf. Artificial Intelligence, vol. 2, pp. 1237–1242, 2011.
[10] P. Date et al., “Design index for deep neural networks”, Procedia Comp. Sci., vol. 88, pp. 131–138, 2016. doi: 10.1016/ j.procs.2016.07.416
Downloads
Published
Issue
Section
License
Copyright (c) 2017 NTUU KPI Authors who publish with this journal agree to the following terms:- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under CC BY 4.0 that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work