Flexible Solution of a 2-layer Perceptron Optimization by its Size and Training Set Smooth Distortion Ratio for Classifying Simple-Structured Objects

Vadim V. Romanuke


Background. Two-layer perceptrons are preferred to complex neural network classifiers when objects to be classified have a simple structure. However, even images consisting of a few geometrically primitive elements on a monotonous background are classified poorly with two-layer perceptron if they are badly distorted (shifted, skewed, and scaled). Performance of two-layer perceptron can be bettered much with modifying its training. This is done by deliberately importing distortions like shifting, skewness, and scaling into a training set, but only controlling volumes of the distor­tions with a special ratio. Besides, the performance is improved with optimally sizing the hidden layer.

Objective. The goal is to optimize the two-layer perceptron by its size and the ratio for classifying simple-structured objects.

Methods. The objects are monochrome images of enlarged English alphabet capital letters (the EEACL26 dataset) of a medium size 60-by-80. EEACL26 is an infinite artificial dataset, so mathematical models of distorted images are given. Then two-layer perceptrons having various sizes and training set smooth distortion ratios are trained and tested. The performance is evaluated via ultimate-distortion classification error percentage.

Results. Based on statistical evaluations of classification error percentage at ultimate distortions, it is revealed that, while the best ratio should be between 0.01 and 0.02, and an optimal number of neurons in the hidden layer should be between 361 and 390. Sizes closer to 375 are counted as statistically more reliable, whereas the ratios are selected uni­formly. Such solution is flexible allowing not only further-training with changing the hidden layer size and ratio, but also a smart initial training for the first passes. Nevertheless, even after the first 100 passes, the two-layer perceptron further-trained for extra 1190 passes by 10 times increasing distortion smoothness performs at 8.91 % of errors at ultimate distortions, which is about 45 % better than a previously known result. At the expected practicable distortions, which are far less, the error rate is 0.8 % that is quite tolerable. But here the main benefit of the optimized two-layer perceptron is its fast operation speed, rather than accuracy.

Conclusions. The obtained flexible solution fits other datasets similar to EEACL26. Number of classes can vary between 20 and 30, and number of features can vary between a few hundred and a few thousand. The stated example of achieving high-performance classification with two-layer perceptrons is a part of the common technique of statistical optimization relating to neural network classifiers. For a more sophisticated dataset of objects, this technique is built and executed in a similar way.


Classification; Shifted-skewed-scaled objects; 2-layer perceptron size; 2-layer perceptron configuration; Training set; MATLAB training function; 2-layer perceptron performance

Full Text:



S.O. Haykin, Neural Networks and Learning Machines. Upper Saddle River, NJ: Pearson, 2009.

V.V. Romanuke, “Two-layer perceptron for classifying scaled-turned-shifted objects by 26 classes general totality of monochrome 60-by-80-images via training with pixel-distorted scaled-turned-shifted images”, Inform. Process. Syst., iss. 7, pp. 98–107, 2015.

Â. Cardoso and A. Wichert, “Noise tolerance in a Neocognitron-like network”, Neural Networks, vol. 49, pp. 32–38, 2014. doi: 10.1016/j.neunet.2013.09.007

K. Fukushima, “Training multi-layered neural network neocognitron”, Neural Networks, vol. 40, pp. 18–31, 2013. doi: 10.1016/j.neunet.2013.01.001

J. Weng et al., “Learning recognition and segmentation using the Cresceptron”, Int. J. Comp. Vision, vol. 25, iss. 2, pp. 109–143, 1997.

J. Weng et al., “Cresceptron: A self-organizing neural network which grows adaptively”, in Proc. Int. Joint Conf. Neural Networks, Baltimore, Maryland, 1992, vol. I, pp. 576–581. doi: 10.1109/IJCNN.1992.287150

V.V. Romanuke, “Appropriate number and allocation of ReLUs in convolutional neural networks”, Naukovi Visti NTUU KPI, no. 1, pp. 69–78, 2017. doi: 10.20535/1810-0546.2017.1.88156

M. Sun et al., “Learning pooling for convolutional neural network”, Neurocomputing, vol. 224, pp. 96–104, 2017. doi: 10.1016/j.neucom.2016.10.049

V.V. Romanuke, “Classifying scaled-turned-shifted objects with optimal pixel-to-scale-turn-shift standard deviations ratio in training 2-layer perceptron on scaled-turned-shifted 4800-featured objects under normally distributed feature distortion”, Electrical Control Commun. Eng., vol. 13, pp. 45–54, 2017. doi: 10.1515/ecce-2017-0007

V.V. Romanuke, “A 2-layer perceptron performance improvement in classifying 26 turned monochrome 60-by-80-images via training with pixel-distorted turned images”, Naukovi Visti NTUU KPI, no. 5, pp. 55–62, 2014.

E.B. Baum, “On the capabilities of multilayer perceptrons”, J. Complexity, vol. 4, iss. 3, pp. 193–215, 1988. doi: 10.1016/0885-064X(88)90020-9

C. Yu et al., “An efficient hidden layer training method for the multilayer perceptron”, Neurocomputing, vol. 70, iss. 1-3, pp. 525–535, 2006. doi: 10.1016/j.neucom.2005.11.008

U. Bhattacharya and S.K. Parui, “On the rate of convergence of perceptron learning”, Pattern Recognition Lett., vol. 16, iss. 5, pp. 491–497, 1995. doi: 10.1016/0167-8655(95)00124-Y

M.F. Moller, “A scaled conjugate gradient algorithm for fast supervised learning”, Neural Networks, vol. 6, iss. 4, pp. 525–533, 1993. doi: 10.1016/S0893-6080(05)80056-5

A. Nied et al., “On-line neural training algorithm with sliding mode control and adaptive learning rate”, Neurocomputing, vol. 70, iss. 16-18, pp. 2687–2691, 2007. doi: 10.1016/j.neucom.2006.07.019

V.V. Romanuke, “Two-layer perceptron for classifying flat scaled-turned-shifted objects by additional feature distortions in training”, J. Uncertain Syst., vol. 9, no. 4, pp. 286–305, 2015.

V.V. Romanuke, “An attempt for 2-layer perceptron high performance in classifying shifted monochrome 60-by-80-images via training with pixel-distorted shifted images on the pattern of 26 alphabet letters”, Radio Electronics, Computer Science, Control, no. 2, pp. 112–118, 2013. doi: 10.15588/1607-3274-2013-2-18

V.V. Romanuke, “Dependence of performance of feed-forward neuronet with single hidden layer of neurons against its training smoothness on noised replicas of pattern alphabet”, Herald of Khmelnytskyi National University. Tech. Sci., no. 1, pp. 201–206, 2013.

M. Zhou et al., “Interval optimization combined with point estimate method for stochastic security-constrained unit commit­ment”, Int. J. Electrical Power & Energy Systems, vol. 63, pp. 276–284, 2014. doi: 10.1016/j.ijepes.2014.06.012

F. Wei et al., “Optimal unit sizing for small-scale integrated energy systems using multi-objective interval optimization and evidential reasoning approach”, Energy, vol. 111, pp. 933–946, 2016. doi: 10.1016/j.energy.2016.05.046

V.V. Romanuke, “Optimal training parameters and hidden layer neuron number of two-layer perceptron for generalised scaled object classification problem”, Inform. Technol. Management Sci., vol. 18, pp. 42–48, 2015. doi: 10.1515/itms-2015-0007

GOST Style Citations



DOI: https://doi.org/10.20535/1810-0546.2017.6.110724


  • There are currently no refbacks.

Copyright (c) 2017 Igor Sikorsky Kyiv Polytechnic Institute

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.