Training Data Expansion and Boosting of Convolutional Neural Networks for Reducing the MNIST Dataset Error Rate
DOI:
https://doi.org/10.20535/1810-0546.2016.6.84115Keywords:
MNIST, Convolutional neural network, Error rate, Training data expansion, BoostingAbstract
Background. Due to that the preceding approaches for improving the MNIST image dataset error rate do not have a clear structure which could let repeat it in a strengthened manner, the formalization of the performance improvement is considered.
Objective. The goal is to strictly formalize a strategy of reducing the MNIST dataset error rate.
Methods. An algorithm for achieving the better performance by expanding the training data and boosting with ensembles is suggested. The algorithm uses the designed concept of the training data expansion. Coordination of the concept and the algorithm defines a strategy of the error rate reduction.
Results. In relative comparison, the single convolutional neural network performance on the MNIST dataset has been bettered almost by 30 %. With boosting, the performance is 0.21 % error rate meaning that only 21 handwritten digits from 10,000 are not recognized.
Conclusions. The training data expansion is crucial for reducing the MNIST dataset error rate. The boosting is ineffective without it. Application of the stated approach has an impressive impact for reducing the MNIST dataset error rate, using only 5 or 6 convolutional neural networks against those 35 ones in the benchmark work.
References
Y. LeCun et al., “Gradient-based learning applied to document recognition”, Proc. IEEE, vol. 86, iss. 11, pp. 2278–2324, 1998. doi: 10.1109/5.726791
E. Kussul and T. Baidyk, “Improved method of handwritten digit recognition tested on MNIST database”, Image and Vision Computing, vol. 22, iss. 12, pp. 971–981, 2004. doi: 10.1016/j.imavis.2004.03.008
D. Ciresan et al., “Flexible, high performance convolutional neural networks for image classification”, in Proc. 22nd Int. Joint Conf. Artificial Intelligence, vol. 2, pp. 1237–1242, 2011.
M. Ranzato et al., “Efficient learning of sparse representations with an energy-based model”, Advances in Neural Information Processing Systems, vol. 19, pp. 1137–1144, 2006.
D. Ciresan et al., “Multi-column deep neural networks for image classification”, in Proc. 2012 IEEE Conf. Computer Vision and Pattern Recognition, pp. 3642–3649, 2012. doi: 10.1109/CVPR.2012.6248110
P. Date et al., “Design index for deep neural networks”, Procedia Computer Sci., vol. 88, pp. 131–138, 2016. doi: 10.1016/ j.procs.2016.07.416
D. Ciresan et al., “Deep big simple neural nets excel on handwritten digit recognition”, Neural Computation, vol. 22, no. 12, pp. 3207–3220, 2010. doi: 10.1162/NECO_a_00052
B. Kégl and R. Busa-Fekete, “Boosting products of base classifiers”, in Proc. 26th Annual Int. Conf. Machine Learning, pp. 497–504, 2009. doi: 10.1145/1553374.1553439
V.V. Romanuke, “Boosting ensembles of heavy two-layer perceptrons for increasing classification accuracy in recognizing shifted-turned-scaled flat images with binary features”, J. Inform. Organiz. Sci., vol. 39, no. 1, pp. 75–84, 2015.
N. Srivastava et al., “Dropout: A simple way to prevent neural networks from overfitting”, J. Machine Learning Res., vol. 15, iss. 1, pp. 1929–1958, 2014.
V.V. Romanuke, “Two-layer perceptron for classifying flat scaled-turned-shifted objects by additional feature distortions in training”, J. Uncertain Systems, vol. 9, no. 4, pp. 286–305, 2015.
V.V. Romanuke, “A method of resume-training of discontinuous wear state trackers for composing boosting high-accurate ensembles needed to regard statistical data inaccuracies and shifts”, Problems of Tribology, no. 3, pp. 19–22, 2015.
V.V. Romanuke, “Accuracy improvement in wear state discontinuous tracking model regarding statistical data inaccuracies and shifts with boosting mini-ensemble of two-layer perceptrons”, Problems of Tribology, no. 4, pp. 55–58, 2014.
Downloads
Published
Issue
Section
License
Copyright (c) 2017 NTUU KPI Authors who publish with this journal agree to the following terms:- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under CC BY 4.0 that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work