A new formula to determine the optimal dataset size for training neural networks

Aik L.E.; Hong T.W.; Junoh A.K.

A new formula to determine the optimal dataset size for training neural networks

Journal

ARPN Journal of Engineering and Applied Sciences

Date Issued

2019-01-01

Author(s)

Aik L.E.

Hong T.W.

Junoh A.K.

Handle (URI)

https://hdl.handle.net/20.500.14170/10069

Abstract

In neural networks, training a network with a large datasets put a heavy load to computation time and does not guarantee networks accuracy. As dataset may contains outlier or missing value that leave a gap that possibly cause the overall shape of dataset to be affected during training session. A datasets with too limited data points or too much data points is not an optimal size for training the neural network. Hence, suitable size is requires ensuring the neural network is trained using optimal dataset size which able to reduce computational time and does not affect the accuracy significantly. This paper presents a dataset size reduction formula that can provide suitable number of training dataset size for the neural networks and does not affect the accuracy significantly. The formula derived from the Fibonacci retracement that has been reported its usage in many literatures. The experiments were performed on four literatures function and four real -world datasets to validate its efficiency. The experiments tested on groups of dataset with their data reduce from 0 percent to 95 percent with 5 percent step size. The results are compared to proposed method for root mean square error (RMSE) and time usage in radial basis function network (RBFN). The proposed method yielded a promising result with an average over 50 percent reduction in time usage and 20 percent in RMSE.

Subjects

Dataset size reductio...

Views

2

Acquisition Date
Mar 5, 2026

View Details

google-scholar

Downloads

Options

A new formula to determine the optimal dataset size for training neural networks