0% found this document useful (0 votes)
9 views26 pages

Meca

Uploaded by

jauharijarrel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views26 pages

Meca

Uploaded by

jauharijarrel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

1 Acoustic signal generation techniques for non-invasive coconut maturity classification

ed
2 system

3 June Anne Caladcad1*, Eduardo Jr. Piedad2

iew
4

5 1Department of Industrial Engineering, University of San Jose - Recoletos, Cebu City, 6000, Philippines;

6 [email protected]

ev
7 2Advanced Science and Technology Institute, DOST, Quezon City 1101, Philippines;

8 [email protected]

9 *corresponding author: [email protected] (JA Caladcad)

r
10

11

12
Abstract er
Technological advancements enable the use of intelligent systems in various fields such as operations,
pe
13 management, as well as in agriculture. In the field of agriculture, smart systems create a positive impact on

14 fruit classification, specifically for export needs. Even with its wide application, there have been limited

15 studies on its utilization in coconut exportation. Another challenge identified is that gathering a large
ot

16 amount of dataset has become difficult, expensive, time-consuming, and prone to errors due to human
tn

17 inconsistencies. Due to this majority of the initial datasets in existing studies are unbalanced and sometimes,

18 inadequate to contribute significant analysis and results. In this study, an investigation is conducted using

19 an existing dataset of coconut acoustic signals to generate more data to balance out the samples across the
rin

20 three maturity levels. Data and data augmentation techniques are utilized along with combining techniques

21 for acoustic signals, feature extraction, and data clustering. The data points under three maturity levels were
ep

22 clustered appropriately using the summing method and Mel Frequency Cepstral Coefficient feature. All

23 data synthesizers failed to generate quality synthetic data. However, both audiomentation and procedural

24 audio generation were able to produce quality augmented data after validating using linear layers, 1-
Pr

25 dimensional convolution, and long-short term memory model as the audio classification technique. The

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4864841
26 new dataset was then fed to the same models and the results yielded to significant increase in its

ed
27 classification performance for all models. The study can be further improved by incorporating other coconut

28 features to increase performance and widen its application.

iew
29 Keywords: signal processing, artificial neural network, support vector machine, random forest model,

30 feature extraction, coconut maturity classification

31 1. Introduction

ev
32 The majority of the global coconut supply comes from tropical countries where coconut trees normally

33 thrive. The three largest coconut-producing countries—Indonesia, the Philippines, and India—contributed

r
34 78% of the global coconut production and the remaining 12% were from the rest of the coconut-producing

35 countries (Burns, et al., 2020). In the Philippines, coconut is one of the greatest contributors to general

36

37
er
economic activity for agricultural land and labor (Moreno, et al., 2020). It is also considered a predictor

of general economic activity due to the performance of coconut as the major source of income in the country
pe
38 (Moreno, et al., 2020). The country has average earnings of 91.4 billion PHP (1.7 billion USD) and an

39 annual coconut production of 14.7 metric tons in nut terms from 2014 to 2019 (Philippine Coconut

40 Authority, 2018). About 25% to 33% of the country’s population is dependent on the coconut industry for
ot

41 their livelihood (Pascua, 2017).


tn

42 For commercial purposes, the coconuts vary in the classification of the fruit’s maturity level

43 (Mahayothee, et al., 2016; Javel, et al., 2018; Terdwongworakul, et al., 2009). It is the prime

44 determinant of the fruit’s maximum economic value and consumption (Chen, et al., 2021). Most
rin

45 commonly, they are classified into three levels, i.e., premature, mature, and overmature (Burns, et

46 al., 2020; Terdwongworakul, et al., 2009; Gatchalian & De Leon, 1994; Mahayothee, et al., 2016;
ep

47 Javel, et al., 2018). Traditionally, coconuts are sorted manually into their maturity levels, which

48 poses a lot of drawbacks from human-related constraints such as inconsistency, variability, and
Pr

49 subjectivity, among others (Zhang, et al., 2014; ElMasry, et al., 2012; Hameed, et al., 2018;

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4864841
50 Elmasry, et al., 2012). Dealing with a large volume of coconuts to be exported or delivered for industrial

ed
51 processing as a fresh product, the traditional technique of manually sorting fruits will no longer be feasible

52 (Gatchalian & De Leon, 1994). Behera et al. (2020) reported that due to the lack of skilled workers and

iew
53 human subjectivity in classifying coconuts, 30% to 35% of the harvested coconuts are wasted. With the

54 advancement of technology, manual fruit classification is gradually being replaced with mechanical

55 methods, such as learning models (Chen, et al., 2021).

56 Classification learning models are data-driven and are highly dependent on their training data to yield

ev
57 high classification accuracy (Salamon & Bello, 2017; Himanen, et al., 2019). These models are classified

58 under big data approaches that are known for dealing with extensively large databases for accurate

r
59 predictions and classification (Wang, et al., 2021). The cascade of multiple hidden layers in a neural

60

61
er
network architecture learns simple to complex features of the training set in its raw form (Shinde & Shah,

2018; Assen, et al., 2020; Janiesch, et al., 2021; Ning & You, 2019). This is how the training data directly
pe
62 affects the performance of a classification model architecture (Shaoa, et al., 2019). Commonly, these

63 models require a large amount of labeled data for good results (Krizhevsky, et al., 2012). However, training

64 samples are usually unbalanced because real-world high-quality data collection is challenging, costly, and
ot

65 time-consuming (Yasar & Laskowski, 2023; Shaked, 2023; Maguolo, et al., 2021; Shaoa, et al., 2019).

66 An imbalanced dataset has been a reoccurring problem in training models, which restricts the accuracy
tn

67 and stability of a model, thus, resulting in poor fruit classification performance (Zheng, et al., 2021). And

68 this can be resolved with the use of artificially generated data produced from data generation methods
rin

69 (Zheng, et al., 2021; Kong, et al., 2021; Park, et al., 2018). In the case of sound and acoustic signals, the

70 most common application domains for sound classification are speech recognition (Chowanda, et al., 2023;

71 Yuan, et al., 2023; Chotirat & Meesad, 2021; Nallanthighal, et al., 2021), music classification (Yu, et al.,
ep

72 2020; Hizlisoy, et al., 2021; Pendyala, et al., 2022; Ghatas, et al., 2022), or environmental sound recognition

73 (Nanni, et al., 2020).


Pr

74 Data generation is most frequently applied to image processing (Le, et al., 2019), speech patterns (Xia,

75 et al., 2014; Siriwardena, et al., 2022), and musical notes (Engel, et al., 2017). There are two types of data

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4864841
76 generation: data synthesis and data augmentation. Data synthesis usually involves data generative models

ed
77 and can produce synthetic data without the original dataset while data augmentation focuses on slight

78 manipulations to generate more data (Awan, 2022). There are several studies in the literature that use data

iew
79 synthesis or data augmentations concerning sound classification and acoustic signal processing. For data

80 synthesis, these include Kong et al. (2021) proposing a diffusion model for waveform generation through

81 the Markov chain, Binkowski et al. (2020) introducing a generative adversarial network (GAN) for tasks

82 that involve text and speech, and Yamamoto et al. (2020) designing a distillation-free and fast waveform

ev
83 generation method using GAN. For data augmentation, these include Byran (2020) estimating reverberation

84 time and direct-to-reverberant ratio from speech using deep convolutional neural networks (CNN),

r
85 Siriwardena et al. (2022) using audio data augmentation for speech inversion, and Sun et al. (2022)

86

87
er
classifying animal sounds using CNN with data augmentation. Although these studies provide significant

insights and results, the majority of them deal with speech analysis and processing while some deal with
pe
88 environmental sounds such as animal sounds. Evidently, there is a limited variety of applications for data

89 generation concerning data samples for classifying fruits.

90 The objective of this study is to improve the existing methods of classifying coconuts to their maturity
ot

91 levels with the use of data generation methods, specifically for mass fruit exportation. In addition, data

92 generation methods are investigated to find which methods and techniques apply to the nature of the dataset
tn

93 used in the study—acoustic signals. By improving such a system and creating an improved classification

94 system, it will aid the agricultural industry and coconut exportation companies in improving their
rin

95 performance in sorting and producing significantly fewer fruits that are misclassified and have gone to

96 waste due to manual methods. This study made use of the dataset available from Caladcad et al. (2022) as

97 the initial dataset to be employed in the data generation methods. The newly generated dataset will be
ep

98 deployed to three machine learning algorithms (MLAs) used in the study of Caladcad et al. (2020) for

99 comparison to further verify the improvement of the classification models.


Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4864841
100 2. Materials and methods

ed
101 2.1. Initial coconut dataset

102 The initial dataset in this study is from a recently published dataset of coconut based on a tapping

iew
103 system in Caladcad et al. (2022). The raw data are acoustic signals from the three coconut maturity levels—

104 premature, mature, and overmature. It is composed of 16-bit information, a sampling frequency of 44.1

105 kHz, and 132,300 time-series data points. From the 129 coconut samples gathered, there are 8 premature,

106 36 mature, and 85 overmature samples. About 66% of the samples are overmature coconuts, which is more

ev
107 than half of the total samples gathered. The coconut samples were gathered during the postharvest season,

108 resulting in the dataset's significant imbalance across maturity levels, which is beyond the scope of the

r
109 study. There is also an insufficient amount of available data prior to data generation, specifically for

110

111
er
premature and mature samples. This dataset serves as the primary dataset to reproduce more data using data

generation methods as a proposed method to solve the imbalance of the dataset.


pe
112 2.2. Audio data generation framework

113 Presented in Fig. 1 is the proposed audio data generation framework. It follows a systematic approach

114 for conducting data analytics in data generation for acoustic signals. This serves as the basis of the study to
ot

115 understand and analyze acoustic signals to make informed decisions. The steps in the framework are

116 summarized as follows: (1) pre-processing, (2) data generation, and (3) data validation. Note that the
tn

117 methods and steps in this section are implemented using a Python-based library: Librosa, NumPy, SciPy,

118 and sci-kit learn.


rin

119 Before proceeding to audio data generation, pre-processing is conducted first. This involves combining

120 the acoustic signals, undergoing feature extraction, clustering, and cleaning the data per maturity level. Pre-

121 processing techniques such as feature extraction, data clustering, and data cleaning before data generation
ep

122 are important to improve the dataset’s quality (Maharana, et al., 2022). Artificially generated data will

123 primarily affect the accuracy, prediction, and classification abilities of the learning models, which is why
Pr

124 ensuring that the dataset is of quality is crucial (Maharana, et al., 2022).

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4864841
ed
iew
r ev
125

126 Figure 1. Audio data generation framework

127

128
er
Firstly, the initial dataset has undergone initial cleaning, which is summarized in Table 1. Both the

missing data and the wrongly labeled data were deleted. The initial dataset is composed of 6.20%, 27.91%,
pe
129 and 65.89% of premature, mature, and overmature, respectively, and all maturity levels have equal duration.

130 It can be observed that there is a clear domination of the number of samples under the overmature level and

131 an imbalance of the overall initial dataset.


ot

132 Table 1. Summary of initial cleaning of the dataset

Samples Error type Correction


tn

C-4, C-45 Wrong labels Deleted


C-128, C-132 Missing data Deleted
133
rin

134 In the previous study by Caladcad et al. (2020), each audio signal is treated as one sample; three knocks

135 on the three ridges of one coconut sample. This resulted in having three audio signals per coconut sample.
ep

136 However, it is not possible to cluster the data per ridge knock, consequently, not possible to proceed with

137 audio data generation. To remedy this, combination methods like extending and summing are explored.

138 Extending is defined as simply merging audio signals into one by mere extension while summing is adding
Pr

139 the values of all audio signals relative to the time signal (Pulakka & Alku, 2011). As a result of the extending

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4864841
140 method, three audio signals under one coconut sample were merged. Shown in Fig. 2 are two samples under

ed
141 the premature level as a result of extending. This also extended the audio duration from three seconds to

142 nine seconds; one audio signal is three seconds long for each ridge knock sample. On the other hand, the

iew
143 audio duration when audio signals were summed remained the same, as shown in Fig. 3.

r ev
144

er
pe
ot

145

146 Figure 2. Two samples of acoustic signals by extending ridges under premature level
tn
rin
ep

147

148 Figure 3. Two samples of acoustic signals by summing ridges under mature level
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4864841
149 After combining, the next step is to cluster the data to form clear distinctions of the data points between

ed
150 the maturity levels. Data clustering was done through feature engineering using the following features: (1)

151 spectrogram (Ngo, et al., 2020; Dennis, et al., 2011; Harmanny, et al., 2014), (2) Mel Filterbank Energy

iew
152 (MFE) (Tak, et al., 2017; Madikeri & Murthy, 2011), and (3) Mel Frequency Cepstral Coefficient (MFCC)

153 (Paul S, et al., 2021; A, et al., 2018; KS, et al., 2021). Data clustering and feature extraction go together in

154 this process since the features were used for clustering the data. The combining method paired with the

155 feature selected that created clusters will be used in the succeeding steps of data generation.

ev
156 Shown in Fig. 4 is the result of clustered data points using the spectrogram feature and the extending

157 method. As presented, it is clear that there are no clusters formed using these two methods. Data points

r
158 overlapped against each other, which didn’t form any distinction between the maturity levels. And so, the

159

160
er
use of the spectrogram feature and extending method for this type of dataset wasn’t used to generate more

data.
pe
ot
tn

161
rin

162 Figure 4. Data clustering using the spectrogram feature of extended ridges

163 On the other hand, shown in Figs. 5 and 6 are the data clustering using MFE and MFCC features,
ep

164 respectively, with the summing method. Fig. 5 showed similar results with Fig. 4 in such a way that the

165 data points couldn’t be clustered according to their maturity levels. The same results showed when time-

166 series, MFE and MFCC feature extraction methods were paired with the extending method, as well as with
Pr

167 time-series and spectrogram feature extraction methods were paired with summing method. However, the

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4864841
168 use of the MFCC feature with summed ridges shows a clear clustering of three maturity levels. In the figure,

ed
169 Cluster 0 indicates the premature samples, while Cluster 1 is for the mature samples, and Cluster 2 is where

170 the overmature samples are. With this, the summed ridges and MFCC feature will be used for the

iew
171 succeeding processes.

r ev
172

173
er
Figure 5. Data clustering using the MFE feature of summed ridges
pe
ot
tn
rin

174

175 Figure 6. Data clustering using the MFCC feature of summed ridges

176 The last process under the pre-processing step is data cleaning. This is the removal of outliers for each
ep

177 cluster to preserve the quality of the samples prior to the implementation of data generation methods.

178 Outliers are those data points that deviate away from their cluster. This is also to prevent inconsistencies
Pr

179 that could reduce the quality of the newly generated dataset and to avoid erroneous conclusions and

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4864841
180 classifications (Maharana, et al., 2022). Summarized in Table 2 is the comparison of the number of data

ed
181 samples per maturity level of the initial dataset and the processed data after pre-processing was conducted.

182 In the table, there is only 1 sample that remained under the premature level, which hinders the process of

iew
183 generating more data. To remedy this, false data points are added under the premature level for the purpose

184 of data generation only. These data points are samples that do not belong to the premature level but falsely

185 belong to the premature level when clustered. Note that the false data points added will be removed after

186 generating more data. It will not be included during the validation of the quality of the dataset and other

ev
187 further processes. After the pre-processing step, there are now 11 premature, 13 mature, and 21 overmature

188 samples to be used for data generation.

r
189 Table 2. Summary of original vs. processed data after pre-processing

Maturity level

Premature
Initial dataset

8
er
Cleaned data
1
Processed data
False data points
10
Total
11
pe
Mature 36 13 0 13
Overmature 85 21 0 21
190

191 Two different audio data generation methods were explored: data synthesis and data augmentation. For
ot

192 data synthesis, the available speech synthesizer repository taken from the study of Kong et al. (2021) was

193 adopted, autoencoder (AE) and variational autoencoder (VAE) (Anwar, 2021). Kong et al. (2021) proposed
tn

194 DiffWave, which is a high-quality neural vocoder and waveform synthesizer. It uses a Markov chain to

195 convert white noise signal into structured waveform. On the other hand, AE is a special type of neural
rin

196 network model that learns efficient coding of unlabeled data to ignore signal noise (Bandyopadhyay, 2021).

197 While VAE is similar to AE, but addresses the non-regularized latent space (Anwar, 2021). Both have been

198 successfully applied to vast amounts of unlabeled data as prominent data synthesizers (Tschannen, et al.,
ep

199 2018).

200 Applying it to the processed dataset presented in Table 2, shown in Table 3 is the summary of the
Pr

201 parameters in training the model. The model only generated 11 premature and 13 mature synthetic data.

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4864841
202 This is the same amount of data from the processed dataset in Table 2, thus, only doubling the samples for

ed
203 premature and mature levels. All of the generated audio were noise, instead of the knocking sounds expected

204 from them. Even with several parameter changes, specifically the noise schedule, it did not help improve

iew
205 the generated audio at all. With these results, the use of the speech synthesizer from the study of Kong et

206 al. (2021) failed to generate additional and quality data for this dataset and, therefore, cannot be used for

207 the learning models.

208 Proceeding to other data synthesis methods, two synthesizers available in the public repository were

ev
209 investigated. These two methods are AE and VAE. The parameters used for both methods are summarized

210 in Table 3 and their loss graphs are presented in Figs. 7 and 8 for AE and VAE, respectively. From the

r
211 table, both models failed to generate quality synthetic data. This is further proven in their loss values

212

213
er
presented in the loss graphs. In Fig. 7, it can be seen that the loss values of the AE model did not yield to

zero or were not even close to it at all. Similarly, loss values of VAE did not yield to zero. The values
pe
214 behave erratically with sporadic changes along its epoch.

215 Table 3. Summary of parameters and results used for all data synthesizers

Combining method Training parameters Evaluation


batch_size=4
ot

learning_rate=2e-4
sample_rate=44100
DiffWave (Kong, et al., 2021) opt = torch.optim.Adam(model.parameters() Failed
residual_channels=64
tn

dilation_cycle_length=10
batch_size = 64
input_size = fixed_length * 128
hidden_size = 256
output_size = fixed_length * 128
rin

Autoencoder Failed
optimizer = optim.AdamW(model.parameters()
lr=0.001
weight_decay=0.01)
num_epochs = 200
input_dim = 13 * 300
ep

latent_dim = 64
Variational autoencoder optimizer = optim.Adam(vae.parameters() Failed
lr=0.001
num_epochs = 1000
216
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4864841
ed
iew
ev
217

218 Figure 7. Loss graph of the AE model

r
er
pe
ot

219
tn

220 Figure 8. Loss graph of the VAE model

221 From the three deep generative models from both speech and nonspeech repositories, the quality of the
rin

222 dataset produced is not qualified as additional data to increase the initial dataset. The use of DiffWave from

223 Kong et al. (2021) generated synthetic data that are all static noise. Consequently, no data was synthesized

224 when using the AE and VAE models because loss values never converged to 0. With this, it is not fit to use
ep

225 data synthesis as a method to generate additional quality data for the knocking sounds of coconut fruits as

226 the basis for the classification of the fruits’ maturity levels.
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4864841
227 Under data augmentation, two methods of data augmentation were examined, namely, audiomentation

ed
228 and procedural generation. Pitch, audio duration, and noise background are just some of the features of

229 audio signals that are being manipulated in the audiomentation method to generate and increase the data

iew
230 (Salamon & Bello, 2017). On the other hand, the procedural generation method utilizes frequency filters.

231 It creates more data by manipulating the frequencies of the audio signals using these filters (Lundberg,

232 2020). Both methods exploit the original data with some minor changes to increase the volume and diversity

233 of the training set, thus, they are referred to as data augmentation methods. All deformation techniques and

ev
234 filters used with their corresponding parameters are summarized in Table 5.

235 Table 4. Deformation techniques and filters used for audiomentation and procedural audio generation

r
Data augmentation method Parameters
Audiomentation stretch_factor = random.uniform(0.8, 1.2)

er
shift_factor = random.randint(-1000, 1000)
pitch_factor = random.randint(-3, 3)
compression_factor = random.uniform(0.1, 0.5)
pe
noise_factor = random.uniform(0, 0.05)
shift_factor = random.uniform(-0.1, 0.1)
filter_factor = random.randint(10, 90)
audio_sep = harmonic_percussive_separation((audio, sr))
audio_vibrato = vibrato((audio, sr))
Procedural audio generation apply_time_varying_lowpass_filter(audio_data, sr):
ot

filter_order = 6
window_size = 1024
overlap = 0.5
tn

butter_lowpass(cutoff, fs, order=5):


nyq = 0.5 * fs
normal_cutoff = cutoff / nyq
butter_lowpass_filter(data, cutoff, fs, order=5):
b, a = butter_lowpass(cutoff, fs, order=order)
rin

236

237 A summary of the generated data from both audiomentation and procedural generation methods
ep

238 compared with the initial dataset is shown in Table 6. Both methods were able to produce 2,025 data under

239 premature and mature levels, having a total of 4,050 data samples. Similar results were also achieved for

240 the mature level with a total of 4,050 data, with each method producing an equal number of data samples.
Pr

241 Both methods produced 2,925 data for overmature level with a total of 5,850 data samples. The total amount

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4864841
242 of data under each method is 6,975 with a grand total of 13,950 data samples. Note that the false data points

ed
243 are already removed and not included in these numbers. When comparing to the initial dataset, samples per

244 maturity levels almost balanced out with only the overmature having greater samples than the two maturity

iew
245 levels.

246 Table 5. Augmented data using data augmentation methods in comparison with the initial dataset

Initial dataset
Maturity level Audiomentation Procedural audio generation Total
(Caladcad, et al., 2020)

ev
Premature 2,025 2,025 4,050 24
Mature 2,025 2,025 4,050 108
Overmature 2,925 2,925 5,850 255
Total 6,975 6,975 13,950 387

r
247

er
pe
ot
tn

248

249 Figure 9. Loss graph of the audio classification module

250 The augmented data were then processed to validate their quality and trainability. To validate, one-
rin

251 dimensional convolution, LSTM, and linear layers are used as the audio classification module. In Fig. 9,

252 the loss graph during the training phase of the module is presented. With epochs of 20 and batch size of 32,
ep

253 the model is slowly approaching a loss value of 0. It achieved its lowest loss value of 0.0862 at epoch 16.

254 Although the loss graph at the end of the training phase slowly increased, it still achieved a training accuracy

255 of 96.12%, testing accuracy of 93.98%, average precision of 93.32%, and average recall of 93.72%, which
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4864841
256 proves the quality of the dataset. The newly generated dataset will then be used as data inputs to improve

ed
257 the classifying algorithms.

258 2.3. Classification module implementation

iew
259 One of the limitations of this study that was previously mentioned is the available data that was used

260 prior to data generation. There are only 8 premature samples and 36 mature samples against the 85

261 overmature samples from the initial dataset. When divided, there is a small amount of data for the testing

262 set, which may not be enough to represent the data distribution. With that, the newly generated dataset will

ev
263 serve as data inputs in the proposed pipeline, as shown in Fig. 10. The three algorithms—artificial neural

264 network (ANN), support vector machine (SVM), and random forest (RF)—from the study of Caladcad et

r
265 al. (2020) will still be used. The pipeline comprises the training and testing phases. Ten-fold cross-

266

267
er
validation is still implemented with a 90/10 data partition, in which 90% of the newly generated dataset is

used for the training and 10% for the testing phases. The machine learning algorithms will then be evaluated
pe
268 under the testing phase, and the following parameters are used:

𝑐𝑜𝑟𝑟𝑒𝑐𝑡𝑙𝑦 𝑐𝑙𝑎𝑠𝑠𝑖𝑓𝑖𝑒𝑑 𝑠𝑎𝑚𝑝𝑙𝑒𝑠


𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = (1)
𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑎𝑚𝑝𝑙𝑒𝑠

( 1
)
ot

𝐹1 ― 𝑠𝑐𝑜𝑟𝑒 = 2 (2)
1 1
+
𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 𝑟𝑒𝑐𝑎𝑙𝑙
tn

269 The accuracy is the ratio of the correctly classified samples and total number of samples, as shown in Eq.

270 1, while the F1-score assesses the weighted average of precision and recall, as shown in Eq. 2. Additionally,

271 a normalized confusion matrix will show the actual classification versus the predicted classification
rin

272 performance of each model. The performance of the machine learning algorithms with the augmented data

273 is compared to their previous performance from the study of Caladcad et al. (2020) without the addition of
ep

274 augmented data. The same parameters from the study of Caladcad et al. (2020) are still implemented in the

275 machine learning algorithms for parallel comparison.


Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4864841
ed
iew
r ev
276

277 Figure 10. Proposed learning algorithm pipeline

278 3. Results and discussion


er
pe
279 The addition of the augmented data increases the number of samples and balances out, if not

280 completely, the samples across the three maturity levels. The ML models are evaluated with the newly

281 generated dataset as their data inputs to assess the improvement of the classifying models in predicting the
ot

282 maturity levels of the coconut. Presented in Fig. 11 are the confusion matrices of ANN, RF, and SVM with

283 the new dataset. The figure elaborates on the ratio of the models’ prediction of the maturity level to their
tn

284 actual maturity level. For ANN, the model correctly classified 97.10%, 88.64%, and 91.62% of the

285 premature, mature, and overmature samples, respectively. Furthermore, the RF model has correctly
rin

286 predicted the maturity level of 94.44% of samples under premature, 93.68% of samples under mature, and

287 98.63% of samples under overmature levels. Lastly, the SVM correctly classified 91.06% premature

288 samples, 75.76% mature samples, and 89.92% overmature samples.


ep
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4864841
ed
iew
ev
289 (a) (b)

r
er
pe

290 (c)
ot

291 Figure 11. Confusion matrices using newly generated dataset of (a) ANN, (b) RF, and (c) SVM
tn

292 Comparing the above results to the study of Caladcad et al. (2020), a summary is presented in Table 7.

293 The table shows the percentages of correctly classified samples per maturity level by the ML models with

294 and without the addition of generated data. It can be observed that there is a significant difference in the
rin

295 performance of the ML models. For instance, the ANN model before data generation could only classify

296 less than 45% of both premature and mature samples. For the RF model prior to data generation, it could
ep

297 only classify 25% premature samples and 59% mature samples, while SVM could only predict 38% of

298 samples from both maturity levels. The difference in the models’ performance for classifying both

299 premature and mature samples ranges from 34.68% to 69.44%. On the other hand, only in classifying
Pr

300 overmature samples did the models before data generation outperform the models trained and tested with

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4864841
301 the newly generated dataset. This was because the majority of the samples in the initial dataset were

ed
302 composed mainly of the overmature samples. The models were able to distinctively classify overmature

303 samples against premature and mature samples. Nonetheless, the results of ML models with the newly

iew
304 generated dataset weren’t far from the previous results. In fact, the difference in the percentages only ranges

305 from 1.37% to 10.08% and couldn’t be deemed as significant.

306 Table 6. Performance comparison of ML models with and without data generation for correctly classified

307 samples

ev
Correctly classified samples per maturity level (%)
ML Models
Premature Mature Overmature
ANN* 38.00 44.00 100.00

r
ANN 97.10 88.64 91.62
RF* 25.00 59.00 100.00
RF
SVM*
SVM
94.44
38.00
91.06
er 93.68
38.00
75.76
98.63
100.00
89.92
pe
308 *The results are taken from the study of Caladcad et al. (2020) using the dataset without generated data.

309 The ML models are also evaluated on their accuracies and F1-scores. The results are summarized in

310 Table 8, along with the performance of ML models before data generation. Both the accuracies and F1-
ot

311 scores of ANN and RF with the generated dataset are 92% and 96%, respectively. The SVM model has the

312 lowest accuracy and F1-score at 87% and 86%, respectively, compared to the other models. However, when
tn

313 comparing the ML models’ performance before data generation, all models’ accuracy and F1-score greatly

314 improved. For accuracy, the percentage difference ranges from 7% to 12.52% with the RF model having

315 the greatest improvement and SVM having the least improvement among the three models. On the other
rin

316 hand, the difference for the F1-score ranges from 9.33% to 14.65% with the RF model still having the

317 greatest improvement and SVM remaining to have the least difference.
ep

318 Table 7. Comparison of ML models with and without data generation using performance indicators

With newly generated


ML Models Performance Indicators Caladcad et al. (2020)*
dataset
Pr

Accuracy (%) 81.74 92.00


ANN
F1-score (%) 79.27 92.00

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4864841
Accuracy (%) 83.48 96.00

ed
RF
F1-score (%) 81.35 96.00
Accuracy (%) 80.00 87.00
SVM
F1-score (%) 76.67 86.00
319 *The results are taken from the study of Caladcad et al. (2020) using the dataset without generated data.

iew
320 Generally, the performance of the ML models in classifying coconut samples to their maturity levels

321 significantly improved with the use of the newly generated dataset. The balanced distribution and the

322 increase in the number of samples across maturity levels greatly contributed to such improvement. Unlike

ev
323 in the previous study, all three ML models can distinctly classify not just the overmature samples, but all

324 samples in the three maturity levels this time. Still, even with the addition of generated data, the RF model

r
325 continues to outperform both ANN and SVM models in terms of their overall performance. It has the highest

326 accuracy at 96% and F1-score at 96% among the three ML models.

327 4. Conclusion
er
pe
328 The study investigated several data generation methods in order to increase the existing dataset of

329 coconut acoustic signals to improve the performance of ML models. Data synthesizers failed to produce

330 quality synthetic data while data augmentation techniques successfully generated quality data that was
ot

331 added to the initial dataset. With the application of data generation, the dataset increased by about 35 times

332 that of the original dataset. The performance of all three ML models significantly increased and is better
tn

333 than the prior study when data augmentation was implemented. This serves as the basis for integrating ML

334 in developing a noninvasive classification system for coconut fruits for mass exportation.
rin

335 References

336 A, S., Thomas, A., & Mathew, D. (2018). Study of MFCC and IHC Feature Extraction Methods With

337 Probabilistic Acoustic Models for Speaker Biometric Applications. Procedia Computer Science,
ep

338 143, 267-276. doi:10.1016/j.procs.2018.10.395


Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4864841
339 Anwar, A. (2021). Difference between AutoEncoder (AE) and Variational AutoEncoder (VAE). Retrieved

ed
340 May 2023, from https://towardsdatascience.com/difference-between-autoencoder-ae-and-

341 variational-autoencoder-vae-ed7be1c038f2

iew
342 Assen, M., Lee, S. J., & De Cecco, C. N. (2020). Artificial intelligence from A to Z: From neural network

343 to legal framework. European Journal of Radiology, 129, 109083.

344 Awan, A. (2022). A Complete Guide to Data Augmentation. Retrieved May 2023, from

345 https://www.datacamp.com/tutorial/complete-guide-data-augmentation

ev
346 Bandyopadhyay, H. (2021). Autoencoders in Deep Learning: Tutorial & Use Cases. Retrieved May 2023,

347 from https://www.v7labs.com/blog/autoencoders-

r
348 guide#:~:text=An%20autoencoder%20is%20an%20unsupervised,even%20generation%20of%20i

349

350
mage%20data.
er
Behera, S., Rath, A., Mahapatra, A., & Sethy, P. (2020). Identification, classification & grading of fruits
pe
351 using machine learning & computer intelligence: a review. Journal of Ambient Intelligence and

352 Humanized Computing. doi:10.1007/s12652-020-01865-8

353 Binkowski, M., Donahue, J., Dieleman, S., Clark, A., Elsen, E., Casagrande, N., . . . Simonyan, K. (2020).
ot

354 High Fidelity Speech Synthesis with Adversarial Networks. In ICLR.

355 doi:10.48550/arXiv.1909.11646
tn

356 Bryan, N. J. (2020). Impulse Response Data Augmentation and Deep Neural Networks for Blind Room

357 Acoustic Parameter Estimation. ICASSP 2020-2020 IEEE International Conference on Acoustics,
rin

358 Speech and Signal Processing (ICASSP). doi:10.1109/ICASSP40776.2020.9052970

359 Burns, D., Johnston, E.-L., & Walker, M. J. (2020). Authenticity and the Potability of Coconut Water - a

360 Critical Review. Journal of AOAC INTERNATIONAL. doi:10.1093/jaocint/qsz008


ep

361 Caladcad, J., Cabahug, S., Catamco, M., Villaceran, P., Cosgafa, L., Cabizares, K., . . . Piedad, E. (2020).

362 Determining Philippine coconut maturity level using machine learning algorithms based on
Pr

363 acoustic signal. Computers and Electronics in Agriculture, 172, 105327.

364 doi:10.1016/j.compag.2020.105327

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4864841
365 Caladcad, J., Piedad, E., Cabahug, S., Catamco, M., Villaceran, P., Cosgafa, L., . . . Hermosilla, M. (2022).

ed
366 Acoustic Signal Dataset: Tall Coconut Fruit Species. Mendeley Data. doi:10.17632/hxh8kd3snj.1

367 Chen, X., Zhou, G., Chen, A., Pu, L., & Chen, W. (2021). The fruit classification algorithm based on the

iew
368 multi-optimization convolutional neural network. Multimedia Tools and Applications, 20, 11313-

369 11330.

370 Chotirat, S., & Meesad, P. (2021). Part-of-Speech tagging enhancement to natural language processing for

371 Thai wh-question classification with deep learning. Heliyon, 7, e08216.

ev
372 doi:10.1016/j.heliyon.2021.e08216

373 Chowanda, A., Iswanto, I., & Andangsari, E. (2023). Exploring deep learning algorithm to model emotions

r
374 recognition from speech. Procedia Computer Science, 216, 706-713.

375

376
doi:10.1016/j.procs.2022.12.187
er
Dennis, J., Tran, H., & Li, H. (2011). Spectrogram Image Feature for Sound Event Classification in
pe
377 Mismatched Conditions. IEEE Signal Processing Letters, 18(2), 130-133.

378 doi:10/11.09/lsp.2010.2100380

379 ElMasry, G., Cubero, S., Molto, E., & Blasco, J. (2012). In-line sorting of irregular potatoes by using
ot

380 automated computer-based machine vision system. Journal of Food Engineering, 112(1-2), 30-38.

381 Elmasry, G., Kamruzzaman, M., Sun, D.-W., & Allen, P. (2012). Principles and Applications of
tn

382 Hyperspectral Imaging in Quality Evaluation of Agro-Food Products: A Review. Critical Reviews

383 in Food Science and Nutrition, 52(11), 999-1023.


rin

384 Engel, J., Resnick, C., Roberts, A., Dieleman, S., Norouzi, M., Eck, D., & Simonyan, K. (2017). Neural

385 Audio Synthesis of Musical Notes with WaveNet Autoencoders. In Proceedings of the 34th

386 International Conference on Machine Learning, 70, 1068-1077.


ep

387 Gatchalian, M. M., & De Leon, S. (1994). Measurement of Young Coconut (Cocos nucifera, L.) Maturity

388 by Sound Waves. Journal of Food Engineering, 23, 253-276.


Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4864841
389 Ghatas, Y., Fayek, M., & Hadhoud, M. (2022). A hybrid deep learning approach for musical difficulty

ed
390 estimation of piano symbolic music. Alexandria Engineering Journal, 61(12), 10183-10196.

391 doi:10.1016/j.aej.2022.03.060

iew
392 Hameed, K., Chai, D., & Rassau, A. (2018). A comprehensive review of fruit and vegetable classification

393 techniques. Image and Vision Computing. doi:10.1016/j.imavis.2018.09.016

394 Harmanny, R., de Wit, J., & Prémel Cabic, G. (2014). Radar Micro-Doppler Feature Extraction Using the

395 Spectrogram and the Cepstrogram. In Proceedings of 2014 11th European Radar Conference.

ev
396 doi:10.1109/eurad.2014.6991233

397 Himanen, L., Geurts, A., Foster, A., & Rinke, P. (2019). Data-Driven Materials Science: Status, Challenges,

r
398 and Perspectives. Advanced Science 2019. doi:10.1002/advs.201900808

399

400
er
Hizlisoy, S., Yildirim, S., & Tufekci, Z. (2021). Music emotion recognition using convolutional long short

term memory deep neural networks. Engineering Science and Technology, an International
pe
401 Journal, 24(3), 760-767. doi:10.1016/j.jestch.2020.10.009

402 Janiesch, C., Zschech, P., & Heinrich, K. (2021). Machine learning and deep learning. Electronic Markets,

403 31, 685-695.


ot

404 Javel, I. M., Bandala, A. A., Salvador, R. C., Bedruz, R. R., Dadios, E. P., & Vicerra, R. P. (2018). Coconut

405 Fruit Maturity Classification using Fuzzy Logic. 2018 IEEE 10th International Conference on
tn

406 Humanoid, Nanotechnology, Information Technology,Communication and Control, Environment

407 and Management (HNICEM), 1-6. doi:10.1109/HNICEM.2018.8666231


rin

408 Kong, Z., Ping, W., Huang, J., Zhao, K., & Catanzaro, B. (2021). DiffWave: A Versatile Diffusion Model

409 for Audio Synthesis. Audio and Speech Processing. doi:10.48550/arXiv.2009.09761

410 Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet Classification with Deep Convolutional
ep

411 Neural Networks. Advances in neural information processing systems (NIPS), 1097-1105.

412 KS, D., MD, R., & G, S. (2021). Comparative performance analysis for speech digit recognition based on
Pr

413 MFCC and vector quantization. Global Transitions Proceedings, 2(2), 513-519.

414 doi:10.1016/j.gltp.2021.08.013

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4864841
415 Le, T.-T., Lin, C.-Y., & Piedad, E. (2019). Deep learning for noninvasive classification of clustered

ed
416 horticultural crops – A case for banana fruit tiers. Postharvest Biology and Technology, 156,

417 110922.

iew
418 Lundberg, A. (2020). Data-Driven Procedural Audio: Procedural Engine Sounds Using Neural Audio

419 Synthesis. Stockhold, Sweden: Kth Royal Institute of Technology.

420 Madikeri, S. R., & Murthy, H. A. (2011). Mel Filter Bank Energy-Based Slope Feature and Its Application

421 to Speaker Recognition. In Proceedings of 2011 National Conference on Communications (NCC).

ev
422 doi:10.1109/ncc.2011.5734713

423 Maguolo, G., Paci, M., Nanni, L., & Bonan, L. (2021). Audiogmenter: a MATLAB toolbox for audio data

r
424 augmentation. Applied Computing and Informatics. doi:10.1108/ACI-03-2021-0064

425

426
er
Maharana, K., Mondal, S., & Nemade, B. (2022). A review: Data pre-processing and data augmentation

techniques. Globra Transitions Proceedings, 3(1), 91-99. doi:10.1016/j.gltp.2022.04.020


pe
427 Mahayothee, B., Koomyart, I., Khuwijitjaru, P., Siriwongwilaichat, P., Nagle, M., & Müller, J. (2016).

428 Phenolic Compounds, Antioxidant Activity, and Medium Chain Fatty Acids Profiles of Coconut

429 Water and Meat at Different Maturity Stages. International Journal of Food Properties, 19(9),
ot

430 2041-2051.

431 Moreno, M. L., Kuwornu, J. K., & Szabo, S. (2020). Overview and Constraints of the Coconut Supply
tn

432 Chain in the Philippines. International Journal of Fruit Science, 20(sup2), 1-18.

433 Nallanthighal, V., Mostaani, Z., Härmä, A., Strik, H., & Magimai-Doss, M. (2021). Deep learning
rin

434 architectures for estimating breathing signal and respiratory parameters from speech recordings.

435 Neural Networks, 141, 211-224. doi:10.1016/j.neunet.2021.03.029

436 Nanni, L., Maguolo, G., & Paci, M. (2020). Data augmentation approaches for improving animal audio
ep

437 classification. Ecological Informatics, 57, 101084. doi:10.1016/j.ecoinf.2020.101084

438 Ngo, D., Hoang, H., Nguyen, A., Ly, T., & Pham, L. (2020). Sound Context Classification Basing on Join
Pr

439 Learning Model and Multi-Spectrogram Features. Sound. doi:10.48550/arXiv.2005.12779

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4864841
440 Ning, C., & You, F. (2019). Optimization under Uncertainty in the Era of Big Data and Deep Learning:

ed
441 When Machine Learning Meets Mathematical Programming. Computers & Chemical Engineering,

442 125, 434-448.

iew
443 Park, N., Mohammadi, M., Gorde, K., Jajodia, S., Park, H., & Kim, Y. (2018). Data Synthesis based on

444 Generative Adversarial Networks. Databases.

445 Pascua, A. M. (2017). Impact Damage Threshold of Young Coconut (Cocos nucifera L.). International

446 Journal of Advances in Agricultural Science and Technology, 4(11), 1-9.

ev
447 Paul S, B., Glittas, A., & Gopalakrishnan, L. (2021). A low latency modular-level deeply integrated MFCC

448 feature extraction architecture for speech recognition. Integration, 76, 69-75.

r
449 doi:10.1016/j.vlsi.2020.09.002

450

451
er
Pendyala, V. S., Yadav, N., Kulkarni, C., & Vadlamudi, L. (2022). Towards building a Deep Learning

based Automated Indian Classical Music Tutor for the Masses. Systems and Soft Computing, 4,
pe
452 200042. doi:10.1016/j.sasc.2022.200042

453 Philippine Coconut Authority. (2018). Coconut Statistics. Retrieved May 2022, from

454 https://pca.gov.ph/index.php/resources/coconut-statistics
ot

455 Pulakka, H., & Alku, P. (2011). Bandwidth Extension of Telephone Speech Using a Neural Network and a

456 Filter Bank Implementation for Highband Mel Spectrum. IEEE Transactions on Audio, Speech,
tn

457 and Language Processing, 19(7), 2170-2183. doi:10.1109/tasl.2011.2118206

458 Salamon, J., & Bello, J. (2017). Deep Convolutional Neural Networks and Data Augmentation for
rin

459 Environmental Sound Classification. IEEE Signal Processing Letters, 24(3), 279-283.

460 doi:10.1109/LSP.2017.2657381

461 Shaked, S. (2023). Why Use Synthetic Data vs Real Data? Retrieved May 2023, from
ep

462 https://www.datomize.com/why-use-synthetic-data-versus-real-data/

463 Shaoa, S., Wang, P., & Yan, R. (2019). Generative adversarial networks for data augmentation in machine
Pr

464 fault diagnosis. Computers in Industry, 106, 85-93. doi:10.1016/j.compind.2019.01.001

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4864841
465 Shinde, P. P., & Shah, S. (2018). A Review of Machine Learning and Deep Learning Applications. In

ed
466 Proceedings of the 2018 Fourth International Conference on Computing Communication Control

467 and Automation, 1-6. doi:10.1109/ICCUBEA.2018.8697857

iew
468 Siriwardena, Y. M., Attia, A., Sivaraman, G., & Espy-Wilson, C. (2022). Audio Data Augmentation for

469 Acoustic-to-Articulatory Speech Inversion using Bidirectional Gated RNNs. Audio and Speech

470 Processing.

471 Sun, Y., Maeda, T., Solis-Lemus, C., Pimentel-Alarcon, D., & Burivalova, Z. (2022). Classification of

ev
472 animal sounds in a hyperdiverse rainforest using convolutional neural networks with data

473 augmentation. Ecological Indicators, 145, 109621. doi:10.1016/j.ecolind.2022.109621

r
474 Tak, R. N., Agrawal, D. M., & Patil, H. A. (2017). Novel Phase Encoded Mel Filterbank Energies for

475

476
er
Environmental Sound Classification. Pattern Recognition and Machine Intelligence, 317-325.

doi:10.1007/978-3-319-69900-4_40
pe
477 Terdwongworakul, A., Chaiyapong, S., Jarimopas, B., & Meeklangsaen, W. (2009). Physical properties of

478 fresh young Thai coconut for maturity sorting. Biosystems Engineering, 103, 208-216.

479 Tschannen, M., Bachem, O., & Lucic, M. (2018). Recent Advances in Autoencoder-Based Representation
ot

480 Learning. Machine Learning. doi:10.48550/arXiv.1812.05069

481 Wang, Q., Velasco, L., Breitung, B., & Presser, V. (2021). High-Entropy Energy Materials in the Age of
tn

482 Big Data: A Critical Guide to Next-Generation Synthesis and Applications. Advanced Energy

483 Materials, 11(47). doi:10.1002/aenm.202102355


rin

484 Xia, X.-J., Ling, Z.-H., Jiang, Y., & Dai, L.-R. (2014). HMM-based unit selection speech synthesis using

485 log likelihood ratios derived from perceptual data. Speech Communication, 63-64, 27-37.

486 doi:10.1016/j.specom.2014.04.002
ep

487 Yamamoto, R., Song, E., & Kim, J.-M. (2020). Parallel WaveGAN: A fast waveform generation model

488 based on generative adversarial networks with multi-resolution spectrogram. Audio and Speech
Pr

489 Processing. doi:10.48550/arXiv.1910.11480

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4864841
490 Yasar, K., & Laskowski, N. (2023). Synthetic data. Retrieved May 2023, from

ed
491 https://www.techtarget.com/searchcio/definition/synthetic-

492 data#:~:text=Synthetic%20data%20is%20information%20that's,machine%20learning%20(ML)%

iew
493 20models.

494 Yu, Y., Luo, S., Liu, S., Qiao, H., Liu, Y., & Feng, L. (2020). Deep attention based music genre

495 classification. Neurocomputing, 372, 84-91. doi:10.1016/j.neucom.2019.09.054

496 Yuan, B., Xie, H., Wang, Z., Xu, Y., Zhang, H., Liu, J., . . . Wu, J. (2023). The domain-separation language

ev
497 network dynamics in resting state support its flexible functional segregation and integration during

498 language and speech processing. NeuroImage, 274, 120132.

r
499 doi:10.1016/j.neuroimage.2023.120132

500

501
er
Zhang, B., Huang, W., Li, J., Zhao, C., Fan, S., Wu, J., & Liu, C. (2014). Principles, developments and

applications of computer vision for external quality inspection of fruits and vegetables: A review.
pe
502 Food Research International, 62, 326-343.

503 Zheng, T., Song, L., Wang, J., Teng, W., Xu, X., & Ma, C. (2021). Data synthesis using dual discriminator

504 conditional generative adversarial networks for imbalanced fault diagnosis of rolling bearings.
ot

505 Measurement, 158, 107741. doi:10.1016/j.measurement.2020.107741


tn
rin
ep
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4864841

You might also like