0% found this document useful (0 votes)

1 views7 pages

Automatic Classification of Singing Voice Quality

This paper discusses the automatic classification of singing voice quality using a database of singer recordings and a feature vector derived from voice parameters. It employs artificial neural networks (ANNs) and rough sets for classification, analyzing both trained and untrained singers. The study emphasizes the importance of specific parameters, particularly related to formant frequencies and amplitudes, in distinguishing different voice qualities and types.

Uploaded by

David Li

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1 views7 pages

Automatic Classification of Singing Voice Quality

Uploaded by

David Li

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/4215614

Automatic classiﬁcation of singing voice quality

Conference Paper · October 2005

DOI: 10.1109/ISDA.2005.28 · Source: IEEE Xplore

CITATIONS READS
4 118

2 authors, including:

Bozena Kostek
Gdansk University of Technology
321 PUBLICATIONS 1,949 CITATIONS

SEE PROFILE

All content following this page was uploaded by Bozena Kostek on 26 October 2019.

The user has requested enhancement of the downloaded file.

Automatic Classification of Singing Voice Quality

Bozena Kostek and Pawel Zwan

Multimedia Systems Department
Gdańsk University of Technology
{bozenka, zwan}@sound.eti.pg.gda.pl

Abstract examined. An implementation of a feature vector

containing both MPEG-7-based parameters and some
In the paper problems related to the classification of new ones especially designed for the singing voice
singing voice quality are presented. For this purpose a signal is the aim of the first part of the paper. In the
database consisting of singers’ sample recordings is second part of the paper decision systems based on
constructed and parameters are extracted from ANNs and rough sets are utilized for classification
recorded voice of trained and untrained singers. The purposes. The effectiveness of decision systems is
parameterization process is based on both voice determined and a possibility of the automatic singing
source and formant analyses of a singing voice. These voice quality/type classification is discussed.
parameters are explained as to their physical
interpretation and analyzed statistically in order to 2. Parameterization
diminish their number. The statistical analysis is based
on the Fisher statistic. In such a way a feature vector 2.1 Parameters Related to the Singing Voice
of a singing voice is formed. Decision systems based Generation Analysis
on neutral networks and on rough sets are utilized in
the context of the voice type and voice quality An output vocal signal is a convolution of the
classification. Results obtained in the automatic glottal source and the vocal tract impulse response,
classification performed by both decision systems are therefore singing voice parameters can be divided into
compared. A possibility to classify automatically voice parameters related to a glottal source and the vocal
type/voice quality is judged and conclusions are tract (formant amplitudes and frequencies). In this
derived. The methodology proposed provides a way for aspect it is necessary to deconvolve a glottal source
discerning trained and untrained singers. waveform from the output signal by using inverse
filtration. That idea is related to the assumption of
1. Introduction linearity of the transmission filter and may lead to
some simplifications of the complicated voice
In order to classify automatically singing voice biomechanics [8]. For that purpose the LPC method is
sounds a parametric description is necessary. Among commonly used. A determination of prediction
many applications of the singing voice automatic coefficients enables the approximation of the vocal
recognition envisioned, the automatic media content tract spectrum characteristics. The relation is given by
indexing and the automatic classification of singing Eq. (1):
voice quality seem to be the most interesting ones.
Therefore it is important to develop a system, which ( )
PA e jω =
N −1
1
(1)
could classify voice type/quality on the basis of the
recorded voice samples. The complicated
1+ ∑a
n =1
n ⋅ exp(− jnω )

biomechanics of the singing voice [8], [10] is the

reason why an intelligent classification system should where coefficients an are determined on the basis of the
be utilized. In addition, a singing voice, although the autoregressive model, and N is the number of
generation mechanism is the same as in speech, is a prediction coefficients. The resolution of the analysis
form of an artistic expression, thus some new is determined by the LPC order.
description parameters have to be developed and
The approximation of the transfer function enables overlapping should be used focusing on micro-changes
formant frequency and amplitude calculation. It is of the formant amplitudes. In the experiments a
particularly important in the case of ‘singers’ formant’ spectrogram of 8 cycles was used. The analyzing
analysis presented in the literature [1], [6], [7], [8]. window size was set to 16 in order to achieve a fine
Singing formant in the frequency band of 3-3.5 kHz is time resolution and a satisfactory spectrum resolution
perceived as a strong energy band with the amplitude of 2.7kHz. By analyzing changes of energies in four
related to vocal quality. Therefore in the feature vector bands (1-2.7kHz) (2.7kHz-5.5kHz) (5.5kHz-8.2kHz)
frequencies and amplitudes of each of the four (8.2kHz-11kHz) some parameters were calculated:
formants (f<10kHz) are calculated (F1-F4) and - Knmax (min) is a value of the first maximum
contained. The statistical analysis shows that singer’s (minimum) of the autocorrelation function of
formant amplitude is crucial in the context of vocal changes of energy within the band n (n denotes the
quality/ vocal type recognition. number of the frequency band). The higher the
A high level of high frequency components (f>2k) value is, the more periodic are changes of spectral
is typical for trained singers (e.g. it allows a singer energies in the band;
being heard well with accompaniment of an orchestra) - KnPr are calculated by using a threshold of the
thus an adequate parameter corresponding to high mean energy in the band and expressed as the
frequency component should be calculated. It can be number of coefficients higher than a threshold.
done by using so-called Perceptual Linear Prediction For higher frequencies the value of this parameter
method [2]. In this method a vocal signal is analyzed in is related to the open glottis coefficient due to the
bark frequency bands resulting in 25 frequency fact that higher resonances are present in the
coefficients that represent energy of the signal in signal when the glottis is open (sub-glottal
critical bands. In such a way lower and higher resonances [8]).
formants are separated and clearly seen. Since no
formant is present in the frequency band of 1.5k – 2.5 2.2 MPEG-7 Based Parameters
kHz, a high/low frequency division point was set to
2kHz. A ratio of high to low band energy was Another way of determining adequate parameters is
calculated by summing up energy of adequate the use of a more general signal description such as the
coefficients (PLPHL) (see Eq. 2) and by calculating a standardized MPEG-7 audio content parameters.
ratio between max coefficient value for the band of Although they are not related to the singing voice
f>2k and the corresponding one for the band of biomechanics, they can be analyzed statistically in
f<2kHz – PLPmax (see Eq. 3). order to see whether they are useful in the singing
25 voice recognition process. The MPEG-7 parameters
∑ PLP(i)
i= N0
[5], [9] are not to be presented in a detail, such a
PLPHL = (2) detailed description is out of the focus of the paper,
N 0 −1
thus only a short description is to be shown. The
∑ PLP(i)
i =1
MPEG-7 audio parameters can be divided into the
following groups:
max ( PLP(i))
N 0 <i ≤ 25
- ASE (Audio Spectrum Envelope) - describes the
PLP max = (3) short-term power spectrum of the waveform with
max ( PLP(i)) logarithmic frequency axis in ¼-octave resolution
1<i< N 0 sized bands, between 62.5 Hz and 16 kHz
(frequencies below 62.5 and above 16 Hz are
Furthermore the FFT spectrum was parameterized expressed as two additional coefficients). The
in the same way resulting in an FFTHL parameter mean values and variances of each coefficient
expressed as a ratio between spectral energy for high over time are denoted as ASE1…ASE34 and
and low frequencies (a division frequency 2kHz). ASEv1…ASEv34 respectively.
Parameterization of the glottal source seems to be a - ASC (Audio Spectrum Centroid) - describes the
more difficult task. The inverse filtration method does center of gravity of the log-frequency power
not give satisfactory effects in the case of the glottal spectrum. The mean value and the variance of the
source parameter estimation due to the lack of spectrum centroid over time are denoted as ASC
knowledge about the phase spectrum of the vocal tract and ASCv respectively.
from LPC prediction [10]. In order to calculate glottal - ASS (Audio Spectrum Spread) - Audio Spectrum
signal parameters or at least some parameters Spread (ASS) is defined as the RMS deviation of
correlated to them a spectral analysis with a great the log-frequency power spectrum with respect to
its center of gravity. The mean value and the Academy in Gdansk), Music Academy Students
variance over time are denoted as ASS and ASSv (others than Vocal Faculty students, e.g. choir
respectively. members) and amateur members of the Choir of the
- SFM (Spectral Flatness Measure) for each Gdansk University of Technology. For all recorded
frequency band. The mean values and the sounds all 259 parameters were extracted and then
variances of each SFM over time are denoted as analyzed statistically by using the Fisher statistic (FS)
SFM1…SFM24 and SFMv1…SFMv24 [4]. In order to diminish the feature vector size
respectively. parameters with the highest Fisher statistic values were
- Parameters related to discrete harmonic values: chosen (25 best parameters for every class pair). Thus
HSD (Harmonic Spectral Deviation), HSS this operation resulted in 46 parameters contained in
(Harmonic Spectral Spread), HSV (Harmonic the feature vector in the case of quality classification
Spectral Deviation), HSS (Harmonic Spectral (Eq. 4) and 52 parameters for the voice type (Eq. 5)
Spread), HSV (Harmonic Spectral Variation). recognition. It should be remembered that the greater
FS values of a chosen parameter for two classes, the
2.3 Additional Harmonic-Related Parameters better is the separability of objects within these classes.

As presented in the literature [6], [8] levels of first

harmonics change for different voice types/qualities, feature_quality = {ASCv, ASE21, ASE22, ASE23,
so some parameters related to harmonics and their ASE24, ASE25, ASE26, ASE2,7 ASE30, ASE31,
changes within time may be useful. Parameters of that ASEv16, ASEv23, ASEv24, ASEv25, ASEv29, ASEv5,
type are not a part of the MPEG-7 standard, thus other ASEv6, ASS, ASSv, F2, F2/F1, F2cz, HSCv, HSD, HS,S
parameters should be conceived. They are [3]: HSSv, K4max, PLPmax, SCv, SFM12, SFM13, SFM15,
- Ev – content of even harmonics in the spectrum, SFM16, SFM17, SFM19, SFM20, SFM5, SFMv12,
- First, Second and Third Tristimulus parameters SFMv13, SFMv8, br, m1, m2, md1, md2, s1, s2}; (4)
(Tri1, Tri2, Tri3),
- Pitch expressed as a KeyNum, feature_type = {ASC, ASCv, ASE10, ASE17, ASE21,
- Brightness (br), ASE22, ASE23, ASE26, ASE27, ASE31, ASE8, ASE9,
- Descriptors related to the behavior of harmonic ASEv13, ASEv24, ASEv25, ASSv, F1, F2/F1, F2cz,
components in the time-frequency domain: F34/F1, HSDv, K2Pr, K2min, K4Pr, K4min, KeyNum,
a) Mean value of differences between amplitudes SC, SFM10, SFM11, SFM12, SFM15, SFM16, SFM22,
of a harmonic in adjacent time frames (sn, SFMv10, SFMv12, SFMv13, SFMv14, SFMv15,
where n is the number of a harmonic) SFMv16, SFMv17, SFMv19, SMv20, SFMv22, SFMv4,
b) Mean value of amplitudes Ah of a harmonic SFMv6, br, m1, md1, md10, md2, s1, s2}; (5)
over time (mn, where n is the number of a
harmonic) In Table 2 values of Fisher statistic for 8 best
c) Standard deviation of amplitudes Ah of a parameters for each 3 pairs of recognized classes are
harmonic over time (mdn, where n is the presented in the context of vocal quality. For classes
number of a harmonic) professional– amateur singers the winning parameters
are mostly spectrum- and LPC-based parameters
3. Experiments related to the singer formant sub-band (F2, ASE22,
ASE24, ASE23, SFM 16, PLPVal) (see Table 1 for a
3.1 Statistical Analysis frequency band related to the spectral parameters
having the biggest Fisher statistic value), but it can be
In experiments 1060 singing voice samples were observed that also K4max related to the regularity of
parameterized by utilizing all mentioned before the formant modulation changes and Md2 related to the
parameters. This resulted in 259 extracted parameters second harmonic have big values.
for each sample. Analyzed sounds consist of
Table 1 Spectral parameters with the highest Fisher
recordings of voices of 16 trained and untrained
statistic value and corresponding frequency bands
singers (8 sopranos, 5 altos, 3 tenors). Voice
recordings were performed in the recording studio of Parameter Freq. band [Hz]
the Gdansk University of Technology. Subjects were ASE22 2000 - 2378
divided into three groups: professional singers ASE23 2378 - 2800
(graduated from the Vocal Faculty of the Music ASEv24 2800 - 3360
SFMv15 2820 - 3140 (K4Pr) can be seen. Fisher statistic values are high for
SFMv16 3360 - 4000 the pairs of ‘tenor-soprano’ and ‘tenor – alto’ classes
(tenors have strongest singer formant – high values of
SFMv19 5656 - 6720
ASE23 parameter correspond to it), values of the
SFMv20 6720 - 8000 Fisher statistic are the lowest for soprano-alto
F3 around 3kHz recognition.

Table 2 Best parameters and the corresponding

Fisher statistic values for all pairs of recognized Mean correlation value
singing voice quality classes
Prof. - MA MA students - 0,4
0,35
Prof. - amateurs students amateurs 0,3

Param. FS Param. FS Param. FS 0,25

0,2

F2 27.11 SFM16 16.89 ASE25 10.77 0,15

0,1
PLPHL 22.34 F2 16.46 ASEv24 9.46 0,05
0
ASE24 22.11 ASE23 16.24 F2 9.31

4
v
E22

E23

E24

E25

Ev5

t
2
AX
6

7
F2
br

Wr
Ev1

Ev2
C_

md
M1

M1
K4M
ASE23 20.61 ASE24 15.85 PLPmax 8.34

PLP
AS

AS
SF

SF
AS

AS
SFM16 19.81 K4max 15.57 SFM17 8.29
K4max 16.32 ASEv16 14.71 ASEv23 8.26 Figure 1 Mean correlation values obtained for best
parameters according to the Fisher statistic
md2 15.91 PLPmax 14.46 br 7.91 analysis
ASE22 15.45 ASCv 13.41 ASEv5 7.85
Table 3 Best parameters and the corresponding
The same parameters are winning for professional - Fisher statistic values for all pairs of recognized
MA (Music Academy) student pairs but the singing voice type classes
corresponding Fisher statistic values are already Soprano - tenor Alto - tenor Soprano - alto
smaller. For such pairs as MA student – amateur Param. FS. Param. FS Param. FS
values of the Fisher statistic are the lowest ones and ASE23 20.16 ASE23 27.48 ASC 7.83
some additional parameters (e.g. Brightness) seem to
K4Pr 14.97 ASE22 20.21 ASE21 7.72
be important in the recognition process.
In order to check if the parameters are correlated, a SFMv20 14.47 ASEv24 18.17 SC 6.67
correlation analysis was performed and a mean md1 14.16 SFMv16 18.03 SFM12 6.07
correlation value for each parameter and each sound ASE32 14.15 SFMv15 15.89 ASE27 5.4
was calculated (Fig. 1) accordingly to (Eq. 6). m1 13.77 ASSv 15.66 SFMv10 5.23
Parameter K4max had the lowest correlation value, and md2 13.69 br 15.81 SFM15 5.1
spectral parameters are strongly correlated to others.
br 13.32 SFMv14 15.01 SFMv6 5.09
So K4max although it does not have the highest Fisher
statistic value seems to be very important because it
carries on an additional non-spectral information. 3.2 ANN Classification

K N param −1( j ≠ l ) Feature vectors were divided into two equal sets. The
Kl = ∑ ∑
k =1 j =1
corr ( p k l , p kj ) (6) first one was used for the ANN training purposes.
Other feature vectors were used to test the
generalization performance of the network and to
where K is the number of the sounds, Nparam is the calculate recognition effectiveness. The ANN was a
number of a parameter, corr(x,y) is the cross- feed-forward type network. In the first layer the
correlation function. number of neurons was equal to number of parameters,
Similarly, the same analysis procedure was the hidden layer was experimentally set to 20 neurons.
performed for voice type classes. The results obtained In the output layer the number of neurons was equal to
are presented in Table 3. As may be seen the best the number of recognized classes (in both cases – 3
parameters differ from quality classes in the context of classes of quality and voice type recognition). During
separation quality. The importance of harmonic learning process a validation test was used and the
parameters (md1, md2, m1) and time parameters training was stopped if the validation error was rising
for 100 cycles. The training of the ANN went quite
smoothly, around 200 to 300 cycles were sufficient for feature vector were not correlated and carried on
this phase. The activation function was sigmoid. supplementary information, which in a sum gave a
In Tables 4 - 7 results for singing voice quality and good separation of recognized classes.
type classification are presented. In the case of the In the case of the vocal quality classification a
vocal quality classification all 1060 sounds were used, lower effectiveness was obtained (in average 84%), but
in case of the vocal type recognition the number of the most of the errors occurred between professionals and
derived feature vectors was lower, because sounds of Music Academy students. Some of sounds sung by
the poorest singers were not used for the vocal type them were of good quality as well as it happened for
recognition. In some cases of amateurs it was hard to an amateur singer. It is important to observe that there
judge whether a vocalist was a soprano or an alto occurred only 3 errors in recognition between
singer). Output classes were denoted ‘professionals’, professional and amateur singers (2% of the total
‘MA students’, ‘amateurs’ for quality, and ‘soprano’, number of sounds).
‘alto’ and ‘tenor’ for the voice type cases.
3.3 Rough-Set-Based Classification
Table 4 Results obtained for the singing voice type
recognition (number of recognized sounds) by the In order to check the recognition effectiveness, RSES,
ANN the rough set decision system was used [11]. Feature
In \ out Soprano Alto Tenor vectors were divided into training and testing sets.
Soprano 76 1 0 Parameters were quantized according to the RSES
Alto 5 54 0 system principles. The local discretization was used.
Tenor 0 0 67 In Tables 8 and 9 the RSES-based decision system
recognition results are presented. In the case of quality
Table 5 Singing voice type recognition (ANN) classification the number of rules extracted was 7550,
No. of sounds Errors Accuracy [%] the minimum length of a rule was 2, the maximum
Soprano 77 1 91.81 length was equal to 10. An example of rules obtained
is given below:
Alto 59 5 79.17
If ASSv=(-Inf,-0.701) and SFM16=(-Inf,-0.080) and
Tenor 67 0 100 F2=(0.153,Inf) and K4max=(-Inf,-0.747) then
Total 203 6 97.04 Quality=GOOD
Table 6 Results obtained for singing voice quality On the other hand, in the case of the singing voice
classification (number of recognized sounds) by type recognition, the number of rules derived was
the ANN
6900, the minimum length was 2 and the maximum
in \ out Professionals AM students Amateurs was equal to 12.
Professionals 157 11 3
MA students 19 152 21 Table 8 Results obtained for singing voice quality
Amateurs 3 28 137 classification (number of recognized sounds) by
the RSES system
Table 7 Singing voice quality classification (ANN)
No. of sounds Errors Accuracy [%] Professional
In \ out s MA students Amateurs
Professional 171 14 91.81
Professionals 157 15 2
MA student 192 40 79.17
MA students 35 110 58
Amateur 168 31 81.55
Amateurs 8 34 114
Total 531 85 84.00
Table 9 Results obtained for the singing voice type
Recognition of the voice type seemed to be an classification (number of recognized sounds) by
easier task. For over 200 testing sounds only 6 singing the RSES system
voices were badly recognized resulting in the total In \ out Professionals MA students Amateurs
effectiveness of 97% (100% accuracy in the tenor
Professionals 73 6 0
recognition was obtained). Soprano and alto sounds
were also well recognized (only 6 errors) although low MA students 7 51 3
Fishers statistic values were obtained for those classes. Amateurs 2 6 57
An explanation is that parameters contained in the
The recognition effectiveness obtained is lower then The great advantage of the rough-set system is that
in the case of ANNs. The average recognition decision rules could be reviewed by the experimenter
effectiveness for rough set decision system is 71% for conducting tests and their content can be easily
quality and correspondingly 87% for the voice type analyzed. At this stage of experiments it is also worth
recognition. ANNs recognition results were 84% and mentioning that rules derived from the rough set-based
98% respectively. analysis may be compared to the subjective test results.
The subjective evaluation is very important in quality
4. Conclusions judgment. Thus what is further proposed attempts to
find a correlation between attributes named by the
The results derived from the statistical analysis experts and parameters extracted from singing voices.
show that high values of Fisher statistic may not be This can be done by analyzing rough set-based
sufficient for a good separation between classes, and knowledge induction obtained from the decision
contrarily low values may describe better quality of system. Moreover the quantization procedure results
parameters if the correlation analysis performed on the can also be compared with the ones obtained by
same data indicates this. The best example is the experts in a more subjective way. In such a way the
presented K4max parameter, which has lover Fisher whole classification process would get a subjective
statistic values in comparison to the spectral justification. Such a scheme of experiments was
parameters, but it is much less correlated to other already tested on musical samples and seemed well
parameters. adapted to music-related studies. This is of advantages
The experiments carried out show good of the rough set analysis over the neural network-based
effectiveness of the ANN-based singing voice classification which does not allow for carrying out
recognition system (especially in the case of the such quality comparison.
recognition of singing voice type – 98%). Accuracy
attained by the system of the level of 84% in the case 5. References
of singing voice quality recognition can be explained
by the fact that sometimes it is difficult to classify a [1] BLOOTHOOF G., “The sound level of the singers
singer into one class. Amateur singers sometimes formant in professional singing”, J. Acoust. Soc. Am. 79 (6),
pp. 2028-2032, 1986.
manage to sing a vowel well (rather casually) and
[2] HERMANSKY H., “Perceptual Linear Predictive (PLP)
contrarily it occurs that some vowels from Analysis of Speech”, Journal of Acoust. Soc. Am., pp. 1738-
professionals are not perfect. In the experiments 1752, April 1990.
carried out there were only 3 errors in the recognition [3] KOSTEK B., CZYZEWSKI A. (2001). Representing
between professional and amateur singers (2% of the Musical Instrument Sounds for Their Automatic
total number of sounds). On the other hand, the Classification, J. Audio Eng. Soc., 49, 9, 768-785.
effectiveness of the system can be drastically improved [4] KOSTEK B., Soft computing in acoustics, Physica
if more than one singing vowel could be analyzed, then Verlag, New York, Heidelberg, 1999.
casual recognition errors between neighboring classes [5] KOSTEK B., SZCZUKO P., ŻWAN P., Processing of
Musical Data Employing Rough Sets and Artificial Neural
would not influence the total quality judgment.
Networks, in Rough Sets and Current Trends in Computing,
Results obtained the RSES decision system are less RSCTC, Uppsala, Sweden, Lecture Notes in Atificial
optimistic. This can be explained by the complexity of Intelligence, LNAI 3066, Springer Verlag, Berlin,
the problem related to the choice of parameters and Heidelberg, New York, 2004, 539-548.
their quantization. A non-linear neural system [6] MENDES A., “Acoustic effect of vocal training”, 17th
managed to analyze properly the parametrized data. In ICA Proceedings vol. VIII, pp. 106-107, Rome 2001.
the experiments performed the methodology proposed [7] ROTHMAN H.B. “Why we don’t like these singers”, 17th
utilized the Fisher statistic and the correlation analyses ICA Proceedings, vol. VIII, pp. 114-115, Rome 2001.
aimed at diminishing the number of parameters used in [8] SUNDBERG J. “The science of the singing voice”,
Northern Illinois University Press, Dekalb, Illinois, 1987.
the classification process. Although such procedures
[9] SZCZUKO P., DALKA P., DABROWSKI P. and
are easily justified with neural networks, this can be a KOSTEK B. (2004). MPEG-7-based Low-Level Descriptor
drawback while using a rough set system. The rough Effectiveness in the Automatic Musical Sound Classification,
set-based analysis has an ability to search for 116 Audio Eng. Conv., Preprint No. 6105, Berlin.
significant attributes, reducts and core, thus in future [10] ZWAN P. “Glottal source parametrization”, Proc of X.
experiments all 259 parameters should be contained in Symposium on New Trends in Audio Video Technology,
decision tables rather then diminishing their number Wroclaw, pp. 64-72 (in Polish), 2004.
before the analysis starts. [11] Rough-set Exploration System ver. 2.2 user manual:
logic.mimuw.edu.pl/~rses/RSES_doc_eng.pdf

View publication stats

INSTALLATION Multiphase Flowmeter
100% (3)
INSTALLATION Multiphase Flowmeter
3 pages
ML4T 2017fall Exam1 Version A
No ratings yet
ML4T 2017fall Exam1 Version A
8 pages
Electronic Bartacking and Button Attaching Machine
No ratings yet
Electronic Bartacking and Button Attaching Machine
42 pages
How To Mix Colors in Paint Production
100% (1)
How To Mix Colors in Paint Production
17 pages
Automatic Singing Quality Recognition Employing Artificial Neural Networks
No ratings yet
Automatic Singing Quality Recognition Employing Artificial Neural Networks
7 pages
Analysing_and_understanding_the_singing
No ratings yet
Analysing_and_understanding_the_singing
13 pages
paper183
No ratings yet
paper183
8 pages
Singing-Voice Synthesis Using ANN Vibrato-Parameter Models: Journal of Information Science and Engineering March 2014
No ratings yet
Singing-Voice Synthesis Using ANN Vibrato-Parameter Models: Journal of Information Science and Engineering March 2014
19 pages
Inter Speech 2006 Nakano
No ratings yet
Inter Speech 2006 Nakano
4 pages
Dual Attention Network for Pitch Estimation of Monophonic Music
No ratings yet
Dual Attention Network for Pitch Estimation of Monophonic Music
6 pages
Biomechnaical Evaluation of The Singing Voice
No ratings yet
Biomechnaical Evaluation of The Singing Voice
5 pages
Vocal Segment Classification in Popular Music
No ratings yet
Vocal Segment Classification in Popular Music
6 pages
Listening Evaluation and Classification of Female Singing Voice Categories
No ratings yet
Listening Evaluation and Classification of Female Singing Voice Categories
14 pages
VocalSet Analysis
No ratings yet
VocalSet Analysis
7 pages
A lifetime of professional singing - Voice parameters and age in the Netherlands Radio Choir
No ratings yet
A lifetime of professional singing - Voice parameters and age in the Netherlands Radio Choir
5 pages
a method of classifying vq
No ratings yet
a method of classifying vq
5 pages
Artificial Neural Networks and Support Vector Machine For Voice Disorders Identification
No ratings yet
Artificial Neural Networks and Support Vector Machine For Voice Disorders Identification
6 pages
Song Recommendation Based On Vocal Competence
No ratings yet
Song Recommendation Based On Vocal Competence
8 pages
VOICE IDENTIFICATION USING MACHINE LEARNING MODELS
No ratings yet
VOICE IDENTIFICATION USING MACHINE LEARNING MODELS
4 pages
189_Paper (1)
No ratings yet
189_Paper (1)
6 pages
Voice Pathology Identification System Using SVM Classifier
No ratings yet
Voice Pathology Identification System Using SVM Classifier
7 pages
JSIP_2014021010293134
No ratings yet
JSIP_2014021010293134
7 pages
Classification of Music Genre: Project Report For 15781
No ratings yet
Classification of Music Genre: Project Report For 15781
12 pages
Singers Voice Range Profile
100% (1)
Singers Voice Range Profile
30 pages
Thesis Gerardo Roa Dabike
No ratings yet
Thesis Gerardo Roa Dabike
232 pages
Bruder Et Al-2024-Scientific Reports
No ratings yet
Bruder Et Al-2024-Scientific Reports
15 pages
025 What Effect Audio Quality Robustness MFCC Chroma Features
No ratings yet
025 What Effect Audio Quality Robustness MFCC Chroma Features
6 pages
Singer Identification of Songs Using Pitch Tracking, Cross-Correlation, MFCC Features and Neural Network Classifier
No ratings yet
Singer Identification of Songs Using Pitch Tracking, Cross-Correlation, MFCC Features and Neural Network Classifier
19 pages
The Mel-Frequency Cepstral Coefficients in The Context of Singer Identification
No ratings yet
The Mel-Frequency Cepstral Coefficients in The Context of Singer Identification
4 pages
Review Paper VPD
No ratings yet
Review Paper VPD
4 pages
Jennifer - Springer - 2020 Published PDF
No ratings yet
Jennifer - Springer - 2020 Published PDF
14 pages
Expert Systems With Applications: P. Dhanalakshmi, S. Palanivel, V. Ramalingam
No ratings yet
Expert Systems With Applications: P. Dhanalakshmi, S. Palanivel, V. Ramalingam
7 pages
Evolutionary Computation Applied To The Control of Sound Synthesis
No ratings yet
Evolutionary Computation Applied To The Control of Sound Synthesis
261 pages
An Improved Method For Voice Pathology D
No ratings yet
An Improved Method For Voice Pathology D
13 pages
2019 Learning Strategies For VD
No ratings yet
2019 Learning Strategies For VD
7 pages
SELF-SUPERVISED REPRESENTATIONSFORSINGINGVOICECONVERSION
No ratings yet
SELF-SUPERVISED REPRESENTATIONSFORSINGINGVOICECONVERSION
5 pages
Automatic Identification of Pathological Voice Quality Based On The GRBAS Categorization
No ratings yet
Automatic Identification of Pathological Voice Quality Based On The GRBAS Categorization
5 pages
Aes2001 Bonada PDF
100% (1)
Aes2001 Bonada PDF
10 pages
2007 The - Relevance - of - Voice - Quality - Features - in - Speaker - Independent - Emotion - Recognition
No ratings yet
2007 The - Relevance - of - Voice - Quality - Features - in - Speaker - Independent - Emotion - Recognition
4 pages
Nietjet 0602S 2018 003
No ratings yet
Nietjet 0602S 2018 003
5 pages
Voicelab: Replicable Automated Acoustical Analysis
No ratings yet
Voicelab: Replicable Automated Acoustical Analysis
13 pages
Application of Automatic Speaker Recognition Techniques To Pathological Voice Assessment
No ratings yet
Application of Automatic Speaker Recognition Techniques To Pathological Voice Assessment
12 pages
Performance Analysis and Scoring of The Singing Voice: JANUARY 2009
No ratings yet
Performance Analysis and Scoring of The Singing Voice: JANUARY 2009
8 pages
Predicting Song Popularity: James Pham Edric Kyauk Edwin Park
No ratings yet
Predicting Song Popularity: James Pham Edric Kyauk Edwin Park
5 pages
Musical and Phonetic Controls
No ratings yet
Musical and Phonetic Controls
103 pages
Voice Quality Comparison Between MPB Sin PDF
No ratings yet
Voice Quality Comparison Between MPB Sin PDF
3 pages
Singer Identification by Vocal Parts Detection and Singer Classification Using LSTM Neural Networks
No ratings yet
Singer Identification by Vocal Parts Detection and Singer Classification Using LSTM Neural Networks
7 pages
A Comparative Study in Automatic Recognition of Broadcast Audio
No ratings yet
A Comparative Study in Automatic Recognition of Broadcast Audio
4 pages
Phyland2014 - Evaluation of The Ability To Sing Easily EASE Test
No ratings yet
Phyland2014 - Evaluation of The Ability To Sing Easily EASE Test
9 pages
Coraline - Polachek - e - Sua Performance - Vocal - Sig - No - Vskip
No ratings yet
Coraline - Polachek - e - Sua Performance - Vocal - Sig - No - Vskip
9 pages
WCI2015 F0 Algorithm Prarthana Gladis Bala Author Manuscript
No ratings yet
WCI2015 F0 Algorithm Prarthana Gladis Bala Author Manuscript
5 pages
Singer Identification in Popular Music Recordings Using Voice Coding Features
No ratings yet
Singer Identification in Popular Music Recordings Using Voice Coding Features
6 pages
Es Sem04 Paper 04307909
No ratings yet
Es Sem04 Paper 04307909
17 pages
Speech Processing Article
No ratings yet
Speech Processing Article
13 pages
Comparative Analysis of Neural Networks For Speech Emotion Recognition
No ratings yet
Comparative Analysis of Neural Networks For Speech Emotion Recognition
5 pages
Driver Identification Based On Voice Signal Using Continuous Wavelet Transform and Artificial Neural Network Techniques
No ratings yet
Driver Identification Based On Voice Signal Using Continuous Wavelet Transform and Artificial Neural Network Techniques
4 pages
Aggregate Features and A B For Music Classification: DA Oost
No ratings yet
Aggregate Features and A B For Music Classification: DA Oost
12 pages
Acoustic Models For Analysis
No ratings yet
Acoustic Models For Analysis
128 pages
2005 Automatic Music Classification and Summarization
No ratings yet
2005 Automatic Music Classification and Summarization
11 pages
Andy Sun, Maisy Wieman, Analyzing Vocal Patterns To Determine Emotion
No ratings yet
Andy Sun, Maisy Wieman, Analyzing Vocal Patterns To Determine Emotion
5 pages
Detection of Pathological Voice Using Cepstrum Vectors: A Deep Learning Approach
No ratings yet
Detection of Pathological Voice Using Cepstrum Vectors: A Deep Learning Approach
8 pages
Musical_Genre_Classification_Using_Advanced_Audio_Analysis_and_Deep_Learning_Techniques
No ratings yet
Musical_Genre_Classification_Using_Advanced_Audio_Analysis_and_Deep_Learning_Techniques
11 pages
Perception and Verbalisation of Voice Quality in Western Lyrical Singing: Contribution of A Multidisciplinary Research Group
No ratings yet
Perception and Verbalisation of Voice Quality in Western Lyrical Singing: Contribution of A Multidisciplinary Research Group
10 pages
COMMUNICATION SYSTEMS
From Everand
COMMUNICATION SYSTEMS
B.P. Lathi
No ratings yet
Tarsosdsp, A Real-Time Audio Processing Framework in Java
No ratings yet
Tarsosdsp, A Real-Time Audio Processing Framework in Java
7 pages
Emergence of Scaling in Random Networks
No ratings yet
Emergence of Scaling in Random Networks
5 pages
RankQA - Neural Question Answering With Answer Re-Ranking PDF
No ratings yet
RankQA - Neural Question Answering With Answer Re-Ranking PDF
10 pages
Network Topologies
No ratings yet
Network Topologies
22 pages
Guacamole Recipe - Alton Brown - Food Network
No ratings yet
Guacamole Recipe - Alton Brown - Food Network
1 page
ML4T 2017fall Exam1 Version B
No ratings yet
ML4T 2017fall Exam1 Version B
8 pages
House Rules and Regulations A. General
No ratings yet
House Rules and Regulations A. General
3 pages
Cambridge IGCSE
No ratings yet
Cambridge IGCSE
12 pages
Material Finish Schedule
No ratings yet
Material Finish Schedule
2 pages
Basic Lesson 4
No ratings yet
Basic Lesson 4
33 pages
50NB Tubular Joint Stiffner Connection-03 (DBR) (27-03-2024)
No ratings yet
50NB Tubular Joint Stiffner Connection-03 (DBR) (27-03-2024)
8 pages
Buried Concrete Basement Wall Design
No ratings yet
Buried Concrete Basement Wall Design
10 pages
(A) General Information Details
No ratings yet
(A) General Information Details
3 pages
Advanced Structural PDF
No ratings yet
Advanced Structural PDF
2 pages
Lesson 5 - Neehar Kulkarni
No ratings yet
Lesson 5 - Neehar Kulkarni
8 pages
Paya Lebar Central - A Vibrant Commercial Hub: An Artist Impression of The Future Plaza Next To Paya Lebar MRT Station
No ratings yet
Paya Lebar Central - A Vibrant Commercial Hub: An Artist Impression of The Future Plaza Next To Paya Lebar MRT Station
32 pages
7402
No ratings yet
7402
3 pages
De_kT_cuoi_Hk1_Tieng_Anh_10_nam_hoc_2022-2023_526f5
No ratings yet
De_kT_cuoi_Hk1_Tieng_Anh_10_nam_hoc_2022-2023_526f5
10 pages
Laxmi Nagar Proposal PDF
100% (1)
Laxmi Nagar Proposal PDF
101 pages
The Steam and Condensate Loop - Spirax Sarco
No ratings yet
The Steam and Condensate Loop - Spirax Sarco
16 pages
3187 - N.Y.S.S.'s Datta Meghe College of Engineering, Airoli, Navi Mumbai
No ratings yet
3187 - N.Y.S.S.'s Datta Meghe College of Engineering, Airoli, Navi Mumbai
14 pages
Digital Path Historical Storage Quality Data PDF
No ratings yet
Digital Path Historical Storage Quality Data PDF
10 pages
Resource Allocation Problems
No ratings yet
Resource Allocation Problems
24 pages
TDS - Total - Statermic XHT - HNN - 201412 - en
No ratings yet
TDS - Total - Statermic XHT - HNN - 201412 - en
1 page
Risk Assess T-01 - Fire Evacuation Office Facilities
No ratings yet
Risk Assess T-01 - Fire Evacuation Office Facilities
2 pages
Guia Comunicacion j1939 y Rs 485
No ratings yet
Guia Comunicacion j1939 y Rs 485
14 pages
HOJA DE CALCULO EJEMPLO A. FRÍA (12 18) - Ejemplo
No ratings yet
HOJA DE CALCULO EJEMPLO A. FRÍA (12 18) - Ejemplo
34 pages
MCC List - FF Main Ring
No ratings yet
MCC List - FF Main Ring
3 pages
USS Kobe Steel Case Study
100% (1)
USS Kobe Steel Case Study
10 pages
Rules For The Classification and Construction of Sea-Going Ships
No ratings yet
Rules For The Classification and Construction of Sea-Going Ships
2 pages
D 0003320
0% (1)
D 0003320
8 pages
Two Stage Air Compressor
0% (1)
Two Stage Air Compressor
5 pages
WSN - Project Proposal PDF
No ratings yet
WSN - Project Proposal PDF
10 pages
Fremap Osram Sylvania Catalog
100% (1)
Fremap Osram Sylvania Catalog
22 pages

Automatic Classification of Singing Voice Quality

Uploaded by

Automatic Classification of Singing Voice Quality

Uploaded by

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Automatic classiﬁcation of singing voice quality

Conference Paper · October 2005

The user has requested enhancement of the downloaded file.

Bozena Kostek and Pawel Zwan

Abstract examined. An implementation of a feature vector

biomechanics of the singing voice [8], [10] is the

As presented in the literature [6], [8] levels of first

Table 2 Best parameters and the corresponding

Param. FS Param. FS Param. FS 0,25

F2 27.11 SFM16 16.89 ASE25 10.77 0,15

View publication stats

You might also like