Sampling, Removal of Silence and Noise in Audio Signal PDF
Sampling, Removal of Silence and Noise in Audio Signal PDF
A PROJECT REPORT
Submitted by
N.S.SWETHA (211420106266)
S.THANUSHA (211420106268)
R.SOWMYA (211420106244)
of
BACHELOR OF ENGINEERING
In
NOVEMBER 2022
ACKNOWLEDGEMENT
We would also like to express our gratitude to our internal guide Mrs.
R .RAJALAKSHMI, Professor, Electronics and Communication Engineering, for
her valuable guidance, ideas and Encouragement for Successful completion of the
project.
Audio sampling
Digital audio uses pulse-code modulation (PCM) and digital signals for
sound reproduction. This includes analog-to-digital conversion (ADC), digital-to-
analog conversion (DAC), storage, and transmission. In effect, the system
commonly referred to as digital is in fact a discrete-time, discrete-level analog of a
previous electrical analog. While modern systems can be quite subtle in their
methods, the primary usefulness of a digital system is the ability to store, retrieve
and transmit signals without any loss of quality.
When it is necessary to capture audio covering the entire 20–20,000 Hz
range of human hearing, such as when recording music or many types of acoustic
events, audio waveforms are typically sampled at 44.1 kHz (CD), 48 kHz,
88.2 kHz, or 96 kHz. The approximately double-rate requirement is a consequence
of the Nyquist theorem. Sampling rates higher than about 50 kHz to 60 kHz cannot
supply more usable information for human listeners. Early professional
audio equipment manufacturers chose sampling rates in the region of 40 to 50 kHz
for this reason.
There has been an industry trend towards sampling rates well beyond the
basic requirements: such as 96 kHz and even 192 kHz Even
though ultrasonic frequencies are inaudible to humans, recording and mixing at
higher sampling rates is effective in eliminating the distortion that can be caused
by fold back aliasing. Conversely, ultrasonic sounds may interact with and
modulate the audible part of the frequency spectrum (inter modulation
distortion), degrading the fidelity. One advantage of higher sampling rates is that
they can relax the low-pass filter design requirements for ADCs and DACs, but
with modern oversampling sigma-delta converters this advantage is less important.
The Audio Engineering Society recommends 48 kHz sampling rate for most
applications but gives recognition to 44.1 kHz for Compact Disc (CD) and other
consumer uses, 32 kHz for transmission-related applications, and 96 kHz for
higher bandwidth or relaxed anti-aliasing filtering. Both Lavry Engineering and J.
Robert Stuart state that the ideal sampling rate would be about 60 kHz, but since
this is not a standard frequency, recommend 88.2 or 96 kHz for recording
purposes.
Speech sampling
Speech signals, i.e., signals intended to carry only human speech, can usually be
sampled at a much lower rate. For most phonemes, almost all of the energy is
contained in the 100 Hz–4 kHz range, allowing a sampling rate of 8 kHz. This is
the sampling rate used by nearly all telephony systems, which use
the G.711 sampling and quantization specifications.
Silence removal block is used to eliminate the unvoiced and silent portion of the
speech signal. For this purpose input signal is divided into small segments (frames)
and root mean square (RMS) of each individual segment is calculated and
compared with a specific threshold value. The total length of each individual
segment is equal to product of time duration and sampling frequency of segment.
𝑆𝑒𝑔𝑚𝑒𝑛𝑡length=𝑆𝑒𝑔𝑚𝑒𝑛𝑡duration× 𝐹𝑠 (2)
OPTIMIZE SYNCHRONIZATION
Different computers, different synchronization sources (internal or SMPTE
code), different tape machines, and—in theory—different samplers or hard disk
recording systems will exhibit slight variations in clock speed. Changing just
one component can lead to a loss of synchronization between recorded audio
material and MIDI. This is particularly applicable to long audio regions.
This is another situation where the Remove Silence function can help, by
creating several shorter audio regions, with more trigger points between the
audio and MIDI events.
For example, you can use this method to roughly split up a whole audio file,
and then divide the new regions, using different parameters. The new regions
can then be processed again with the Remove Silence function.
OPTIMIZE FILES AND REGIONS
You can use Remove Silence to automatically create regions from an audio file
that contains silent passages, such as a single vocal take that runs the length of a
project. The unused regions or portions of the audio file can be deleted, saving
hard disk space, and simplifying (file and) region management.
NOISE REDUCTION
Noise reduction is the process of removing noise from a signal. Noise
reduction techniques exist for audio and images. Noise reduction algorithms
may distort the signal to some degree. Noise rejection is the ability of a circuit to
isolate an undesired signal component from the desired signal component, as
with common-mode rejection ratio.
All signal processing devices, both analog and digital, have traits that make
them susceptible to noise. Noise can be random with an even frequency
distribution (white noise), or frequency-dependent noise introduced by a device's
mechanism or signal processing algorithms.
In electronic systems, a major type of noise is hiss created by random
electron motion due to thermal agitation. These agitated electrons rapidly add and
subtract from the output signal and thus create detectable noise.
In the case of photographic film and magnetic tape, noise (both visible and
audible) is introduced due to the grain structure of the medium. In photographic
film, the size of the grains in the film determines the film's sensitivity, more
sensitive film having larger-sized grains. In magnetic tape, the larger the grains of
the magnetic particles (usually ferric oxide or magnetite), the more prone the
medium is to noise. To compensate for this, larger areas of film or magnetic tape
may be used to lower the noise to an acceptable level.
MATLAB PROGRAM :
clear all; close all; clc
% REMOVAL OF SILENCE AND NOISE IN SPEECH SIGNAL
% read sound
[data, fs] = audioread('count.wav'); sound(data,fs); hold on;
plot(data(1:end,1))
figure(1);
% normalize data
data = data / abs(max(data));
fd = 0.025;
f_size = round(fd * fs);
n = length(data)/f_size;
floor(n);
temp = 0;
for i = 1 : n
frames(i,:) = data(temp + 1 : temp + f_size);
temp = temp + f_size;
end
GRAPH OUTPUT :
SAMPLING
REMOVAL OF SILENCE AND NOISE
INPUT SIGNAL
OUTPUT SIGNAL
RESULT :
Hence , the program has been successfully executed and output is verified.