The Fast Continuous Wavelet Transformation (FCWT) For Real-Time, High-Quality, Noise-Resistant Time-Frequency Analysis
The Fast Continuous Wavelet Transformation (FCWT) For Real-Time, High-Quality, Noise-Resistant Time-Frequency Analysis
https://doi.org/10.1038/s43588-021-00183-z
The spectral analysis of signals is currently either dominated by the speed–accuracy trade-off or ignores a signal’s often non-
stationary character. Here we introduce an open-source algorithm to calculate the fast continuous wavelet transform (fCWT).
The parallel environment of fCWT separates scale-independent and scale-dependent operations, while utilizing optimized fast
Fourier transforms that exploit downsampled wavelets. fCWT is benchmarked for speed against eight competitive algorithms,
tested on noise resistance and validated on synthetic electroencephalography and in vivo extracellular local field potential
data. fCWT is shown to have the accuracy of CWT, to have 100 times higher spectral resolution than algorithms equal in speed,
to be 122 times and 34 times faster than the reference and fastest state-of-the-art implementations and we demonstrate its
real-time performance, as confirmed by the real-time analysis ratio. fCWT provides an improved balance between speed and
accuracy, which enables real-time, wide-band, high-quality, time–frequency analysis of non-stationary noisy signals.
S
ignals are essential in both nature and (man-made) technology, The WT overcomes the drawback of the STFT by not relying on
because they enable communication1,2 (Fig. 1). Mathematically, a window function. Instead, it uses a family of base functions that
a signal is a function of one (for example, speech) or more (for dilate and contract with frequency to represent the signal, thereby
example, a two-dimensional (2D) image) dimensions that carries ensuring high resolution across the entire frequency spectrum.
information about the properties (for example, state) of a physi- Consequently, the WT suffers from a high computational load. This
cal system3. A source transmits a signal via a channel to a receiver, prohibits its use with low-end hardware and for real-time applica-
which delivers it to its destination. For example, a brain sends an tions9, as real-time computation requires an algorithmic computa-
oral message via vocal cords through the air, which is received by the tion time that is smaller than the signal’s duration.
listener’s ear, which brings it to the listener’s brain. When the same To reduce the computational burden of the WT, the discrete
message is transmitted via a smartphone, the air is complemented wavelet transform (DWT) has been proposed, which applies a
by a chain of technology, leaving the rest of the chain untouched. coarse, logarithmic discretization. This makes DWT suitable
Signals are omnipresent in society3,4 (Fig. 1). for data compression, but simultaneously disqualifies it from use
Independent of its source, a signal needs to be processed to enable in detailed analysis, as it is not able to analyze intricate time–fre-
the generation, transformation, extraction and interpretation of the quency details8 (as shown in Fig. 2). For this, a true WT—the com-
information it is carrying3. A widely used method to interpret (that putationally expensive continuous wavelet transform (CWT)—also
is, extract and analyze) repeating patterns in signals is the Fourier called an integral wavelet transform (IWT), is needed. CWT offers
transform (FT)3,4. A FT transforms a function of time into a complex- a high-resolution representation of the time–frequency domain
valued function of frequency, representing the magnitudes of the fre- by using near-continuous discretization. Its continuous time and
quencies. The FT assumes the signal is stationary. In other words, it frequency scales better support intricate time–frequency analysis.
is a stochastic process in which the marginal and joint density func- Consequently, CWT is often described as the mathematical micro-
tions do not depend on the choice of time origin2. However, in real- scope of data analysis10 (Fig. 2).
world practice, this assumption is often violated. Consequently, the In this Resource paper we introduce the open-source fast con-
FT is unable to process real-world non-stationary signals reliably5. tinuous wavelet transform (fCWT), which brings real-time, high-
To circumvent the problem of non-stationarity, advanced algorithms resolution CWT to real-world practice (for example, biosignals11–13,
exist that analyze a signal based on their decomposition in elemen- cybersecurity14,15 and renewable energy management16,17; Fig. 1).
tary signals that are well localized (or boxed) in time and frequency4. Next, we assess the performance of fCWT in a benchmark study
These include the short-term Fourier transform (STFT), also known and then validate the use of fCWT on synthetic, electroencephalog-
as the Gabor transform, and the wavelet transform (WT)6. raphy (EEG) and in vivo electrophysiological data. We end with a
The STFT is very similar to the FT, but it uses a window function concise discussion.
and short wavelets localized in both time and frequency, instead
of pure waves, to extract temporal and spectral information. The Results
drawback of the STFT is its use of a fixed-width window function, The performance of fCWT was benchmarked against six widely
as a result of which frequency analysis is restricted to frequencies used CWT implementations, then it was subjected to a threefold
with a wavelength close to the window width7. Additionally, chop- validation on accuracy, resolution and throughput using, respec-
ping up the signal in short, fixed-width windows scrambles the sig- tively, synthetic data, human EEG data and high-density in vivo
nal’s properties. Accordingly, the frequency analysis is affected8. extracellular rodent electrophysiology.
Department of Information and Computing Sciences, Utrecht University, Utrecht, The Netherlands. ✉e-mail: [email protected]; [email protected]
fCWT’s Real-time high-resolution signal processing (e.g. Reducing risk by real-time machine monitoring and Real-time BCI and saving lives by continuous,
advantage classical music), stress reduction via enhanced increasing prospecting efficiency 34–120 ×. remote, real-time monitoring of the cardiovascular
noise canceling. system.
Example Movement
Hanford, Washington (H1) 200
512 8 High spectrum power: 17.0
130
Period (yr)
2 80
128 0.25 1 50
1/2
64 0.5 30
1/4
32 1 1/8
0.30 0.35 0.40 0.45 0 0.5 1.0 1.5 2.0 2.5 3.0 0 20 40 60 80 100
Time (s) Time (year) Time (s)
CWT used in gravity wave detection by the Laser Colorbar shows power (dB) CWT used to measure overall body health
Interferometer Gravitational-Wave CWT used to measure ground deformation remotely during nocturnal body movements
Observatory (LIGO) above gas reservoir
Fig. 1 | The impact of time–frequency analysis across society. In both nature and technology, signals enable communication, and processing techniques
such as the CWT (also called IWT) are applied throughout. CWT was the primary processing method used in the Laser Interferometer Gravitational-wave
Observatory (LIGO) experiment to detect gravity waves in highly non-stationary gravitational wave data. In industry, CWT has been applied to enhance
mineral detection and speech segmentation. CWT also allows the detailed analysis of biosignals such as an electrocardiogram in the medical domain.
BCI, brain–computer interface; BPM, beats per minute. Image credits: (left) adapted with permission from ref. 82, Caltech/MIT/LIGO Laboratory; (center)
adapted from ref. 83 under a CC BY license.
0.5
20
lower frequency resolution in the high-frequency spectrum.
PyWavelet19 and SciPy20 execution times were measured in a
0 Python 3.8.6 environment, using the Timeit library inside the code
1.0
to exclude compile time. The overhead resulting from the transla-
60 tion between C and Python was removed by estimating the inter-
frequency (kHz)
40
section factor of the linear relationship between signal size and
CWT
0.5
20 execution time. MATLAB v2019b and Mathematica 12.0.0.0 execu-
tion times were measured using the program-specific timing func-
0
0.7 1.4 2.1 2.8 tions that measure the exact kernel execution times.
Time (ms) Wavelib21 was used as the benchmark’s baseline algorithm as it
is the reference CWT C/C++ library9, and most microcontrollers
Fig. 2 | Comparison of DWT and CWT. A time-varying pulse signal of a are programmed using C/C++. Wavelib21 thus serves as a baseline
sonar device is analyzed in the range 0–60 kHz using the DWT and the for the reported speed-ups (Fig. 3). The reported execution times
CWT. The DWT uses a coarse time–frequency discretization to favor were obtained from an eight-core 2.30-GHz central processing
speed. By contrast, the CWT uses a time-consuming near-continuous unit (CPU) via 100 successive runs, which removed the influence
discretization of the time and frequency scales to favor resolution. of caching behavior. A 10-s pause between runs was implemented
to prevent the CPU from overheating. Outliers that deviated by
more than 3 s.d. from the mean were removed. Wavelib and SciPy
had three outliers, leaving N = 97 samples for all algorithms to
Benchmark. To benchmark the performance of fCWT we com- ensure equal group sizes. A repeated-measures analysis of vari-
pared fCWT to the six widely used CWT implementations shown ance (ANOVA) revealed that the algorithms differed significantly,
in Fig. 3. Because of its widespread use across research, the com- F(4, 93) = 2,474,778.911, P ≪ 0.001, η2 = 1.000, where F denotes the
plex Morlet wavelet (σ = 6) was used to calculate the CWT of three ANOVA statistic based on the ratio of mean squares, which indi-
signals, all containing N = 100,000 samples. The Morlet wavelet is cates the ratio between the explained and unexplained variance or,
defined as a plane wave modulated by a Gaussian envelope. The in other words, the between- and within-group variability. P is the
parameter σ controls the time–frequency resolution trade-off18. The probability that an observed difference occurred by chance, and η2
first signal was generated to be non-stationary using a sine wave ‘indicates the proportion of variance accounted for (that is, a gen-
whose frequency changed linearly from fstart = 1 Hz to fend = 7 Hz. The eralization of r/r2 and R/R2 in correlation/regression analysis)13.
second and third signals contained uniformly random noise and a Also, all pairwise comparisons were highly significant (P ≪ 0.001,
b
Not real time
100
Fastest CWT available Real time
fCWT
10–1
RAR
STFT
DWT
10–2
10–3
0 20 40 60 80 100 120 140 160 180 200
Sampling frequency (kHz)
Fig. 3 | Benchmarking with fCWT and six state-of-the-art time–frequency methods. a, The average speed-up of fCWT and six publicly available
implementations after 100 runs on a signal of length N = 100,000 with accompanying statistics (in seconds). The signal was analyzed using 3,000
frequencies ranging from f0 = 1 Hz to f1 = 32 Hz. b, The RAR (equation (1)) of fCWT (600 frequencies, σ = 6), the fastest CWT available (PyWavelet’s CWT,
600 frequencies, σ = 6), STFT (500-ms Blackman with 400-ms overlap) and DWT (four-order Debauchie 20 levels) versus sampling frequency on a 10-s
synthetic signal. Parameters were chosen to reflect actual usage in real-world applications. Jumps in the performance of fCWT are explained in the Methods.
Bonferroni-corrected), with fCWT being, respectively, 122 times to fs1 = 200 kHz). fCWT and CWT used 5-s signals to fit memory
and 34 times faster than the reference Wavelib21 and the fastest constraints. Small fluctuations in RAR are caused by the stochastic
available algorithm, PyWavelet19. Figure 3 presents descriptive sta- nature of benchmarks performed under real-world conditions. It
tistics for all distributions. should be noted that the sampling frequency is directly related to
The fast running time of fCWT was also compared to two other the number of samples. Therefore, we test fCWT’s performance for
fast time–frequency estimation algorithms: the STFT and DWT. different signal lengths.
In this benchmark, STFT uses a Blackman window of 500 ms with STFT and DWT exhibit superior real-time behavior on signals
400-ms overlap, and DWT uses 20 dyadic (that is, aj = 2j) scales of with sampling frequencies up to 200 kHz and beyond. However,
Debauchie decomposition. The parameters were chosen to reflect they achieve these very high speeds because of their considerable
actual usage in real-world applications (Fig. 1). Both algorithms are drop in precision, as shown in Fig. 2. Therefore, STFT and DWT
implemented and benchmarked in MATLAB using the in-program are not suitable for wide-band high-resolution time–frequency esti-
timing functions. CWT implementations use 600 frequencies, mation. In these cases, CWT is favored. However, even the fastest
evenly spaced in exponential space. Fewer frequencies are used to CWT implementation available tends to be extremely slow com-
reduce memory usage. pared to STFT and DWT. fCWT merges the best of both worlds,
To assess whether or not the algorithms perform in real time yielding real-time behavior on signals with sampling frequencies up
(that is, an algorithmic computation time less than the signal’s dura- to 200 kHz. This has brought CWT’s execution time close to that
tion), we define the real-time analysis ratio (RAR): of STFT and DWT, while having 25 times to 100 times the spec-
tral resolution of DWT throughout the spectral domain. As such,
Δtcomputation fCWT is a truly competitive real-time, high-resolution alternative
RAR = , (1)
Δtsignal for STFT and DWT.
fCWT allows signals with 34 to 122 times the sampling frequency
with Δtcomputation and Δtsignal being the duration of the computation and of existing CWT implementations. Figure 3 shows fCWT’s capa-
signal, respectively. In the case of RAR > 1, an algorithm does not bility of analyzing signals up to 200 kHz in real time, whereas the
operate in real time. In the case of RAR just shy of 1, the algorithm fastest implementation of CWT fails at fs = 30 kHz. Consequently,
is unlikely to run in real time as the time–frequency calculation is fCWT enables real-time analysis of high-frequency signal dynam-
merely one step in a processing pipeline. When RAR ≪ 1, real-time ics, as exist in audio (for example, loudspeaker characterization22,
operation is likely to be achieved or within reach. For all six CWT full band speech coding23 and paralinguistic analysis24), biosignals
implementations and two traditional time–frequency techniques (for example, brain–computer interfaces12 and peripheral signals
(that is, STFT and DWT), Fig. 3b shows RAR versus sampling fre- such as ECG, electromyography, electrodermal activity and respi-
quency. The RARs were obtained by averaging 100 successive runs ration11,13), image and video (for example, distance transforms25,26),
on 10-s signals with varying sampling frequencies (range, fs0 = 1 kHz sonar and radar27,28, network analysis (for example, renewable
energy management16,17 and cybersecurity14,15) and machine fault not orthogonal at different scales), which reduces noise by cancel-
diagnosis29,30 (Fig. 1). ing out the random signal components34. Hence, both can separate
frequency bands and their details across the full frequency range.
Synthetic data. fCWT’s spectral resolution is equal to that of CWT. When compared to the slow CWT, fCWT’s accuracy and noise-
In contrast to many other CWT optimization studies, we do not handling capabilities are not compromised by its highly efficient
compromise precision. To demonstrate this, we compared fCWT to implementation. Small differences in the time–frequency spectrum
CWT on both clean and noisy synthetic datasets (see Data availabil- can be seen at the edges. However, these differences are caused by
ity statement for details). Each dataset consists of three wavepackets MATLAB’s mitigation of edge artifacts (202020Implementation of
that validate an algorithm on spectral and temporal resolution and fCWT section in the Methods).
bandwidth size. A noisy dataset was generated to mimic realistic STFT cannot extract details of the lower frequency bands present
conditions and assess noise resilience. in the first and third wavepackets. The wavelengths of these waves
Quantitative assessment of each algorithm’s performance is car- are too long for the 500-ms window we used, whereas a larger win-
ried out by calculating the per-wavepacket mean absolute percent- dow cannot distinguish the complex non-stationary behavior of
age error (MAPE) scores of 100 runs on both datasets between the first packet. Nevertheless, STFT shows strong noise-handling
actual frequencies and the time–frequency ridges extracted from the capabilities that result from the averaging effect of FFT’s inherited
spectra (see Methods for details). The MAPE scores of the clean data convolution. DWT is powerful in denoising, but not suitable for
are based on one run, as they are completely deterministic. We used time–frequency analysis. WVD suffers from its well-known arti-
a relative error measure to weight errors at all frequencies evenly. facts, which are only made worse by the additive noise4. HHT and
Next to fCWT and CWT, STFT and DWT were also included, EWT are very good at separating the frequency bands of the clean
allowing us to show the speed–accuracy trade-off that currently dataset. Unfortunately, HHT’s frequency estimations, and to a lesser
dominates the time–frequency landscape. STFT is based on calcu- extent those from EWT, fluctuate heavily, leading to high MAPE
lating multiple traditional FTs with overlapping fixed-sized win- values. These distortions are caused by the interference between the
dows. The STFT is very fast and efficient as it relies on the fast multiple wavefunctions in each wavepacket. This effect increases
Fourier transform (FFT). However, the use of fixed-sized windows dramatically for both algorithms in the noisy dataset4.
requires the wavelengths to be close to the window size. Hence, fre-
quency resolution changes drastically over the spectrum, and only EEG. Owing to its ease of measurement and high temporal resolu-
a small frequency band can be analyzed at the same time. DWT tion, the vast majority of neuroscience studies are based on EEG
does not have this drawback. It does not rely on a window func- measurements35. As EEG measures brain activity via electrodes on
tion. Similar to CWT, it uses wavelets that dilate and contract with the skull, no medical procedures are needed. However, such external
frequency to represent the signal. However, in contrast to CWT, it measurements do suffer from increased noise. Fluctuations in EEG
uses far fewer wavelets to represent the signal. This makes DWT caused by brain activity are orders of magnitude smaller than the
a very fast time–frequency estimator. Finally, to complete the disturbances caused by eye, face and body movements36. Therefore,
time–frequency landscape and allow a thorough comparison on studies average the recordings of many trials to cancel random fluc-
accuracy, we added the high-resolution Wigner–Ville distribu- tuations. Unfortunately, the use of repeated trials removes the tem-
tion (WVD)4, the advanced Hilbert–Huang transform (HHT)31 poral advantage of EEG and prevents its applicability in real-time
and the more recent empirical wavelet transform (EWT)32. WVD implementations, which rely on single-trial estimation.
has the highest time–frequency resolution mathematically pos- The often-used FFT cannot handle the highly non-stationary
sible and HHT and EWT improve the resolution by using a slow character of EEG signals. Additionally, EEG sampling frequencies
but accurate adaptive iterative process to decompose a signal into are often 1 kHz, and the simultaneous recording of 64 electrodes
fundamental functions that are not necessarily sine functions (for is standard. Hence, high-speed, non-stationary, time–frequency
example, FFT). Manual tuning obtained the following parameters analysis is essential to have any chance of success in single-trial
for optimal time–frequency sharpness. fCWT and CWT use the estimation. This is a criterion that current time–frequency tech-
complex Morlet wavelet (σ = 6) and a frequency scale of 480 fre- niques are unable to meet. Techniques like STFT and DWT8 are
quencies (range, f0 = 0.25 Hz to f1 = 250 Hz), evenly spaced in expo- fast but lack the desired resolution in representation, whereas
nential space (cf. the 111Benchmark section). STFT uses a 500-ms methods like CWT6 are precise but lack speed. fCWT fuses the
Blackman window with 400-ms overlap, DWT uses 11 dyadic (that best of both worlds by accelerating the high-resolution CWT by
is, aj = 2j) scales of 15-order Daubechie wavelet decomposition, and 34 to 122 times. So, we can improve the resolution by ≥34 times
WVD does not take parameters. HHT and EWT use a frequency or handle ≥34 times as many data than the fastest CWT imple-
resolution of 0.25 Hz. HHT uses seven intrinsic modes that were mentation available in the same time frame. To demonstrate the
extracted using a maximum signal-to-residual ratio of 20 as a stop- impact of real-time super-resolution on neuroscience, fCWT was
ping criterion. EWT decomposes the signal using a peak threshold thus benchmarked against full-resolution CWT and fast STFT, and
of 5%. Outliers that deviated more than 3 s.d. from the mean were DWT on a single-trial EEG dataset of subjects performing mental
removed. The HHT had four outliers, which resulted in N = 96 for arithmetic tasks37.
all algorithms to ensure equal group sizes. Because active concentration is known to be most visible in the
Overall, the per-wavepacket MAPE scores differed signifi- frontal region of the brain36, the signals of three frontal electrodes
cantly on both the clean and noisy datasets between the algorithms (pre-frontal 1, pre-frontal 2 and mid-frontal in the 10–20 system36)
(F(6, 90) = 112, 243.890, P ≪ 0.001, η2 = 1.000; Fig. 4). Within each were averaged to reduce local fluctuations. We analyzed the resulting
algorithm, the per-wavepacket MAPE scores also differed signifi- signal in the δ (delta), θ (theta), α (alpha), β (beta) and γ (gamma)
cantly between each other (F(2, 94) = 399.044, P ≪ 0.001, η2 = 0.895) frequency bands, using a frequency range that spans five octaves
However, fCWT and CWT generated similar, low MAPE scores on (f0 = 2 Hz to f1 = 64 Hz). Simultaneous analysis of all these frequency
both the clean and noisy datasets for all three wavepackets. This was bands is vital for cognitive task experiments, with pre-frontal δ fre-
confirmed by a correlation analysis per wavepacket, respectively quencies (2–4 Hz) being associated with attention and motivation38,
r(94) = 0.996, P < 0.001, r(94) = 1.000, P < 0.001 and r(94) = 0.997, and the power of θ oscillations (4−7 Hz) reflecting memory encod-
P < 0.001. The low MAPE scores can be explained by CWT’s and ing and retrieval39. Lower α-desynchronization (8–13 Hz) relates to
fCWT’s wavelet convolution, which averages fluctuations of a sig- task-unspecific attentional demands and β-band (13–30 Hz) power
nal at different scales33, and its redundancy (that is, wavelets are increases with demanding cognitive tasks36. The γ oscillations
100 4 100
20 20
fCWT
fCWT
5 5
0 0 0
100 4 100
CWT
CWT
20 20
5 5
0 0 0
100 4 100
DWT
DWT
20 20
5 5
Frequency (Hz)
Frequency (Hz)
4
100 100
STFT
STFT
50 50
20 20
0 0 0
4 100
100
WVD
WVD
50 50
20 20
0 0 0
4
100 100
HHT
HHT
50 50
20 20
0 0
4
100 100
EWT
EWT
50 50
20 20
0 0
0 5 10 15 20 17 18 19 0 5 10 15 20
Time (s) Normalized power (dB) Time (s)
0 1
c
MAPE
104
MAPE clean (%)
102
100
104
MAPE noisy (%)
102
100
WP 1 2 3 WP 1 2 3 WP 1 2 3 WP 1 2 3 WP 1 2 3 WP 1 2 3 WP 1 2 3
fCWT CWT DWT STFT WVD HHT EWT
Fig. 4 | Benchmark results for synthetic data. a, Synthetic data composed of wavepackets WP1, WP2 and WP3 (see Methods for details). Seven time–
frequency estimation techniques that cover a frequency range from f0 = 0.25 Hz to the Nyquist frequency f1 = 250 Hz are shown. fCWT and CWT use the
Morlet wavelet (σ = 6) and 480 frequencies to divide the spectrum, DWT uses 11 levels of 15-order Debauchie wavelet decomposition, and STFT uses a
500-ms Blackman window with 400-ms overlap to obtain optimal time–frequency resolution. WVD takes no parameters. HHT and EWT have a frequency
resolution of 0.25 Hz and rely on an adaptive iterating process. HHT uses seven intrinsic modes that were extracted using a maximum signal-to-residual
ratio stopping criterion. A close-up of the time–frequency estimation of the third wavepacket is also shown for comparison. As relative intensity is of
primary interest, the spectra are normalized to a [0, 1] range. b, As in a, but 0-dB white Gaussian noise is added to the synthetic data. The parameters
remained the same. c, MAPE scores for the clean and noisy data. Boxes show the median and 25th to 75th percentile range; whiskers show minima and
maxima. In the top plot only medians are visible as results on the clean dataset are deterministic and, hence, contain no variance. See Supplementary Table
1 for the distribution statistics.
(~30−100 Hz) indicate complex cognitive thinking (for example, The analysis of CWT, fCWT, STFT and DWT was comple-
object recognition and sensory processing40). Consequently, full- mented with 3.0%CWT (that is, CWT with fCWT’s RAR; Fig.
range, high-resolution frequency analysis is vital. 5). 3.0%CWT enables a fair comparison between the real-time
a b c
Rest Arithmetic task
Fz
50
EEG (µV)
Fp2 Fp1
0
–50
Frequency (Hz)
100 4
Real time
18
3.0%CWT
STFT 3× power
40 35
fCWT
30
STFT
20 25
RAR
DWT
20
10–1
DWT
64
32
16 32
8
4
10–2 2
1 32 64 128 256 512 –30 –15 0 15 30
Number of electrodes Time (s) Normalized power (dB)
0 1
Fig. 5 | Benchmark results of human EEG data. a, The Fp1 and Fp2 pre-frontal and Fz mid-frontal EEG electrodes, which were averaged to assess mental
workload. Credit: Imagewriter/Alamy. b, Full fCWT and CWT, 3.0%CWT, STFT and DWT of EEG, recorded during 30 s of rest and 30 s of mental
arithmetic. Full fCWT and 3.0%CWT analyze the signal using the Morlet wavelet (σ = 20) at 650 and 20 scales, evenly spaced in exponential space,
respectively. STFT uses a 500-ms Blackman window with 400-ms overlap and DWT uses 11 levels of 15-order Daubechie wavelet decomposition.
Spectra are normalized to [0, 1], except for a few spectra that are amplified to enhance visibility. c, Zoomed view during the arithmetic task to show each
algorithm’s ability to extract the intricate time–frequency details of the β frequency band (13–30 Hz). d, The RAR (equation (1)) of full fCWT and CWT,
3.0%CWT, STFT and DWT versus the number of electrodes with a 1-kHz EEG signal.
resolution of CWT and full fCWT using 650 frequencies and calculating real-time, high-resolution time–frequency representa-
3.0%CWT using 20 frequencies. The three CWTs use the complex- tions of state-of-the-art EEG set-ups with up to 512 electrodes.
valued Morlet wavelet (σ = 20), tuned for optimal time–frequency
resolution. Based on manual tuning we set a 500-ms Blackman win- In vivo electrophysiology. Using depth electrodes, local field
dow with 400-ms overlap for STFT and 11 dyadic (that is, aj = 2j) potentials (LFPs) measure local voltage changes inside the brain
scales of 15-order Debauchie wavelet decomposition for DWT, caused by the activity of neuron clusters. LFPs are recorded in
enabling maximal time–frequency sharpness. RAR versus the num- vivo and, consequently, they do not suffer from the skull’s high-
ber of 1-kHz channels was calculated for full-resolution CWT and frequency mask behavior. Consequently, the γ-frequency (~30–
fCWT, STFT and DWT. 100 Hz) and high γ-frequency (>100 Hz) bands can be reliably
The resolution difference between the equally fast full fCWT recorded, these being bands that highly correlate with single neuron
and 3.0%CWT is most prominent during the mental arithmetic firing and reflect aspects of movement (in the motor cortex41) and
task. Real-time fCWT distinguishes different EEG frequency bands vision (in the visual cortex42). Recording these frequencies requires
much better than real-time CWT. The sheer amount of subdivi- sampling rates that are several times those used for EEGs (that is,
sions in the frequency spectrum allows fCWT to show the small 2–3 kHz). Furthermore, in vivo electrophysiology techniques43 use
chaotic β-frequency variations often seen during active concentra- huge amounts of electrodes44. LFPs are often recorded simultane-
tion36 and the slow oscillating δ-band power associated with moti- ously at 100–300 channels, or even more45. In the future, data band-
vation38, in real time. Having the same runtime, the fastest CWT width is expected to increase even more than its recent tremendous
implementation fails. Although STFT can separate frequencies in increases. Neuropixels43, Utah arrays44 and Michigan probes46 are
the β-frequency (13–30 Hz) and γ-frequency (~30−100 Hz) bands, currently able to measure hundreds of LFPs and thousands of neu-
it suffers from low spectral resolution in the δ-frequency (<4 Hz) rons simultaneously. Real-time LFP time–frequency analysis could
and θ-frequency (4–7 Hz) bands. Hence, STFT makes wide-band lead to next-generation prostatics41. Unfortunately, current imple-
EEG analysis impractical. Again, DWT was shown to be unsuitable mentations are unable to handle these bandwidths without com-
for detailed time–frequency analysis. promising resolution. fCWT shows that super-resolution can be
fCWT’s power excels when an entire array of EEG electrodes is maintained when analyzing hundreds of high-bandwidth LFP data
analyzed in real time. Although the use of EEG is gaining popu- streams simultaneously.
larity, its low spatial resolution remains a huge drawback. Figure Rodent in vivo electrophysiology data from the Allen Brain
5 shows that the fastest CWT implementation available can only Observatory data collection47 were analyzed. During randomly
handle ~20–24 electrodes (or streams of data) simultaneously at alternating full-field, high- and low-contrast flashes, six Neuropixel
full resolution in real time. By contrast, fCWT is easily capable of probes43 with 374 electrodes (Neuropixel 3a; 20 μm vertical
LFP (V)
Neuropixel 0
Anteromedial
area
Visual cortex –1
Full fCWT and CWT
128 80
64 60
32 40
16
3.0% CWT
128 80
d Fastest full CWT available 60
64
3.0%CWT 32 40
Not real time
Frequency (Hz)
100 16
Real time fCWT
STFT STFT 2× power
DWT 120 80
100
80 60
60
RAR
40
40
10–1 20 20
DWT
128
64 64
32 32
16 16
10–2 8
1 100 200 300 400 500 0 2 4 6 8
Number of channels Time (s) Normalized power (dB)
0 1
Fig. 6 | Benchmark results of in vivo electrophysiology data. a, In vivo electrophysiology measurements were obtained by the insertion of a Neuropixel43
inside the anteromedial area of a rodent’s visual cortex. Mouse drawing adapted from ref. 84 under a CC BY license. b, Time–frequency estimations by
fCWT, CWT, STFT and DWT during 9 s of four 250-ms full-field, high- and low-contrast flashes. The LFP shows exclusive activation after the black stimuli.
Full fCWT and 3.0%CWT analyze the signal using the Morlet wavelet (σ = 16) at 520 and 16 scales evenly spaced in exponential space, respectively. STFT
uses a 500-ms Blackman window with 400-ms overlap and DWT uses 11 levels of 15-order Daubechie wavelet decomposition. Spectra are normalized
to [0, 1], except for a few spectra that are amplified to enhance visibility. c, Zoom-in of the β- (15–30 Hz), γ- (32–100 Hz) and high γ-frequency bands
(>100 Hz), immediately after a black stimulus. Three frequency components in the β-frequency band and two γ bursts are present. Plot scales are aligned
as well as possible, despite differences in exponential scale (fCWT and CWT) and linear scale (STFT). d, The RAR (equation (1)) of full-resolution fCWT
and CWT, 3.0%CWT, STFT and DWT versus the number of channels in a 2.5-kHz electrophysiology signal.
electrode separation) each recorded a mouse visual cortex’s calculated for fCWT and CWT at full resolution and STFT and
responses. LFPs were obtained by downsampling the data to DWT for a 2.5-kHz input signal.
1.25 kHz and filtering using a 1,000-Hz low-pass filter. Full fCWT The subfigures of Fig. 6c show the ability of real-time, full fCWT
and CWT, 3.0%CWT (EEG section), STFT and DWT time–fre- to separate multiple β-frequency components (16, 20 and 25 Hz),
quency estimations were performed on 9 s of raw single-trial LFP locate four γ bursts and reveal the overall γ-frequency dynamics, all
data containing four stimuli. at the same time. By contrast, real-time 3.0%CWT misses two out
We compared CWT and fCWT to STFT and DWT, as the latter of four γ bursts, cannot separate low-frequency β components, and
two are used in situations where speed is key. Other time–frequency loses higher γ-frequency dynamics. With STFT, the resolution is on
algorithms offer much higher resolution but are orders of magni- par in the mid-frequency range, but the high- and low-frequency
tude slower, making them impractical for LFP analysis. ranges suffer from low resolution. Despite their very high speeds,
The analysis covers a frequency range from f0 = 8 Hz to f1 = 128 Hz, both STFT and DWT are unsuitable for broadband, high-resolu-
allowing simultaneous analysis of both low frequency (that is, α and tion, time–frequency estimations.
β bands) and high frequency (that is, γ and high γ bands), which Electrode density is set to increase dramatically; for example,
is very important as they reflect different aspects of task perfor- 5,000-electrode Neuropixels have already been announced50. Figure
mance. Low-frequency LFPs unveil long-distance communication, 6d shows RAR (equation (1)) versus the number of channels per
whereas high-frequency activity reflects local neural processing48. algorithm. Full CWT can hardly process 15 LFP channels (or data
As the interplay between these frequency ranges discloses the coor- streams) in real time. By contrast, fCWT offers a real-time, full-res-
dination at the inter- and intra-cortical level49, real-time, wide-band olution performance for up to 350–400 channels. Considering the
time–frequency estimation is key in the LFP analysis of complex Allen Brain Observatory dataset, fCWT supports real-time analy-
brain mechanics. sis and feature extraction of three to four entire Neuropixel probes,
The three CWTs use the complex-valued Morlet wavelet (σ = 16), whereas the fastest CWT implementation available supports only
tuned for optimal time–frequency resolution. Based on manual one-tenth of a single probe.
tuning we set a 500-ms Blackman window with 400-ms overlap for
STFT and 11 dyadic (that is, aj = 2j) scales of 15-order Debauchie Discussion
wavelet decomposition for DWT, enabling maximal time–fre- One of WT’s most powerful features is the possibility to use custom
quency sharpness. The RAR versus number of channels was also wavelets. However, not all wavelet types are suitable for existing fast
approximate CWT implementations, which rely on finite impulse In this paper we use the data of subject 13, a 24-year old male who excelled in
response filters4. fCWT does not suffer from this setback, as it cal- mental arithmetic by performing 34 subtractions between four-digit and two-digit
numbers in 4 min. Subject 13 was chosen to ensure task compliance. We used the
culates wavelets starting directly from its definition. With custom last 30 s of EEG during rest and the first 30 s of EEG during the arithmetic task.
wavelets, fCWT performance can be improved even further51. As
such, fCWT enables the real-time analysis of high-frequency non- In vivo electrophysiology. In vivo electrophysiology data were collected from The
stationary signals, such as in audio22–24,52, biosignals (for example, Visual Coding—Neuropixels project47. LFP data from female specimen 738651054
from stimuli IDs 3861−3864 were used. Six Neuropixel version 3a probes were
brain–computer interfaces12 and ECG11,13), image and video25,26,
inserted into the mouse visual cortex. In this study, LFP data from fifth probe (Probe
sonar and radar27,28, renewable energy management16,17, cybersecu- ‘e’) channel 63 were used. The 250-ms high-contrast stimuli, 2,000 ms apart, alternate
rity14,15 and machine fault diagnosis29,30,53 (Fig. 1). in random order. Mice were shown a neutral gray screen between stimuli. Additional
The implementation of fCWT could be extended to other time– technical, experimental and medical details about the dataset can be found in ref. 47.
frequency methods as well. The synchrosqueezed transform (SST)54
Mathematical preliminaries. The Fourier transform. With its core idea that a
uses reassignment to sharpen the CWT spectrum, and the chirplet function, often a signal, can always be decomposed into pure sine and cosine
transform (CT)55, superlets (SL)6 and the noiselet transform (NT)56 functions, the FT is foundational in spectral pattern analysis3,4,8,61. However, not all
use atoms to describe a signal, sharing a wavelet-like implementation. functions f(t) can be decomposed—only those that live in the Lebesgue space L2(0,
Future research could explore speed-ups of these algorithms and 2π). This space includes all functions that are (1) finite in energy, (2) 2π-periodic
bring them to real-time applications. Hence, fCWT’s impact is and (3) square-integrable, formally
broader than CWT-based applications alone. Consequently, we ∫ 2π
2
|f(t)| dt < ∞ t ∈ (0, 2π ) (2)
did not include the SST, CT, SL and NT in the benchmark study, as 0
these rely on the CWT in their core. These second-order techniques
as well as modifications of the included first-order techniques f(t) = f(t − 2π ) t ∈ R, (3)
(for example, smoothed WVD6) are by definition slower than the
already expensive CWT. which allows f(t) to be represented as a weighted sum of complex wavefunctions:
fCWT shares its mathematical definition with CWT and, hence, ∑
∞
without compromise, inherits both all its benefits10 and all its limita- f( t ) = cn e2πint , (4)
tions (for example, its degrading spectral resolution57 and increas- −∞
ing redundancy in higher frequency ranges5). Fortunately, these with the Fourier coefficients cn given by the amount of overlap between the
are well-known limitations that have solutions4,54. Moreover, the conjugated complex wavefunction and the function f(t):
time–frequency landscape keeps growing, including new CWT 1
∫ ∞
implementations58. We therefore invite everyone to compare their cn = f(t)e−2πint (5)
2π −∞
implementations against fCWT’s open source59, and, to extend its
validity, we invite all to apply fCWT on more extensive and different or in discrete form when used on actual digital samples in a sequence f having
specimens that fall outside this paper’s scope. length N:
fCWT allows an acceleration in the developments of science and N
∑ −1
engineering, industry and health (Fig. 1). Although maintaining xk = f[n]e−i2πkn/N . (6)
CWT’s full resolution and supporting customization, fCWT enables n= 0
real-time time–frequency analysis of non-stationary signals. As In other words, any 2π-periodic, square-integrable function f(t) can be represented
such, fCWT can bring offline research that is hindered by the low by this superposition of complex-valued sinusoidal waves that are translated in the
resolution of DWT, the limited range of STFT and/or the computa- frequency domain. However, this is precisely Fourier’s pitfall; not all functions, or
tional burden of CWT into real-time practice. signals for that matter, are 2π-periodic. FTs cannot decompose the wide variety of
non-stationary functions that are not 2π-periodic. Unfortunately, this constraint
is often misunderstood, and FT are still used to analyze signals with varying
Methods frequencies.
Datasets. In this Resource paper, three types of data were used: synthetic, EEG The mathematical reason behind FT’s constraint becomes apparent when we
and in vivo electrophysiological data. Details on each dataset are described in the consider the Lebesgue space L2 (R) containing all square-integrable functions that
following subsections. have finite energy along the entire real axis:
∫ ∞
Synthetic data. Two synthetic datasets were generated for this paper, both 2
|f(t)| dt < ∞. (7)
composed of the same three time-varying wavepackets with a sampling frequency −∞
of 500 Hz:
The reason why equation (4) cannot represent these functions is that pure sine
1. Three 5-s sine waves, the frequencies of which gradually change between 100
waves extend to infinity and therefore do not have finite energy. Pure waves do not
and 110 Hz, 20 and 22 Hz and 5 and 6 Hz, respectively, with a periodicity of
lie in L2 (R) and, as such, they cannot represent its functions.
1 Hz.
2. Two 5-s sine waves with linearly changing frequencies between [5, 50] and
Wavelets. We can define a set of functions other than equation (4) that do
[100, 50] Hz.
have finite energy. The result is the set of short periodic functions ψ(t) called
3. Three 10-s low-frequency waves of 2, 1 and 0.5 Hz. All wavepackets are sepa-
wavelets that are well localized in both the time and frequency domains5,6,8,33,57,62.
rated by 0.5 s and are multiplied by a Gaussian window function to mitigate
Consequently, wavelets need to be able to translate in both domains as well:
discontinuities at the boundaries.
One set contained clean data and the other was contaminated with white ψ jk (t) = 2−j/2 ψ (2j t − k), (8)
Gaussian noise with a 1:1 signal-to-noise ratio (SNR) across the whole signal,
with the SNR being determined by the average power. Both datasets have a total where ψjk is a daughter wavelet function, defined as the mother wavelet ψ(t) scaled
duration of 21.0 s and are available in the fCWT CodeOcean repository59. in the frequency domain by j and translated in the time domain by k. So, the WT
outputs a 2D time–frequency matrix, where the FT gives a 1D frequency spectrum.
EEG. The EEG mental arithmetic dataset by Zyma et al.37 was obtained from Similar to equation (4), the superposition of these wavelets can represent any
PhysioNet60 and loaded into MATLAB R2021a. EEG data were recorded function
monopolarly at 500 Hz, using Ag/Ag electrodes and the Neurocom EEG
23-channel system (Ukraine, XAI-MEDICA). The International 10/20 scheme was ∑
∞
used for electrode placement. Electrodes were referenced to the interconnected f(t) = cjk ψ jk (t), (9)
ear reference electrodes. Data were preprocessed using a 30-Hz high-pass filter j, k=−∞
and a 50-Hz power line notch filter. Common EEG artifacts were removed using
independent component analysis. All participants had normal or corrected-to- where, like with the FT, the wavelet coefficients cjk are given by the amount of
normal vision and had no mental or cognitive impairment. overlap between the wavelet and the function f(t). This definition also shows us
in which ψ jk corresponds to the conjugate of ψjk. However, as j and k can be 3. Evaluate the inverse FT and obtain Wψf[a, b],
any real number, we have to define both variables’ optimal discretization such with the first two steps evaluated in O(N) and the last one requiring at least
that the resulting time–frequency matrix does not under- or overdetermine the O(Nlog2N) when using a fast FT implementation68,69. This results in
function f(t). So, the variables should be discretized such that the wavelets form O(Nlog2N) complexity, a considerable reduction compared to O(N2), which
an orthogonal basis in Hilbert space63,64—in other words, such that the wavelet is needed for the naïve approach. Additionally, the constant factor of this
functions have zero overlap. complexity can be reduced even more, as we will see in the next section.
Wavelets are orthogonal in Hilbert space if
⟨ψ jk , ψ lm ⟩ = δ jk δ lm , (11) Implementation of fCWT. Fourier-based wavelet transformation’s computational
complexity is mainly determined by the inverse FT. Consequently, equation
from which it follows that equation (8) is indeed logarithmic orthogonal. The WT (12) has been rewritten regularly to use spline interpolation of the wavelet and
that uses this type of discretization is called the DWT8,65,66. In this context, ‘discrete’ circumvent the FT entirely70,71. Spline interpolation, also known as polynomial
refers to the use of its wavelets, not to the type of data it processes. As all DWT’s interpolation, defines a wavelet by only a few evenly spaced sampling points
wavelets are orthogonal, it describes a function by the minimal number of wavelet across the domain. Because the number of points is independent of the wavelet’s
coefficients possible. However, as stated at the beginning of this paper, a redundant, scale, the theoretical complexity of equation (12) is reduced to linear time.
overcomplete representation is often much more favorable for signal analysis. However, while complexity is lowered, the constant factor that equals the number
Therefore, it is also possible to define a WT with arbitrary wavelet discretization. of sampling points has been increased tremendously. In turn, this yields a trade-
Such a wavelet transformation is called the CWT67. Again, ‘continuous’ does not off between speed and accuracy: more interpolation points leads to increases in
refer to the type of data it can handle. CWT features continuously scalable and both precision and computation time. Additionally, the spline interpolation only
translatable wavelets that allow a much more precise analysis of a signal’s spectrum: works for specific wavelet types. To avoid the trade-off, we optimize the Fourier-
∫ ∞ ( ) based wavelet transformation by reducing the constant factor of its computational
t−b complexity. In this way, we maintain WT’s ability to use custom wavelet types51 and
Wψ f(a, b) = |a|−1 f(t)ψ dt, (12)
−∞ a can exploit optimized FFT libraries72–74.
fCWT separates scale-independent and scale-dependent operations, which
which comes with considerable computational complexity. When implemented have to be performed separately for each wavelet’s scale. A detailed schematic of
digitally, its discrete form is used: fCWT’s algorithmic implementation is provided in Extended Data Fig. 1. With
[ ] CWTs, the frequency scale is often divided into hundreds of scales. We thus
N
∑−1
n−b focused the optimization on the fCWT’s scale-dependent part by exploiting its
Wψ f[a, b] = |a|−1 f[n]ψ , (13) repeated nature and high parallelizability. The scale-independent operations are
n=0
a
performed first as their result forms the input for the scale-dependent steps. We
which is mathematically equivalent to passing the input signal through a series of pre-calculate two functions: (1) the input signal’s FFT and (2) the FFT of the
wavelet filters of different lengths. Care is required at the boundaries of the signal. mother wavelet function at scale a0 = 2. Both functions are independent of the scale
As the discrete form assumes signals of finite length, wavelet coefficients near factor a, so they can be pre-calculated and used as look-up tables in the processing
the boundaries become increasingly meaningless. Instantaneous frequency at the pipeline.
first or last sample is impossible to calculate as one should know how the signal
continues. There are several strategies to solve this uncertainty. For more details FFT. Using the float- and AVX2-enabled Fastest Fourier Transform in the West
about this topic, see the Boundary effects section. (FFTW) library73, the input signal’s FFT is calculated. FFTW has superior
Equation (10)’s computational complexity can be estimated using the performance in various benchmarks75 and has the ability to dynamically optimize
trapezoidal rule for integral solving and assuming a signal of length N = 2J. its algorithmic implementation. FFTW determines the most efficient way to
Furthermore, we assume J wavelets at aj = 2j discrete scales, and a wavelet length calculate the signal’s FFT with length N on hardware set-up X. This requires
of L samples at unit scale. Starting at unit scale a0 = 1, we then have O(a0NL) considerable time, which makes it only useful in situations where many FFTs
complexity, with the cost of all scales resulting in are calculated with the same N and X. This is the case with fCWT, as its scale-
dependent part evaluates a fixed-length inverse FFT for every scale factor a. Other
NL + 2NL + 4NL + … + 2J NL = O(LN2 ). (14) high-performance FFT libraries include the Fastest Fourier Transform in the
South72 and Intel’s Math Kernal Library74. However, as Fastest Fourier Transform in
In other words, a naïve approach to DWT calculation would result in a polynomial the South lacks important optimization techniques and Intel’s Math Kernel Library
complexity of O(N2). CWT would be even worse, as the discretization of the time is limited to Intel processors only, FFTW is currently the most flexible and versatile
and frequency domains is much finer. Fortunately, scientists quickly realized a high-performance FFT library available.
considerable reduction in computational complexity could be achieved using Before a signal’s FFT is calculated, it is first zero-padded to the nearest
Parseval’s theorem. power of two, which allows more time-efficient calculations than with other
signal lengths. Zero padding lets all signals that map to the same nearest power
Fourier-based wavelet transform. Applying Parseval’s theorem to equation (12), a of two use the same FFTW optimization. Hence, the flexibilty of fCWT as a
reduction in CWT’s complexity can be achieved: tool is preserved while still enjoying the benefit of FFTW’s optimization plans.
However, it will result in step-like performance behavior as seen in Fig. 3. After
1 FFT calculation, we let FFTW write the complex-valued FT to memory in an
Wψ f(a, b) = f̂(ξ)ψ
a, b (ξ )dξ. (15)
2π interleaving format (Extended Data Fig. 2). Using this, we exploit the CPU’s
predictive caching behavior and hence reduce memory access in the next steps.
Subsequently, we define ψ a, b (ξ ) in terms of the FT of the mother wavelet function Because a CPU works with chunks of memory instead of single values, it always
ψ(t), using its basic time-shifting and time-scaling properties: caches adjacent memory next to a requested value as well26,76. While we access the
1 real part of a value, interleaving takes advantage of this behavior as the complex
ψ
a, b (ξ ) = ψ̂ (ξ)e−ibξ (time shifting) (16) part is cached. Consequently, accessing the complex part after the real part does
a
not require an additional memory request, which reduces memory accesses
by 50%.
−ibξ
= ψ̂ (aξ)e (time scaling). (17)
Scale-independent mother wavelet generation. The FFT of the mother wavelet
Substitution gives function Ψ̂ [k] is generated once during the scale-independent step. Because
∫ wavelets in the frequency domain uniformly contract as their scale increases,
1 daughter wavelet functions can be generated by downsampling a pre-generated
Wψ f(a, b) = f̂(ξ)ψ̂ (aξ)eibξ dξ (18)
2π mother wavelet function. Because scales must be at least amin = 2, we generate the
mother wavelet function at a0 = 2 to save memory. It is important to note that
or in its discrete form the mother wavelet function is generated directly from its analytical Fourier-
K−1 transformed definition. Consequently, we create Ψ̂ [k] such that its length always
1∑ matches that of f̂[k]. This ensures fCWT’s independence of wavelet length and
Wψ f[a, b] = f̂[k]ψ̂ [ak]ei2πbk/K , (19)
K achieves the highest wavelet resolution possible.
k=0
Extended Data Fig. 1 | Algorithmic implementation of fCWT The algorithmic implementation behind fCWT can be divided into: i) scale-independent and
ii) scale-dependent operations. The scale-dependent operations each calculate the wavelet coefficients of a single scale-factor in the final time–frequency
matrix. By repeating the scale-dependent part m = ∣a∣ times, the time–frequency matrix is build up one row at a time.
Extended Data Fig. 2 | FFTW’s interleaving storing format Using an interleaving value format, the Fastest Fourier Transform in the West (FFTW) writes
a complex-valued Fourier transform to memory. As the CPU caches adjacent values when accessing memory, accessing the complex and real part only
requires single memory access instead of two.
Extended Data Fig. 3 | From mother to daughter wavelet The generation of the daughter wavelet ψ̂ a [k] is done efficiently by downsampling the mother
wavelet Ψ̂[k]. This eliminates the need for expensive Gaussian calculations in the scale-dependent step. The mother wavelet is only calculated once in the
scale-independent step.
Extended Data Fig. 4 | SIMD multiplication fCWT combines the generation of the daughter wavelet and its multiplication with the Fourier transformed
input signal together in one Single Instruction, Multiple Data (SIMD) multiplication. As the Fourier transformed input signal is complex-valued, the real
daughter wavelet values are copied twice such that SIMD can perform an element-wise multiplication between both buffers. In this example a scale-factor
of a = 3 is used.
Extended Data Fig. 5 | Boundary effects in fCWT and MATLAB With fCWT we perform zero extension to mitigate boundary effects. In contrast, by
default MATLAB uses a content dependent mirror extension. In some cases, such an extension strategy can increase boundary effect severity instead of
decreasing it as can be seen here.