0% found this document useful (0 votes)

184 views

The Kaldi Speech Recognition Toolkit

The document summarizes the Kaldi Speech Recognition Toolkit, an open-source toolkit for speech recognition research written in C++. Key points: 1) Kaldi provides a speech recognition system based on finite-state transducers using the OpenFst library, along with scripts for building complete recognition systems from widely available databases. 2) The toolkit supports standard acoustic modeling techniques like Gaussian mixture models as well as subspace Gaussian mixture models, along with speaker adaptation methods and feature transforms. 3) Kaldi is designed to have a modular structure and extensive testing, with the goal of being easy to understand, modify, and extend for research purposes. It compiles on common operating systems and has an open license

Uploaded by

chieubuonhoanghon_ht894249

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

184 views

The Kaldi Speech Recognition Toolkit

Uploaded by

chieubuonhoanghon_ht894249

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

The Kaldi Speech Recognition Toolkit

Daniel Povey1 , Arnab Ghoshal2 ,

Gilles
Lukas Burget4,5 , Ondrej Glembek4 , Nagendra Goel6 , Mirko Hannemann4 ,
Petr Motlc ek7 , Yanmin Qian8 , Petr Schwarz4 , Jan Silovsky9 , Georg Stemmer10 , Karel Vesely4
Boulianne3 ,

Microsoft Research, USA, [email protected];

Saarland University, Germany, [email protected];
3
Centre de Recherche Informatique de Montreal, Canada; 4 Brno University of Technology, Czech Republic;
5
SRI International, USA; 6 Go-Vivace Inc., USA; 7 IDIAP Research Institute, Switzerland; 8 Tsinghua University, China;
9
Technical University of Liberec, Czech Republic; 10 University of Erlangen-Nuremberg, Germany
2

AbstractWe describe the design of Kaldi, a free, open-source

toolkit for speech recognition research. Kaldi provides a speech
recognition system based on finite-state transducers (using the
freely available OpenFst), together with detailed documentation
and scripts for building complete recognition systems. Kaldi
is written is C++, and the core library supports modeling of
arbitrary phonetic-context sizes, acoustic modeling with subspace
Gaussian mixture models (SGMM) as well as standard Gaussian
mixture models, together with all commonly used linear and
affine transforms. Kaldi is released under the Apache License
v2.0, which is highly nonrestrictive, making it suitable for a wide
community of users.

I. I NTRODUCTION
Kaldi1 is an open-source toolkit for speech recognition
written in C++ and licensed under the Apache License v2.0.
The goal of Kaldi is to have modern and flexible code that is
easy to understand, modify and extend. Kaldi is available on
SourceForge (see http://kaldi.sf.net/). The tools compile on the
commonly used Unix-like systems and on Microsoft Windows.
Researchers on automatic speech recognition (ASR) have
several potential choices of open-source toolkits for building a
recognition system. Notable among these are: HTK [1], Julius
[2] (both written in C), Sphinx-4 [3] (written in Java), and the
RWTH ASR toolkit [4] (written in C++). Yet, our specific
requirementsa finite-state transducer (FST) based framework, extensive linear algebra support, and a non-restrictive
licenseled to the development of Kaldi. Important features
of Kaldi include:
Integration with Finite State Transducers: We compile
against the OpenFst toolkit [5] (using it as a library).
Extensive linear algebra support: We include a matrix
library that wraps standard BLAS and LAPACK routines.
Extensible design: We attempt to provide our algorithms
in the most generic form possible. For instance, our decoders
work with an interface that provides a score for a particular
frame and FST input symbol. Thus the decoder could work
from any suitable source of scores.
Open license: The code is licensed under Apache v2.0,
which is one of the least restrictive licenses available.
1 According to legend, Kaldi was the Ethiopian goatherd who discovered
the coffee plant.

Complete recipes: We make available complete recipes

for building speech recognition systems, that work from
widely available databases such as those provided by the
Linguistic Data Consortium (LDC).
Thorough testing: The goal is for all or nearly all the
code to have corresponding test routines.
The main intended use for Kaldi is acoustic modeling
research; thus, we view the closest competitors as being HTK
and the RWTH ASR toolkit (RASR). The chief advantage
versus HTK is modern, flexible, cleanly structured code and
better WFST and math support; also, our license terms are
more open than either HTK or RASR.
The paper is organized as follows: we start by describing the
structure of the code and design choices (section II). This is
followed by describing the individual components of a speech
recognition system that the toolkit supports: feature extraction
(section III), acoustic modeling (section IV), phonetic decision
trees (section V), language modeling (section VI), and decoders (section VIII). Finally, we provide some benchmarking
results in section IX.
II. OVERVIEW OF THE TOOLKIT
We give a schematic overview of the Kaldi toolkit in figure
1. The toolkit depends on two external libraries that are
also freely available: one is OpenFst [5] for the finite-state
framework, and the other is numerical algebra libraries. We use
the standard Basic Linear Algebra Subroutines (BLAS)and
Linear Algebra PACKage (LAPACK)2 routines for the latter.
The library modules can be grouped into two distinct
halves, each depending on only one of the external libraries
(c.f. Figure 1). A single module, the DecodableInterface
(section VIII), bridges these two halves.
Access to the library functionalities is provided through
command-line tools written in C++, which are then called
from a scripting language for building and running a speech
recognizer. Each tool has very specific functionality with a
small set of command line arguments: for example, there
are separate executables for accumulating statistics, summing
accumulators, and updating a GMM-based acoustic model
2 Available
from:
http://www.netlib.org/blas/
http://www.netlib.org/lapack/ respectively.

and

External Libraries
BLAS/LAPACK

B. GMM-based acoustic model

OpenFST

Kaldi C++ Library

Matrix
Feat

GMM

Transforms

U>ls

Tree

SGMM
Decodable

FST ext

HMM
Decoder

Kaldi C++ Executables

(Shell) Scripts
Fig. 1. A simplified view of the different components of Kaldi. The library
modules can be grouped into those that depend on linear algebra libraries
and those that depend on OpenFst. The decodable class bridges these two
halves. Modules that are lower down in the schematic depend on one or more
modules that are higher up.

using maximum likelihood estimation. Moreover, all the tools

can read from and write to pipes which makes it easy to chain
together different tools.
To avoid code rot, We have tried to structure the toolkit
in such a way that implementing a new feature will generally
involve adding new code and command-line tools rather than
modifying existing ones.
III. F EATURE E XTRACTION
Our feature extraction and waveform-reading code aims to
create standard MFCC and PLP features, setting reasonable
defaults but leaving available the options that people are most
likely to want to tweak (for example, the number of mel
bins, minimum and maximum frequency cutoffs, etc.). We
support most commonly used feature extraction approaches:
e.g. VTLN, cepstral mean and variance normalization, LDA,
STC/MLLT, HLDA, and so on.
IV. ACOUSTIC M ODELING
Our aim is for Kaldi to support conventional models (i.e.
diagonal GMMs) and Subspace Gaussian Mixture Models
(SGMMs), but to also be easily extensible to new kinds of
model.
A. Gaussian mixture models
We support GMMs with diagonal and full covariance structures. Rather than representing individual Gaussian densities separately, we directly implement a GMM class that
is parametrized by the natural parameters, i.e. means times
inverse covariances and inverse covariances. The GMM classes
also store the constant term in likelihood computation, which
consist of all the terms that do not depend on the data vector.
Such an implementation is suitable for efficient log-likelihood
computation with simple dot-products.

The acoustic model class AmDiagGmm represents a collection of DiagGmm objects, indexed by pdf-ids that correspond
to context-dependent HMM states. This class does not represent any HMM structure, but just a collection of densities (i.e.
GMMs). There are separate classes that represent the HMM
structure, principally the topology and transition-modeling
code and the code responsible for compiling decoding graphs,
which provide a mapping between the HMM states and the
pdf index of the acoustic model class. Speaker adaptation
and other linear transforms like maximum likelihood linear
transform (MLLT) [6] or semi-tied covariance (STC) [7] are
implemented by separate classes.
C. HMM Topology
It is possible in Kaldi to separately specify the HMM
topology for each context-independent phone. The topology
format allows nonemitting states, and allows the user to prespecify tying of the p.d.f.s in different HMM states.
D. Speaker adaptation
We support both model-space adaptation using maximum
likelihood linear regression (MLLR) [8] and feature-space
adaptation using feature-space MLLR (fMLLR), also known
as constrained MLLR [9]. For both MLLR and fMLLR,
multiple transforms can be estimated using a regression tree
[10]. When a single fMLLR transform is needed, it can be
used as an additional processing step in the feature pipeline.
The toolkit also supports speaker normalization using a linear
approximation to VTLN, similar to [11], or conventional
feature-level VTLN, or a more generic approach for gender
normalization which we call the exponential transform [12].
Both fMLLR and VTLN can be used for speaker adaptive
training (SAT) of the acoustic models.
E. Subspace Gaussian Mixture Models
For subspace Gaussian mixture models (SGMMs), the
toolkit provides an implementation of the approach described
in [13]. There is a single class AmSgmm that represents a whole
collection of pdfs; unlike the GMM case there is no class that
represents a single pdf of the SGMM. Similar to the GMM
case, however, separate classes handle model estimation and
speaker adaptation using fMLLR.
V. P HONETIC D ECISION T REES
Our goals in building the phonetic decision tree code were
to make it efficient for arbitrary context sizes (i.e. we avoided
enumerating contexts), and also to make it general enough
to support a wide range of approaches. The conventional
approach is, in each HMM-state of each monophone, to have
a decision tree that asks questions about, say, the left and
right phones. In our framework, the decision-tree roots can
be shared among the phones and among the states of the
phones, and questions can be asked about any phone in the
context window, and about the HMM state. Phonetic questions
can be supplied based on linguistic knowledge, but in our

recipes the questions are generated automatically based on

a tree-clustering of the phones. Questions about things like
phonetic stress (if marked in the dictionary) and word start/end
information are supported via an extended phone set; in this
case we share the decision-tree roots among the different
versions of the same phone.

TABLE I
BASIC TRIPHONE SYSTEM ON R ESOURCE M ANAGEMENT: %WER S

HTK
Kaldi

Feb89
2.77
3.20

Oct89
4.02
4.21

Test set
Feb91
3.30
3.50

Sep92
6.29
5.86

Avg
4.10
4.06

VI. L ANGUAGE M ODELING

VIII. D ECODERS

Since Kaldi uses an FST-based framework, it is possible, in

principle, to use any language model that can be represented as
an FST. We provide tools for converting LMs in the standard
ARPA format to FSTs. In our recipes, we have used the
IRSTLM toolkit 3 for purposes like LM pruning. For building
LMs from raw text, users may use the IRSTLM toolkit, for
which we provide installation help, or a more fully-featured
toolkit such as SRILM 4 .

We have several decoders, from simple to highly optimized;

more will be added to handle things like on-the-fly language
model rescoring and lattice generation. By decoder we mean
a C++ class that implements the core decoding algorithm. The
decoders do not require a particular type of acoustic model:
they need an object satisfying a very simple interface with a
function that provides some kind of acoustic model score for
a particular (input-symbol and frame).

VII. C REATING D ECODING G RAPHS

All our training and decoding algorithms use Weighted
Finite State Transducers (WFSTs). In the conventional
recipe [14], the input symbols on the decoding graph correspond to context-dependent states (in our toolkit, these
symbols are numeric and we call them pdf-ids). However,
because we allow different phones to share the same pdf-ids,
we would have a number of problems with this approach,
including not being able to determinize the FSTs, and not
having sufficient information from the Viterbi path through an
FST to work out the phone sequence or to train the transition
probabilities. In order to fix these problems, we put on the
input of the FSTs a slightly more fine-grained integer identifier
that we call a transition-id, that encodes the pdf-id, the phone
it is a member of, and the arc (transition) within the topology
specification for that phone. There is a one-to-one mapping
between the transition-ids and the transition-probability parameters in the model: we decided make transitions as finegrained as we could without increasing the size of the decoding
graph.
Our decoding-graph construction process is based on the
recipe described in [14]; however, there are a number of
differences. One important one relates to the way we handle
weight-pushing, which is the operation that is supposed to
ensure that the FST is stochastic. Stochastic means that
the weights in the FST sum to one in the appropriate sense,
for each state (like a properly normalized HMM). Weight
pushing may fail or may lead to bad pruning behavior if the
FST representing the grammar or language model (G) is not
stochastic, e.g. for backoff language models. Our approach
is to avoid weight-pushing altogether, but to ensure that
each stage of graph creation preserves stochasticity in an
appropriate sense. Informally, what this means is that the nonsum-to-one-ness (the failure to sum to one) will never get
worse than what was originally present in G.
3 Available
4 Available

from: http://hlt.fbk.eu/en/irstlm
from: http://www.speech.sri.com/projects/srilm/

class DecodableInterface {
public:
virtual float LogLikelihood(int frame, int index) = 0;
virtual bool IsLastFrame(int frame) = 0;
virtual int NumIndices() = 0;
virtual DecodableInterface() {}
};

Command-line decoding programs are all quite simple, do

just one pass of decoding, and are all specialized for one
decoder and one acoustic-model type. Multi-pass decoding is
implemented at the script level.
IX. E XPERIMENTS
We report experimental results on the Resource Management (RM) corpus and on Wall Street Journal. The results reported here correspond to version 1.0 of Kaldi; the scripts that
correspond to these experiments may be found in egs/rm/s1
and egs/wsj/s1.
A. Comparison with previously published results
Table I shows the results of a context-dependent triphone
system with mixture-of-Gaussian densities; the HTK baseline
numbers are taken from [15] and the systems use essentially
the same algorithms. The features are MFCCs with per-speaker
cepstral mean subtraction. The language model is the wordpair bigram language model supplied with the RM corpus.
The WERs are essentially the same. Decoding time was about
0.13RT, measured on an Intel Xeon CPU at 2.27GHz. The
system identifier for the Kaldi results is tri3c.
Table II shows similar results for the Wall Street Journal
system, this time without cepstral mean subtraction. The WSJ
corpus comes with bigram and trigram language models. and
we compare with published numbers using the bigram language model. The baseline results are reported in [16], which
we refer to as Bell Labs (for the authors affiliation), and a
HTK system described in [17]. The HTK system was genderdependent (a gender-independent baseline was not reported),
so the HTK results are slightly better. Our decoding time was
about 0.5RT.

TABLE II
BASIC TRIPHONE SYSTEM , WSJ, 20 K OPEN VOCABULARY, BIGRAM LM,
SI-284 TRAIN : %WER S

Bell
HTK (+GD)
KALDI

Test set
Nov92
Nov93
11.9
15.4
11.1
14.5
11.8
15.0

TABLE III
R ESULTS ON RM AND ON WSJ, 20 K OPEN VOCABULARY, BIGRAM LM,
TRAINED ON HALF OF SI-84: %WER S

Triphone
+ fMLLR
+ LVTLN
Splice-9 + LDA + MLLT
+ SAT (fMLLR)
+ SGMM + spk-vecs
+ fMLLR
+ ET

RM (Avg)
3.97
3.59
3.30
3.88
2.70
2.45
2.31
2.15

WSJ Nov92
12.5
11.4
11.1
12.2
9.6
10.0
9.8
9.0

WSJ Nov93
18.3
15.5
16.4
17.7
13.7
13.4
12.9
12.3

B. Other experiments
Here we report some more results on both the WSJ test sets
(Nov92 and Nov93) using systems trained on just the SI-84
part of the training data, that demonstrate different features that
are supported by Kaldi. We also report results on the RM task,
averaged over 6 test sets: the 4 mentioned in table I together
with Mar87 and Oct87. The best result for a conventional
GMM system is achieved by a SAT system that splices 9
frames (4 on each side of the current frame) and uses LDA
to project down to 40 dimensions, together with MLLT. We
achieve better performance on average, with an SGMM system
trained on the same features, with speaker vectors and fMLLR
adaptation. The last line, with the best results, includes the
exponential transform [12] in the features.
X. C ONCLUSIONS
We described the design of Kaldi, a free and open-source
speech recognition toolkit. The toolkit currently supports modeling of context-dependent phones of arbitrary context lengths,
and all commonly used techniques that can be estimated using
maximum likelihood. It also supports the recently proposed
SGMMs. Development of Kaldi is continuing and we are
working on using large language models in the FST framework, lattice generation and discriminative training.
ACKNOWLEDGMENTS
We would like to acknowledge participants and collaborators in the 2009
Johns Hopkins University Workshop, including Mohit Agarwal, Pinar Akyazi,
Martin Karafiat, Feng Kai, Ariya Rastrow, Richard C. Rose and Samuel
Thomas; Patrick Nguyen, for introducing the participants in that workshop
and for help with WSJ recipes, and faculty and staff at JHU for their help
during that workshop, including Sanjeev Khudanpur, Desiree Cleves, and the
late Fred Jelinek.
We would like to acknowledge the support of Geoffrey Zweig and Alex

Acero at Microsoft Research. We are grateful to Jan (Honza) Cernock

y for
helping us organize the workshop at the Brno University of Technology during
August 2010 and 2011. Thanks to Tomas Kasparek for system support and
Renata Kohlova for administrative support.
We would like to thank Michael Riley, who visited us in Brno to
deliver lectures on finite state transducers and helped us understand OpenFst;

Henrique (Rico) Malvar of Microsoft Research for allowing the use of his
FFT code; and Patrick Nguyen for help with WSJ recipes. We would like
to acknowledge the help with coding and documentation from Sandeep Boda
and Sandeep Reddy (sponsored by Go-Vivace Inc.) and Haihua Xu. We thank
Pavel Matejka (and Phonexia s.r.o.) for allowing the use of feature processing
code.
During the development of Kaldi, Arnab Ghoshal was supported by
the European Communitys Seventh Framework Programme under grant
agreement no. 213850 (SCALE); the BUT researchers were supported by the
Technology Agency of the Czech Republic under project No. TA01011328,
and partially by Czech MPO project No. FR-TI1/034.
The JHU 2009 workshop was supported by National Science Foundation
Grant Number IIS-0833652, with supplemental funding from Google Research, DARPAs GALE program and the Johns Hopkins University Human
Language Technology Center of Excellence.

R EFERENCES
[1] S. Young, G. Evermann, M. Gales, T. Hain, D. Kershaw, X. Liu,
G. Moore, J. Odell, D. Ollason, D. Povey, V. Valtchev, and P. Woodland,
The HTK Book (for version 3.4). Cambridge University Engineering
Department, 2009.
[2] A. Lee, T. Kawahara, and K. Shikano, Julius an open source realtime large vocabulary recognition engine, in EUROSPEECH, 2001, pp.
16911694.
[3] W. Walker, P. Lamere, P. Kwok, B. Raj, R. Singh, E. Gouvea, P. Wolf,
and J. Woelfel, Sphinx-4: A flexible open source framework for speech
recognition, Sun Microsystems Inc., Technical Report SML1 TR20040811, 2004.
[4] D. Rybach, C. Gollan, G. Heigold, B. Hoffmeister, J. Loo f, R. Schluter,
and H. Ney, The RWTH Aachen University Open Source Speech
Recognition System, in INTERSPEECH, 2009, pp. 21112114.
[5] C. Allauzen, M. Riley, J. Schalkwyk, W. Skut, and M. Mohri, OpenFst:
a general and efficient weighted finite-state transducer library, in Proc.
CIAA, 2007.
[6] R. Gopinath, Maximum likelihood modeling with Gaussian distributions for classification, in Proc. IEEE ICASSP, vol. 2, 1998, pp. 661
664.
[7] M. J. F. Gales, Semi-tied covariance matrices for hidden Markov
models, IEEE Trans. Speech and Audio Proc., vol. 7, no. 3, pp. 272
281, May 1999.
[8] C. J. Leggetter and P. C. Woodland, Maximum likelihood linear
regression for speaker adaptation of continuous density hidden Markov
models, Computer Speech and Language, vol. 9, no. 2, pp. 171185,
1995.
[9] M. J. F. Gales, Maximum likelihood linear transformations for HMMbased speech recognition, Computer Speech and Language, vol. 12,
no. 2, pp. 7598, April 1998.
[10] , The generation and use of regression class trees for MLLR
adaptation, Cambridge University Engineering Department, Technical
Report CUED/F-INFENG/TR.263, August 1996.
[11] D. Y. Kim, S. Umesh, M. J. F. Gales, T. Hain, and P. C. Woodland,
Using VTLN for broadcast news transcription, in Proc. ICSLP, 2004,
pp. 19531956.
[12] D. Povey, G. Zweig, and A. Acero, The Exponential Transform as a
generic substitute for VTLN, in IEEE ASRU, 2011.
[13] D. Povey, L. Burget et al., The subspace Gaussian mixture model
A structured model for speech recognition, Computer Speech & Language, vol. 25, no. 2, pp. 404439, April 2011.
[14] M. Mohri, F. Pereira, and M. Riley, Weighted finite-state transducers
in speech recognition, Computer Speech and Language, vol. 20, no. 1,
pp. 6988, 2002.
[15] D. Povey and P. C. Woodland, Frame discrimination training for HMMs
for large vocabulary speech recognition, in Proc. IEEE ICASSP, vol. 1,
1999, pp. 333336.
[16] W. Reichl and W. Chou, Robust decision tree state tying for continuous
speech recognition, IEEE Transactions on Speech and Audio Processing, vol. 8, no. 5, pp. 555566, September 2000.
[17] P. C. Woodland, J. J. Odell, V. Valtchev, and S. J. Young, Large
vocabulary continuous speech recognition using HTK, in Proc. IEEE
ICASSP, vol. 2, 1994, pp. II/125II/128.

Max A. Little Machine Learning For Signal Processing Data Science Algorithms and Computational Statistics Oxford University Press USA 2019
100% (1)
Max A. Little Machine Learning For Signal Processing Data Science Algorithms and Computational Statistics Oxford University Press USA 2019
378 pages
Deep Learning Based TTS-STT Model With Transliteration For Indic Languages
No ratings yet
Deep Learning Based TTS-STT Model With Transliteration For Indic Languages
9 pages
How-To Install RTAI in Ubuntu Hardy
No ratings yet
How-To Install RTAI in Ubuntu Hardy
6 pages
Getting Started Guide: Model Predictive Control Toolbox™ 3
No ratings yet
Getting Started Guide: Model Predictive Control Toolbox™ 3
142 pages
primerHW SWinterface
No ratings yet
primerHW SWinterface
107 pages
Murphy Book Solution
No ratings yet
Murphy Book Solution
100 pages
Kaldi Whitepaper PDF
No ratings yet
Kaldi Whitepaper PDF
4 pages
What Is Kaldi?: History of The Kaldi Project
No ratings yet
What Is Kaldi?: History of The Kaldi Project
3 pages
Pytorch-Kaldi 2018
No ratings yet
Pytorch-Kaldi 2018
5 pages
Kaldi For Dummies
No ratings yet
Kaldi For Dummies
13 pages
The Kaldi Speech Recognition Toolkit PDF
No ratings yet
The Kaldi Speech Recognition Toolkit PDF
4 pages
Lecture 1 Kaldi
No ratings yet
Lecture 1 Kaldi
56 pages
Hands-On Speech Recognition Wit - Yamin Ren
No ratings yet
Hands-On Speech Recognition Wit - Yamin Ren
223 pages
Presentation 2
No ratings yet
Presentation 2
12 pages
MFCC PDF
No ratings yet
MFCC PDF
14 pages
Speaker Recognition
No ratings yet
Speaker Recognition
29 pages
Bad Ideas
No ratings yet
Bad Ideas
69 pages
Kaldi SRUsage Manual
No ratings yet
Kaldi SRUsage Manual
31 pages
How To Use An Existing DNN Recognizer For Decoding in Kaldi
No ratings yet
How To Use An Existing DNN Recognizer For Decoding in Kaldi
14 pages
Speech Enhancement Using Kalman Filter
No ratings yet
Speech Enhancement Using Kalman Filter
14 pages
Kalman Filter
No ratings yet
Kalman Filter
31 pages
Presentation On Speech Recognition
No ratings yet
Presentation On Speech Recognition
11 pages
Simulink Basics Tutorial PDF
No ratings yet
Simulink Basics Tutorial PDF
44 pages
Automatic Fault Detection System Using PLC
No ratings yet
Automatic Fault Detection System Using PLC
26 pages
Iot Based Cattle Health Monitoring System IJERTCONV5IS01041
No ratings yet
Iot Based Cattle Health Monitoring System IJERTCONV5IS01041
4 pages
PID Principle
No ratings yet
PID Principle
44 pages
LSTM
No ratings yet
LSTM
42 pages
Real Time Operating Systems For Small Microcontrollers
No ratings yet
Real Time Operating Systems For Small Microcontrollers
16 pages
Beaglebone Black
No ratings yet
Beaglebone Black
63 pages
01-Docker - 02 - Install Docker Desktop on Windows (1)
No ratings yet
01-Docker - 02 - Install Docker Desktop on Windows (1)
6 pages
RHCSA-4 Basic Concepts and Commands
No ratings yet
RHCSA-4 Basic Concepts and Commands
11 pages
Glade Tutorial
No ratings yet
Glade Tutorial
5 pages
ANTLR Mega Tutorial
100% (1)
ANTLR Mega Tutorial
70 pages
Glade Tutorial
No ratings yet
Glade Tutorial
36 pages
Lynxos
No ratings yet
Lynxos
4 pages
Programming1 Lecture Presentations
No ratings yet
Programming1 Lecture Presentations
124 pages
Documentation
No ratings yet
Documentation
36 pages
Fault Detection Classification
No ratings yet
Fault Detection Classification
210 pages
Python Coding by Solving African Problem Regis Nguessan
100% (1)
Python Coding by Solving African Problem Regis Nguessan
55 pages
Agenda: 1. What Is Tensorflow?
No ratings yet
Agenda: 1. What Is Tensorflow?
10 pages
Python Cheat Sheet - OverAPI
No ratings yet
Python Cheat Sheet - OverAPI
1 page
Simulink Design Optimization - User's Guide
No ratings yet
Simulink Design Optimization - User's Guide
411 pages
ASR For L2 Japanese
No ratings yet
ASR For L2 Japanese
17 pages
Core Python Programming and Problem Anal
No ratings yet
Core Python Programming and Problem Anal
232 pages
Real Time Linux Slide
No ratings yet
Real Time Linux Slide
12 pages
Automated ANTLR Tree Walker Generation
No ratings yet
Automated ANTLR Tree Walker Generation
140 pages
AI Using Python
No ratings yet
AI Using Python
9 pages
Mutable Vs Immutable Objects in Python
No ratings yet
Mutable Vs Immutable Objects in Python
9 pages
Deep Learning For Edge Computing Applications A ST
No ratings yet
Deep Learning For Edge Computing Applications A ST
14 pages
COMP3007 Modern Programming Languages-Week2
No ratings yet
COMP3007 Modern Programming Languages-Week2
22 pages
OS Intro (Compatibility Mode)
No ratings yet
OS Intro (Compatibility Mode)
53 pages
OpenPLC - An Opensource Alternative For Automation
100% (1)
OpenPLC - An Opensource Alternative For Automation
5 pages
Q-Learning and Deep Q Networks (DQN)
No ratings yet
Q-Learning and Deep Q Networks (DQN)
52 pages
Linux Device Driver Development
No ratings yet
Linux Device Driver Development
3 pages
ML 09 SVM Draft
No ratings yet
ML 09 SVM Draft
73 pages
To Parallelize or Not To Parallelize, Speed Up Issue
No ratings yet
To Parallelize or Not To Parallelize, Speed Up Issue
15 pages
Frrouting Developers Guide
No ratings yet
Frrouting Developers Guide
315 pages
Anomaly Detection: Course: Data Mining II
No ratings yet
Anomaly Detection: Course: Data Mining II
12 pages
50 Python Concepts Every Developer Should Know
From Everand
50 Python Concepts Every Developer Should Know
Hernando Abella
No ratings yet
Advanced Unix Programming
From Everand
Advanced Unix Programming
Prof. N. B Venkateswarlu
No ratings yet
Continuous Hindi Speech Recognition Model Based On Kaldi ASR Toolkit
No ratings yet
Continuous Hindi Speech Recognition Model Based On Kaldi ASR Toolkit
4 pages
The Pytorch Kaldi Speech Recognition Too PDF
No ratings yet
The Pytorch Kaldi Speech Recognition Too PDF
5 pages
Bayesian_Inference_for_AI
No ratings yet
Bayesian_Inference_for_AI
22 pages
Speach To Text Transcription
No ratings yet
Speach To Text Transcription
15 pages
Asr04 HMM Intro
No ratings yet
Asr04 HMM Intro
38 pages
Max Yankelevich WorkFusion Overview GA Summit
No ratings yet
Max Yankelevich WorkFusion Overview GA Summit
51 pages
An Efficient Approach For Credit
No ratings yet
An Efficient Approach For Credit
17 pages
Num Pyro Ai en Stable
No ratings yet
Num Pyro Ai en Stable
309 pages
PHD Syllabus
No ratings yet
PHD Syllabus
119 pages
Speech Segmentation
No ratings yet
Speech Segmentation
6 pages
Gen AI Notes Part 1
No ratings yet
Gen AI Notes Part 1
15 pages
Lip Reading Using Image Processing
No ratings yet
Lip Reading Using Image Processing
4 pages
Hidden Markov Models
No ratings yet
Hidden Markov Models
20 pages
01 Introduction
No ratings yet
01 Introduction
68 pages
Unit 4 Full PPT (ML)
No ratings yet
Unit 4 Full PPT (ML)
31 pages
Unit V
No ratings yet
Unit V
17 pages
Immediate download Intelligent Video Surveillance Systems: An Algorithmic Approach First Edition Maheshkumar H. Kolekar ebooks 2024
100% (1)
Immediate download Intelligent Video Surveillance Systems: An Algorithmic Approach First Edition Maheshkumar H. Kolekar ebooks 2024
65 pages
CR Notes
No ratings yet
CR Notes
22 pages
Homework hmm2
No ratings yet
Homework hmm2
2 pages
Eti 3111
No ratings yet
Eti 3111
28 pages
Chemical Process Performance Evaluation
No ratings yet
Chemical Process Performance Evaluation
170 pages
Sonar Signal Processing
No ratings yet
Sonar Signal Processing
20 pages
AAI IA1 QUE ANS
No ratings yet
AAI IA1 QUE ANS
17 pages
Automatic Speech Recognition
No ratings yet
Automatic Speech Recognition
9 pages
Intro Stat 153
No ratings yet
Intro Stat 153
198 pages
01. Computational biology
No ratings yet
01. Computational biology
47 pages
Digital Speech Processing- Synthesis, And Recognition by Sadaoki Furui
No ratings yet
Digital Speech Processing- Synthesis, And Recognition by Sadaoki Furui
42 pages
NLP SEM IMP
No ratings yet
NLP SEM IMP
46 pages
Immediate download (Ebook) Machine Learning with TensorFlow by Chris A. Mattmann ISBN 9781617297717, 1617297712 ebooks 2024
100% (10)
Immediate download (Ebook) Machine Learning with TensorFlow by Chris A. Mattmann ISBN 9781617297717, 1617297712 ebooks 2024
55 pages
A Roadmap For Autonomous Robotic Assembl PDF
No ratings yet
A Roadmap For Autonomous Robotic Assembl PDF
6 pages