0% found this document useful (0 votes)
46 views

InTech-A Vlsi Architecture For Output Probability Computations of HMM Based Recognition Systems

This document describes a VLSI architecture for output probability computations of HMM-based recognition systems using a new block-wise parallel processing method called block-wise frame parallel processing (BFFP). BFFP processes multiple input frames in parallel within a block, requiring fewer registers and processing elements than the conventional block-wise state parallel processing (BSPP) architecture. The document introduces HMM-based recognition systems and output probability computations, describes BFFP and a suitable VLSI architecture, and evaluates that the BFFP architecture is more efficient through its use of registers compared to BSPP.

Uploaded by

Viet Vo
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views

InTech-A Vlsi Architecture For Output Probability Computations of HMM Based Recognition Systems

This document describes a VLSI architecture for output probability computations of HMM-based recognition systems using a new block-wise parallel processing method called block-wise frame parallel processing (BFFP). BFFP processes multiple input frames in parallel within a block, requiring fewer registers and processing elements than the conventional block-wise state parallel processing (BSPP) architecture. The document introduces HMM-based recognition systems and output probability computations, describes BFFP and a suitable VLSI architecture, and evaluates that the BFFP architecture is more efficient through its use of registers compared to BSPP.

Uploaded by

Viet Vo
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

A VLS Architecture for Output Probability Computations of HMM-based Recognition Systems 273

X

A VLSI Architecture for Output ProbabiIity
Computations of HMM-based
Recognition Systems

Kazuhiro Nakanura, Masaloshi Yananolo,
Kazuyoshi Takagi and Naofuni Takagi
Nagcqa Unitcrsi|q
]apan

1. Introduction

MoliIe enledded syslens vilh naluraI hunan inlerfaces, such as speech recognilion, Iip
reading, and geslure recognilion, are required for lhe reaIizalion of fulure uliquilous
conpuling. Recognilion lasks can le inpIenenled eilher on processors (CIUs and DSIs) or
dedicaled hardvare (ASICs). AIlhough processor-lased approaches offer fIexiliIily, reaI-
line recognilion lasks using slale-of-lhe-arl recognilion aIgorilhns exceed lhe perfornance
IeveI of currenl enledded processors, and require nodern high-perfornance processors
lhal consune far nore pover lhan dedicaled hardvare. Dedicaled hardvare, vhich is
oplinized for Iov-pover, reaI-line recognilion lasks, is nore suilalIe for inpIenenling
naluraI hunan inlerfaces in Iov pover noliIe enledded syslens. VLSI archileclures
oplinized for recognilion lasks vilh Iov pover dissipalion have leen deveIoped.
Yoshizava el aI. invesligaled a lIock-vise paraIIeI processing nelhod for oulpul prolaliIily
conpulalions of conlinuous hidden Markof nodeIs (HMMs), and proposed a Iov pover,
high-speed VLSI archileclure. Oulpul prolaliIily conpulalions are lhe nosl line-
consuning parl of HMM-lased recognilion syslens. Malhev el aI. deveIoped Iov-pover
acceIeralors for lhe SIHINX 3 speech recognilion syslen, and aIso deveIoped perceplion
acceIeralors for enledded syslens. In lhis chapler, ve presenl a fasl and nenory efficienl
VLSI archileclure for oulpul prolaliIily conpulalions of conlinuous HMMs using a nev
lIock-vise paraIIeI processing nelhod. We shov lIock-vise frane paraIIeI processing
(III) for oulpul prolaliIily conpulalions and presenl an appropriale VLSI archileclure
for ils inpIenenlalion. Conpared vilh a convenlionaI lIock-vise slale paraIIeI processing
(SII) archileclure, vhen lhere are a sufficienl nunler of HMM slales for accurale
recognilion, lhe III archileclure requires fever regislers and processing eIenenls (P|s),
and Iess processing line. The P|s used in lhe III archileclure are idenlicaI lo lhose used
in lhe SII archileclure. Iron a VLSI archilecluraI vievpoinl, a conparison shovs lhe
efficiency of lhe III archileclure lhrough efficienl use of regislers for sloring inpul fealure
veclors and inlernediale resuIls during conpulalion. The renainder of lhis chapler is
organized as foIIovs: lhe slruclure of HMM lased recognilion syslens is descriled in
13
VLS 274

















Iig. 1. asic slruclure of HMM-lased recognilion hardvare

Seclion 2, III and III-lased VLSI archileclure are inlroduced in Seclion 3, lhe
evaIualion of lhe III archileclure is descriled in Seclion 4, and concIusions are presenled
in Seclion 5.

2. HMM-based Recognition Systems

2.1 HMM-based Recognition Hardware
Due lo lheir effecliveness and efficiency for user-independenl recognilion, HMMs are
videIy used in appIicalions such as speech recognilion, Iip-reading, and geslure recognilion.
Iigure 1 shovs lhe lasic slruclure of HMM-lased recognilion hardvare (Yoshizava el aI.,
2OO6, Yoshizava el aI., 2OO4, Yoshizava el aI., 2OO2, Malhev el aI., 2OO3a). The oulpul
prolaliIily conpulalion circuil and Vilerli scorer vork logelher as a recognilion engine.
The inpuls lo lhe oulpul prolaliIily conpulalion circuil are fealure veclors of severaI
dinensions and nodeI paranelers of HMMs. These vaIues are slored in RAM and ROM
respecliveIy. The RAM, ROM and oulpul prolaliIily conpulalion circuil inlerconnecl via a
singIe lus, and nenory accesses are excIusive. The oulpul prolaliIily conpulalion circuil
oulpuls lhe resuIls of lhe oulpul prolaliIily conpulalion of HMMs. The Vilerli scorer
oulpuls IikeIihood score using lhe Vilerli aIgorilhn. In HMM-lased recognilion syslens,
lhe nosl line-consuning lask is oulpul prolaliIily conpulalions, and lhe oulpul
prolaliIily conpulalion circuil acceIerales lhese conpulalions. The oulpul prolaliIily
conpulalion circuil has severaI regisler arrays and processing eIenenls (P|s) for efficienl
high-speed paraIIeI processing.

2.2 Output ProbabiIity Computation of HMMs
Rel 2
1
, 2
2
, ..., 2
T
le a sequence of P-dinensionaI inpul fealure veclors (inpul franes) lo
HMMs, vhere 2
|
= (c
|1
, c
|2
, ..., c
|P
), 1 s | s T. T is lhe nunler of inpul fealure veclors, and P is
lhe dinension of lhe inpul fealure veclor. Ior an inpul frane 2
|
, lhe oulpul prolaliIily of
N-slale Iefl-lo-righl conlinuous HMM al lhe j-lh slale is given ly

Oulpul prolaliIily conpulalion circuil


Oulpul prolaliIilies of HMMs
Regisler arrays
HMM paranelers
(slored in ROM)
Iealure veclors
(slored in RAM)
Vilerli scorer /search aIgorilhn
LikeIihood scores
us
vords
elc.
voice
elc.
excIusive access
P|
P|s
A VLS Architecture for Output Probability Computations of HMM-based Recognition Systems 275





















Iig. 2. IIovcharl of oulpul prolaliIily conpulalion

( ) ( ) ORJ

7 W 1 M R E
3
S
MS WS MS M W M
s s s s + =
_
=
o e 2

(1)
vhere e
j
, o
jp
and
jp
are lhe faclors of lhe Caussian prolaliIily densily funclion (Yoshizava
el aI., 2OO6).
The oulpul prolaliIily conpulalion circuil conpules Iog|
j
(2
|
) lased on Lq. (1), vhere aII
HMM paranelers e
j
, o
jp
and
jp
are slored in ROM, and lhe inpul franes are slored in RAM.
The vaIues of T, N, P and lhe nunler of HMMs V differ for each recognilion syslen. Ior a
recenl isoIaled vord recognilion syslen (Yoshizava el aI., 2OO6, Yoshizava el aI., 2OO4), T, N,
P and V are 86, 32, 38, and 8OO, respecliveIy, and for anolher vord recognilion syslen
(Yoshizava el aI., 2OO2), T, N, P and V are 89, 12, 16 and 1OO respecliveIy. Ior a conlinuous
speech recognilion syslen (Malhev el aI., 2OO3a), T, N, P and V are approxinaleIy 2O, 1O, 4O,
and 5O, respecliveIy. Differenl appIicalions require differenl oulpul prolaliIily conpulalion
circuil archileclures. A fIovcharl of oulpul prolaliIily conpulalions for V HMMs is shovn
in Iig. 2. Oulpul prolaliIilies are ollained ly T N P V lines lhe parliaI conpulalion of
Iog|
j
(2
|
). IarliaI conpulalion of Iog|
j
(2
|
) perforns four arilhnelic operalions, a
sullraclion (a = c
|p

jp
), an addilion (acc = acc + |, vhere lhe iniliaI vaIue of acc is e
j
), and
lvo nuIlipIicalions (| = a a o
jp
) for Lq. (1), and conpules Iog|
j
(2
|
).

3. Fast and memory efficient VLSI architecture

3.1 BIock-wise frame paraIIeI processing
Iock paraIIeI processing (II) for oulpul prolaliIily conpulalions vas proposed as an
efficienl paraIIeI processing nelhod for vord HMM-lased speech recognilion ly Yoshizava

t = O
yes
t = t + 1
| = O
| = | + 1
j = O
j = j + 1
p = O
p = p + 1
IarliaI conpulalion of Iog|
j
(2
|
)
P s p
N s j
T s |
V s t
yes
yes
yes
no
no
no
no
|ccp A
|ccp 8
|ccp C
|ccp D
VLS 276



















Iig. 3. IIovcharl of oulpul prolaliIily conpulalion using SII

el aI. (Yoshizava el aI., 2OO6, Yoshizava el aI., 2OO4, Yoshizava el aI., 2OO2). In lhis nelhod,
lhe sel of inpul franes is caIIed a ||cc|, and HMM paranelers are effecliveIy shared lelveen
differenl inpul franes in lhe conpulalion. N-paraIIeI conpulalion is perforned ly lheir II.
In lhis chapler, ve cIassify lvo lypes of II according lo dala fIov of oulpul prolaliIily
conpulalions: lIock-vise frane paraIIeI processing (III) and lIock-vise slale paraIIeI
processing (SII). A lIock can le seen as a sel of M ( T) inpul franes, vhose eIenenls are
2
|
s, 1 | M. M franes in T inpul franes are processed in lIock. 8|PP perforns
arilhnelic operalions lo IocaIIy slored inpul franes, vhich are 2
1
, 2
2
, ..., 2
M
, and oulpul
prolaliIily conpulalions for nuIlipIe franes are carried oul sinuIlaneousIy. On lhe olher
hand, a lIock can aIso le seen as a M P nalrix vhose eIenenls are c
|p
, 1 | M, 1 p P.
8SPP perforns arilhnelic operalions lo an inpul sequence, vhich is c
11
, ..., c
1P
, c
21
, ..., c
2P
, ...,
c
M1
, ..., c
MP
, and oulpul prolaliIily conpulalions for nuIlipIe slales are carried oul
sinuIlaneousIy.
The II proposed ly Yoshizava el aI. (Yoshizava el aI., 2OO6, Yoshizava el aI., 2OO4,
Yoshizava el aI., 2OO2) is cIassified as a SII. In lhis chapler, ve presenl III for oulpul
prolaliIily conpulalions. M/2-paraIIeI conpulalions are perforned ly our III.
A fIovcharl of lhe oulpul prolaliIily conpulalions vilh lhe convenlionaI SII (Yoshizava
el aI., 2OO6, Yoshizava el aI., 2OO4, Yoshizava el aI., 2OO2) is shovn in Iig. 3.
P|
i
represenls lhe i-lh processing eIenenl, vhich conpules Iog|
i
(2
|
) ly a sullraclion, an
addilion, and lvo nuIlipIicalions for Lq. (1). |ccp 8 (Iig. 2) is expanded as shovn in Iig. 3,
and Iog|
1
(2
|
), Iog|
2
(2
|
), ..., and Iog|
N
(2
|
) are conpuled sinuIlaneousIy vilh N P|s, vhere
c
|p
is fed lo lhe N P|s in |ccp A. In addilion lo lhe N-slale paraIIeI conpulalion, lhe sane
HMM paranelers
jp
s and o
jp
s, and e
j
s, 1 j N, 1 p P, are used repealedIy during
|ccp C in Iig. 3.
A fIovcharl of lhe oulpul prolaliIily conpulalion vilh III is shovn in Iig. 4. The P|s in

t = O
t = t + 1
| = O
| = | + 1
p = O
p = p + 1
c
|p
P s p
T s |
V s t
yes
yes
yes
no
no
no
|ccp A
|ccp C
|ccp D
Iog|
1
(2
|
) Iog|
2
(2
|
)
N-paraIIeI conpulalion vilh N P|s
Iog|
N
(2
|
)
. . . . . . . . . . . . . . . . . . . . .
P|
2
P|
N
P|
1
A VLS Architecture for Output Probability Computations of HMM-based Recognition Systems 277


































Iig. 4. IIovcharl of oulpul prolaliIily conpulalion using III

Iigs. 4 and 3 are idenlicaI, lul in a differenl nunler. |ccp C in Iig. 2 is parliaIIy expanded in
Iig. 4, and Iog|
j
(2
|+1
), Iog|
j
(2
|+2
), ..., and Iog|
j
(2
|+M/2
) are conpuled sinuIlaneousIy vilh
M/2 P|s in |ccp C1, vhere
jp
and o
jp
are fed lo lhe M/2 P|s in |ccp A. In addilion lo lhe
M/2-frane paraIIeI conpulalions, Iog|
j
(2
|+M/2+1
), Iog|
j
(2
|+M/2+2
), ..., and Iog|
j
(2
|+M
) are
aIso conpuled vilh lhe sane M/2 P|s. In lhis doulIe M/2-paraIIeI conpulalion, lhe sane
HMM paranelers
jp
and o
jp
are used lvice, lecause lhe paranelers are independenl of |. In
addilion lo lhe M/2-paraIIeI conpulalions, |ccp D (Iig. 2) is divided inlo |ccps D1 and D2
(Iig. 4). The sane inpul franes 2
|+1
, 2
|+2
, ..., and 2
|+M
are used repealedIy during |ccp D1,
lecause lhe inpul franes are independenl of t.
t = t + 1
j = O, t = t
j = j + 1
p = O
p = p + 1

jp
, o
jp
P s p
T s |
max
V s t
max
yes
yes
yes
no
no
no
|ccp A
|ccp 8
|ccp D1
t
max
s t
yes
no
N s j
yes
no
t
max
= O
|ccp D2
|
max
= O
| = |
max
, |
max
= |
max
+ M, t = t
max
|
|ccp C1
t
max
= t
max
+ |
Iog|
j
(2
|+1
) Iog|
j
(2
|+2
)
doulIe M/2-paraIIeI conpulalion
vilh M/2 P|s
. . . . . . . . . . . . . . . . . .
P|
2
P|
M/2
P|
1
Iog|
j
(2
|+M/2+1
)
. . . . . . . . . . . . . . . .
Iog|
j
(2
|+M
) Iog|
j
(2
|+M/2+2
)
Iog|
j
(2
|+M/2
)
VLS 278

3.2 A VLSI architecture for output probabiIity computation
Our III VLSI archileclure for oulpul prolaliIily conpulalions is shovn in Iig. 5. The
archileclure consisls of five regisler arrays and M/2 P|s. Rcg2 slores M inpul franes 2
|+1
,
2
|+2
, ..., 2
|+M
. Rcg and Rcgo slore HMM paranelers
jp
, and o
jp
, respecliveIy. Rcge slores




























Iig. 5. III VLSI archileclure

HMM paraneler e
j
and inlernediale resuIls. Rcgo slores conpuled oulpul prolaliIilies for
a Vilerli scorer. Lach P|
i
consisls of lvo adders and lvo nuIlipIiers, vhich are used for
conpuling
( )
_
=
+
3
S
MS S W MS M
R

o e
.
Iigure 6 shovs lhe fIovcharl of oulpul prolaliIily conpulalions using lhe III
archileclure. The conpulalion slarls ly reading M inpul franes fron RAM and sloring
lhen lo Rcg2 in |ccp C1, vhich are 2
|+1
, 2
|+2
, ..., 2
|+M/2
, 2
|+M/2+1
, 2
| +M/2+2
, ..., 2
|+M
. The
HMM paranelers of t-lh HMM are read fron ROM and slored in Rcg, Rcgo and Rcge,
vhich are
11
, o
11
,

and e
1
. The vaIue of aII regislers in Rcge is sel lo e
1
. Ior lhe firsl haIf of
lhe slored inpul franes 2
|+1
, 2
|+2
, ..., and 2
|+M/2
, M/2 inlernediale resuIls are
sinuIlaneousIy conpuled vilh lhe slored
11
, o
11
, and e
1
ly M/2 P|s, vhere lhe HMM
paranelers are shared ly aII P|s. Al lhe sane line, an HMM paraneler
j p+1
of t-lh HMM
c
|p
Ou|pu| prc|a|i|i|q ccmpu|a|icn circui|
Rcg2
Rcg Rcgo
ROM
(, o, e)
RAM
(2
Rcgo
P
Rcge
+ +
P|
1
+ +
P|
2
+ +
P|
M/21
+ +
P|
M/2
. .
2
|+1
2
|+2
2
|+M/21
2
|+M/2
2
|+M/2+1
2
|+M/2+2
2
|+M1
2
|+M
.
.
.
.
.
.
.
.
.
.
.
.
.
.
M/2
M/2
M/2-paraIIeI conpulalion
M/2 M/2
M/2 M/2
M
A VLS Architecture for Output Probability Computations of HMM-based Recognition Systems 279

is read fron ROM and slored in Rcg. Then, for lhe olher haIf of lhe slored inpul franes
2
|+M/2+1
, 2
|+M/2+2
, ..., and 2
|+M
, M/2 inlernediale resuIls are sinuIlaneousIy conpuled vilh
lhe sane
11
, o
11
, and e
1
ly M/2 P|s. Al lhe sane line, an HMM paraneler o
j p+1
of t-lh
HMM is read fron ROM and slored in Rcgo. In lhis doulIe M/2-paraIIeI conpulalion, lhe





































Iig. 6. IIovcharl of conpulalions using lhe III archileclure

sane HMM paranelers
11
, o
11
, and e
1
are used lvice. In lhe nexl doulIe M/2-paraIIeI
conpulalion, lhe slored HMM paranelers
j p+1
and o
j p+1
are used lvice. M oulpul
prolaliIilies Iog|
j
(2
|+1
), Iog|
j
(2
|+2
), ..., and Iog|
j
(2
|+M
) of t-lh HMM are ollained ly |ccp
t = t + 1
j = O, t = t
j = j + 1
p = O
p = p + 1

jp
, o
jp
P s p
T s |
max
V s t
max
yes
yes
yes
no
no
no
|ccp A
|ccp 8
|ccp D1
t
max
s t
yes
no
N s j
yes
no
t
max
= O
|ccp D2
|
max
= O
| = |
max
, |
max
= |
max
+ M, t = t
max
|
|ccp C1
t
max
= t
max
+ |
Iog|
j
(2
|+1
)
doulIe M/2-paraIIeI conpulalion
vilh M/2 P|s
. . . . . . . . . . . . . . . . . .
P|
2
P|
M/2
P|
1
Iog|
j
(2
|+M/2
)
Iog|
j
(2
|+M/2+1
)
. . . . . . . . . . . . . . . . Iog|
j
(2
|+M
)
Load 2
|
lo Rcg2 (|=|+1, |=|+2, ., |=|+M, MP cycIes)
Load
11
, o
11
lo Rcg, Rego, respecliveIy (2 cycIes)
t = t+1, j = 1, p = 1
Load e
j
lo Rcge (1 cycIe)
Copy Rcge lo Rcgo
Load
j p+1
lo Rcg
Load o
j p+1
lo Rcgo
Iog|
j
(2
|+2
)
Iog|
j
(2
|+M/2+2
)
VLS 280

A. The ollained resuIls are lransfered fron Rcge lo Rcgo for slarling lhe nexl oulpul
prolaliIily conpulalion, Iog|
j+1
(2
|+1
), Iog|
j+1
(2
|+2
), ..., Iog|
j+1
(2
|+M
) of t-lh HMM. The
slored resuIls are fed lo lhe Vilerli scorer. The MN oulpul prolaliIilies of t-lh HMM are
ollained ly |ccp 8. MN| oulpul prolaliIilies of HMM t + 1, t + 2, ..., t + L are ollained
ly |ccp D1 vilh lhe sane M inpul franes 2
|+1
, 2
|+2
, ..., and franes 2
|+1
, 2
|+2
, ..., and 2
|+M.





























Iig. 7. SII VLSI archileclure

franes 2
|+1
, 2
|+2
, ..., and 2
|+M.
The MN|(T/M) oulpul prolaliIilies of HMM t + 1, t +
2, ..., t + | are ollained ly |ccp C1, and finaIIy lhe MN|(T/M)(V/|) oulpul prolaliIilies
of aII HMMs are ollained ly |ccp D2.

4. EvaIuation

We conpared lhe proposed III vilh SII (Iig. 7) VLSI archileclure (Yoshizava el aI.,
2OO6, Yoshizava el aI., 2OO4, Yoshizava el aI., 2OO2). The SII archileclure consisls of lhree
regisler arrays and N P|s. Rcg and Rcgo slore HMM paranelers
jp
and o
jp
, respecliveIy,
and Rcge slores HMM paraneler e
j
and inlernediale resuIls. The P|s in Iigs. 7 and 5 are
idenlicaI.
c
|p
Ou|pu| prc|a|i|i|q ccmpu|a|icn circui|
Rcg Rcgo
ROM
(, o, e)
RAM
(2
Rcge
+ +
P|
1
+ +
P|
2
+ +
P|
N1
+ +
P|
N
. .
.
N-paraIIeI conpulalion
N
N
.
.
.
.
.
P
N
.
.
.
.
.
P
A VLS Architecture for Output Probability Computations of HMM-based Recognition Systems 281

Iigure 8 shovs lhe fIovcharl of lhe conpulalions of SII archileclure. The conpulalion
slarls ly reading aII 2NP + N HMM paranelers of t-lh HMM fron ROM and sloring lhen
lo Rcg, Rcgo, and Rcge in |ccp D. Ior inpul c
|p
, lhe inlernediale resuIls are conpuled vilh
slored HMM paranelers ly N P|s. N oulpul prolaliIilies Iog|
1
(2
|
), Iog|
2
(2
|
), ..., Iog|
N
(2
|
)
of lhe HMM are ollained ly |ccp A. The ollained resuIls are fed lo a Vilerli scorer. NT

























Iig. 8. IIovcharl of conpulalions using lhe SII archileclure

Regisler size (lil)
III (ours) PMxo + 2x

+ x
o
+ 2Mx
f

SII NPx

+ NPx
o
+ Nx
f

TalIe 1. Regisler size

Irocessing line (cycIes)
III (ours) V/|(|PM + (1 + 2P)|NT/M(
SII V(2NP + N + PT)
TalIe 2. Irocessing line

oulpul prolaliIilies of t-lh HMM are ollained ly |ccp C vilh lhe sane HMM paranelers.
The NTV oulpul prolaliIilies of aII HMMs are ollained ly |ccp D.
TalIe 1 shovs lhe regisler size of lhe SII and III archileclures, vhere x

, x
o
, x
o
, and x
f

represenl lhe lil Ienglh of
jp
, o
jp
, o
|p
, and lhe oulpul of P|, respecliveIy. N, P, and M are lhe
t = O
t = t + 1
| = O
| = | + 1
p = O
p = p + 1
P s p
T s |
V s t
yes
yes
yes
no
no
no
|ccp A
|ccp C
|ccp D
Iog|
1
(2
|
) Iog|
2
(2
|
)
N-paraIIeI conpulalion vilh N P|s
Iog|
N
(2
|
)
. . . . . . . . . . . . . . . . . . . . .
P|
2
P|
N
P|
1
Load c
|p
Load
jp
and o
jp
of HMM t lo Rcg and Rcgo
(j = 1, 2, ., N, p = 1, 2, ., P, 2NP cycIes)
Load e
j
lo Rcge (j = 1, 2, ., N, N cycIes)
VLS 282

nunler of HMM slales, lhe dinension of inpul fealure veclor (frane), and lhe nunler of
inpul franes in a lIock, respecliveIy.
TalIe 2 shovs lhe processing line for conpuling oulpul prolaliIilies of V HMMs vilh lhe
III and SII archileclures, vhere T and | are lhe nunler of inpul franes and lhe
nunler of HMMs vhose oulpul prolaliIilies are conpuled vilh lhe sane inpul franes
during |ccp D1 of Iig. 6, respecliveIy.

Regisler size (lil) Irocessing line (cycIes) #P|s
III (ours) 15,512 4,477,44O 22
SII 2O,224 4,585,6OO 32
TalIe 3. LvaIualion of lhe SII and III perfornance






















Iig. 9. LvaIualion of lhe SII and III perfornance, and lhe vaIue of M of lhe III (N =
32, P = 38, T = 86, V = 8OO)

TalIe 3 shovs lhe regisler size, lhe processing line, and lhe nunler of P|s for conpuling
oulpul prolaliIilies of 8OO HMMs, vhere ve assune lhal N = 32, P = 38, T = 86, x

= 8, x
o
= 8,
x
o
= 8, x
f
= 24, and V = 8OO, lhe sane vaIues used in a recenl circuil design for isoIaled vord
recognilion (Yoshizava el aI., 2OO6, Yoshizava el aI., 2OO4). We aIso assune lhal M = 44 and
| = 5 for lhe III archileclure. The P|s used in lhe SII and III archileclures are
idenlicaI. Conpared vilh lhe SII archileclure, lhe III archileclure has fever regislers,
requires Iess processing line, and has fever P|s. Iron lhe VLSI archileclure vievpoinl, lhis
is lecause lhe regisler size of lhe III archileclure is independenl of N, and ils P|s can
repealedIy use lhe sane inpul franes. The III archileclure has fever vail cycIes for
#P|s=32 (SII)
#P|s (III)
6
7
8
9
15
22
#P|s=43
#P|s=32
44
A VLS Architecture for Output Probability Computations of HMM-based Recognition Systems 283

reading dala fron ROM lefore paraIIeI conpulalions, 586,24O (V/|((PM + |N)T/M(),
lhan lhe SII archileclure, vhich has 1,971,2OO (V(2NP + N)).
Iig. 9 shovs lhe processing line and lhe nunler of P|s of lhe III and SII archileclures,
and lhe vaIue of M of lhe III archileclure. The processing line and lhe nunler of P|s of
lhe III archileclure are Iess lhan lhose of lhe SII archileclure vhen M = 44 (Iig. 9).
Iron a Iogic design vievpoinl, lhe regisler arrays of lhe SII and III archileclures are
designed vilh IIip-IIops or on-chip nuIli-porl nenories of differenl sizes. Dala palhs are
designed vilh idenlicaI P|s, lul in a differenl nunler. The conlroI palhs of lhese
archileclures are designed, as shovn in lhe fIovcharls Iigs. 8 and 6. The dala palh deIay is
lhe sane for lolh lhe SII and III designs, equaI lo lhe deIay line of one P|. The deIay
lines of conlroI palhs differ lelveen lhe lvo, lul lhe conlroI palh deIay is snaII conpared
vilh lhe dala palh deIay.

5. ConcIusions

We presenled III for oulpul prolaliIily conpulalions and presenled an appropriale VLSI
archileclure for ils inpIenenlalion. III perforns arilhnelic operalions lo IocaIIy slored
inpul franes, and oulpul prolaliIily conpulalions for nuIlipIe franes are carried oul
sinuIlaneousIy. Conpared vilh lhe convenlionaI SII archileclure, vhen lhe nunler of
HMM slales is Iarge enough for accurale recognilion, lhe III archileclure requires fever
regislers and P|s, and Iess processing line. In lerns of lhe VLSI archileclure, a fasl and
nenory efficienl VLSI archileclure for oulpul prolaliIily conpulalions of HMM-lased
recognilion syslens has leen presenled. A Iogic design, a Vilerli scorer for lhe III
archileclure, and a reconfiguralIe archileclure for lolh lhe SII and III archileclures are
our fulure vorks.

6. References

. Malhev, A. Davis & Z. Iang (2OO3a). Ierceplion Coprocessors for Lnledded Syslens,
Prcc. cf Icr|sncp cn |m|cc Sqs|cms fcr Rca|-Timc Mu||imcia (|ST|Mcia), pp. 1O9-
116, 2OO3.
. Malhev, A. Davis & Z. Iang (2OO3l). A Lov-Iover AcceIeralor for lhe SIHINX 3 Speech
Recognilion Syslen, Prcc. cf |n|'| Ccnf. cn Ccmpi|crs, Arcni|cc|urc an Sqn|ncsis fcr
|m|cc Sqs|cms, pp. 21O-219, 2OO3.
S. Yoshizava, Y. Miyanaga & N. Yoshida (2OO2). On a High-Speed HMM VLSI ModuIe vilh
Iock IaraIIeI Irocessing, |||C| Trans. |unamcn|a|s (]apancsc |i|icn), VoI. }85-A,
No. 12, pp. 144O-145O, 2OO2.
S. Yoshizava, N. Wada, N. Hayasaka & Y. Miyanaga (2OO4). ScaIalIe Archileclure for Word
HMM-ased Speech Recognilion, Prcc. cf 2004 |||| |n|'| Sqmpcsium cn Circui|s an
Sqs|cms (|SCAS'04), pp. 417-42O, 2OO4.
S. Yoshizava, N. Wada, N. Hayasaka & Y. Miyanaga (2OO6). ScaIalIe Archileclure for Word
HMM-ased Speech Recognilion and VLSI InpIenenlalion in ConpIele Syslen,
|||| TRANSACT|ONS ON C|RCU|TS AND SYST|MS, VoI. 53, No. 1, pp. 7O-77,
2OO6.
VLS 284

X. Huang, I. AIIeva, H. W. Hon, M. Y. Hvang, K. f. Lee & R. RosenfeId (1992). The SIHINX-
II speech recognilion syslen: an overviev, Ccmpu|cr Spcccn an |anguagc, VoI. 7(2),
pp. 137-148, 1992.

You might also like