0% found this document useful (0 votes)
20 views

ws18 PDF

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

ws18 PDF

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 134

The Workshop Programme

Sunday, May 30th 2004


In vited talks
9:00 – 9:30 Thomas Hanke
HamNoSys – Representing Sign Language Data in Language Resources and
Language Processing Contexts

9:30 – 10:00 Carol Neidle, Robert G. Lee


Corpus Annotation, SignStream

10:00 – 10:30 Richard Gleaves, Valerie Sutton


SignWriter
10:30 – 11:00 Break

11:00 – 12:30 Ora l p r e s en t at i o n s ( I s e s s i o n )


11:00 – 11:30 Galini Sapountzaki, Eleni Efthimiou, Costas Karpouzis, Vassilis Kourbetis
Open-ended Resources in Greek Sign Language: Development of an e-
Learning Platform

11:30 – 12:00 Onno Crasborn, Els van der Kooij, Daan Broeder, Hennie Brugman
Sharing sign language corpora online: proposals for transcription and
metadata categories

12:00 – 12:30 Matt Huenerfauth


Spatial Representation of Classifier Predicates for Machine Translation into
American Sign Language
12:30 – 14:00 Lunch

Ora l p r e s en t a t i o n s ( I I s e s s i o n )
14:00 – 14:30 Antônio Carlos da Rocha Costa, Graçaliz Pereira Dimuro, Juliano Baldez de
Freitas
A Sign Matching Technique to Support Searches in Sign Language Texts

14:30 – 15:00 Angel Herrero


A Practical Writing System for Sign Languages

15:00 – 15:30 Maria Papadogiorgaki, Nikos Grammalidis, Nikos Sarris, Michael G.


Strintzis
Synthesis of Virtual Reality Animations from SWML using MPEG-4 Body
Animation Parameters
15:30 – 16:00 Coffee Break

I
16:00 – 18:00 Poster s ess io n
Elen i Ef th i mio u , Anna Vacalopoulou, Stavroula Evita Fotinea, Gregory
Steinhauer
Multipurpose Design and Creation of GSL Dictionaries

Chiara Vettori, Oliver Streiter, Judith Knapp


From Computer Assisted Language Learning (CALL) to Sign Language
Processing: the Design of E-LIS, an Electronic Bilingual Dictionary of
Italian Sign Language and Italian

Rubén Nogueira, Jose M. Martínez


19th Century Signs in the Online Spanish Sign Language Library: the
Historical Dictionary Project

Elana Ochse
A language via two others: learning English through LIS

Ingvild Roald
Making Dictionaries of Technical Signs: from Paper and Glue through SW-
DOS to SignBank

Steven Aerts, Bart Braem, Katrien Van Mulders, Kristof De Weerdt


Searching SignWriting Signs

Inge Zwitserlood, Doeko Hekstra


Sign Printing System – SignPS

Boris Lenseigne, Frédérick Gianni, Patrice Dalle


A New Gesture Representation for Sign Language Analysis

Jose L. Hernandez – Rebollar


Phonetic Model for Automatic Recognition of Hand Gestures

Daniel Thomas Ulrich Noelpp


Development of a New „SignWriter“ Program

Ralph Elliott, John Glauert, Vince Jennings, Richard Kennaway


An Overview of the SiGML Notation and SiGMLSigning Software System

Jan Bungeroth, Hermann Ney


Statistical Sign Language Translation

Guylhem Aznar, Patrice Dalle


Computer Support for SignWriting Written Form of Sign Language

Yiqiang Chen, Wen Gao, Changshui Yang, Dalong Jiang, Cunbao Ge


Chinese Sign Language Synthesis and Its Applications

Paola Laterza, Claudio Baj


Progetto e-LIS@
II
Workshop Organisers

- Oliver Streiter, Research Area “Language and Law” - European Academy of Bolzano, Italy
- Antônio Carlos da Rocha Costa, Escola de Informática - Universidade Católica de Pelotas,
Brazil
- Christian Retoré, Laboratoire Bordelais de Recherche en Informatique, France
- Chiara Vettori, Research Area “Language and Law” - European Academy of Bolzano, Italy

Workshop Programme Committee

- Antônio Carlos da Rocha Costa, Escola de Informática - Universidade Católica de Pelotas,


Brazil
- Carol Neidle, Department of Modern Foreign Languages and Literatures - Boston University,
Boston MA
- Chiara Vettori, Research Area “Language and Law” - European Academy of Bolzano, Italy
- Christian Retoré, Laboratoire Bordelais de Recherche en Informatique, France
- Eva Safar, School of Computing Sciences - University of East Anglia, Norwich, England
- Ian Marshall, School of Computing Sciences, University of East Anglia, Norwich, Englan
- Marco Consolati, Cooperativa Alba, Torino, Italy
- Oliver Streiter, Research Area “Language and Law” - European Academy of Bolzano, Italy
- Patrice Dalle, Équipe "Traitement et Compréhension d'Images", IRIT - Université Paul
Sabatier, France

III
Table of Contents
Preface .................................................................................................................................................i
Thomas Hanke
HamNoSys – Representing Sign Language Data in Language Resources and Language Processing
Contexts............................................................................................................................................... 1

Richard Gleaves, Valerie Sutton


SignWriter ........................................................................................................................................... 7

Galini Sapountzaki, Eleni Efthimiou, Costas Karpouzis, Vassilis Kourbetis


Open-ended Resources in Greek Sign Language: Development of an e-Learning Platform ........... 13

Onno Crasborn, Els van der Kooij, Daan Broeder, Hennie Brugman
Sharing sign language corpora online: proposals for transcription and metadata categories ....... 20

Matt Huenerfauth
Spatial Representation of Classifier Predicates for Machine Translation into American Sign
Language........................................................................................................................................... 24

Antônio Carlos da Rocha Costa, Graçaliz Pereira Dimuro, Juliano Baldez de Freitas
A Sign Matching Technique to Support Searches in Sign Language Texts ..................................... 32

Angel Herrero
A Practical Writing System for Sign Languages............................................................................... 37

Maria Papadogiorgaki, Nikos Grammalidis, Nikos Sarris, Michael G. Strintzis


Synthesis of Virtual Reality Animations from SWML using MPEG-4 Body Animation
Parameters ........................................................................................................................................ 43

Eleni Efthimiou, Anna Vacalopoulou, Stavroula Evita Fotinea, Gregory Steinhauer


Multipurpose Design and Creation of GSL Dictionaries ................................................................. 51

Chiara Vettori, Oliver Streiter, Judith Knapp


From Computer Assisted Language Learning (CALL) to Sign Language Processing: the Design of
E-LIS, an Electronic Bilingual Dictionary of Italian Sign Language and Italian............................ 59

IV
Rubén Nogueira, Jose M. Martínez
19th Century Signs in the Online Spanish Sign Language Library:the Historical Dictionary Project
........................................................................................................................................................... 63

Elana Ochse
A language via two others: learning English through LIS ............................................................... 68

Ingvild Roald
Making Dictionaries of Technical Signs: from Paper and Glue through SW-DOS to SignBank..... 75

Steven Aerts, Bart Braem, Katrien Van Mulders, Kristof De Weerdt


Searching SignWriting Signs ............................................................................................................ 79

Inge Zwitserlood, Doeko Hekstra


Sign Printing System – SignPS.......................................................................................................... 82

Boris Lenseigne, Frédérick Gianni, Patrice Dalle


A New Gesture Representation for Sign Language Analysis............................................................ 85

Jose L. Hernandez - Rebollar


Phonetic Model for Automatic Recognition of Hand Gestures......................................................... 91

Daniel Thomas Ulrich Noelpp


Development of a New „SignWriter“ Program................................................................................ 95

Ralph Elliott, John Glauert, Vince Jennings, Richard Kennaway


An Overview of the SiGML Notation and SiGMLSigning Software System ..................................... 98

Jan Bungeroth, Hermann Ney


Statistical Sign Language Translation............................................................................................ 105

Guylhem Aznar, Patrice Dalle


Computer Support for SignWriting Written Form of Sign Language............................................. 109

Yiqiang Chen, Wen Gao, Changshui Yang, Dalong Jiang, Cunbao Ge


Chinese Sign Language Synthesis and Its Applications.................................................................. 111

Paola Laterza, Claudio Baj


Progetto e-LIS@ ............................................................................................................................. 113

V
Author Index
Aerts, Steven ........................................................................................................................................79
Aznar, Guylhem .................................................................................................................................109
Baj, Claudio .......................................................................................................................................113
Baldez de Freitas, Juliano ....................................................................................................................32
Braem, Bart ..........................................................................................................................................79
Broeder, Daan ......................................................................................................................................20
Brugman, Hennie .................................................................................................................................20
Bungeroth, Jan ...................................................................................................................................105
Chen, Yiqiang ....................................................................................................................................111
Crasborn, Onno ....................................................................................................................................20
da Rocha Costa, Antônio Carlos ..........................................................................................................32
Dalle, Patrice................................................................................................................................85, 109
De Weerdt, Kristof...............................................................................................................................79
Efthimiou, Eleni .............................................................................................................................13, 51
Elliott, Ralph ........................................................................................................................................98
Fotinea, Stavroula Evita.......................................................................................................................51
Gao, Wen ...........................................................................................................................................111
Ge, Cunbao.........................................................................................................................................111
Gianni, Frédérick .................................................................................................................................85
Glauert, John ........................................................................................................................................98
Gleaves, Richard ....................................................................................................................................7
Grammalidis, Nikos .............................................................................................................................51
Hanke, Thomas ......................................................................................................................................1
Hekstra, Doeko.....................................................................................................................................82
Hernandez – Rebollar, Jose L. .............................................................................................................91
Herrero, Angel .....................................................................................................................................37
Huenerfauth, Matt ................................................................................................................................24
Jennings, Vince ....................................................................................................................................98
Jiang, Dalong .....................................................................................................................................111
Karpouzis, Costas.................................................................................................................................13
Kennaway, Richard..............................................................................................................................98
Knapp, Judith .......................................................................................................................................59

VI
Kourbetis, Vassilis ...............................................................................................................................13
Laterza, Paola.....................................................................................................................................113
Lenseigne, Boris...................................................................................................................................85
Martínez, Jose M..................................................................................................................................63
Ney, Hermann ....................................................................................................................................105
Noelpp, Daniel Thomas Ulrich ............................................................................................................95
Nogueira, Rubén ..................................................................................................................................63
Ochse, Elana.........................................................................................................................................68
Papadogiorgaki, Maria .........................................................................................................................51
Pereira Dimuro, Graçaliz .....................................................................................................................32
Roald, Ingvild ......................................................................................................................................75
Sapountzaki, Galini..............................................................................................................................13
Sarris, Nikos.........................................................................................................................................51
Steinhauer, Gregory .............................................................................................................................51
Streiter, Oliver......................................................................................................................................59
Strintzis, Michael G. ............................................................................................................................51
Sutton, Valerie .......................................................................................................................................7
Vacalopoulou, Anna.............................................................................................................................51
van der Kooij, Els ................................................................................................................................20
Van Mulders, Katrien...........................................................................................................................79
Vettori, Chiara......................................................................................................................................59
Yang, Changshui................................................................................................................................111
Zwitserlood, Inge .................................................................................................................................82

VII
Preface
On behalf of the program committee for the LREC 2004 "Workshop on the Processing of Sign Languages", we are pleased to
present you with the proceedings which contain the papers accepted for presentation at the Lisbon meeting on May 30th,
2004.

This volume, full of eye-catching signs, symbols, robots and screen-shots may charmingly attract readers who, although
having a sound knowledge of Natural Language Processing, might be confused by the great variety of topics and approaches.
How do SignWriting, avatars, XML, videos and image recognition fit together? Are they competitive approaches or different
solutions to different problems? Where will future research lead us, which endeavours answer real social needs and which
scenarios are still illusionary - or congenially visionary?

As always, the answers to these questions lie between slow and quick, up and down, straight and curbed. It is by drawing
analogies to the processing of spoken languages that we might better understand the contribution and benefits of the different
approaches, span the space of possible research and identify future tendencies in the research on the processing of sign
languages.

Trivially speaking, spoken languages are spoken and heard. Sign languages are signed and seen. Spoken languages have been
written on stone, wood, paper and electronic media. The technical support ranged from a chisel to a keyboard. The writing
systems which developed have been under the influence of the particular language and the technical support. Having a
hammer in your right and a chisel in the left makes it difficult to write from left to right. Having stable vowels motivates their
representation in the written form. So how can sign languages be written for love letters, poems, verdicts and recipes?

One possible answer is SignWriting. SignWriting does not decompose a sign into phonemes, syllables or morphemes but
body-parts, movements and face expressions and assigns a representation to each of them. Given such representations - e.g.
an alphabet for potentially all sign languages - how may a keyboard, the input system, look like? How are the simple
elements (body-parts, movements and face expressions) to be encoded in the computer and how the composed signs? As
pictures, in Unicode or XML? How will this influence the input of signs, the layout and formatting of SignWriting
documents, the possibilities to perform fuzzy matches on texts, in dictionaries, in the Internet? The papers written by Richard
Gleaves, Valerie Sutton (Signwriter), Antônio Carlos da Rocha Costa, Graçaliz Pereira Dimuro, Juliano Baldez de Freitas
(A Sign Matching Technique to Support Searches in Sign Language Texts), Angel Herrero (A Practical Writing System for
Sign Languages), Steven Aerts, Bart Braem, Katrien Van Mulders, Kristof De Weerdt (Searching SignWriting Signs), Daniel
Thomas Ulrich Noelpp (Development of a new 'SignWriter' Program) discuss these and related questions.

SignWriting, however, is by no means the only possible way of writing signs. Thomas Hanke in his invited talk “HamNoSys
– Representing Sign Language Data in Language Resources and Language Processing Contexts” introduces an alternative
approach, the Hamburg Notation System for Sign Languages. The purpose of HamNoSys has never been a usage in everyday
communication. It was designed to comply with research requirements, e.g. for corpus annotation, sign generation, machine
translation and dictionary construction. It thus differs from SignWriting in its scope and granularity. Unicode and XML
solutions are available for HamNoSys, c.f. Ralph Elliott, John Glauert, Vince Jennings and Richard Kennaway in their
contribution “An Overview of the SiGML Notation and SiGMLSigning Software System”.

Once these fundamental questions regarding the writing of sign languages will be settled, derived notions such as word n-
grams and character n-grams, important for computational approaches, may be used for applications such as language
recognition, document classification and information retrieval. Spelling checking, syntax checking and parsing are obvious
further developments once these more fundamental questions about the writing of signs will have been agreed upon.

It is a matter of fact, however, that most signers have not been trained in reading or writing in SignWriting. What is known as
“text-to-speech” in the processing of spoken languages would seem a possible solution: a front-end to web-pages, mail boxes
etc. would sign out the written text. As shown by Maria Papadogiorgaki, Nikos Grammalidis, Nikos Sarris, Michael G.
Strintzis in “Synthesis of virtual Reality Animations from SWML using MPEG-4 Body Animation Parameters” and Yiqiang
Chen, Wen Gao, Changshui Yang, Dalong Jiang and Cunbao Ge in “Chinese Sign Language Synthesis and Its Applications”,
avatars, i.e. virtual signers, may be constructed which translate a written form of a sign language or spoken language into
signs, just like translating "d" into the corresponding sound wave.

A front-end on the input side of the system might translate signs into a written representation. Speech Recognition becomes
Sign Recognition. Two different techniques are introduced. The recognition with the help of a data glove precedes from the
signer's perspective and his/her articulations, c.f. Jose L. Hernandez-Rebollar’s contribution “Phonetic Model for Automatic

i
Recognition of Hand Gestures”. This approach may seem in line with the definition of phonemes in terms of their articulation
and not their acoustic properties. On the other hand, it does not match our every-day experience in which we use a
microphone and not electronic contact points at our vocal cords, tongue, velum, teeth and lips when using a telephone. The
recognition of signs with the help of cameras, the second alternative, leads to the description of signs from the observer's
point of view, in terms of formants and f0, so to say. However, the articulation can be reconstructed and might be a better
representation for the signs than the ‘phonetic’ description, as suggested by Boris Lenseigne, Frédérik Gianni, and Patrice
Dalle in “A New Gesture Representation for Sign Language Analysis”.

Both modules, sign recognition and sign generation, may serve MT systems with a sign language as source or target language
respectively. A sign language as target language is used in translation experiments described by Jan Bungeroth and Hermann
Ney in “Statistical Sign Language Translation”. This corpus-based approach to Machine Translation, by the way, raises the
question of sign language corpora. The only paper which really tackles the question of signed corpora in this collection is that
of Onno Crasborn, Els van der Kooij, Daan Broeder, Hennie Brugman “Sharing sing language corpora online. Proposals for
transcription and metadata”. Matt Huenerfauth in his contribution “Spatial Representations for Generating Classifiers
Predicates in an English to American Sign Language Machine Translation System”, focuses on a particularly difficult aspect
of sign language generation, the classifier predicates. Thus, when signing "leaves are falling", it is not enough to generate the
sign "leave" and "falling", e.g. a downward movement. Instead the hand shape of "falling" should indicate the kind of object
that is falling, e.g. with a flat hand.

The usage of classifiers leads us directly to the question of how to construct dictionaries for sign languages. Learners'
dictionaries, reference dictionaries, dictionaries of NLP applications all need information about part of speech, lexical
functions, idioms, subcategorization and semantics, which by no means is the same as in the national spoken language.
How do we search in a sign language dictionary? Have you ever looked up a Chinese or Japanese Dictionary? Paola Laterza
and Claudio Baj in their paper “Progetto e-LIS@” propose an at least partially equivalent approach to the ordering of signs in
a sign language dictionary.
How do you present the dictionary content to a learner? In the national spoken language or in SignWriting? The complexity
of the question can be gauged from Elana Ochse’s contribution “A Language via Two Others, Learning English through
LIS”. Should we use videos, photos, animations or drawings to represent the entries in dictionaries? A number of authors
discuss these and related topics in the context of specific dictionary projects: for static presentations, i.e. paper dictionaries,
Inge Zwitserlood and Doeko Hekstra propose the “Sign Printing System – SignPS” to compose pictures of signs. Eleni
Efthimiou, Anna Vacalopoulou, Stavroula-Evita Ftinea, Gregory Steinhauer focus in their paper “Multipurpose Design and
Creation of GSL Dictionaries” on the content, i.e. the types of information to be included in a sign language dictionary.
Chiara Vettori, Oliver Streiter and Judith Knapp focus on different user requirements and the possible role of SignWriting in
a sign language dictionary. Rubén Nogueira, Jose M. Martínez and present a dictionary project of a particular kind: “19th
Century Signs in the Online Spanish Sign Language Library: the Historical Dictionary Project.” Ingvild Roald finally gives a
practical account on the history of techniques for the creation of sign language dictionaries, discussing advantages and
drawbacks of the respective approaches.

When writing these lines, the preparation of the workshop and the proceedings is almost finished. This workshop wouldn’t
have been possible without the energy many people have invested in their spare time. First of all we would like to thank the
authors who have done their best and provided superb papers. Our thank goes also to the reviewers for their detailed and
inspiring reviews. Last but not least we want to thank Sara Goggi who accompanied the workshop on behalf of the LREC
Programme Committee.

In closing we would like to thank you for attending the workshop, and we wish you will have a great time.

Oliver Streiter and Antônio Carlos da Rocha Costa


April 22, 2004

ii
HamNoSys – Representing Sign Language Data in Language
Resources and Language Processing Contexts
Thomas Hanke
University of Hamburg
Institute of German Sign Language and Communication of the Deaf
Binderstraße 34, 20146 Hamburg, Germany
[email protected]

Abstract
This paper gives a short overview of the Hamburg Notation System for Sign Languages (HamNoSys) and describes its application
areas in language resources for sign languages and in sign language processing.

More than fifteen years after the first published


1. Introduction version, HamNoSys is now at version 4
The Hamburg Notation System for Sign Languages (Schmaling/Hanke, 2001). This latest version filled some
(HamNoSys) is an alphabetic system describing signs on a minor gaps and introduced some shortcuts, but more
mostly phonetic level. As many sign notation systems importantly addressed issues related to using HamNoSys
developed in the last 30 years, it has its roots in the Stokoe in a sign language generation context. For the latter
notation system that introduced an alphabetic system to purpose, it was also complemented with a new set of
describe the sublexical parameters location, hand systems to encode nonmanual behaviour in a detailedness
configuration (in most cases, the handshape only) and not previously possible in HamNoSys.
movement to give a phonological description of American
Sign Language signs (Stokoe, 1960). 2. Overview of the System
HamNoSys (first version defined in 1984, first
published version Prillwitz et al., 1987), however, was 2.1. General Structure
designed to be usable in a variety of contexts with the A HamNoSys notation for a single sign consists of a
following goals in mind: description of the initial posture (describing nonmanual
• International use: HamNoSys transcriptions should be features, handshape, hand orientation and location) plus
possible for virtually all sign languages in the world, the actions changing this posture in sequence or in
and the notation should not rely on conventions parallel. For two-handed signs, the initial posture notation
differing from country to country, such as the national is preceded by a symmetry operator that defines how the
fingerspelling alphabets. description of the dominant hand copies to the non-
• Iconicity: As the large number of possible parameter dominant hand unless otherwise specified.
variations did not allow for a standard alphabet (e.g. Specifications of nonmanual features and actions are
Roman alphabet) familiar to the users, newly created optional. If the location specification is missing, a default
glyphs should be designed a way that helps to location is assumed.
memorise or even deduct the meaning of the symbols
wherever possible.
2.2. Handshapes
• Economy: While it should be possible to transcribe
any signed utterance (even sign errors) with The description of a handshape is composed of
HamNoSys, notation of the majority of signs should symbols for basic forms and diacritics for thumb position
make use of principles such as symmetry conditions, and bending. In
resulting in much shorter notation for the average addition, deviations
sign. from this general
• Integration with standard computer tools: The description with
notation system should be usable within computer- respect to the
supported transcription as well as in standard text fingers involved or
processing and database applications. the form of
• Formal syntax: The notation language should have a individual fingers
well-defined syntax, and its semantics should can be specified.
generally follow the compositionality principle. Where necessary,
• Extensibility: As it seemed obvious that, given the intermediate forms
state of the art in sign language research, a notation can be described as
system would not be capable of addressing all aspects well.
of sign formation description for all sign languages By this
right from the beginning, HamNoSys should allow combinatorial
both for a general evolution and specialisations. New approach, the set of
versions of the system should not render old describable hand-
transcriptions invalid. shapes is rather
large and is

1
supposed to include all handshapes actually used in sign For two-handed signs, the
languages documented so far. location may also describe
Dynamic handshapes as defined for German Sign the relation of the two hands
Language by Prillwitz et al. (2002) are not considered to each other (“hand
primitives in HamNoSys. Instead, the initial handshape of constellation”) as describing
an opening or closing dynamic pair appears within the the positions of the two
posture, whereas the second one appears as the target of a hands with respect two body
handshape change action. For wiggling etc., one parts might not be precise
representative handshape is described in the posture, the enough.
wiggling itself, however, is described as an action.
2.5. Actions
2.3. Hand Orientation Actions are combinations
HamNoSys describes the orientation of the hand by of path movements (i.e.
combining two components: extended finger direction (i.e. movements changing the
for index hands the index direction) specifying two position of the hand) and in-
degrees of freedom, and palm orientation determining the place movements of the
third degree. By providing symbols for both components hands as well as nonmanual
in a distance of 45°, a sufficiently fine-grained movements. The combinations can be performed either
determination of the 3D-orientation of the hand becomes sequentially or cotemporally.
possible. In HamNoSys, path movement buildings blocks are
straight lines, curved and zigzag lines, circles and similar
forms. Here again, a quantization with 45° is applied.

The three perspectives used for the extended finger


direction (signer’s view, birds’ view, and view from the
right) are reflected in the glyphs by no reference line, a
horizontal reference line, or a vertical reference line Path movements can be specified either as targeted
representing the signer’s body. (The same model is used movements (target specified as location) or relative
for movements.) movements (target determined by the direction and the
|J
Redundant symbols, such as , are not used. Insofar, size of the movement).
there is a priority ordering between the three views In-place movements are changes in handshape or hand
determining which view to be used for each symbol. orientation as well as wiggling, twisting etc.
For all movement components, diacritic symbols to
For the third degree of freedom, only eight symbols are specify size can be added. Furthermore, for each
needed. The meaning of a symbol is defined relative to the movement a mode (such as slow or sudden stop) can be
extended finger direction (Qd palm down, Hd palm away specified.
from the body etc.). Repetitions of actions can be specified either by exact
numbers as multiple repetition. In each case, a repetition
can be continuous or recurrent.
The mere concatenation of actions means their
performance in sequence, whereas actions in square
brackets are done in parallel. E.g. a circle movement in
square brackets with a straight movement results in a
spiral movement. For two-handed actions, it is possible to
specify different actions for each hand to be performed
By adding a subscript, hand orientation can be made simultaneously.
relative to the movement, e.g. the palm orientation
changes as the movement direction changes:
2.6. Two-handed Signs
13Qel … §• The notation of a two-handed sign begins with a
HOUSE symmetry marker. This symbol determines how to copy
the specification for the dominant hand to the non-
dominant hand. Exceptions can always be specified by
2.4. Location separately describing configurations or actions for each
As with hand orientation, location specifications are hand. Example:
split into two components: The first determines the
location within the frontal plane (x and y coordinates, 0æ7%78øQdƒ
whereas the second determines the z coordinate. If the
second part is missing, a “natural” distance of the hand (German Sign Language NINETEEN): Both hands have
from the body is assumed. If both parts are missing, the the same hand orientation and the same movement, but
hand is assumed to be located in “neutral signing space”, they differ in their handshapes.
i.e. with “natural” distance in front of the upper torso.

2
2.7. Nonmanual Components A complete documentation of these nonmanual coding
As most notation systems, HamNoSys focuses on the schemes can be found in Hanke et al. (2001).
description of the manual activities within a sign. The
descriptive power of the existing system with respect to 2.8. Implementation
nonmanuals is rather limited: For each action, HamNoSys The HamNoSys symbols are available as a Unicode
allows to specify an articulator to replace the hand. The font, with the characters mapped into the Private Use area
actions available are those introduced for the hands. This of Unicode.
allows appropriate descriptions for shoulder shrugging, For MacOS X, a keyboard layout has been defined that
head movements etc. but not necessarily facial can be automatically activated once text in the HamNoSys
expressions or mouth movements. font is selected. This keyboard graphically arranges the
Originally, it was planned to add a facial circle to be characters on the keyboard, e.g. the arrows in circles with
complemented with diacritics for eyes, eyebrows, nose, 45° sectors. This makes learning keyboard input rather
cheeks, and mouth. At that time, however, practical easy for those using HamNoSys every day. For people
limitations did not allow for the sheer number of who use the system less frequently, even this keyboard is
diacritical symbols to be put into one font. Later too much to memorise. Here we offer (for both MacOS
suggestions added movement primitives to HamNoSys and Win) a small input utility that allows the user to
that targeted towards facial movements. construct the HamNoSys string by clicking on the
For the time being, we use a rather unsophisticated appropriate symbols on (user-configurable) palettes:
coding scheme to specify a number of nonmanual tiers in
a multi-tier transcription scheme with the HamNoSys
manual tier being the master tier. Synchronization is
generally done on a sign level only.
Coding schemes are defined for eye gaze, facial
expression (eye brows, eye lids, nose), mouth gestures and
mouth pictures. The separation from the manual parts
allows codes to be defined for states as well as for
movements, i.e. sequences of states (e.g. TB tightly shut
eyelids vs. BB eye blink). For mouth gestures, the coding
scheme simply enumerates all gestures identified so far,
e.g.:
C01 cheeks puffed (static)

A syntax-oriented editor was available for HamNoSys


C02 cheeks and upper and (static)
lower lip areas puffed
2 (Schulmeister, 1990), but has not been updated since
then. Within the ViSiCAST project (cf. Schulmeister,
C03 cheeks puffed gradually (dynamic)
2001), SiGML, an XML equivalent to HamNoSys, has
been defined (Elliott et al., 2000).

C04(C) one cheek puffed (static)


3. Dictionaries
C05(C) one cheek puffed; (dynamic) In many sign language dictionaries, you find notation
blow out air briefly at as a description how to perform an entry. Depending on
corner of one‘s mouth
the media used, the notation is part of a multitude of form
C06(C) one cheek puffed; blow out (dynamic) descriptions, e.g. video, photos or drawings with or
air briefly at corner of without arrows, performance instructions in written
one’s mouth when touch- language, etc. Today’s sign language dictionaries mostly
ing cheek with index finger present only the citation form of the sign, some possibly
C07 cheeks sucked in, without (static) add unstructured information like “directional verb” to
sucking in air indicate the kind and richness of morphological derivation
C08 cheeks sucked in, sucking (dynamic) that can be applied to the sign.
in air through slightly open Notation is also used to provide some means of access
lips to the dictionary contents from the sign language side: For
C09(C) tongue pushed into cheek (static) search access, you normally find partial matching
(visible from outside) strategies. In the case of HamNoSys with its relatively
high degree of detailedness, we add fuzzy search
C10(C) tongue pushed into cheek (dynamic) mechanisms to allow for variation. For browsing access
several times (visible from
(and this includes of course printed dictionaries), the
outside)
lexemes (or an index thereof) are ordered according to
C11(C) one cheek puffed; blow out (dynamic)
air briefly at corner of
only some parameters expressed in the notation. For
one‘s mouth several times HamNoSys, it is obvious why the order on HamNoSys
strings induced by some order of the HamNoSys alphabet
C12 lips closed, tongue pushed (static) is not really useful: With about 200 symbols, no user will
behind bottom lip/chin be able to memorise this ordering, and, for a given sign,
(visible from outside) you often find several equivalent HamNoSys notations,

3
and HamNoSys still lacks a reduction method to identify
one of these as the canonical notation. (For an example,
cf. Konrad et al., 2003.)

4. Transcription of Signed Corpora


Notation is used to transcribe linguistic data (by
viewing video) to provide an efficient form description
and to make it accessible to analysis. In the first years,
notation was that part of the transcription that came
closest to raw data. But even after the integration of digital
video, notation did not become superfluous as it makes
data searchable for phonetic aspects (cf. Hanke/Prillwitz,
1995 and Hanke, 2001).

In most cases, the notation was used to describe the


observed event at a certain level of detailedness. No The language model used in ViSiCAST is an HPSG
attempt was made to relate the observed token to its type. feature structure. Depending on the morphological
One exception is the work by Johnston (1991), who, after richness of a lexical entry, the structure may be fully
giving the citation form of the type, describes how the instantiated with HamNoSys values, or might contain
token deviates from the type. In the context he introduced more complex structures only finally reducible into
this notational convention, he considered those derivations HamNoSys values. For a locatable sign like HOUSE in
only that are morphologically relevant, but it is easy to see German Sign Language, this roughly looks as follows:
how this could be extended.
ilex, our recent approach to corpus transcription
(Hanke, 2002a), ties a lexicon into the transcription Handedness 1
system and requires the user to relate each token to a type,
a function considered absolutely necessary to ensure data
Handshape 3
consistency in larger corpora transcriptions that usually
are team efforts and therefore cannot rely on the
Orientation Qel
transcriber’s intimate knowledge of the data already
processed. What may be substituted in spoken language
Handconstellation …
corpora by automatically searching the transcription data Location 1
cannot be avoided for sign language corpora as long as
HamNoSys or other notation systems do not establish a Movement §•
working orthography.
Using HamNoSys symbols as HPSG feature values is
5. Generation quite convenient as the user can immediately grasp the
One of the first projects meaning of the values, and the approach has been
HamNoSys was used in is successfully applied to a range of sign language specific
H.AN.D.S. (Hamburg phenomena such as classifier and number incorporation,
Animated Dictionary of directional verb signs and locatable signs. Problems
Signs, cf. Prillwitz/ remain where the independence of sign parameters is an
Schulmeister, 1987) which over-simplification. This is easily illustrated with the
represented dictionary example MOVE–classifier:car–source:right-side-in-front-
entries by the notation and of-the-signer–goal:left-side-in-front-of-the-signer. Once
a two-dimensional anima- the feature structure for double-track vehicles
tion automatically created from the notation. Due to the
immense number of high-precision drawings needed for Handedness
that purpose, only a subset of HamNoSys could be
correctly animated at the end of the project. The upcoming Handshape 3
digital video technology then pushed animation to the
background as far as sign language dictionaries were
Orientation Qld
concerned. However, in the context of spoken-to-sign Handconstellation
language translation systems, animation promises far
better results than digital video: While Krapez/Solina
(1999) describe a method to improve sign-to-sign video is unified with the lexical entry for MOVE and
blending, they also outline the limitations. Animation everything from source and goal except the height in
technology can not only model transitional movements signing space, the result is equivalent to the following
between signs, but, based on a suitable language model, lambda expression:
provide uncountable morphological variations of sign
lexicon entries as needed for locatable signs, directional λℵ. 3Ndℵß
verbs etc. Kennaway (2002, 2004) describes the
ViSiCAST animation component based on SiGML: With heights above chest level, this results in highly
unnatural signing: Instead of

4
3Nd}ß link sign language resources and processing models with
the larger field of multimodality.
one would sign
8. References
Elliott, R., J. Glauert, R. Kennaway and I. Marshall, 2000.
3ì3AOd}ß The development of language processing support for
the ViSiCAST project. In The fourth international ACM
Apparently the assumption that a classifier feature conference on Assistive technologies ASSETS. New
structure should specify whole handshapes and hand York: ACM. 101–108.
orientations is too restrictive. Instead, one might want to Gut, U., K. Looks, A. Thies, T. Trippel and D. Gibbon,
specify a part of the hand and this part's orientation. While 2003. CoGesT - Conversational Gesture Transcription
it is always possible to translate the fully instantiated System - version 1.0. http://www.spectrum.uni-
structures into standard HamNoSys to feed into the bielefeld.de/modelex/publication/techdoc/cogest/CoGes
animation component, this would distribute the need for T_1.pdf.
anatomical knowledge over two system components: The Hanke, T., 2001. Sign language transcription with
language model and the animation component, a highly syncWRITER. Sign Language and Linguistics 4(1/2):
undesirable situation. Instead, it might be a better idea to 267–275.
allow parts of handshapes and orientations thereof instead Hanke, T., 2002a. iLex - A tool for sign language
of complete handshapes with hand orientation in lexicography and corpus analysis. In M. González
HamNoSys itself. A suggestion in this direction also Rodriguez and C. Paz Suarez Araujo (eds.),
discussing other classes of examples has been made by Proceedings of the third International Conference on
Hanke (2002b). Language Resources and Evaluation. Las Palmas de
Gran Canaria, Spain. Paris: ELRA. 923-926.
6. Applications beyond Sign Language Hanke, T., 2002b. HamNoSys in a sign language
While Miller (2001) reports that HamNoSys and the generation context. In R. Schulmeister and H. Reinitzer
family of derivatives of the Stokoe notation are the most (eds.), Progress in sign language research: in honor of
widely used systems in research, it seems that even more Siegmund Prillwitz / Fortschritte in der Gebärden-
people use HamNoSys outside sign language research, sprachforschung: Festschrift für Siegmund Prillwitz.
namely in gesture research. Seedorf: Signum. 249–266.
In the Berlin Dictionary of Everyday Gestures (Posner Hanke, T. and S. Prillwitz, 1995. syncWRITER: Integrat-
et al., in prep.), HamNoSys is used in toto to describe the ing video into the transcription and analysis of sign
form of the gestures in addition to photos and verbal language. In H. Bos and T. Schermer (eds.), Sign
descriptions. Language Research 1994: Proceedings of the Fourth
A number of gesture description schemes inherit European Congress on Sign Language Research,
structure and/or feature values from HamNoSys, such as Munich, September 1-3, 1994. Hamburg: Signum.
MURML (Kopp et al., 2004), FORM (Martell, 2002) and 303–312.
CoGesT (Gut et al., 2003). KINOTE (Hofmann/Hommel, Hanke, T., G. Langer and C. Metzger, 2001. Encoding
1997) was described by the authors as a kinematic non-manual aspects of sign language. In: T. Hanke
transcription of a subset of HamNoSys. Some of these (ed.), Interface definitions. ViSiCAST Deliverable D5-
representation languages are also the target for gesture 1.
recognition, be they based on data gloves or video, so that Hofmann, F. and G. Hommel, 1997. Analyzing human
HamNoSys is indirectly also used in recognition contexts gestural motions using acceleration sensors. In P.
for gesture. Harling and A. Edwards (eds.), Progress in gestural
interaction. Proceedings of Gesture Workshop '96.
7. Outlook Berlin: Springer. 39-59.
New application areas will always pose new Johnston, T., 1991. Transcription and glossing of sign
requirements on a system such as HamNoSys. So we language texts: Examples from AUSLAN (Australian
currently see no end in the development of the system. Sign Language). International Journal of Sign
Obviously, one application area for HamNoSys is still Linguistics 2(1): 3–28.
missing: Sign language recognition. Only a few sign Kennaway, R., 2002. Synthetic animation of deaf signing
language recognition systems work on a sublexical level gestures. In I. Wachsmuth and T. Sowa (eds.), Gesture
(e.g. Vogler/Metaxas, 2004), and all recognition systems and sign language in human-computer interaction.
today work with rather small sign inventories. In the International Gesture Workshop, GW 2001, London,
future, language models in connection with lexicons might UK, April 18-20, 2001 Proceedings. Berlin: Springer.
help recognition systems to cover larger subsets of a sign 146-157.
language, and it would be interesting to see how Kennaway, R., 2004. Experience with and requirements
HamNoSys fits into such a system. for a gesture description language for synthetic
For transcription schemes of signed texts as well as animation. In A. Camurri and G. Volpe (eds.), Gesture-
lexicon building, data on intra- and inter-transcriber Based Communication in Human-Computer
reliability could contribute to another aspect to the Interaction, 5th International Gesture Workshop, GW
question how useful a phonetic transcription of signs is. 2003, Genova, Italy, April 15-17, 2003, Selected
The use of HamNoSys in a number of gesture Revised Papers. Berlin: Springer. 300-311.
description systems might turn out to be a useful key to Konrad, R. et al., 2003. F a c h g e b ä r d e n l e x i k o n
Sozialarbeit/Sozialpädagogik. Hamburg: Signum 2003.

5
[WWW version at http://www.sign-lang.uni-
hamburg.de/solex/]
Kopp, S., T. Sowa and I. Wachsmuth, 2004. Imitation
games with an artificial agent: from mimicking to
understanding shape-related iconic gestures. In A.
Camurri and G. Volpe (eds.), G e s t u r e - B a s e d
Communication in Human-Computer Interaction, 5th
International Gesture Workshop, GW 2003, Genova,
Italy, April 15-17, 2003, Selected Revised Papers.
Berlin: Springer. 436-447.
Krapez, S. and F. Solina, 1999. Synthesis of the sign
language of the deaf from the sign video clips.
Elektrotehniski vestnik. 66(4-5). 260-265.
Martell, C., 2002. Form: An extensible, kinematically-
based gesture annotation scheme. In M. González
Rodriguez and C. Paz Suarez Araujo (eds.),
Proceedings of the third International Conference on
Language Resources and Evaluation. Las Palmas de
Gran Canaria, Spain. Paris: ELRA. 183-187.
Miller, C., 2001. Some reflections on the need for a
common sign notation. Sign Language and Linguistics.
4(1/2):11-28.
Posner, R., R. Krüger, T. Noll and M. Serenari, in prep.
Berliner Lexikon der Alltagsgesten. Berlin: Berliner
Wissenchafts-Verl.
Prillwitz, S. and R. Schulmeister, 1987. Entwicklung eines
computergesteuerten Gebärdenlexikons mit bewegten
Bildern. Das Zeichen 1(1): 52-57.
Prillwitz, S. et al., 1987. HamNoSys. Hamburg Notation
System for Sign Languages. An introduction. Hamburg:
Zentrum für Deutsche Gebärdensprache.
Prillwitz, S. et al., 2002. Zur Phonetik und Phonologie der
Deutschen Gebärdensprache. Manuscript, University of
Hamburg.
Schmaling, C. and T. Hanke, 2001. HamNoSys 4.0. In: T.
Hanke (ed.), Interface definitions. ViSiCAST
Deliverable D5-1. [This chapter is available at http://
www.sign-lang.uni-hamburg.de/projekte/
HamNoSys/HNS4.0/englisch/HNS4.pdf]
Schulmeister, R., 1990. Zentrum erhält den Deutschen
Hochschul-Software Preis 1990. Das Zeichen 4(11): 73-
77.
Schulmeister, R., 2001. The ViSiCAST Project:
Translation into sign language generation of sign
language by virtual humans (avatars) in television,
WWW and face-to-face transactions. In C. Stephanidis
(ed.), Universal access in HCI : towards an information
society for all. Hillsdale, NJ: Erlbaum. 431-435.
Stokoe, W., 1960. Sign language structure: An outline of
the visual communication systems of the American deaf.
Buffalo, NY: Univ. of Buffalo.
Vogler, C. and D. Metaxas, 2004. Handshapes and
movements: multiple-channel American Sign Language
recognition. In A. Camurri and G. Volpe (eds.),
Gesture-Based Communication in Human-Computer
Interaction, 5th International Gesture Workshop, GW
2003, Genova, Italy, April 15-17, 2003, Selected
Revised Papers. Berlin: Springer. 247-258.

6
SIGNWRITER
Richard Gleaves
Valerie Sutton

Deaf Action Committee for SignWriting


Center For Sutton Movement Writing
La Jolla, California, USA
[email protected]
[email protected]

Abstract SignWriter Apple

This paper reviews the design history of SignWriter, a It was in this context that SignWriter was conceived in
word processor for the SignWriting system. While the 1986. The intended use for SignWriter was in education
primary goal of SignWriter was simply to create a word and the hardware platform was once again the Apple II,
processor for SignWriting, its development and which at the time was an established standard for
subsequent use had several beneficial effects on the personal computing. The design goal was to implement
SignWriting system. Various design aspects of the full SignWriting system in a simple but complete and
SignWriter are considered in the context of current usable word processor.
computing technologies and sign processing development
efforts. This more ambitious goal could be attempted on the same
hardware because as a former member of the UCSD
Background Pascal project, Richard Gleaves had several years of
experience developing system software for the Apple II,
The SignWriting system [Sutton04] was conceived, and knew how to program in assembly language and
developed, and used for many years as a hand-written make full use of the Apple's 128KB memory. In addition
notation. In particular, its use predated the introduction of Gleaves’ Pascal project colleague Mark Allen provided
low-cost personal computers. some high-performance graphics routines that he had
developed for writing arcade-style games on the Apple II.
In 1984 Emerson and Stern Associates, a small
educational research and development firm, received a Much of the design effort in SignWriter was spent on two
grant to develop a word processor for SignWriting. The issues:
resulting software, which operated on an Apple II
computer, supported only a minor subset of the … Developing a memory-efficient encoding for
SignWriting system and was more of a demonstration SignWriting text
than a useful tool: it was not subsequently used, and … Devising user interface mechanisms for efficiently
received no further development. The application was typing symbols
notable for displaying the symbols in a virtual "picture
frame" around a central editing area, with symbols SignWriting symbols were encoded using a variable-
selected for entry by moving a cursor around the frame length byte-code system that was introduced in UCSD
until the desired symbol was reached. Pascal p-code [Bowles78] and later adopted for use in
Java object code. The SignWriter graphics engine
Emerson and Stern's software design implied that interpreted the byte codes as instructions for drawing
SignWriting was too complex for the personal computers symbols on the screen in specific locations and
of the time. Interestingly, their response was to devise an orientations.
entirely different writing system named SignFont
[Newkirk87], which traded computational simplicity - it Typing was chosen as the input mode for two reasons.
was designed as a standard Macintosh font - for First, while mice were available for the Apple II they
notational obscurity. SignFont's subsequent nonuse were an optional add-on and therefore most Apple IIs did
suggests that this design tradeoff was unsuccessful. not have them. Second, the SignWriting system was
receiving criticism at the time for allegedly being a form

7
of illustration rather than a true writing system. Therefore The Apple II version of SignWriter supported the full
an efficient typing mechanism would cause SignWriter to SignWriting system as it was defined at the time (palm
serve as implicit proof that SignWriting was indeed a orientation had not yet been introduced). The software
form of writing. was quite usable, but was never widely used because
experienced SignWriting users had to type in each
It was evident that SignWriting's complex symbol set occurrence of each sign, while for new users typing
would prevent it from being typed as efficiently as the symbols was relatively inefficient and – in the absence of
Roman alphabet on a standard keyboard. However, the a system for teaching typing – posed a significant
design that evolved - which involved the context- learning curve.
sensitive dynamic redefinition of the keyboard keys -
yielded a valuable tradeoff of efficiency for learnability.
The key boxes displayed on the screen highlighted the SignWriter DOS
natural categories of the SignWriting symbols in a
manner that allowed the typing mechanism to serve as an
By the late 1980s the IBM PC had replaced the Apple II
implicit learning tool: a crucial property given the symbol
as the personal computer of choice. SignWriter was
set complexity and the application's intended audience.
ported to the IBM PC with programming assistance from
See Figures 1, 2 and 3 from the SignWriter-At-A-Glance
Barry Demchak. We chose the CGA display mode
Instruction Manual.
because at the time it was the graphics display mode
supported by the most PC models, and because its screen
The SignWriting symbol images were created by Valerie
resolution of 640 by 200 pixels was close enough to the
Sutton using the SignWriter symbol editor program. In Apple to simplify porting the existing symbol graphics to
addition she defined the mapping of SignWriting symbols the PC (which is why the SignWriter symbols are so
to the keyboard keys. As with the key boxes, this jagged).
mapping emphasized learnability by grouping symbols
according to their natural categories. Conversely, the
The extra memory available on the IBM PC allowed
mapping of the key box keys and symbol attribute keys SignWriter to be expanded with additional symbols, a
(Arrow, Cursor, Mirror, Size, and Rotate) was determined sign dictionary, and support for multiple countries and
strictly by typing efficiency. languages. These features (along with software
distribution on the Internet) had a significant impact on
SignWriter's Find and Replace commands were SignWriter use, as researchers began using SignWriter to
implemented (at significant expense in memory) both to create and publish dictionaries for various signed
establish SignWriter as a complete word processor and languages. This is the version of SignWriter that is in
again to demonstrate SignWriting's status as a true common use today.
writing system. Unfortunately the search algorithm did
not take into account the relative positioning of symbols
within a sign, thus making the search feature itself more
of a demonstration than a useful tool. Effects on SignWriting

Because SignWriter was developed as a stand-alone The purpose of SignWriter was simply to provide a word
application, it was free to possess an application-specific processor for the SignWriting system. However, its
user interface. The interface design was influenced by development and subsequent use had several beneficial
Tufte's principle of graphical minimalism [Tufte83]: effects on SignWriting:
namely, every pixel that was not part of a SignWriting
symbol existed onscreen only because it was functionally … SignWriter offered a concrete proof of SignWriting's
necessary. While this design approach may seem austere status as a systematic notation rather than an ad hoc form
given today's large color displays, it made for a simple of illustration. This notion influenced the subsequent
and easy-to-use interface on the Apple II, which had a design of the software.
screen resolution of only 560 by 192 pixels. … The typing mechanism served as an implicit
interactive system for learning the SignWriter symbols
The major drawbacks to SignWriter's interface design (an important achievement given the complexity of the
were the inefficient cursor movement commands and the symbol set).
need for a keyboard card showing the assignment of … The SignWriter symbol editor was withheld from
SignWriting symbols and commands to the keys. distribution to ensure the controlled development of the

8
SignWriting system as it evolved to support more and and fine symbol positioning), typing-centered symbol
more signed languages. input may well prove superior to any mouse-based
… The constraints of computer implementation exerted a systems.
positive influence on the subsequent evolution of the
SignWriting system. Finally, SignWriter demonstrated that with the
… The SignWriter software itself served as an efficient appropriate software architecture a true word processor
means of distributing the SignWriting system, and could be implemented for SignWriting given limited
established a de facto standard for data exchange (an resources for memory, processing power, and display
effect greatly amplified by the introduction of the resolution. This in turn suggests opportunities for
Internet). developing useful sign processing software on the
emerging handheld computing platforms such as PDAs
and cell phones.
Conclusion

Beyond its immediate value as a tool for practical sign


References
processing, SignWriter offers a number of lessons for
[Bowles78]
current and future developers of sign processing software.
Bowles, Kenneth L., "UCSD Pascal", Byte. 46 (May)
The most important is the need to standardize a user
[Newkirk87]
interface mechanism for symbol input; just as the symbol
set is being standardized across all sign processing Newkirk, Don, "SignFont Handbook", San Diego:
programs that use SignWriting, so must symbol entry. Emerson and Stern Associates (1987)
Such a standard should be centered on typing, with
mouse input as an alternative rather than a replacement. [Tufte83]
Tufte, Edward R., "The Visual Display of Quantitative
Compelling pedagogical and linguistic reasons exist for
providing efficient input mechanisms at the level of Information", Graphics Press (1983)
symbols rather than signs; while such mechanisms need
not supplant text entry at the sign level, the reverse [Sutton93]
equally holds true. Sutton, Valerie. SignWriter-At-A-Glance Instruction
Manual, SignWriter Computer Program Notebook,
The diagrams in this paper illustrate SignWriter’s typing- Deaf Action Committee For SignWriting (1993)
based symbol input system as an example of how future
typing-centered systems could be designed. [Sutton04]
Sutton, Valerie. SignWriting Site. www.signwriting.org
With regards to efficiency, Valerie Sutton has learned to
type SignWriting almost as efficiently as English. This
suggests that with the proper training (an accepted norm
for typing) and appropriate hardware (e.g., a notebook
computer with an integrated touchpad for cursor control

9
Figure 1: A page from the SignWriter-At-A-Glance-Manual. Symbol groups are under each key.

10
Figure 2: A page from the SignWriter-At-A-Glance-Manual. Symbol categories are placed in rows of keys.

11
Figure 3: A page from the SignWriter-At-A-Glance-Manual. 17 countries with 17 fingerspelling keyboards.

12
Open-ended Resources in Greek Sign Language:
Development of an e-Learning Platform
Galini Sapountzaki1, Eleni Efthimiou1, Kostas Karpouzis2, Vassilis Kourbetis3
1
ILSP-Institute for Language and Speech Processing
Artemidos 6 & Epidavrou, GR 151 25, Maroussi, Greece
2
Image, Video and Multimedia Systems Laboratory, National Technical University of Athens, 9, Iroon Polytechniou
Str., GR 157 73, Zographoy, Greece
3
Pedagogical Institute, Messogion 392 Aghia Paraskevi GR 153 41, Athens, Greece
[email protected], [email protected], [email protected],
[email protected]

Abstract
In this paper we present the creation of dynamic linguistic resources of Greek Sign Language (GSL). The resources will feed the
development of an educational multitask platform within the SYNENNOESE project for the teaching of and in GSL. The platform
combines avatar and animation technologies for the production of sign sequences/streams, exploiting digital linguistic resources of
both lexicon and grammar of GSL. In SYNENNOESE, the input is written Greek text, which is then transformed into GSL and
appears animated on screen. A syntactic parser decodes the structural patterns of written Greek and matches them into equivalent
patterns in GSL, which are then signed by a virtual human. The adopted notation system for the lexical database is HamNoSys
(Hamburg Notation System). For the implementation of the digital signer tool, the signer’s synthetic movement follows MPEG-4
standard and frame H-Anim with the use of VRML language.

applications in areas of other subjects in the school


1. Introduction curriculum.
Primary target user group are the deaf pupils who need
teaching tools and educational material for the GSL 2. Greek Sign Language – the background
grammar class. Till very recently educational material was Greek Sign Language (GSL) is a natural visual language
available to students with hearing impairments only in used by the members of the Greek Deaf Community with
written Greek form. Formal teaching of GSL as a first several thousands of native or non-native signers.
language from the very early school years, and relevant Research on the grammar of GSL per se is limited; some
development of educational content is becoming very work has been done on individual aspects of its syntax
urgent since law 2817/2000 was put into action by the (negation (Antzakas & Woll, 2001), morphology
Hellenic State. This law defines that «the official school (Lampropoulou, 1992)), as well as on applied and
language of deaf and hard hearing students is the Greek educational linguistics. It is assumed that GSL as we now
Sign Language» and that «knowledge of the Greek Sign know it is a combination of the older type of Greek sign
Language is a prerequisite for the positioning of tutors and language dialects with French sign language influence
special education staff at the schools that host deaf and (Lampropoulou, 1997). Comparison of core vocabulary
hard hearing students». In this context the new education lists exhibit many similarities with sign languages of
programs of the Pedagogical Institute1 (in print) require neighboring countries, while in morphosyntax GSL shares
that all educational material, which will be produced from the same cross-linguistic tendencies as many other well
now on, must be accessible to the deaf students through analysed sign languages (Bellugi & Fischer, 1972 ;
the use of the Greek Sign Language. Liddell, 1980).
In consultancy with the Pedagogical Institute, GSL has developed in a social and linguistic context
SYNENNOESE helps pupils acquire the proper linguistic similar to most other sign languages (Kyle & Woll, 1985 ;
background so that they can take full advantage of the Brennan, 1987). It is used widely in the Greek deaf
new accessible educational material. The platform offers community and the estimation for GSL users is about
students the possibility of systematic and structured 40,600 (1986 syrvey of Gallaudet Univ.). There is also a
learning of GSL for either self-tutoring or participation to large number of hearing non-native signers of GSL,
virtual classroom sessions of asynchronous teaching, and mainly students of GSL and families of deaf people.
its design is compatible with the principles that generally Although the exact number of hearing students of GSL in
define systems of open and distant learning. Besides Greece is unknown, records of the Greek Federation of
teaching GSL as a first language, in its present form the the Deaf (GFD) show that, in the year 2003 about 300
platform can be used for the learning of written Greek people were registered for classes of GSL as a second
through GSL, and it will also be open to future language. The recent increase of mainstreamed deaf
students in education, as well as the population of deaf
students scattered in other institutions, minor town units
for the deaf and private tuition may well double the total
1
Pedagogical Institute (PI) is the official organisation that number of secondary and potential sign language users.
validates all educational programs of primary and Official settings where GSL is being used include 11 Deaf
secondary education in Greece.

13
clubs in Greek urban centers and a total of 14 Deaf on the ability to generate sign phrases, which respect the
primary, secondary and tertiary educational settings. GSL grammar rules in a degree of accuracy that allows
them to be recognised by native signers as correct
3. Linguistic research background in the utterances of the language.
area of sign languages In this respect SYNENNOESE offers a great challenge for
In Greece there have been some serious attempts of in-depth work on both directions, lexicography and
lexicography in the recent past (PROKLESE, a Dictionary linguistic analysis of GSL; for the first time research will
of Computing Signs, NOEMA: a Multimedia Dictionary go beyond a mere collection of glosses (Logiadis &
of GSL Basic Vocabulary and A Children’s Dictionary of Logiadis, 1985) and move further from many previous
GSL) mainly for educational purposes (Kourbetis, 1999 ; bilingual dictionaries of sign languages (Brien & Brennan,
Kourbetis & Efthimiou, 2003), but complete decoding of 1992)), into the domain of productive lexicon (Wilcox et
the language structure is not yet publicly available. al., 1994), i.e. the possibility of building new GSL glosses
The linguistic part of the project is based on overall following known structural rules, and also challenge
assumptions for the adequacy of signed languages as by automatic translation in predictable environments, using
Stokoe (1960, 1978), Woll & Kyle (1985), Valli & Lucas an effective module/interface for the matching of
(1995), Sutton-Spence & Woll (1999), Neidle et al. structural patterns between the written input and the
(2000), Gee & Goodhart, (1985) among many. Greek sign signed output of the platform. It is a design prerequisite
language is analyzed to its linear and non-linear that the system of GSL description should have an open
(simultaneous) components (Padden, 1989 ; Engberg – design, so that it may be easily extendible allowing
Pedersen, 1993). The linear part of the language involves additions of lemmas and more complicate rules, with the
any sequences of lexical and functional tokens and their long term objective to create an environment for storage
syntactic relations, while non-linear structures in GSL, as and maintenance of a complete computational grammar of
in all known sign languages, are present in all levels of the GSL. From a linguistic point of view the resulting
grammar. Each sign in GSL is described as to its database of glosses, rules and tendencies of GSL will be a
handshape, location, movement, orientation, number of significant by-product of the project, of great value to
hands and use of any obligatory non-manually articulated future applications.
elements (referred to as nmf, i.e. mouth patterns, head and
shoulder movements and other non-manual features), 4.1 Grammar content definition
based on the Stokoe model (ibid). In the early implementation phase, the subsystem for the
In the project it was considered essential that the output is teaching of GSL grammar covers a restricted vocabulary
as close to native GSL as used in the Greek deaf and a core grammar capable of analysing a restricted
community. In this respect, forms of ‘signed Greek’ or number of main GSL grammatical phenomena, which
other manual codes for the teaching of Greek were might be argued that belong to signing universals:
excluded and the two languages (GSL and Greek) were The objective of the 18-month project is to transcribe the
treated as the first and second language respectively for digitized avi files with GSL individual signs and store
the users of the platform, quite as other bilingual them in a retrievable database. This requires the analysis
platforms may function outside the domain of special of the GSL signs into their phonological parts and their
education. semantics. It was agreed that only monomorphemic signs
that use only one handshape are analyzed in this early
4. The project’s language resources stage, so that feedback from the technical team will
Implementation of both the tutoring and the determine further steps (Johnston & Schembri, 1999).
summarization tools of the platform require collection of Non-manual grammatical features (Boyes Braem &
extensive electronic language resources for GSL as Sutton-Spence, 2001) and polymorphemic signs are
regards the lexicon and the structural rules of the language acknowledged but not included in this stage. In the
(Efthimiou et al., 2004). The actual data of the study are second stage longer sequential structures of signs will be
based on basic research on GSL analysis undertaken since considered (e.g. compound word-signs) and once
1999 as well as on experience gained by projects NOEMA individual signs are transcribed and stored in a database,
and PROKLISI (Efthimiou & Katsoyannou, 2001 ; additional tiers such as non-manual features can be added
Efthimiou & Katsoyannou, 2002). The data consist of without technical difficulties.
digitized language productions of deaf native GSL signers At the stage of grammatical analysis international findings
and of the existing databases of bilingual GSL on sign language grammars, as well as the views of our
dictionaries, triangulated with the participation of deaf deaf native user consultants are taken into account in
GSL signers in focus group discussions. The project order to verify findings. It is admitted that there is even
follows methodological principles on data collection and more work to be done on the pragmatics of GSL and its
analysis suitable to the minority status of GSL. Wherever relation with real-world situations (e.g. for the use of
the status of individual GSL signs is in consideration, the indexes or classifiers), and these are noted as future aims
Greek Federation of the Deaf is advised upon, too. of the platform.
Many of the grammar rules of GSL are derived from the An interesting parameter of a virtual signer is the ability
analysis of a digital corpus that has been created by to sign letters of the written alphabet (fingerspelling). This
videotaping native signers in a discussion situation or technique is useful in cases of proper nouns, acronyms,
when performing a narration. This procedure is required terminology or general terms for which no specific sign
because there exists little previous analysis of GSL as a exists. Fingerspelling is used extensively in some other
natural language. The basic design of the system, except sign languages such as ASL or BSL (Sutton-Spence
for the educational content this currently supports, focuses 1994), while our evidence in GSL suggests that it is only

14
used occasionally, rarely incorporating fingerspelled loans platform also incorporates a subsystem that allows
into the core of the language. From a technical point of approach by the deaf learner to material available only in
view, however, generally it is quite simple for an avatar to written Greek form by means of a signed summary. The
fingerspell as fingerspelling includes no syntax, learning process in practice will involve an initiator of the
movement in signing space or non-manual grammatical session, the student-s in groups or alone and a teacher-
elements. Many previous attempts of sign animation facilitator of the process, physically present with the
would go up to the level of fingerspelling or signing only students. The process can take place in real-time or can be
sequential structures of a representation of the written or relayed. There is provision of a whiteboard, icon banks
spoken language. Since then technology has developed and chat board visible in the screen along with the virtual
and so has linguistic description of sign language signer for common use in the classroom. The participants
structures.On the other hand few deaf people in Greece will also be able to see each other in real time through a
use fingerspelling or a code such as ‘Signed Exact Greek’ web camera, in order to verify results of GSL learning.
extensively. For these reasons the present project aims to
represent a form of GSL as close to natural fluent signing Specifications for the formation of GSL resources of the
as possible, and only uses fingerspelling occasionally, for application are crucially based on exhaustive research in
example in language games, where teaching of written the official, recently reformed, guidelines for the teaching
Greek is the focus. of Greek language and of GSL in primary schools for the
deaf (Kourbetis & Efthimiou, 2003). The educational
4.2 Notation and glossing content of the platform follows the same guidelines as the
In order to decide on the notation to be followed for sign hearing children’s curriculum, so that the same
recording in the lexical resources DB, the existing grammatical and semantic units can be taught in the two
international systems of sign language recording were languages, GSL and spoken / written Greek. Concepts
evaluated in respect to effectiveness as to determination of such as subject-object relations, types of verbs, discourse
the intermediate language of the system (see also Pizzuto functions of the language form the units of the curriculum
& Pietrandrea (2000), for a more theoretical discussion). in SYNENNOESE so that the same principles are taught
The latter consists an important part of the whole engine under the same umbrella, but without projecting onto GSL
as it serves for the communication between the linguistic a mirror image of the Greek grammar. For the selection
subsystem that determines the meaningful movements in and arrangement of the educational material the project is
the context of GSL and the technological subsystem that in close cooperation with the Pedagogical Institute in
performs these movements with a synthetic 3D model Athens, which is the main official agency in charge of the
signer. development of educational material.
Tools for transcription and notation of GSL include According to EU principles for accessibility to
HamNoSys, a pictographic notation system developed by information in special education (see also WP COM
the University of Hamburg for the description of the (2000) 284 final), all Greek schools have been provided
phonology of signs (Prillwitz et al., 1989). HML files in with suitable equipment for unrestricted Internet access,
HamNoSys will form the corpus of GSL lemmas while for so the deliverables of the project can be readily applicable
the representation of sequential structures (i.e. in the to real life school routine. Unfortunately, though, there
phrase level) ELAN language annotator developed by the have been no official educational resources for primary
Max-Planck Institute of Psycholinguistics in Nijmegen, education of the deaf in the area of languages, until the
the Netherlands, will be used. We considered these two time of writing of the current work. SYNENNOESE is the
systems as most suitable to the text-to-sign animation first applicable project for open and distance learning for
according to reviews of recent relevant projects. The the deaf, either individually or in group sessions. After
classic Stokoe model is used for the morpho-phonological month 12 of the beginning of the project there will be a
description, with one additional tier with written Greek trial period in sample student and tutor groups with the aid
words of harsh semantic equivalents of utterances. It is an of the Pedagogical Institute for feedback and corrections.
aim of the project to add more tiers as the project
continues, such as those mentioned above on the use of 6. Technical considerations
non-manual features and on pragmatics, using the The implementation team has reviewed currently
esxisting symbols in HamNoSys and ELAN. Signwriting available avatar and animation technologies for the
was another transcribing tool under consideration, but was representation of sign language in order to adopt one of
not chosen, given the expected compatibility of the most prominent technological solutions. The
HamNoSys within the Elan tiers in the near future. movements of a synthetic 3D signing model have to be
recorded in a higher and friendly level of description,
5. Tutoring system description - corpus before they are transformed in parameters of body
of educational material movement (Body Animation Parameters –BAPs)
The user interface under development is based on according to the MPEG-4 model. In the area of text-to-
technologies (experience gained in previous SPERO and sign animation there have been some similar projects
Faethon projects) which enable tracing the personal (VISICAST, Thetos, SignSynth and eSIGN among them)
characteristics of specific users, on the basis of that SYNENNOESE uses as background.
combination of personal data and his/her responses, Technologies considered for the viewing and interaction
previously acquired knowledge and user classification, so of 3D models wereVRML (Virtual Reality Modeling
that the teaching process may be best customised. The test Language), X3D (eXtensible 3D) and H-ANIM. VRML
bed learning procedure concerns teaching of GSL (Virtual Reality Modelling Language) is a high level
grammar to early primary school pupils, whereas the formal language with the ability to describe 3D interactive

15
objects and worlds. It is a hierarchical scene description which represents the avatar and includes references to the
language that defines the geometry and behaviour of a 3D STEP engine and the related JavaScript interface. From
scene or "world" and the way in which this is navigated this setup, one may choose to create one’s own script, for
by the user. VRML is the only standardised (ISO/IEC sign representation, and execute them independently, or
14772) 3D format suitable for Web delivery. embed them as JavaScript code, for maximized
X3D is the next-generation open standard for 3D on the extensibility. The common VRML viewing plug-ins offer
web. It is an extensible standard that can easily be the possibility to select the required viewpoint at run-time,
supported by content creation tools, proprietary browsers, so it is possible for the user to experience the signing from
and other 3D applications, both for importing and any desired point of view (Kennaway, 2001 ; Kennaway,
exporting. It replaces VRML, but also provides 2003 ; Huang, Eliens, & Visser, 2002). As an example, a
compatibility with existing VRML content and browsers. frame of the signing sequence for “radio” is presented in
H-ANIM is a set of specifications for description of figure 1.
human animation, based on body segments and In SYNENNOESE, a syntactic parser decodes the
connections. According to the H-ANIM standard, the structural patterns of written Greek and matches them into
human body consists of a number of segments (such as the equivalents in GSL (Boutsis et al., 2000), and these
the forearm, hand and foot), which are connected to each resulting patterns are signed by a virtual human (avatar).
other by joints (such as the elbow, wrist and ankle). H- Using the technologies above, an internet platform will
ANIM can be used to describe the gestures. Motion make access easy and fast, while the use of animated
tracking and haptic devices (such as CyberGrasp or models instead of video files saves valuable storage space
Acceleration Sensing Glove with a virtual keyboard) were and bandwidth. Other advantages are the possibility of
initially considered but it was agreed that, if quality of the preview of predefined movements of the humanoid and
results of the first transcribed signs with application of the possibility of adding new movements and handshapes
HamNoSys notation commands is acceptable, motion onto the system at any moment (script authoring). The
capture sequences will not need to be applied. In either advantages of an H-ANIM model (used version is v. 1.1)
case, both are much more flexible solutions than using are its compatibility with VRML 97, flexibility on all
‘frozen’ mpeg or avi video files. Avatars are much more segments and a more straightforward use.
accessible to flexible information exchange and take The chart below (Figure 2) shows how the system
advantage of the dynamic nature of phonological and functions and how data is transferred between machine
syntactic rules. and users. The testbed includes a page with and embedded
VRML97 object, a JavaScript form for communication
7. Adopted 3D technologies with the user and a Java Applet for communication with
For the content designer to interact with an avatar, a the back-end system. As can be seen in the chart, the
scripting language is required. In our implementation, we system does not involve recognition of speech or signs.
chose the STEP language (Scripting Technology for Machine translation mechanisms are at the background
Embodied Persona) (Huang, Eliens & Visser (2002)). as while at the present the output is a medium for human to
the intermediate level between the end user and the virtual non-human communication, rather than a machine for
actor. A major advantage of languages such as STEP is automatic translation.
that one can separate the description of the individual
gestures and signs from the definition of the geometry and front-end
hierarchy of the avatar; as a result, one may alter the
definition of any action, without the need to re-model the
HTM JavaScrip
virtual actor. The avatars that are utilized here, are
compliant with the H-ANIM standard, so one can use any
of the readily available or model a new one. VRML
Java
Appl

DLP STEP

back-end

Figure 2. Data flow chart

8. Implications and extensibility of the


educational platform
• As an educational tool above all,
SYNENNOESE offers a user-friendly
Figure 1: The virtual signer signing “radio” in GSL environment for young deaf pupils aged 6 to 9 so
they can have visual translation of words and
An integrated system based on STEP is usually deployed phrases. The signed feedback acts as a
in a usual HTML page, in order to maximize motivating tool for spelling Greek words and
interoperability and be accessible to as many users as structuring sentences correctly, as well for
possible. This page includes an embedded VRML object, evaluating one’s performance. For deaf young

16
students as a group with special needs, the their phonological characteristics. As mentioned
platforms draws some of the accessibility already in the section above on grammar content
barriers, and the possibility of home use even definition, monomorphemic entries were agreed
makes it accessible to family, thus encouraging to be included in the first stage. In the next stages
communication in GSL, but also access to the there will be gradual provision for
majority (Greek) language. polymorphemic signs, compound signs,
• New written texts can be launched, so functional morphemes, syntactic use of non-
SYNENNOESE may receive unlimited manual elements, sequential and lastly
educational content besides primary school simultaneous constructions of separate lexical
grammar units. On the other hand, unlimited signs, each stage to correspond with the level of
school units, such as the increasing special units linguistic research in GSL.
with individual deaf students in rural areas and • The data available in GSL, when compared with
islands can link with one another via data from Greek, for example, are dauntingly
SYNENNOESE. scarce. Error correction mechanisms were sought
• Text-to-sign translation can be extended and after in order to assure reliability of results. Such
applied to different environments such as Greek back-up mechanisms are the use of approved
language teaching to deaf students of higher dictionaries, the consultancy of Pedagogical
grades, GSL teaching for hearing students, Greek Institute and the feedback from the Deaf
for specific purposes such as to adult literacy Community, along with the continuing data from
classes for the Deaf etc. GSL linguistic research.
• More domains of GSL grammar can be described • Lastly, all schools in Greece have recently
and decoded, making the output closer to natural become accessible to the Internet, Deaf settings
signed utterances as our analysis proceeds. This included. In practice however, there are many
is a challenge not only for theoretical research, more accessibility barriers for a considerable
but also for computer science and applied number of deaf students who have additional
linguistic research. special needs. Relevant provisions have been
• Furthermore, a database with the bulk of GSL made according to general accessibility
utterances, described as to their features from the principles for these students (as to text size,
phonological up to the pragmatic level will be the keyboard settings etc) but the pilot application of
major outcome of the whole project. In this way the platform in December 2004 after 12 months
the representation of GSL structures can be of the beginning of the project will certainly
matched to equivalents ones of written Greek, indicate more points for development.
and it will be a challenge to be able to compare
directly the grammars of the two languages. In Technical problems include:
much the same way structures of GSL will easily • A solution for smooth transition between signs
be compared with counterparts from ASL or BSL and fusion between handshapes so that
for research across signed languages. neighboring signs in a sentence appear as
• From a socio-economic point of view, creating naturally articulated as possible.
this platform will greatly contribute towards the • Automated commands for grammatical use of
inclusion of deaf people in Greek society in an eye gaze, particularly when eye gaze has to
environment of equal opportunities. follow the track of hand movements. Similar
problems are anticipated on mouth movements
9. Problems and limitations on prosodic features of sign phonology.
The main limitations of the study are described below. Mouthing the visible part of spoken Greek words
These are divided into linguistic, educational and will not be an issue for the project yet, but this,
technical ones. Most of the limitations are typical to sign too is anticipated as a problem to deal with in the
animation projects, and they were expected before the future, as all of the above non manually signed
beginning of the project. features are considered as internalized parts of
From a linguistic and educational point of view, the major GSL grammar.
issues that need to be addressed are the following: • It would be ideal to have a readily available
• In some areas of the language there are no system for retrieving and automatically extend
standardized signs, so there may be some phonological rules via HamNoSys notation. To
theoretical objections as to the use of particular the best of our knowledge such provisions are
entries. However, a platform such as the one being made and the problem will meet a solution
described allows for multiple translations and soon.
does not have any limitations as to the size of • The ultimate challenge, as in all similar projects,
files, which was the case, for example in remains the automatic translation of the
previous GSL dictionaries in DVD form with avi language. It is still too difficult to produce
video entries. Moreover, the platform will be acceptable sentences in the automatic translation
open to updates through the script authoring of any language at the moment, even more so a
process. minor, less researched language with no written
• A second problem is the choice of entries to be tradition such as GSL. Realistically the teams
included in each stage of the platform involved in the SYNENNOESE project can
development depending on the complexity of expect as an optimum result the successful use of
automatic translation mechanisms in GSL only in

17
a restricted, sub-language oriented environment Engberg – Pedersen, E. (1993). Space in Danish Sign
with predetermined semantic and syntactic Language, Signum Press, Hamburg.
characteristics. Gee, J. & Goodhart, W. (1985). Nativization, Linguistic
Theory, and Deaf Language Acquisition. Sign Language
10. Conclusion Studies, 49, 291--342.
Given that the platform under discussion consists an Greenberg J. H. ed. (1968) Universals of Human
original research object, successful completion of its Language, MIT Press.
development will open the way to a complete support Huang, Z., Eliens, A., & Visser, C. (2002). STEP: A
system for the education of the Deaf Community Scripting Language for Embodied Agents, Proceedings
members in Greece. of the Workshop on Lifelike Animated Agents.
Johnston T. & Schembri, Α. (1999). On defining Lexeme
Acknowledgments in a Signed Language. Sign Language and Linguistics
2:2, 115--185.
The authors wish to acknowledge the assistance of all Kennaway, R. (2001). Synthetic Animation of Deaf
groups involved in the implementation of the discussed Signing Gestures. International Gesture Workshop, City
platform: computational linguists, computer scientists, University, London.
GSL specialists, the panel of GSL consultants and the Kennaway, R. (2003). Experience with, and Requirements
representatives of the Greek Deaf Community, acting as for, a Gesture Description Language for Synthetic
our informants. This work was partially supported by the Animation. 5th International Workshop on Gesture and
national project grant SYNENNOESE (GSRT: e-Learning Sign Language based Human-Computer Interaction,
44). Genova.
Kourbetis, V. & Efthimiou, E. (2003). Multimedia
References Dictionaries of GSL as language and educational tools.
Antzakas, K. & Woll, B. (2001). Head Movements and Second Hellenic Conference on Education, Syros (in
Negation in Greek Sign Language. Gesture Workshop, Greek), (in print).
City University of London 2001: 193--196. Kourbetis, V. (1999). Noima stin Ekpaideusi Athens,
Bellugi, U. & Fischer, S. (1972). A comparison of Sign Greece, Hellenic Pedagogical Institute.
language and spoken language. Cognition, 1, 173--200. Kyle, J. G. & Woll, B. (1985). Sign Language: the study
Boutsis S., Prokopidis P., Giouli V. & Piperidis S., of deaf people and their language, Cambridge University
(2000). A Robust Parser for Unrestricted Greek Text. Press.
In Proceedings of the 2nd Language Resources and Lampropoulou, V. (1992). Meeting the needs of deaf
Evaluation Conference, pages 467-474. children in Greece. A systematic approach. Journal of
Boyes Braem, P. & Sutton-Spence, R. (eds.) (2001). The the British Association of Teachers of the Deaf, 16, 33--
Hands are the Head of the Mouth: the Mouth as 34.
Articulator in Sign Languages Hamburg : Signum, Lampropoulou, V. (1997). I Ereuna tis ellinikis
c2001. vi, 291 p. International studies on sign language Noimatikis Glossas: Paratiriseis Phonologikis Analisis.
and communication of the deaf ; v.39. Athens, Greece: Glossa.
Brennan, M. (1987). British Sign Language, the Language Liddell, S. (1980). American Sign Language Syntax, The
of the Deaf Community, in T. Booth & W. Swann (eds), Hague: Mouton.
Including People with Disabilities: Curricula for All, Logiadis, N. & Logiadis, M. (1985). Dictionary of Sign
Open University Press: Milton Keynes. Language. Potamitis Press (in Greek).
Brien D. & Brennan, M. (1992). Dictionary of British Neidle, C., Kegl, J. et al. (2000). The Syntax of ASL.
Sign Language / English ) Faber and Faber, London Functional Categories and Hierarchical Structure MIT
Boston. Press, Cambridge, Massachusetts: London.
Bybee, J. (1985). Morphology: a Study of the Relation Padden, C. (1989). The Relationship between Space and
between Meaning and Form. John Benjamins Grammar in American Sign Language Verb
publishing, Amsterdam / Philadelphia. Morphology, in Theoretical Issues in Sign Language
Comrie, B. (1981). Language Universals and Linguistic Research, Gallaudet Univ. Press.
Typology : Syntax and Morphology /Bernard Comrie. Pizzuto, E. & Pietrandrea, P. (2000). Notating signed
Oxford: Blackwell. texts: open questions and indications for further research
Efthimiou, E. & Katsoyannou, M. (2001). Research issues (unpublished manuscript).
on GSL: a study of vocabulary and lexicon creation. Prillwitz et al. (1989). HamNoSys. Version 2.0. Hamburg
Studies in Greek Linguistics, Vol. 2 Computational Notation System for Sign Language. An Introductory
Linguistics, 42--50 (in Greek). Guide. Broschur / Paperback (ISBN 3-927731-01-3).
Efthimiou, E. & Katsoyannou, M. (2002). NOEMA: a PROKLISI Project: WP9: Development of a GSL based
Greek Sign Language Modern Greek bidirectional educational tool for the education of people with hearing
dictionary. Modern Education Vol. 126/127, 2002, 115-- impairments, Deliverable II: GSL linguistic material:
118 (in Greek). data collection methodology. ILSP, May 2003.
Efthimiou, E., Sapountzaki, G., Carpouzis, C., Fotinea, S- Stokoe W & Kuschel R (1978). For Sign Language
E. (2004). Developing an e-Learning platform for the Research, Linstock Press.
Greek Sign Language. Lecture Notes in Computer Sutton-Spence, R. (1994). The Role of the Manual
Science (LNCS), Springer-Verlag Berlin Heidelberg (in Alphabet and Fingerspelling in British Sign Language.
print). PhD Dissertation. University of Bristol.

18
Sutton-Spence, R. & Woll, B. (1999). The Linguistics of Sign Language. In Proceedings of ASSETS Conference,
British Sign Language; an Introduction, Cambridge Association for Computing Machinery, (pp. 9--16).
University Press. http://www.leidenuniv.nl/hil/sign-
Valli, C., and Lucas, C. (1995). Linguistics of American lang/slsites.html#technical
Sign Language, 2nd ed. Washington D.C.: Gallaudet http://www.sign-lang.uni-
University Press. hamburg.de/Quellen/default.html
Wilcox, S., Scheibmann, J., Wood, D., Cokely, D. & http://www.fhs-
Stokoe, W. (1994). Multimedia dictionary of American hagenberg.ac.at/mtd/projekte/FFF/3dSign/bookmarks.ht
ml

19
Sharing sign language corpora online: proposals for transcription and metadata
categories
Onno Crasborn*, Els van der Kooij*, Daan Broeder†, Hennie Brugman†
*
Department of Linguistics, University of Nijmegen
PO Box 9103, NL-6500 HD Nijmegen, The Netherlands
(o.crasborn, e.van.der.kooij}@let.kun.nl

Technical group, Max Planck Institute for Psycholinguistics
PO Box 310, 6500 AH Nijmegen, The Netherlands
{daan.broeder, hennie.brugman}@mpi.nl

Abstract
This paper presents the results of a European project called ECHO, which included an effort to publish sign language corpora online.
The aim of the ECHO project was to explore the intricacies of sharing data using the internet in all areas of the humanities. For sign
language, this involved adding a specific profile to the IMDI metadata set for characterizing spoken language corpora, and developing
a set of transcription conventions that are useful for a broad audience of linguists. In addition to presenting these results, we outline
some options for future technological developments, and bring forward some ethical problems relating to publishing video data on
internet.

that the same conventions for annotating these data are


1. The ECHO project used, both in terms of linguistic transcription and in
Within the EU project ‘European Cultural Heritage terms of metadata description. The availability of a
Online’ (ECHO)1, one of the five case studies is devoted small corpus of video recordings from different
to the field of language studies. The case study is titled languages, as published for the ECHO project, hopefully
‘Language as cultural heritage: a pilot project with sign promotes standardization.
languages’2. New data have been recorded from three
sign languages of different Deaf communities in 2.1 Metadata standards
Europe: Sign Language of the Netherlands (abbreviated In our case, metadata descriptions of language corpora
SLN), British Sign Language (BSL) and Swedish Sign characterize the documents and data files that make up
Language (SSL). By having people retell written fable the corpus in terms of descriptors that pertain to the
stories, comparable data resulted that can be used for whole unit of media and transcription files, rather than
cross-linguistic research. In addition to these semi- to individual sections within the files. For example,
spontaneous data, we have elicited basic word lists and information about the subjects, the identity of the
included some sign language poetry (some newly researchers involved in the collection and the register
recorded, some already published). used by the speakers or signers typically belongs to the
The first aim of this paper is to characterize the metadata domain. Users can then search within and
conventions that were used and to explain why these can across large corpora for all transcribed video material
be considered as useful to a larger audience of linguists. with male signers older than 45 years, for example.
The ELAN and IMDI software tools that were used to However, for such searches to be possible, it is essential
enter the transcriptions and metadata store their data in that users obey the same conventions for labeling
XML files whose format is described by open schemata corpora. A proposal for such a standard is presented in
and which can be accessed by other software tools as section 33. This is a specialization of the IMDI set of
well. Using these open-standard tools, we developed a metadata descriptors for language resources4.
set of transcription conventions that are likely to be
useable by a large group of researchers with diverse 2.2 Transcription standards
interests. Several tools are currently available for annotating
The second aim of this paper is to outline some desired video data. Both SyncWriter (Hanke & Prillwitz 1995)
functionalities of these tools that will make it more and SignStream5 have developed especially for sign
attractive to actually use existing corpora. Finally, we language data, whereas ELAN started its life in the
will outline some ethical challenges that have not yet domain of gesture research (former versions were called
received much discussion in the sign language field MediaTagger)6.
These new technologies for presenting sign language
2. The need for standardization data and transcriptions pose the following question: to
For actual cross-linguistic studies to take place, it is what extent should we use standard transcription
necessary that not only the same stimulus material is conventions? If all the raw material (the video sources)
used, or otherwise comparable data are used, but also
3
Further information on the proposed standard can be found at
1
http://echo.mpiwg-berlin.mpg.de/ http://www.let.kun.nl/sign-lang/IMDI/.
2 4
http://www.let.kun.nl/sign-lang/echo/; project partners were http://www.mpi.nl/IMDI
5
the University of Nijmegen, City University London, and http://www.bu.edu/asllrp/SignStream/
6
Stockholm University. http://www.mpi.nl/tools/elan.html

20
is available, do we need full transcriptions? In principle, 3. Collector. Name and contact information for the
one can look at the video source for all kinds of person who collected the session.
information that are traditionally included in various 4. Content. A set of categories describing the
transcription system, such as eye gaze, head nods, etc. intellectual content of the session.
On the other hand, the great strength of computer tools 5. Actors. Names, roles and further information about
such as ELAN is that it allows for complex searches in the people involved in the session, including the signers
large data domains and for the immediate inspection of and addressees, but also, for example, the researchers
the video fragments relating to the search results; this is who collected and transcribed the material.
typically very time consuming when using paper 6. Resources. Information about the media files, such as
transcription forms or even digitized transcription forms URL, size, etc.
that are not directly linked to the original data. 7. References. Citations and URLs to relevant
Within the ECHO project, we therefore wanted to publications and other archive resources.
establish an annotation system that could be useful for Each of these seven categories allow for extension by
any researcher, with a focus on the syntactic and users, in the form of ‘key–value pairs’. A key specifies
discourse domains. We tried to be careful not to impose an extra category, an extra field, for which a value can
too much analysis on any tier by saying that a specific be specified. For example, one might specify a key
phonetic form is an instance of ‘person agreement’, for called Backup Copy to quickly specify whether a back-
example. On the other hand, analytical decisions are up copy of the original tape has already been made (yes
constantly being made in any transcription process. For vs. no).
example, even adding multiple tiers with translations in In a workshop for the ECHO project, held at the
various written languages (in the case of the ECHO University of Nijmegen in May 2003, a group of sign
project: Dutch, English and Swedish) implies taking linguists from various countries and with varying
(implicit or explicit) decisions about where sentence research interests sat together to see how these
boundaries are located. categories could be applied to sign language data. The
While every research project will have its own research outcome of that workshop was a set of key fields to
questions and require special transcription categories, it describe sign language corpora. These extra categories
should be possible to define a standard set of have now been bundled in an extension to the standard
transcription tiers and values that are useful to large IMDI metadata specification, called ‘sign language
groups of researchers, regardless of their specific profile’. Profiles in the IMDI Editor tool offer sets of
interests. For example, a translation at sentence level to extra fields that apply to specific types of data, in this
a written language is always useful, if only for exploring case communication in a specific modality.
a video recording. Working with three teams of linguists
from different countries, each with their own research 3.2 The sign language profile
interests, the ECHO project formed a good start for The sign language profile adds key fields in two areas in
developing such a standard set of transcription the IMDI set: content and actors. All of the fields can be
conventions. This ECHO set is described in section 4. specified or left empty.
The relatively small set of transcription tiers allows for In content, Language Variety describes the specific
the coding of a relatively large data set, which can be form of communication used in the session, and
further expanded by researchers according to their Elicitation Method specifies the specific prompt used to
specific needs. ELAN will see several updates in the elicit the data at hand. A set of four keys describes the
near future; one of the future functions will be the communication situation with respect to interpreting:
possibility to expand a publicly available transcription who was the interpreter (if any) interpreting for
file with ones own additions, including extra tiers, and (Interpreting.Audience), what were the source and target
storing these additions in a local file while maintaining modalities (Interpreting.Source and Interpreting.
the link to the original transcription that will be stored Target), and is the interpreter visible in the video
on a remote server. recording (Interpreting.Visibility)?
Secondly, four sets of keys are defined that can be used
3. Metadata description of sign to describe various properties of each actor who is
language corpora: expanding the IMDI related to the session: properties pertaining to deafness,
standard the amount of sign language experience, the family
members, and the (deaf) education of the actor.
3.1 The IMDI standard and profiles Deafness.Status describes the hearing status of the actor
The set of IMDI metadata descriptors that was (deaf, hard-of-hearing, hearing), and Deafness.AidType
developed for spoken language corpora distinguishes 7 describes the kind of hearing aid the actor is using (if
categories for each session unit: any).
1. Session. The session concept bundles all information The amount of Sign Language Experience is expressed
about the circumstances and conditions of the linguistic by specifying the Exposure Age, Acquisition Location
event, groups the resources (for example, video files and and experience with Sign Teaching.
annotation files) belonging to this event, and records The family of the actor can be described by specifying
any administrative information for the event. Deafness and Primary Communication Form for
2. Project. Information about the project for which the Mother, Father and Partner.
sessions were originally created. Finally, the Education history of the actor can be
specified in a series of keys: Age (the start and end age

21
of the actor during his education), the School Type Manual behavior is systematically described separately
(primary school, university, etc.), the Class Kind (deaf, for the two hands. For both the left and the right hand,
hearing, etc.), the Education Model, the Location, and there is a Gloss tier. Child tiers for each of these two
whether the school was a Boarding School or not. articulators specify whether there is Repetition in the
A more complete definition of the whole sign language movement of the glossed unit, and what the Direction &
profile is given in Crasborn & Hanke (2003). Location of each of the hands is.

4. A standard set of linguistic 4.4 Tiers with non-manual information


transcription conventions for sign language A set of non-manual tiers allow for the specification of
data some of the relevant properties of the face, head, and
body of the signer. Movement of the Head and Eye
4.1 An introduction to ELAN and the ‘tier’ Brows can be specified, as well as the amount of Eye
concept Aperture (including the notation of eye blinks) and the
Below we describe the different tiers used for the ECHO direction of Eye Gaze.
project7. A tier is a set of annotations that share the A new system was devised to specify the behavior of
same characteristics, e.g. one tier containing all the the Mouth, including the tongue, which in previous
glosses for the right hand and another tier containing the systems was often treated in a rather fragmentary
Dutch translations. ELAN distinguishes between two manner (Nonhebel, Crasborn & van der Kooij 2004b).
types of tiers: “parent tiers” and “child tiers”. Parent
tiers are independent tiers, which contain annotations 4.4 Properties of the transcription
that are linked directly to a time interval of the media conventions
frame. Child tiers or referring tiers contain annotations The transcription system outlined in the sections above
that are linked to annotations on another tier (the parent had two central goals. First of all, it should be easy and
tier)8. ELAN provides the opportunity to select one or relatively quick to use for encoders, so that users can
more video frames and assign a specific value to this transcribe considerable amounts of data within a
selected time span. For example, when the eye brows reasonable time frame. This inevitably goes at the
are first up and then down (neutral) in the same sign, expense of detail. For example, for facial expression, the
one would only select the time interval in the video in FACS system (Ekman, Friesen & Hager 2002) is the
which the eyebrows are up for the brows tier, and mark most detailed and accurate transcription method that is
that time-domain with a specific code (for instance known, but it is extremely time-intensive to learn to
‘up’). This is possible for all tiers that one creates. master and use, and offers far more detail than is
It is important to emphasize that, unlike in the IMDI necessary for the large majority of research projects.
software, there is no standard set of tiers for any The tiers for non-manual activity that we propose aim to
document. Tiers have to be set up by the user for every form an optimal compromise between the amount of
annotation file that is created to annotate a media file. detail available to the user and the time investment
The set that we propose is just that: a proposal for a set made by the transcriber.
of tiers that cover elementary transcription categories Secondly, we tried to systematically separate form from
that can be useful for many different kinds of research. function for all tiers. Since the function of a given
The use of this set of tiers is exemplified by the data linguistic form can vary from language to language, it is
transcribed for the ECHO project9. Any user can add crucial to emphasize the coding of the form of linguistic
both additional tiers and additional annotations on behavior.
existing tiers to the documents that have been published
in the context of the ECHO project. 5. Specifications for future tools
4.2 Tiers with general information Most importantly in the context of this paper, searching
across both data and metadata domains will need to be
General information that can be supplied for every an important target of further development. In the
fragment of a video file includes Translation tiers for present state of the tools, one needs to first search within
English, Swedish and Dutch. Each of these tiers target a the set of metadata categories, and in the resulting set of
translation at sentence level. An annotation on the Role transcription files search for data categories one-by-one.
tier indicates that the signer takes on the role of a Finding all cases of weak hand spreading by people
specific discourse participant, as commonly happens in younger than 20 thus becomes a very time-consuming
sign language discourse. Finally, the Comments/notes task, whereas corpora are particularly useful for those
tier can be used to add any kind of comment by the user. kinds of complex queries.
In the sign language research community, working with
4.3 Tiers with manual information corpus data is still very uncommon, presumably in part
because there are no commonly used written forms of
sign languages until now that have allowed to create text
7
An extensive description is available in Nonhebel, Crasborn corpora. Now the computer technology is available to
& van der Kooij (2004a). build up corpora of digitized video recordings and
8
See also ELAN manual, available at annotate these, in addition to the search facilities,
http://www.mpi.nl/tools/elan.html.
9 software is needed to provide basic statistical functions
These data can be freely downloaded from
http://www.let.kun.nl/sign-lang/echo/data.html.
in ELAN, including frequencies of annotation values on

22
different tiers and the distribution of the durations of Alternatively, data access can be restricted to linguists
these annotation values. Currently, the most obvious registered as users of the corpus by the host institution,
way to perform quantitative analyses of transcription but this comes down to restricting access to data that
files at this moment is to export data to a spreadsheet were intended to be public – at least within the open
program for further analysis. access concept that is central to the ECHO project.
A function that is currently being implemented is to add Future projects aimed at making data accessible online
a visualization of kinematic recordings with the should explore these issues in more depth, with
transcription of video material, similar to the display of assistance from both legal and ethics specialists.
the oscillogram of sound files in ELAN. These
numerical data can then be more easily integrated with 7. References
qualitative analyses based on transcription.
Boersma, P. & D. Weenink (2004) Praat. Doing
Additionally, the software will need to provide
phonetics by computer. http://www.praat.org.
numerical analyses appropriate to phonetic analysis of
Brugman, H., O. Crasborn & A. Russel (2004)
sign languages, similar to the ‘Praat’ software for
Collaborative annotation of sign language data with
speech analysis (Boersma & Weenink 2004). As the
peer-to-peer technology. Paper presented at LREC
field of sign language phonetics is still in its infancy, the
2004, Lissabon.
specifications of such functionality will have to develop
Crasborn, O. & T. Hanke (2003) Additions to the IMDI
over the years to come. Finally, a similar integration of
metadata set for sign language corpora.
quantitative data from eye-tracking equipment would
http://www.let.kun.nl/sign-
enhance the usability of the software for some research
lang/echo/docs/SignMetadata_Oct2003.doc.
groups.
Ekman, P., W. Friesen & J. Hager (2002) Facial Action
Working together with colleagues anywhere in the
Coding System. Salt Lake City: Research Nexus.
world on the same annotation document at the same
Hanke, T. & S. Prillwitz (1995) syncWRITER:
time is another function currently under development.
Integrating video into the transcription and analysis of
Using peer-to-peer technology, it will become possible
sign language. In H. Bos & T. Schermer (eds.) Sign
to look at the same annotation document on different
language research 1994. Hamburg: Signum. Pp. 303-
computers connected to the internet, and instantly see
312.
modifications that are being made by the other party. In
Nonhebel, A., O. Crasborn & E. van der Kooij (2004a)
combination with a chat function, one can jointly look at
Sign language transcription conventions for the ECHO
existing annotations and create new annotations (see
project. http://www.let.kun.nl/sign-lang/echo/docs/
Brugman, Crasborn & Russel 2004 for further details on
transcr_conv.pdf.
this ‘collaborative annotation’ concept).
Nonhebel, A., O. Crasborn & E. van der Kooij (2004b)
Sign language transcription conventions for the ECHO
6. Ethical aspects of publishing sign project. BSL and NGT mouth annotations.
language data online http://www.let.kun.nl/sign-lang/echo/docs/
Needless to say, the privacy of subjects in scientific transcr_mouth.pdf.
studies has to be respected. For the sign language study
in the ECHO project, this gives rise to extra problems
not previously encountered in the creation of spoken
language corpora that just make use of sound
recordings. The visual information in the video
recordings contains a lot more personal information than
audio recordings of voices, including not only the
identity of the signer (i.e., the visual appearance of the
face), but also more clues to the emotional state and age
of the person, for example.
While it is common practice to ask subjects in linguistic
recordings for their explicit written permission to use
the recordings for various purposes, including making
images for publications, discussion among sign
language specialists revealed that this permission is a
rather sensitive issue in the case of internet publication.
Publication of data online imply that the information is
available to the whole world, and not just to a limited
group of people with access to specific university
libraries, for example, as in the case of video tape
recordings used until recently. Signers who have no
problem with the inclusion of the video data at the time
of recording may well regret this choice 15 years later.
Can this be considered the problem of the person
involved, or should researchers make more of an effort
to outline the implications of sharing data to subjects?

23
Spatial Representation of Classifier Predicates for
Machine Translation into American Sign Language
Matt Huenerfauth
Computer and Information Science
University of Pennsylvania
Philadelphia, PA 19104
[email protected]

Abstract
The translation of English text into American Sign Language (ASL) animation tests the limits of traditional machine translation (MT)
approaches. The generation of spatially complex ASL phenomena called “classifier predicates” motivates a new representation for
ASL based on virtual reality modeling software, and previous linguistic research provides constraints on the design of an English-to-
Classifier-Predicate translation process operating on this representation. This translation design can be incorporated into a multi-
pathway architecture to build English-to-ASL MT systems capable of producing classifier predicates.

reference to this entity can be made by pointing to this


Introduction and Motivations location (Neidle et al., 2000). Some verb signs will move
Although Deaf students in the U.S. and Canada toward or away from these points to indicate (or show
are taught written English, the challenge of acquiring a agreement with) their arguments (Liddell, 2003a; Neidle
spoken language for students with hearing impairments et al., 2000). Generally, the locations chosen for this use
results in the majority of Deaf U.S. high school graduates of the signing space are not topologically meaningful; that
reading at a fourth-grade1 level (Holt, 1991). is, one imaginary entity being positioned to the left of
Unfortunately, many strategies for making elements of the another in the signing space doesn’t necessarily indicate
hearing world accessible to the Deaf (e.g. television the entity is to the left of the other in the real world.
closed captioning or teletype telephone services) assume Other ASL expressions are more complex in their
that the user has strong English literacy skills. Since use of space and position invisible objects around the
many Deaf people who have difficulty reading English signer to topologically indicate the arrangement of entities
possess stronger fluency in American Sign Language in a 3D scene being discussed. Constructions called
(ASL), an automated English-to-ASL machine translation “classifier predicates” allow signers to use their hands to
(MT) system can make more information and services position, move, trace, or re-orient an imaginary object in
accessible in situations where English captioning text is at the space in front of them to indicate the location,
too high a reading level or a live interpreter is unavailable. movement, shape, contour, physical dimension, or some
Previous English-to-ASL MT systems have used other property of a corresponding real world entity under
3D graphics software to animate a virtual human character discussion. Classifier predicates consist of a semantically
to perform ASL output. Generally, a script written in a meaningful handshape and a 3D hand movement path. A
basic animation instruction set controls the character’s handshape is chosen from a closed set based on
movement; so, MT systems must translate English text characteristics of the entity described (whether it is a
into a script directing the character to perform ASL. vehicle, human, animal, etc.) and what aspect of the entity
Previous projects have either used word-to-sign the signer is describing (surface, position, motion, etc).
dictionaries to produce English-like manual signing For example, the sentence “the car drove down
output, or they have incorporated analysis grammar and the bumpy road past the cat” could be expressed in ASL
transfer rules to produce ASL output (Huenerfauth, 2003; using two classifier predicates. First, a signer would move
Sáfár and Marshall, 2001; Speers, 2001; Zhao et al., a hand in a “bent V” handshape (index and middle fingers
2000). While most of this ASL MT work is still extended and bent slightly) forward and slightly
preliminary, there is promise that an MT system will one downward to a point in space in front of his or her torso
day be able to translate many kinds of English-to-ASL where an imaginary miniature cat could be envisioned.
sentences; although, some particular ASL phenomena – Next, a hand in a “3” handshape (thumb, index, middle
those involving complex use of the signing space – have fingers extended with the thumb pointing upwards) could
proven difficult for traditional MT approaches. This paper trace a path in space past the “cat” in an up-and-down
will present a design for generating these expressions. fashion as if it were a car bouncing along a bumpy road.
Generally, “bent V” handshapes are used for animals, and
“3” handshapes, for vehicles.
ASL Spatial Phenomena
ASL signers use the space around them for Generating Classifier Predicates
several grammatical, discourse, and descriptive purposes.
During a conversation, an entity under discussion As the “bumpy road” example suggests,
(whether concrete or abstract) can be “positioned” at a translation involving classifier predicates is more complex
point in the signing space. Subsequent pronominal than most English-to-ASL MT because of the highly
productive and spatially representational nature of these
signs. Previous ASL MT systems have dealt with this
1
Students who are age eighteen and older are reading problem by omitting these expressions from their
English text at a level more typical of a ten-year-old student.

24
linguistic coverage; however, many English concepts lack As engineering limitations are identified or additional
a fluent ASL translation without them. Further, these linguistic analyses are considered, the design will be
predicates are common in ASL; in many genres, signers modified, and progressively more sophisticated
produce a classifier predicate on average once per 100 representations and processing architectures will emerge.
signs (this is approximately once per minute at typical
signing rates) (Morford and MacFarlane, 2003). So, Design 1: Lexicalize the Movement Paths
systems that cannot produce classifier predicates can only The task of selecting the appropriate handshape
produce ASL of limited fluency and are not a viable long- for a classifier predicate, while non-trivial, seems
term solution to the English-to-ASL MT problem. approachable with a lexicalized design. For example, by
Classifier predicates challenge traditional storing semantic features (e.g. +human, +vehicle,
definitions of what constitutes linguistic expression, and +animal, +flat-surface) in the English lexicon, possible
they oftentimes incorporate spatial metaphor and scene- handshapes can be identified for entities referred to by
visualization to such a degree that there is debate as to particular English nouns. Associating other features (e.g.
whether they are paralinguistic spatial gestures, non- +motion-path, +stationary-location, +relative-locations,
spatial polymorphemic constructions, or compositional yet +shape-contour) with particular verbs or prepositions in
spatially-parameterized expressions (Liddell, 2003b). No the English lexicon could help identify what kind of
matter their true nature, an ASL MT system must information the predicate must express – further
somehow generate classifier predicates. While MT narrowing the set of possible classifier handshapes. To
designs are not required to follow linguistic models of produce the 3D movement portion of the predicate using
human language production in order to be successful, it is this lexicalized approach, we could store a set of 3D
worthwhile to consider linguistic models that account well coordinates in the English lexicon for each word or phrase
for the ASL classifier predicate data but minimize the (piece of lexicalized syntactic structure) that may be
computational or representational overhead required to translated as a classifier predicate.
implement them.
Problems with This Design
Design Focus and Assumptions
Unfortunately, the highly productive and scene-
This paper will focus on the generation of specific nature of these signs makes them potentially
classifier predicates of movement and location (Supalla, infinite in number. For example, while it may seem
1982; Liddell, 2003a). Most of the discussion will be possible to simply store a 3D path with the English phrase
about generating individual classifier predicates; an "driving up a hill," factors like the curve of the road,
approach for generating multiple interrelated predicates steepness of hill, how far up to drive, etc. would affect the
will be proposed toward the end of the paper. final output. So, a naïve lexicalized 3D-semantics
This paper will assume that English input treatment of classifier movement would not be scalable.
sentences that should be translated into ASL classifier
predicates can be identified. Some of the MT designs Design 2: Compose the Movement Paths
proposed below will be specialized for the task of
generating these phenomena. Since a complete MT Since the system may need to produce
system for English-to-ASL would need to generate more innumerable possible classifier predicates, we can't merely
than just classifier predicates, the designs discussed below treat the movement path as an unanalyzable whole. A
would need to be embedded within an MT system that had more practical design would compose a 3D path based on
other processing pathways for handling non-spatial some finite set of features or semantic elements from the
English input sentences. The design of such multi- English source text. This approach would need a library
pathway MT architectures is another focus of this research of basic animation components that could be combined to
project (Huenerfauth, 2004). produce a single classifier predicate movement. Such an
These other pathways could handle most inputs “animation lexicon” would contain common positions in
by employing traditional MT technologies (like the ASL space, relative orientations of objects in space (for
MT systems mentioned above). A sentence could be concepts like above, below, across from), common motion
“identified” (or intercepted) for special processing in the paths, or common contours for such paths. Finally, these
classifier predicate pathway if it fell within the pathway’s components would be associated with corresponding
implemented lexical (and – for some designs – spatial) features or semantic elements of English so that the
resources.2 In this way, a classifier predicate generation appropriate animation components can be selected and
component could actually be built on top of an existing combined at translation time to produce a 3D path.
ASL MT system that didn't currently support classifier
predicate expressions. Problems with This Design
We will first consider a classifier predicate MT This design is analogous to the polymorphemic
approach requiring little linguistic processing or novel model of classifier predicate generation (Supalla 1978,
ASL representations, namely a fully lexicalized approach. 1982, 1986). This model describes ASL classifier
predicates as categorical, and it characterizes their
generation as a process of combining sets of spatially
2
A later section of this paper describes how the decision semantic morphemes. The difficulty is that every piece of
of whether an input English sentence can be processed by the spatial information we might express with a classifier
special classifier predicate translation pathway depends on predicate must be encoded as a morpheme. These
whether a motif (introduced in that section) has been
phenomena can convey such a wide variety of spatial
implemented for the semantic domain of that sentence.

25
information – especially when used in combination to final MT design (discussed in a later section) will use
describe spatial relationships or comparisons between virtual reality 3D scene modeling software to simulate the
objects in a scene – that many morphemes are required. movement and location of entities described by an English
Liddell’s analysis (2003b) of the polymorphemic text (and to automatically manage their interactions).
model indicates that in order to generate the variety of
classifier predicates seen in ASL data, the model would The AnimNL System
need a tremendously large (and possibly infinite) number A system for producing a changing 3D virtual
of morphemes. Using a polymorphemic analysis, Liddell reality representation of a scene from an English text has
(2003b) decomposes a classifier predicate of one person already been implemented: the Natural Language
walking up to another, and he finds over 28 morphemes, Instructions for Dynamically Altering Agent Behaviors
including some for: two entities facing each other, being system (Schuler, 2003; Bindiganavale et al., 2000; Badler
on the same horizontal plane, being vertically oriented, et al., 2000) (herein, “AnimNL”). The system displays a
being freely moving, being a particular distance apart, 3D animation and accepts English input text containing
moving on a straight path, etc. instructions for the characters and objects in the scene to
Liddell considers classifier predicates as being follow. It updates the virtual reality so that objects obey
continuous and somewhat gestural in nature (2003a), and the English commands. AnimNL has been used in
this partially explains his rejection of the model. (If there military training and equipment repair domains and can be
are not a finite number of possible sizes, locations, and extended by augmenting its library of Parameterized
relative orientations for objects in the scene, then the Action Representations (PARs), to cover additional
number of morphemes needed becomes infinite.) domains of English input texts.
Whether classifier predicates are continuous or categorical The system's ability to interact with language and
and whether this number of morphemes is infinite or plan future actions arises from the use of PARs, which can
finite, the number would likely be intractably large for an be thought of as animation/linguistic primitives for
MT system to process. We will see that the final classifier structuring the movements in a 3D scene. PARs are
predicate generation design proposed in this paper will use feature-value structures that have slots specifying: what
a non-categorical approach for selecting its 3D hand agent is moving, the path/manner of this motion, whether
locations and movements. This should not be taken as a it is translational/rotational motion, the terminating
linguistic claim about human ASL signers (who may conditions on the motion, any speed or timing data, etc. A
indeed use the large numbers of morphemes required by single locomotion event may contain several sub-
the polymorphemic model) but rather as a tractable movements or sub-events, and for this reason, PARs may
engineering solution to the highly productive nature of be defined in a hierarchical manner. A single “high-level”
classifier predicates. PAR may specify the details for the entire motion, but it
Another reason why a polymorphemic approach may be defined in terms of several “low-level” PARs
to classifier predicate generation would be difficult to which specify the more primitive sub-movements/events.
implement in a computational system is that the complex The system stores a database of PAR templates
spatial interactions and constraints of a 3D scene would be that represent prototypical actions the agent can perform.
difficult to encode in a set of compositional rules. For These templates are missing particular details (some of
example, consider the two classifier predicates in the “the their slots aren’t filled in) about the position of the agent
car drove down the bumpy road past the cat” example. To or other entities in the environment that would affect how
produce these predicates, the signer must know how the the animation action should really be performed in
scene is arranged including the locations of the cat, the particular situations. By parameterizing PARs on the 3D
road, and the car. A path for the car must be chosen with coordinates of the objects participating in the movement,
beginning/ending positions, and the hand must be the system can produce animations specific to particular
articulated to indicate the contour of the path (e.g. bumpy, scene configurations and reuse common animation code.
hilly, twisty). The proximity of the road to the cat, the English lexicalized syntactic structures are
plane of the ground, and the curve of the road must be associated with PARs so that the analysis of a text is used
selected. Other properties of the objects must be known: to select a PAR template and fill some of its slots. For
(1) cats generally sit on the ground and (2) cars generally example, there may be a PAR associated with the concept
travel along the ground on roads. The successful of "falling" vs. another for "jumping." While these
translation of the English sentence into these two classifier templates must remain parameterized on the 3D location
predicates involved a great deal of semantic of the agent of the movement until it is known at run time,
understanding, spatial knowledge, and reasoning. there are some properties (in this case, the direction of
motion) that can be specified for each from the English
A 3D Spatial Representation for ASL MT semantics. During analysis of the English input text,
ASL signers using classifier predicates handle semantic features of motion verbs are obtained from the
these complexities using their own spatial knowledge and VerbNet hierarchy (Kipper et al., 2004), and these features
reasoning and by visualizing the elements of the scene. are also used to select and fill a particular motion
An MT system may also benefit from a 3D representation template. Since VerbNet groups verbs that share common
of the scene from which it could calculate the movement semantic/syntactic properties, AnimNL is able to link an
paths of classifier predicates. While design 2 needed entire set of semantically similar motion verbs to a single
compositional rules (and associated morphemes) to cover PAR template. Each of the verbs in the set may fill some
every possible combination of object positions and spatial of the slots of the motion template somewhat differently.
implications as suggested by English texts, the third and

26
When a PAR template has been partially filled object is introduced into the invisible world, the signing
with information from the English text and 3D object character moves its hand to a location “inside of” the
locations, it is passed off to AnimNL’s animation planner. transparent object. By also choosing an appropriate
In fact, PARs contain slots allowing them to be handshape for the character (possibly using the +animal or
hierarchical planning operators: pre-conditions, effects, +vehicle features discussed above), then a classifier
subplans, etc. The movements of all objects in the predicate is apparently produced that conveys the spatial
AnimNL system are governed by a planning process, information from the English text. As objects in the
which allows the objects in the scene to move realistically. invisible world are moved or reoriented as AnimNL
Many spatial motions have conditions on the location, analyzes more text, the signer can express this information
orientation, or motion state of an object and its using additional classifier predicates by again placing its
environment before, during, and after the event. The PAR hand inside the (possibly moving) 3D object. (See Figure
operators help the system work out the details of an 1.)
animation from the limited specification of this motion
provided by an English text. For example, it may Limitations of the “Directly Pictorial” Strategy
determine starting and stopping locations for movement Whereas design 2 mirrored the polymorphemic
paths or select relative locations for objects in the 3D model, this design is similar to that of DeMatteo (1977),
scene based on prepositions and adverbials in the English who sees classifier predicates as being direct “spatial
input text. The interaction and conditions of these analogues” of 3D movement paths in a scene imagined by
planning operators simulate physical constraints, collision the signer (Liddell, 2003b). In this model, signers
avoidance, human anatomical limitations, and other maintain a 3D mental image of a scene to be described,
factors to produce an animation. select appropriate handshapes to refer to entities in their
model, and trace out topologically analogous location and
Using AnimNL for ASL movement paths for these entities using their hands.
The MT system’s classifier predicate generator Unfortunately, the model is over-generative
can use the AnimNL software to analyze English (Liddell, 2003b). By assuming that the selection of
sentences to be translated into classifier predicates. handshapes and movements are orthogonal and that
AnimNL can process this text as if it were commands for movement paths are directly representative 3 of the paths
the entities mentioned in the text to follow. Based on this of entities in space, this analysis predicts many ASL
analysis, the AnimNL can create and maintain a 3D classifier constructions that never appear in the data
representation of the location and motion of these entities. (containing imaginable but ungrammatical combinations
Next, a miniature virtual reality animation of the objects of handshape, orientation, and movement) (Liddell,
in this representation can be overlaid on a volume of the 2003b). Finally, the model cannot consider discourse and
space in front of the torso of the animated ASL-signing non-spatial semantic features that can influence classifier
character. In this way, a miniature 3D virtual reality predicate production in ASL.
would be embedded within the original 3D space
containing the standing animated virtual human. In the Design 3: Lexicon of Classifier Predicates
“bumpy road” example, a small invisible object would be The “Directly Pictorial” strategy was just one
positioned in space in front of the chest of the signing way to use the 3D information in the invisible world
character to represent the cat. Next, a 3D animation path representation to generate classifier predicates. This
and location for the car (relative to the cat) would be section will introduce the MT approach advocated by this
chosen in front of the character’s chest. paper: design 3. This design uses the invisible world but
The AnimNL software can thus produce a avoids the limitations of the previous strategy by
miniature “invisible world” representing the scene considering additional sources of information during
described by the input text. Unlike other applications of translation. Whereas previous sections of this paper have
AnimNL – where entities described by the English text used comparisons to linguistic models to critique an MT
would need to be rendered to the screen – in this situation, design, this section will use a linguistic model for
the 3D objects would be transparent. Therefore, the MT inspiration.
system does not care about the exact appearance of the
objects being modeled. Only the location, orientation, and Lexicon of Classifier Predicate Templates
motion paths of these objects in some generic 3D space
are important since this information will be used to Liddell (2003a, 2003b) proposed that ASL
produce classifier predicates for the animated ASL- classifier predicates are stored as large numbers of
signing character. abstract templates in a lexicon. They are “abstract” in the
sense that each is a template parameterized on 3D
An Overly Simplistic Generation Strategy coordinates of whatever object is being described, and
each can therefore be instantiated into many possible
The next section of this paper (design 3) will
discuss how the “invisible world” representation can be 3
used to generate classifier predicates. To motivate that To illustrate how classifier predicate movements can be
third and final design, we will first consider an overly conventional and not visually representative, Liddell (2003b)
uses the example of an upright figure walking leisurely being
simplistic (and incorrect) strategy for using the virtual expressed as a classifier predicate with D handshape slightly
reality to attempt classifier predicate generation. bouncing as it moves along a path. While the hand bounces,
This simplistic “Directly Pictorial” strategy for the meaning is not that a human is bouncing but that he or she
building a classifier predicate is as follows: When a new is walking leisurely.

27
Figure 1: “Directly Pictorial” Generation Strategy Figure 2: The Design 3 Architecture.
(argued against in this paper). Solid lines depict Notice the new selection/filling process for a Classifier
transformation processes between representations, and Predicate PAR based on: a PAR template, the 3D scene
dotted lines, information flow into a process. data, and English text features.

classifier predicate outputs. For example, there may be list of templates for producing the signing character’s arm
one template for classifier predicates expressing that a car movements, (3) a way to link the semantics of English
is parked at a point in space; when this template is turned sentences to specific templates, and (4) a method for
into an actual classifier predicate, then the 3D coordinate turning a filled template into an animation of the signer’s
of the car would be filled in. arm. Requirement 1 is satisfied by the invisible world
Each lexical entry stores the semantic content of representation produced by the AnimNL software.
a particular classifier predicate and most of the handshape While the AnimNL software used one database
and movement specification for its performance. A signer of PAR templates to produce the 3D animation of objects
selects a template based on how well its spatial and non- in the invisible world, this design can fulfill requirement 2
spatial semantics convey the desired content. When a by adding a second database, whose PAR templates will
signer generates a classifier predicate from this template, describe the animated movement of the signing
then the locations, orientations, and specific movement character’s arm as it performs a classifier predicate. (This
paths of objects in a 3D mental spatial representation are first set will be called “invisible world” PARs, and the
used to fill the remaining parameters of the template and second, “classifier predicate” PARs.) Compared to the
produce a full specification of how to perform the invisible world PARs, the classifier predicate PARs will
classifier predicate. be very simple: they will store instructions for the signing
Although the previous paragraph refers to this character’s hand to be in a particular shape and for it move
approach as “lexical,” it differs from design 1 (which between two or more 3D coordinates in the signing space
augmented the English lexicon with 3D movement data) – possibly along a programmed contour.
because it creates a distinct ASL lexicon of classifier The re-use of PAR templates suggests a method
predicates, and the movement information in these entries for linking the semantics of the English text to arm
is parameterized on the data in the 3D scene. While these movement templates (requirement 3). Just as the AnimNL
templates may also resemble the compositional software used features of lexical syntactic structures to
morphemes of the polymorphemic model (the “animation trigger invisible world PARs, design 3 can use these
lexicon” of design 2) since they both link semantics to 3D features to link the semantics of English sentences to
movement, these templates have more pre-compiled classifier predicate PARs. These features can help select a
structure. While the morphemes required complex template and fill some of its non-spatial information slots.
processing by compositional rules, the templates just need Finally, data from the invisible world representation can
to be selected and to have their 3D parameters set. fill the spatial parameters of the classifier predicate PAR.
Liddell (2003b) explains that this model avoids Since arm movements are represented as PARs,
the under-generation of (Supalla, 1978, 1982, 1986) by this design can use a planning process (like that of the
incorporating a 3D spatial representation to select AnimNL software) to transform these PARs into a 3D
locations and movement paths, but it also avoids the over- animation script (requirement 4). While the AnimNL’s
generation of (DeMatteo, 1977) by restricting the possible planning process turned invisible world PARs into
combinations of handshapes and movement paths. animations of invisible objects, this planning process will
Impossible combinations are explained as lexical gaps; turn classifier predicate PARs into an animation script
ungrammatical classifier predicate feature combinations controlling the movement of the signing character’s arm
are simply not entries in the lexicon (Liddell, 2003b). as it produces a classifier predicate. (See Figure 2.)

Classifier Predicate Templates for MT Generating Multiple Classifier Predicates


To implement this linguistic model as an MT Up until now, this paper has focused on
design, we will need: (1) a 3D scene representation, (2) a generating a single classifier predicate from a single

28
English sentence, but in fact, the actual English-to-ASL predicates. The motif structure could decide how many
translation problem is more complex. New challenges classifiers must be used to communicate some block of
arise when generating several interrelated classifier spatial information and how to coordinate and arrange
predicates to describe a single scene. While specifying a them.
system to generate a single predicate has been a natural A motif would serve as a set of deep generation
starting point (and a first priority), it is important to rules or patterns for constructing a series of ASL classifier
consider how this architecture would need to be enhanced predicates in a specific semantic genre – e.g. movement of
to handle the production of multiple classifier predicates. vehicles, giving directions, furniture arrangement,
If these issues are not considered early in the development movements of walking people, etc. While this paper
process, then software design decisions may be made that focuses on movement and location predicates, motifs can
would make the MT system difficult to extend. be imagined for size and shape specifiers (e.g. stripes or
While the earlier sections of this paper may have spots on clothing), instrument classifiers (e.g. using
suggested that there is always a correspondence between a handtools), and others. Each motif would contain
single English input sentence and a single ASL classifier conditional rules for determining when it should be
predicate output, in fact, several classifier predicates may employed, that is, whether a particular English input text
be needed to convey the semantics of one English is within its genre. Just like the classifier predicate PAR
sentence (or vice versa). Even when the mapping is one- templates in design 3, motifs could be triggered by
to-one, the classifier predicates may need to be rearranged features of the analyzed English text.4
during translation to reflect the scene organization or ASL Motifs would use planning rules to select and
conventions on how these predicates are sequenced or sequence their component predicates and to choose the
combined. For instance, when describing the arrangement best viewpoint, orientation, and scale for the entire scene.
of furniture in a room, signers generally sequence their Having a separate motif for each genre would allow these
description starting with items to one side of the doorway planning rules to be specialized for how interrelated
and then circling across the room back to the doorway classifier predicates communicate spatial semantic
again. An English description of a room may be information in a particular domain – possibly using genre-
significantly less spatially systematic in its ordering. specific conventions as in the “furniture arrangement”
Multiple classifier predicates used to describe a example. Each motif could translate an English sentence
single scene may also interact with and constrain one according to its own guidelines; so, the system could
another. The selection of scale, perspective, and translate the same input sentence differently based on the
orientation of a scene chosen for the first classifier motif genre in which it occurred.
predicate will affect those that follow it. If decisions
about the representation of the virtual reality scene are Implementation Issues
made without considering the requirements of the later We can extend design 3 to generate multiple
classifier predicates, then output may be produced which classifier predicates by adding a database of motif
arranges the elements of the scene in a non-fluent manner. representations to be used in the PAR-planning process.
Often the first English sentence describing a 3D scene In fact, these multi-predicate motifs could be represented
may not contain enough detail to make all of the choices as additional higher-level PAR templates. In the same
about the scene layout or perspective. A generation way that a classifier predicate PAR can be hierarchically
approach that considers the spatial information in adjacent decomposed into sub-movements of the signer’s arm
(later) English input sentences prior to making such (each represented by a lower-level PAR), analogously, a
decisions could produce higher quality ASL output. PAR representing a multi-predicate motif can be
Another motivation for making generation decomposed into PARs for individual classifier predicates.
decisions for groups of related classifier predicates is that In design 3, English text features immediately triggered a
the semantics of multiple classifier predicates may interact single classifier predicate PAR; now, English features will
to produce emergent meaning. For example, one way to trigger a PAR representing a motif. During planning, the
convey that an object is between two others in a scene is motif PAR can use English text features and 3D invisible
to use three classifier predicates: two to locate the world data to decide how to expand its sub-actions – how
elements on each side and then one for the entity in the to select and arrange the classifier predicates to express it.
middle. In isolation, these classifier predicates do not Motifs are quite domain-specific in their
convey any idea of a spatial relationship, but in implementation; so, questions can be raised as to what
coordinated combination, this semantic effect is achieved. degree of linguistic coverage this design could achieve.
This MT approach is certainly not meant to cover all
Classifier Predicate Motifs English input sentences – only those that should be
An MT system could handle the translation translated as classifier predicates. While domain-
complexities discussed above by using sets of multi- specificity can sometimes make an MT approach
classifier templates called motifs. Instead of immediately impractical to use, this design is meant to be embedded
triggering one ASL classifier as each sentence of an within a complete (possibly existing) MT system for
English text is encountered, now the system will represent English-to-ASL that uses traditional MT technologies to
collections of multiple interrelated classifier predicate handle the majority of English inputs. Because these
templates that can be used together to describe a scene.
These collective structures would allow generation 4
decisions to be made at the scene-level, thus decoupling A stochastic motif genre-identifier could also be induced
individual English sentences from individual classifier from statistical analyses of English texts known to produce a
certain type of classifier predicate translation.

29
other MT processing pathways would be available, this processes (those which overlap with the work of motifs).
design can focus on linguistic depth, rather than breadth. So, the point where the model diverges with this approach
With the linguistic coverage of the initial system is the same as where it diverged from the original design 3
as a baseline, the addition of this design would improve – when 3D data is used to fill the parameters of the
the coverage incrementally by bringing additional genres classifier predicate PAR. This surface generation stage
(domains) of classifier predicate expressions into the produces the non-categorical movements and locations of
system’s ASL repertoire as new motifs are implemented. the classifier predicate output.
The non-classifier translation pathways of the MT system
would handle those spatial sentences still outside of the Discussion
motif coverage. The other pathways would likely produce
an overly English-like form of signing for these spatial
Advantages of Virtual Reality
sentences: a less desirable but somewhat useful result.
The 3D representation in this design allows it to
Relating Motifs to ASL Linguistic Models consider spatial information when making generation
decisions. Not only does this help make the generation of
The previously discussed linguistic models did individual classifier predicates possible, but it also allows
not include a level of representation analogous to a motif the system to potentially consider factors like spatial
because these models were focusing on a different part of layout or visual salience when making deeper generation
the classifier predicate generation problem. Only after a choices inside motifs – something a system without a 3D
signer has decided what spatial information to representation could never do.
communicate (content selection) and how to sequence its This virtual reality representation for the space
presentation (propositional ordering) do these models used by ASL classifier predicates may also be a basis for
describe how to build an individual classifier predicate transcribing or recording these ASL phenomena
(surface generation). They account for how humans electronically. A listing of the 3D objects currently in the
produce single classifier predicate expressions – not how invisible world with their properties/coordinates and a
they plan the elements of an entire scene. fully specified/planned arm movement PAR could be used
Linguistic models that do explain how human to record a classifier predicate performance of a human
signers conceptualize 3D scenes also do not use a motif- signer. This approach would record more movement
analogous representation. Here, the reason may be that detail than classifier predicate glosses used in the
the generation task for a human is significantly different linguistic literature, which merely describe the motion in
than the translation task for a computer. For example, English words and the handshape used. It would also be
Liddell (2003a) discusses how signers could plan a 3D more informative than a simple movement annotation
scene and use multiple interrelated classifier predicates to since it could store its non-spatial semantics (the semantic
describe it, but his model relies on the human ASL features that triggered the movement template), its spatial
signers’ rich mental visualization of objects in a 3D space semantics (the locations of the 3D objects in the scene
and their ability to map (or “blend”) these locations to the which it is describing), and the identities of those objects
physical signing space. In a translation setting, the mental (what discourse entities are they representing). This
3D visualization of the English speaker is not available; additional information would likely be of interest to
the English text is the only source of information about researchers studying these phenomena or building MT
the scene. Because English generally includes less spatial systems to handle them.
detail than ASL when describing 3D space, both MT The 3D representation also allows this system to
systems and human ASL interpreters are faced with the address ASL phenomena aside from classifier predicates
problem of understanding the English description and in novel and richer ways. One example is the non-
reconstructing the scene when producing classifier topological use of the ASL signing space to store locations
predicates.5 Although not as robust as a human ASL for pronominal reference or agreement (Neidle et al.,
interpreter, the AnimNL software can help this MT system 2000). These locations could be modeled as special
create a 3D representation from the English text. But we objects in the invisible world. The layout, management,
are still left with the task of interpreting the English text and manipulation of these pronominal reference locations
for semantic and discourse cues to help guide our (or “tokens”) is a non-trivial problem (Liddell, 2003a),
selection of classifier predicates to express this 3D scene. which would benefit from the rich space provided by the
Therefore, motifs are triggered and informed by features virtual reality representation. If an ASL discourse model
from the analysis of the English text. were managing a list of entities under discussion, then it
As a final linguistic concern, it is useful to could rely on the virtual reality representation to handle
consider whether the addition of motifs (that use 3D data) the graphical and spatial details of where these “tokens”
to design 3 has placed this system in further conflict with are located and how to produce the “pointing” arm
the polymorphemic model (Supalla, 1978, 1982, 1986). movements to refer to them.
While this may initially appear to be the case, the addition The virtual reality representation could also
of motifs is actually neutral with respect to this model. facilitate the production of pronominal reference to
The model claims that an individual classifier predicate is entities that are “present” around the signing character.
composed from discrete morphemes, but it does not For instance, the character may be embedded in an
preclude the human signer from using mental 3D application where it needed to refer to “visible” objects
visualization of the scene during the deeper generation around it in the 3D virtual reality space or to computer
screen elements on a surrounding user-interface. To make
5
And neither is perfect at this task. pronominal reference to an object in the visible 3D virtual

30
reality space, a copy of this object could be made inside of Holt, J. (1991). Demographic, Stanford Achievement Test
the signing character’s invisible world model. Then this - 8th Edition for Deaf and Hard of Hearing Students:
invisible world copy could be treated like a “token” by the Reading Comprehension Subgroup Results.
generation system, and pronominal references to this Huenerfauth, M. (2003). A Survey and Critique of
location could be made in the same way as for the “non- American Sign Language Natural Language Generation
present” objects above. If the 3D object changed location and Machine Translation Systems. Technical Report
during the signing performance, then its invisible world MS-CIS-03-32, Computer and Information Science,
“token” counterpart can be repositioned correspondingly. University of Pennsylvania.
The AnimNL software makes use of Huenerfauth, M. (2004). A Multi-Path Architecture for
sophisticated human characters that can be part of the Machine Translation of English Text into American
scenes being controlled by the English text. These virtual Sign Language Animation. In the Proceedings of the
humans possess many skills that would make them Student Workshop of the Human Language
excellent ASL signers for this project: they can gaze in Technologies conference / North American chapter of
specific directions, make facial expressions useful for the Association for Computational Linguistics annual
ASL grammatical features, point at objects in their meeting (HLT/NAACL 2004), Boston, MA.
surroundings, and move their hand to locations in space in Kipper, K., Snyder, B., Palmer, M. (2004). “Extending a
a fluid and anatomically natural manner (Badler et al., Verb-lexicon Using a Semantically Annotated Corpus,”
2000; Bindiganavale et al., 2000). When passed a In the Proceedings of the 4th International Conference
minimal number of parameters, they can plan the on Language Resources and Evaluation (LREC-04).
animation and movement details needed to perform these Liddell, S. (2003a). Grammar, Gesture, and Meaning in
linguistically useful actions. If one of these virtual American Sign Language. UK: Cambridge University
humans served as the signing character, as one did for Press.
(Zhao et al., 2000), then the same graphics software would Liddell, S. (2003b). Sources of Meaning in ASL Classifier
control both the invisible world representation and the Predicates. In Karen Emmorey (ed.). Perspectives on
ASL-signing character, thus simplifying the Classifier Constructions in Sign Languages. Workshop
implementation of the MT system. on Classifier Constructions, La Jolla, San Diego,
California.
Current Work Morford, J., and MacFarlane, J. (2003). “Frequency
Currently, this project is finishing the Characteristics of American Sign Language.” Sign
specification of both the classifier predicate generation Language Studies, 3:2.
design and a multi-pathway machine translation Neidle, C., Kegl, D., MacLaughlin, D., Bahan, B., and
architecture in which it could be situated (Huenerfauth, Lee, R.G. (2000). The Syntax of American Sign
2004). Other research topics include: defining evaluation Language: Functional Categories and Hierarchical
metrics for an MT system that produces ASL animation Structure. Cambridge, MA: The MIT Press.
containing classifier predicates, developing PAR- Sáfár, É., and Marshall, I. (2001). The Architecture of an
compatible ASL syntactic representations that can record English-Text-to-Sign-Languages Translation System.
non-manual signals, and specifying ASL morphological or In G. Angelova (ed.) Recent Advances in Natural
phonological representations that can be integrated with Language Processing (RANLP), (pp. 223-228). Tzigov
the PAR-based animation framework. Chark, Bulgaria.
Schuler, W. (2003). Using model-theoretic semantic
Acknowledgements interpretation to guide statistical parsing and word
I would like to thank my advisors Mitch Marcus and recognition in a spoken language interface. Proceedings
Martha Palmer for their guidance, discussion, and of the 41st Annual Meeting of the Association for
revisions during the preparation of this work. Computational Linguistics (ACL'03), Sapporo, Japan.
Speers, d’A. (2001). Representation of American Sign
Language for Machine Translation. PhD Dissertation,
References Department of Linguistics, Georgetown University.
Bindiganavale, R., Schuler, W., Allbeck, J., Badler, N., Supalla, T. (1978). Morphology of Verbs of Motion and
Joshi, A., and Palmer, M. (2000). Dynamically Altering Location. In F. Caccamise and D. Hicks (eds).
Agent Behaviors Using Natural Language Instructions. Proceedings of the Second National Symposium on Sign
4th International Conference on Autonomous Agents. Language Research and Teaching. (pp. 27-45). Silver
Badler, N., Bindiganavale, R., Allbeck, J., Schuler, W., Spring, MD: National Association for the Deaf.
Zhao, L., Lee, S., Shin, H., and Palmer, M. (2000). Supalla, T. (1982). Structure and Acquisition of Verbs of
Parameterized Action Representation and Natural Motion and Location in American Sign Language.
Language Instructions for Dynamic Behavior Ph.D. Dissertation, University of California, San Diego.
Modification of Embodied Agents. AAAI Spring Supalla, T. (1986). The Classifier System in American
Symposium. Sign Language. In C. Craig (ed.) Noun Phrases and
DeMatteo, A. (1977). Visual Analogy and the Visual Categorization, Typological Studies in Language, 7.
Analogues in American Sign Language. In Lynn (pp. 181-214). Philadelphia: John Benjamins.
Friedman (ed.). On the Other Hand: New Perspectives Zhao, L., Kipper, K., Schuler, W., Vogler, C., Badler, N.,
on American Sign Language. (pp 109-136). New York: and Palmer, M. (2000). A Machine Translation System
Academic Press. from English to American Sign Language. Association
for Machine Translation in the Americas.

31
A SIGN MATCHING TECHNIQUE TO SUPPORT SEARCHES IN SIGN
LANGUAGE TEXTS
Antônio Carlos da Rocha Costa, Graçaliz Pereira Dimuro,
Juliano Baldez de Freitas
ESIN/UCPel – Escola de Informática
Universidade Católica de Pelotas
{rocha,liz,jubafreitas}@atlas.ucpel.tche.br
Abstract
This paper presents a technique for matching two signs written in the SignWriting system. We have defined such technique to support
procedures for searching in sign language texts that were written in that writing system. Given the graphical nature of SignWriting, a
graphical pattern matching method is needed, which can deal in controlled ways with the small graphical variations writers can introduce
in the graphical forms of the signs, when they write them. The technique we present builds on a so-called degree of graphical similarity
between signs, allowing for a sort of “fuzzy” graphical pattern matching procedure for written signs.

1. Introduction Since Stokoe, in the 1960’s, first recognized that sign


languages are full natural languages, in the same sense that
For the most part, software for processing sign lan-
oral languages are, some notation systems for sign lan-
guage texts and databases have started to be developed only
guages have been proposed. Stokoe himself introduced
recently, simultaneously with the spreading of interest in
one such notation system (W. C. Stokoe and Croneberg,
SWML (Costa, 2003) among software developers concerned
1976). HamNosys (Hanke, ) was another proposal. Both
with the SignWriting (Sutton, a; Sutton, c) system. Ob-
were conceived as technical tools for registering linguistic
viously, an important and critical operation needed for such
features of sign languages (handshapes, movements, artic-
sign language processors is that of searching signs in sign
ulation points, etc.).
language texts.
SignWriting is also a proposed system for writing
This paper presents a technique for matching two signs
sign languages (Sutton, a). Contrary to the other systems,
written in the SignWriting system. We have defined
however, which were proposed mainly as tools for technical
such technique to support procedures for searching sign
linguistic work, SignWriting was proposed as tool for
language texts that were written in that writing system.
daily use, by common (Deaf) people (Sutton, b).
Given the graphical nature of SignWriting, a graphi-
cal pattern matching method is needed, which can deal in
controlled ways with the small graphical variations writ- 3. SignWriting and SWML
ers can introduce in the graphical forms of signs when they Both the Stokoe system and HamNoSys are based on a
write them. The technique we present builds on a so-called linear representation of signs, using special characters for
degree of graphical similarity between signs, allowing for such purpose. SignWriting is based on graphical, bi-
a sort of “fuzzy” graphical pattern matching procedure for dimensional representations, using graphical symbols.
written signs. This way, the former systems can easily be encoded
The paper is organized as follows. In section 2., we in computers in a linear way, by simply assigning nu-
review aspects of sign languages related to the problem meric codes to each special character, and the technique for
of having them written in some notation, and summarize searching signs in texts written with such systems should
the main features of the SignWriting system. Sec- be straight forward to develop.
tion 3. summarizes the work done on SWML and its im- SignWriting, on the other hand, requires that, be-
portance for the development of software for processing sides the numeric encoding of each symbol, the computer
SignWriting texts and databases. Section 4. presents representation of a sign keeps the information concerning
the main contribution of the paper, namely, the sign match- the relative position of each symbol in the bi-dimensional
ing technique designed to support procedures for search- area occupied by the representation of the sign (this com-
ing in sign language texts. Section 5. brings the Conclu- plicates the searching procedure, as is shown below).
sion. The sample signs presented in the paper are from the The SignWriter program (Sutton et al., 1995), the
Brazilian sign language LIBRAS (Linguagem Brasileira de first computer editor for sign languages, defined such an
Sinais). encoding for SignWriting. That encoding was a binary
encoding, created specifically for the needs of that program.
2. Sign languages and the SignWriting SWML (Costa, 2003) is a proposal for a general encod-
ing format for SignWriting documents, using XML (?).
system It builds on the encoding used by the SignWriter pro-
Along history, no writing system has been widely es- gram, presenting it in a fashion the makes such encoding
tablished for sign languages, so that such languages have available for use in all kinds of computer applications of
always been used only for face-to-face communication. SignWriting (document storage and retrieval, on-line

32
dictionaries, computer interpretation and generation of sign
languages, etc.). The SW-Edit program (Torchelsen et al.,
2002) fully relies on SWML to store SignWriting-based
sign language texts. SignWriting and SWML were pro-
posed (Costa and Dimuro, 2002; Costa and Dimuro, 2003)
as foundations for Sign Language Processing, the transpo-
sition of the methods and techniques of Natural Language
Processing and Computational Linguistics, that have long
been developed for oral language texts, to sign language
texts. Figure 1: The group G0/0 of symbols called index, and
The rest of this paper tackles one of the simplest op- some of its rotated and flopped elements.
eration one can do on a sign language document, namely,
searching for a specific sign.
index finger is curved or not, in the symbol for the in-
4. Matching Written Signs dex handshape),
There is a particular problem that has to be solved to
allow sound searching procedures for sign languages files (iv) f is the filling information (encoding, e.g., palm ori-
written in SignWriting, namely, to define a way of dealing entation, in a symbol for a hand).
with the small graphical variations that writers can introuce
A set of symbols having the same symbol category and
in the forms of the signs, when they write them.
shape (c, n) and differing only in their filling or variation
The SignWriting system distinguishes explicitly
information, is called a symbol group, denoted by Gc/n . For
some graphical properties of the symbols of a sign, like ro-
each symbol group Gc/n there is a so-called basic symbol,
tation and flop, for example, but does not distinguish tiny
denoted by sc/n , for which f = 0 and v = 0, so that sc/n =
variations due to vertical and/or horizontal displacements
(c, n, 0, 0).
of symbols within the sign, because such values are allowed
to vary along the whole range of available positions within Definition 2 An oriented symbol S is defined as a tuple
a sign box (as opposed to, e.g., rotation, which can only S = (s, r, f p), where:
assume a small set of possible discrete values). The conse-
quence of having such a “continuous” set of possible posi- (i) s is a symbol of any symbol group Gc/n ,
tions of symbols within a sign box is that one lacks a clear
geometric definition for the similarity between two signs, if (ii) r indicates the (counter clockwise) rotation operation
they differ only with respect to the positions of their corre- applied to s, relative to the basic symbol sc/n of the
sponding symbols. symbol group Gc/n (the rotation is given in intervals of
The solution we have found to that problem is to allow 45 degrees, for all symbols sets available up to now),
the user to control the criteria to be used for judging on the and
degree of similarity of two signs by giving him a means to
define a “fuzzy” correspondence between the component (iii) f p, called flop, is a Boolean value indicating if the
symbols of the two signs. The resulting matching proce- symbol s is vertically mirrored or not, relative to the
dure guarantees that two corresponding symbols have the basic symbol sc/n of the symbol group Gc/n .
same symbol type, rotation and flop, but allows them to
Example 1 The symbol group called index, denoted by
have the (user specified) degree of variation on their relative
G0/0 , whose symbols, with category c = 0 and shape
positions within the respective signs instances. This kind of
n = 0, represent hands with index finger straight up and
similarity between two signs is formalized in this section as
closed fist, is shown in Figure 1. Each symbol s in the group
a parameterized, reflexive and symmetric relation, that we
G0/0 is a tuple s = (0, 0, 0, f ), with variation v = 0 and
call sign similarity relation.
fill f = 0, 1, ..., 5 (from left to right in the figure). The
4.1. Basic Geometric Features of Symbols and Signs oriented symbols in the first row have the basic orienta-
Initially, we formalize the basic geometric information tion (no rotations, no flop) and are given by tuples of the
concerning SignWriting symbols and signs. form S = (s, 0, 0). Each different fill information is rep-
resented by a different color fill in the symbol, indicating a
Definition 1 A symbol s is defined as a tuple s = different palm orientation, starting with the palm oriented
(c, n, f, v), where the values of c, n, f and v vary according towards the signer’s face. In the second row, a rotation of
to the symbol set being used, and: 45 degrees was applied to each symbol, and the oriented
symbols in that line are thus given by S = (s, 1, 0). In the
(i) c is the category number (not available in symbol sets
third and fourth rows, representing the left hand, there are
previous to the SSS-2002 symbol set (Sutton, c); use
flopped symbols, given by S = (s, 0, 1) (with no rotations)
c = 0 in such cases),
and S = (s, 7, 1) (with rotations of 315 degrees).
(ii) n is the shape number (within the symbol’s category),
(iii) v is the symbol variation (a complimentary informa- Definition 3 (i) A symbol box is the least box that
tion distinguishing symbols by features like, e.g., if the contains a symbol, defined as the 4-uple sb =

33
(x, y, wsb , hsb ), where x and y are, respectively, the
horizontal and vertical coordinates of the upper left
corner of the symbol box (relative to the upper left cor-
ner of the sign box containing the symbol box — see
item (iv)), wsb is its width and hsb is its height;
(ii) A symbol instance, that is, an occurrence of an ori-
ented symbol within a sign, is defined as a pair Si =
(S; sb), where S = (s, r, f l) is an oriented symbol and
sb is its symbol box; Figure 2: A way to write the LIBRAS sign for IDEA.

(iii) A sign, denoted by Sg, is a finite set of symbol in-


stances; <category>02</category>
<group>01</group>
(iv) A sign box is a box that contains a sign, defined as a <symbnum>001</symbnum>
pair Sgb = (wSgb , hSgb ), where wSgb is the box width <variation>01</variation>
and hSgb is the box height; <fill>01</fill>
<rotation>01</rotation>
(v) A sign instance is defined as a tuple Sgi = (Sg; Sgb; p), </symb>
representing a sign Sg together with a sign box Sgb <symb x="99" y="31" x-flop="0" y-flop="1"
that contains it, and an index p indicating the posi- color="0,0,0">
tion of the sign instance within the sign sequence (sign <category>02</category>
phrase) to which it belongs . <group>05</group>
<symbnum>001</symbnum>
All the definitions presented above are reflected in the <variation>01</variation>
SWML format. Note, in particular, that as defined above, <fill>01</fill>
<rotation>02</rotation>
sign boxes (and consequently, sign instances) have no co-
</symb>
ordinate information. This is so because sign language texts
</signbox>
should be conceived essentially as strings of signs, with no
particular formatting information included in them.
SWML, however, defines the notions of document, page, 4.2. The Sign Similarity Relation
line and cell, so that sign instances can be put into cells, The sign similarity relation is a parameterized, reflex-
sequences of cells organized into lines, sequences of lines ive, symmetric and non transitive relation, introduced here
into pages, and sequences of pages into documents, in order to formalize the approximate similarity between two sign
to support document rendering procedures (e.g., horizontal instances, and to provide for the construction of matching
or vertical renderings). Note also that symbols don’t have procedures for signs and sign language expressions.
predefined sizes (width and height). Sizes are defined only The sign similarity relation has to embody an admissi-
for symbol instances, through their symbol boxes. This al- ble difference in the positions of corresponding symbol in-
lows for scalable symbol sets (e.g., in the SVG format (?)). stances within the two sign instances that it relates, taking
into account a measure of significance for this difference, as
Example 2 The SWML representation of the LIBRAS sign
determined by the user. The admissible differences in the
for IDEA (written as in Figure 2) is:
positions of corresponding symbol instances are expressed
<signbox> in terms of percentages of some reference sizes, by a so-
<symb x="46" y="37" x-flop="0" y-flop="0" called minimum degree of correspondence, denoted by ε.
color="0,0,0"> The reference sizes may be given either explicitly (e.g.,
<category>04</category> 10 pixels) or implicitly (e.g., as the height and width of
<group>02</group> some symbol instance, chosen for that purpose among the
<symbnum>001</symbnum> symbols of the symbol set).
<variation>01</variation> More over, the admissible difference in the correspond-
<fill>01</fill>
ing positions of the corresponding symbols may be calcu-
<rotation>04</rotation>
lated in two ways:
</symb>
<symb x="81" y="48" x-flop="0" y-flop="0"
color="0,0,0"> • with respect to their absolute positions within the sign
<category>01</category> boxes to which they belong
<group>01</group>
<symbnum>001</symbnum>
• with respect to their positions relative to some refer-
<variation>01</variation> ence symbol, known to be instantiated in each of the
<fill>02</fill> signs being compared
<rotation>02</rotation>
</symb> The absolute way of calculating the admissible differ-
<symb x="62" y="18" x-flop="0" y-flop="0" ences is simpler, but the relative way allows the establish-
color="0,0,0"> ment of the similarity between a sign and another deriving

34
Figure 3: Similarity based on absolute and relative posi-
tions of the symbols (LIBRAS sign for YEAR). Figure 4: Three (possible) instances of the LIBRAS sign
IDEA.
from it just by a joint displacement of the symbols within
the sign box: e.g., in figure 3, the first sign instance would instance. In spite of this fact, they are graphically differ-
usually be judged similar only to the second instance, ac- ent from the first instance, in a strict sense. They may all
cording to an absolute position based similarity relation, be considered to represent the same sign, or not, depend-
while it could also be judged similar to the third instance, ing on the minimum degree of similarity required by the
according to the relative position based similarity relation. user for the results of the matching procedure. If the user
We now define the sign similarity relation based on the specifies an intermediate degree of similarity, the second in-
absolute positions of the symbols. stance would match the first, while the third instance would
not (the hand is too low in comparison with its position in
Definition 4 Let Si1 = (S1 ; sb1 ) and Si2 = (S2 ; sb2 ) be the first sign instance). If the user specifies a low degree of
two symbol instances belonging to two different signs. Let similarity, all instances would match. If the user required
their symbol boxes be given by sb1 = (x1 , y1 , wsb1 , hsb1 ) 100% of similarity, no instance would match. The total de-
and sb2 = (x2 , y2 , wsb2 , hsb2 ), respectively. Then, Si1 and gree of similarity (ε = 100%) requires that no difference be
Si2 are said to correspond to each other with at least degree admitted between the two sign instances being compared.
ε, and reference sizes h0 and w0 (for height and width),
denoted by Si1 ≈εh0 ,w0 Si2 , if and only if the following con- The basic similarity relation defined above does not
ditions hold: take into account some important (and frequent) excep-
tions. Such exceptions are mainly related to symbols like
(i) Equality between the basic symbols: the arrow symbol (encountered, e.g., in the LIBRAS sign
S1 = S2 (which implies wsb1 = wsb2 and hsb1 = IDEA), whose position within the sign is, in general, not
hsb2 ), critical (see Figure 5). Such symbols have most of their
meaning completely encoded in their shapes and transfor-
(ii) Admissible horizontal difference: mations, and the place where they are put in the sign boxes
| x1w−x
0
2
|≤k is essentially irrelevant. For instance, the arrow symbol in
(iii) Admissible vertical difference: the sign for IDEA means that the right hand moves in the
| y1h−y 2
|≤k horizontal plane, in the indicated direction, and this infor-
0
mation is the same, wherever the arrow is placed in the
where k = 100−ε
≥ 0. sign box. In such cases, the relative position of the symbol
100
within the sign box is not important. In the examples of
Definition 5 Let Sgi1 = (Sg1 ; Sgb1 ; j1 ) and Sgi2 = the Figure 5, even if a rigorous or a total degree of similar-
(Sg2 ; Sgb2 ; j2 ) be two sign instances. Sgi1 and Sgi2 are ity is required, the matching process should find that those
said to be similar with at least degree ε, relative to the ab- three sign instances are similar. On the other hand, for sym-
solute positions of their symbols, and reference sizes h0 and bols like the asterisk, almost no variation of the its position
w0 , if and only each symbol in a sign has one and only one should be allowed, since it indicates a position where two
corresponding symbol in the other sign, that is, there exists components of the sign (e.g., head, hands, etc.) touch each
a bijection f : Sg1 → Sg2 , such that for each Si ∈ Sg1 , other when the sign is performed, and even small degrees
Si ≈εh0 ,w0 f (Si). of variations may imply linguistically relevant differences
between the signs.
Example 3 Consider the three instances of the LIBRAS Other reasonable definitions for the sign similarity re-
sign IDEA which are in Figure 4. Observe that each lation could be given such as, for instance, the one already
such sign instance contains an instance of the symbol in- mentioned, of taking the positions of the symbols relatively
dex which differs in its coordinates from the correspond- to a reference symbol, known to occur on both the sign
ing index symbol instance of the other sign instances instances that are being compared. Even coarser relations
(all other symbol instances match exactly their correspon- could be defined, and possibly considered useful, e.g., one
dents). Consider a situation where a user is searching for defining the admissible differences on the basis of the ab-
that sign IDEA in a text. Suppose he writes the first sign solute coordinates of the very symbols being compared.
instance as the sign to be searched and that only the two
other instances are present in the text. The later two in- 4.3. Search Procedures for Sign Texts
stances have some degree of similarity with the first sign SWML, as currently defined, already has all information

35
V. Sutton. The SignWriting site. Located at:.
http://www.signwriting.org.
V. Sutton, , and R. Gleaves. 1995. SignWriter – The
world’s first sign language processor. Center for Sutton
Movement Writing, La Jolla.
R. P. Torchelsen, A. C. R. Costa, and G. P. Dimuro. 2002.
Editor para textos escritos em signwriting. In 5th. Sim-
posium on Human Factors in Computer Systems, IHC
2003, pages 363–66, Fortaleza. SBC.
Figure 5: Three (guaranteed) instances of the LIBRAS sign D. C. Casterline W. C. Stokoe and C. G. Croneberg. 1976.
IDEA. A Dictionary of American Sign Language on Linguistic
Principles. Linstok Press, Silver Spring.
needed to allow for a sign matching procedure based on
the sign similarity relation defined here. The special treat-
ment of symbols whose meanings are not sensitive to the
symbols’ placements in the signs is to be embedded in the
matching process, requiring from SWML only that it identi-
fies symbol instances completely, which it perfectly does.
On the basis of such sign matching procedure, a procedure
to search for signs in sign language texts can be easily de-
fined, in a straightforward way.

5. Conclusion
In this paper, we have shown that searching for signs in
sign language texts written in SignWriting is a straight
forward matter. The only slightly tricky part of the search-
ing procedure is in the operation of matching two signs,
which should allow for small differences in the positions
of their corresponding symbol instances. Ideally, the size
of the differences that are to be admitted in such corre-
spondence tests should be specifiable by the user when he
calls the search procedure, so that he can have full control
over the desired degree of similarity of the signs being com-
pared.

Acknowledgements
This work was partially supported by CNPq and
FAPERGS.

6. References
A. C. R. Costa and G. P. Dimuro. 2002. SignWriting-
based sign language processing. In I. Wachsmuth and
T. Sowa, editors, Gesture and Sign Language in Human-
Computer Interaction, pages 202–05, Berlin. Springer-
Verlag.
A. C. R. Costa and G. P. Dimuro. 2003. SignWriting
and SWML: Paving the way to sign language process-
ing. In O. Streiter, editor, Traitement Automatique des
Langues de Signes, Workshop on Minority Languages,
Batz-sur-Mer, June 11-14, 2003.
A. C. R. Costa. 2003. The SWML site. Located at:.
http://swml.ucpel.tche.br.
T. Hanke. Hamnosys - the hamburg notation system. Lo-
cated at:.
V. Sutton. Lessons in signwriting – textbook and work-
book. Available on-line at:. (Sutton, c).
V. Sutton. The signwriting history. Available on-line at:.
(Sutton, c).

36
A Practical Writing System for Sign Languages
Angel Herrero
University of Alicante
Ap.Correos 99.-E-03080 Alicante (Spain)
[email protected]

Abstract
This paper discusses the problems involved in writing sign languages and explains the solutions offered by the Alphabetic Writing
System (Sistema de Escritura Alfabética, S.E.A.) developed at the University of Alicante in Spain. We will ponder the syllabic nature
of glottographic or phonetically-based writing systems, and will compare practical phonological knowledge of writing with notions of
syllables and sequence. Taking advantage of the ideas of sequentiality contributed by the phonology of sign languages, we will
propose a sequential writing model that can represent signers’ practical phonological knowledge.

languages. However, it has to be said that tonal languages


1. Sign Languages and Writing and others that have been put forward to justify a non-
Except for semasiographic systems, such as the winter segmental conceit in prosodic phonology (Venda, Turkish,
counts of the Dakota people, and visual instructions for Hebrew), are currently written in alphabetical, segmental
the use of certain machines, which “state ideas directly”, writing.
all writing systems are glottographic (Sampson, 1997: 42). From a scientific point of view, the practical phonology
In other words, “they […] use visible marks to represent that gave rise to writing is full of imperfections, creating
forms of a spoken language”. Writing systems that had an unreal image of languages (Olson, 1991:285).
initially been considered pictographic, such as Egyptian However, this image has historically been identified with
hieroglyphics, Chinese writing, Mayan glyphs, or the knowledge and culture, and writing, with all its
Easter Island tablets, were later shown to be glottographic, imperfections, has become an irreplaceable practical skill
or “true writing”, as underlined by the greatest scholar of for consigning knowledge. The reason for this is,
writing systems, Thomas Barthel. doubtless, the way it represents the speech process.
Ever since it was discovered by Sumerian culture, Therefore, if sign languages, from the point of view of
alphabetic writing has been based on syllables, involving a linguistic typology, are comparable to oral languages in
phonological analysis of the chain that bases many morphological and syntactical aspects, it would
representation on the different components of each appear logical to extend this comparison to the syllable as
syllable: consonants and vowels. Other glottographic the basic phonotactic unit of writing, although the concept
writing systems, known as logographical writing systems, of syllable is also currently questioned in non-linear
are based on significant parts of words, or morphemes. phonology (Wilbur, 1990). If letters (characters) represent
This is the case of Chinese for example, although in this the kind (and stage) of articulation of the sounds in a
case the significant parts of the words, the morphemes, syllable, so that the speaker can not only make the sounds
generally coincide with syllables, meaning that but also distinguish the order in which they are produced,
logographic writing may also be considered syllabic. in sign languages (LSs) letters may also represent the kind
There are also cases of ‘motivated’ logographic writing (and stage) of manual articulation, and the order of the
systems, such as the phonological-featural alphabet of letters can represent the order of production of signs by
Korean Hangul. However, in this phonological-featural the signer.
alphabet also, based on infra-phonemic elements, “the In this paper we will present a proposed writing system
essential graphic distinction is between vowels and based on this possibility. Annotation systems currently
consonants” (Sampson, 1997:179). In practice, different used to transcribe signs, such as the HamNoSys system
writing systems can be combined, as we do when we use devised by Siegmund Prillwitz and his group at Hamburg
morphological symbols such as numbers or percentage University, or SignWriting devised by Valery Sutton at
symbols, present on all keyboards, in alphabetical texts. San Diego University, may not be processed as writing.
The distinction between Consonant and Vowel has SignWriting showed the very possibility of writing and is
proven to be an excellent criterion for phonological a historic contribution to the culture of the signing
representation: it is immensely practical, as it represents community, but the alphabetical writing system we
the syllable at the same time. In other words, and this is present is based on a principle of phonological economy,
the essential idea of our proposed writing system, while SignWriting, because of its openly visual nature, is
Consonants and Vowels are represented as stages of based on simultaneity and the supposed analogical or
articulation. Non-segmental phonology, specifically iconic nature of the signs. The problems with alphabetical
feature-geometrical phonology (Clements, 1985) or writing are precisely the advantages of SignWriting: the
Prosodic Phonology, on which the most complete model supposed simultaneity of the signs and their analogical
of ASL phonology, devised by Brentari (1998) is based, nature, particularly obvious in non-manual expression. We
have resolved the CV difference in other minor will now see that the notion of simultaneity goes hand-in-
differences, so that V or C is a relative question, arising hand with the notion of syllable and that they have
from the assignation of features; the notions of V or C can compatible sequential processes.
be replaced by the notion of auto segment, or even by a
phonological rule, thereby giving a more explicative 2. Syllable, Sequence and Simultaneity
model for certain phenomena such as tone, vocalic Although the current phonology of sign languages still
harmony or the vocalic morphology of certain oral suffers from many problems, as can be seen from the

37
different phonological models that have been devised one (1960): Location, Hand Shape and Movement, sometimes
after the other in recent years (Liddell, 1984, 1989, 1990; called the major parameters, and the difference between
Sandler, 1989; Perlmuter,1988; Brentari ,1998, 2002), path movements and local movements was specified
there is still sufficient consensus, in our opinion, to justify (Liddell, 1989). Additionally, the passive hand should be
a proposed writing system that could be used as a skill, specified as the location L of the sign when it acts as such,
rather than a phonological model. with its own Q and O, or as an element of symmetry with
As we have pointed out, the basic unit of glottographic the active hand. Lastly, our writing system represents
writing is the syllable, as this is the minimum unit in possible contact with the body, C, as a specification of
which sounds can be distinguished and combined. location. These are the constituents that we represent.
Accordingly, in spite of certain pending questions (such as We are not going to deal here with the phonological or
the phonological interpretation of repetition and featural nature of these components, but briefly to justify
lengthening), the phonology of sign languages already their sequential representation and the use of the Hand
gives a good idea of the phonological components of the Shape as the nucleus of the syllable, as the basis for an
syllable and its limits. It is also generally agreed that two economical writing system.
successive movements, even when they are local,
correspond to two syllables, and rules have been made for 2.1. Sequentiality
elision, epenthesis and gemination (Liddell, 1989). Several sequential models have been proposed since the
However, the main problem with these methods is that 80’s: Liddel (1982, 1989), Sandler (1986, 1989, 1990),
they continue to consider that, except for the movements, Perlmutter (1988), Brentari’s prosodic model (1998), etc.
which, by definition, are sequential, the syllable is In this last one, Hand Shape, Location, Orientation and
simultaneous. Movement are treated as types of (geometric) features,
In 1933 the vocal apparatus was filmed in operation for rather than segments. It considers that, “It is sufficient to
the first time, and the great linguist Roman Jakobson was make reference to distinctive features, in syllable initial
very impressed by the result. In the first of his Six leçons and syllable final positions, and there is no support for any
sur le son et le sens, given in New York in 1942, he further internal segmental divisions... no intermediate
remembers the film and states (1988: 396) that when he segments are recognized by the signers”. Moreover,
saw it he understood that “the act of speaking is a Brentari (2002: 45) considered that simultaneity is a
continuous, uninterrupted movement… there are no characteristic of sign languages, “Cs and Vs are realized at
position vs. transition sounds; they are all transition(...) the same time in sign languages, rather than as temporally
Strictly from the point of view of articulation, the discrete units”; (2002:47): “If sign language Cs are
sequence of sounds does not exist. Instead of following properties of the IF tree and sign language, Vs are
each other, sounds link up with each other; and one sound, properties of the PF tree, the major difference between
which our acoustic impression tells us comes after sign and spoken languages in this regard is that in sign
another, can be articulated simultaneously with it or even languages IFs and PFs occur at the same time”
partially before it(...) It is not possible to classify, or even, Liddel’s model conceived of Hold and Movement as
I would say, to describe the different articulations segments, so that its syllabic model consisted of a hold-
accurately, without continuously asking what is the movement-hold sequence; the Hand Shape and
acoustic function of such and such motor action” . Orientation features, along with contact and Location L,
Syllables are acoustic units determined by the level of formed part of specific tiers, represented as simultaneous.
merging and influence of vowels and consonants Sandler’s model is also partially sequential, based on
(Malmberg, 1955), which are, therefore, relative Location and Movement segments; this model also
segments. Syllables are recognised by the transitions of a recognises the segmental nature of Q (Sandler, 1990:20
vowel or nucleus due to the effect of the consonant(s) of “hand shape is a distinct and temporally autonomous
the syllable. phonological element in ASL”). In our proposal,
Thus, as its etymology indicates, the syllable is a sequentiality will be extended to all the other parameters,
paradigm of simultaneity. In written representation, we although we insist that our aim is not to present a
would point out that literate speakers recognise segments phonological model, but rather a model of written
of this transition; a segmental sound is an articulation with representation. This model, which we call the
stable parameters, insofar as there are changes between Alphabetical Writing System for Spanish Sign Language
the sounds that allow us to identify them. Accordingly, the (Sistema de Escritura Alfabética de la Lengua de Signos
real effect of the operation is simultaneity, while Española - SEA.), is available in book form (Herrero and
segmentality is an operation of the mind, which I have Alfaro, 1999; Herrero, 2003) and on the internet
described above as practical phonological knowledge, (cervantesvirtual.com/portal/signos); all we can do here is
distinguishing between CV and types of both. describe its essential elements in relation to the problems
So what segments should be represented in writing an that practical phonology based on writing may raise when
SL, in our case Spanish Sign Language (LSE)? The approaching theoretic phonology. The system has been
linguistics of sign languages was born with the discovery successfully taught to several signers in a few weeks.
of its phonemes (Stokoe, 1960), initially called For our writing system, we start off by taking the basic
phonological ‘aspects’ or ‘cheremes’ and later, sequence proposed by Sandler (in its turn a specification
‘parameters’, a term which has spread to most current of the one proposed by Liddell): the Location-Movement
phonological models (e.g. Brentari, 2002). Until the 80’s, sequence. There are several pairs of signs that show the
these constituents, which we believe should simply be sequential incidence of Movement:
considered phonemes, were seen as simultaneous with
monosyllabic signs, i.e., syllables. A fourth parameter, AMOR .............LASTIMA
Orientation, was added to the three proposed by Stokoe

38
love pity b) We also consider it proper to represent active two-
DIFÍCIL ...........ANUNCIAR handed signing (symmetric, asymmetric and displaced
difficult to announce symmetric signs) at the beginning for reasons of
JUNTOS..........MESA processing, as two-handedness affects the articulation of
together table all the other components from the beginning.
LISTO .............SABER c) We have already said that there is a general consensus
clever to know as regards the Location-Movement sequence. The Hand
TELEFONO ...LLAMAR POR TELEFONO Shape and Orientation components are represented
telephone to phone between the two. On the one hand, it would appear
ARBOL...........BOSQUE obvious that what Movement does is to modify Location,
tree forest in the case of Path Movement D, or Hand Shape Q and/or
SILLA ............SILLAS Orientation O in Local Movements; these components
chair chairs should be specified before M as they are a part of the Hold
MIRAR ..........VER (in Liddell’s model).
to look to see d) The LQO order is an interpretation of the articulation of
ARRIBA........ .MANDAR bringing the hand from a part of the body or from the
up to command signing space with an articulation Q. The hand then
LLAVE ......... ESPADA remains in that Location with a certain Orientation and, in
key sword dynamic signs (most of them), carries out a movement.
CASA ........... CASA GRANDE e) The precedence of L over Q is clear when L is the
house big house passive hand. Another indication is given by the fact that
PROBAR...... ARADO when the sign is made in the mouth (SILENCIO, silence;
to try plough ODIO, hate; ROJO, red) the position of the lips goes
before Q, and when the sign is made with a non-manual
Using this elemental sequence, which refers only to two component (DELGADO, thin), this component goes
phonemes or parameters, Location and Movement, the before Q. In general, this place is guessed “before” Q, as a
remaining parameters are written in the following order: root which Q will specify. As a matter of fact, the initial
process of articulation in many signs is similar to an oral
S L(.)QODF CV syllable, insofar as the articulation takes the Hand
Where Shape of Q as that of the Vowel, while the occlusion
• S represents the left hand (as in ESCRIBIR , to occurs. We use the term ‘occlusion’ here in the sense of
write) or active two-handed signs (as in VIAJE, visual perception studies, as occlusion (interposition) of
journey). one object by another, in this case, the body by the hand
• The point (.) that may follow Location indicates (Kanisza, 1986: 283). What is not seen is not so much a
that there is no contact with the part of the body mental representation as a ‘found detail in an non-modal
taken as reference for signing (the temple, in complementation, with clear functional effects on the
TEORIA, theory) perception of fragmented objects.
• Hand Shape Q and Hand Shape Orientation f) One last clarification regarding sequentiality:
follow after Location and before Movement Movement, whether path or local, does not generally have
• Movement M is differentiated, as is normal in all a specified ending place. The sign does not necessarily
phonological models, into Path Movement (D) stop in one place (IDEA, idea; ENFADADO, angry) and,
and Local Movement (F), which are not if it does, does not do so in a lexical Location L (but rather
obligatory, may be simultaneous and, when in a precisely moved place), or with a Hand Shape Q or an
simultaneous, give rise to two syllables. The orientation O other than those foreseen by M, these
simultaneity of D and F will be represented by Locations, Hand Shapes and Orientations being moreover
adding the direction feature to the F symbol, i.e., subject to strict constrictions. M consists precisely of
making a kind of D out of DF. leading to that end. Another thing is two successive
• Non-manual elements that accompany the signs movements (ESPADA, sword), or two phonological
will only be represented if they have places (PADRE , father), which we consider disyllabic,
morphological value (e.g., adverbial but in monosyllabic signs the economy of the writing
intensification, although most signers know system makes it possible to end the sign in its movement.
lexical forms of representing this intensification; The incidence of certain Ms, specifically in local Fs,
or simultaneous affirmation and negation). which modify Q and/or O, seems comparable to glides in
oral languages. D movements, on the other hand, do not
Before going deeper into the writing system and giving change Q and can be compared to consonants. The
examples, we would first like to make a few comments on incidence of M is phonetically very varied.
the decisions that we have taken and that we have just We now give some arguments for considering Q the
summarised. syllabic nucleus, and thus justify its being written in the
a) The initial writing of the passive hand when it acts as centre of the syllable.
Location (but not in two-handed signs or as the moving
hand) is justified by articulatory and perceptive reasons: 2.2. The Nuclear Character of Q
while making the sign, the dominant hand addresses the We agree with Brentari (1998: 313) that “the formal role
previously moving passive hand (ESCRIBIR, to write; of distinctive features, syllables, and segments as building
POR QUÉ, why; OBJETIVO, aim). As far as I know, this blocks of a grammar with constraints is the same for
sequentiality has so far gone unnoticed.

39
signed and spoken languages, but the substantive referred to this double status, “Depending on whether the
definitions in both types of languages –those that are more posture of the hands remains constant throughout a sign –
phonetic and less grammatical- depend on conditions of in which case the dynamic portion of the signs comes
naturalness in each modality”, although we believe that from path movement-or whether the posture of the hands
the identity of the formal role should be translated as the changes while the other parameters are held constant,
difference between nucleus, onset and coda (or between hand configurations can be thought as non nuclear (C) or
onset and rhyme), which is immensely important, as far as nuclear (V) in a sign syllable”. We could also ask about
writing is concerned. This is the difference on which the simultaneous changes in hand shape and path movement
writing system is based, and, although the model is not the (as in COMPRENDER, to understand), which would
most scientifically suited for the phonological description involve a new treatment of hand shape. However, its
of sign languages, as neither is it for oral languages unified treatment as a nucleus avoids these dysfunctions.
(according to non-linear phonology), it may be applied to In our model, the components or phonemes of Location,
sign languages with similar criteria as to spoken Hand Shape, Orientation and Movement can be
languages. This opinion is defended by Wilbur (1990). considered structurally or syntactically as the [Onset]
The following are the main reasons why we will [Rhyme (nucleus, coda)] elements of the syllable. This
consider Q the nucleus: model has the asymmetrical conditions that characterise
a) The nucleus is a necessary constituent of every syllable. linguistic constructs, as regards syllabic structure
Some phonologists have stated that the necessary, nuclear, (Carstairs-McCarthy, 2001).
constituent is Movement. Brentari (2002:44), for example:
“regarding minimal word constraints, no sign is well 3. Economy of the Writing System:
formed unless it has a movement of some type”, but, in Projection Model, Featural Elements and
Spanish Sign Language at least, there are fairly evident Rules for Simplification
counter-examples of signs without M: one-handed signs When the Greeks imported Semitic writing, they gave
such as OJO (eye), ALTO (tall), ANCHO (wide); and the characters the Greek names closest in sound to their
two-handed signs such as PELOTA (ball), GAFAS Semitic names (aleph / alpha), and adapted them to
(glasses), CRUCIFIJO (crucifix), which neither have represent their own sounds (many of which, particularly
movement nor undergo an epenthesis of movement, as learned words, were borrowed from Semitic languages).
Brentari states. On the other hand, the only signs without In sign languages, the alphabet may not be imported based
Q are the non-manual signs (Dively, 2002). These signs on reasons of perceptive analogy, but on general semiotic
are generally gestures (emblems, etc.), and have no lexical values associated to different types of sounds.
entity. When they act with related morphological value, Moreover, although the exact number of phonemes of
they are represented at the end of the sign. each type (places on the body or in the signing space,
b) While Location or Movement can be reduced in rapid hand shapes, types of orientation, types of movement) is
signing (IDEA, idea, can be signed in a slightly higher not closed, at least in Spanish Sign Language, we know
place, although not at the temple; or the movement of enough to propose a representation open to new symbols.
EMPEZAR, to begin, can be reduced to a slight, local What we do know is that the number of phonemes,
waving movement), Hand Shape cannot usually be understood like this, is clearly greater in Sign Languages
reduced. than in spoken languages: 32 parts of the body, 10 parts of
c) We agree with Coulter (1990: 125) that stress is “the the signing space, 31 hand shapes, four orientations for
notion that greater articulatory effort is involved”, i.e. as each hand shape; as regards M, the number depends on the
muscular tension, so that, according to Wilbur (1990: 99) consideration of features. This complexity will be
“stressed signs were temporally shorter than unstressed”. resolved by what means of what we call the projection
In prosodic phonological models, the nuclear nature of model. In any case, this property of sign languages leads
Movement means that it carries prosodic marks such as to a phoneme: morpheme ratio of almost 1:1.
duration, but I believe that this is not the same as stress. In The symbols (represented by consonants) for the parts of
this regard, it is very significant that the emphasis on some the signing space, orientation and direction of movement
signs normally made with binary repetition eliminates this will be further specified by means of vowels, using a hand
repetition while tensing the articulation. We believe that projection model which associates “up”, “upwards” or
our point of view is compatible with the well-known “towards the signer’s face” with the vowel “a” (which also
Straka Rule, “under the effect of reinforcing articulatory symbolises the thumb); “down”, “downwards” or
energy, consonants close and vowels open; on the “towards the listener’s face” with the vowel “u” (which
contrary, under the effect of articulatory weakening, also symbolises the little finger); “left” towards the left”
consonants open and vowels close” (Straka, 1963: 35) with the vowel “i” (which also symbolises the middle
d) Lastly, it should be noted that when Sign Languages are finger); “in front” or “forwards” with the vowel “e”
interpreted for deaf-blind people, they are reduced to Q, (which also represents the index finger); “in the centre” or
insofar as fingerspelling is a part of Sign Languages. “backwards” with the vowel “o” (which also represents
Considering Q the nucleus also resolves the problem of the hand shape that uses the five fingers); and “right” or
Hand Shape double behaviour in prosodic models. As “towards the right” with the symbol “y”. This geometric
regards this double behaviour Corina (2002: 91-92) has model has been partially inspired by Friedman (1977).
said, “that is, that hand shapes may be a constituent of the These specifications are features that allow more
syllable nucleus or not” or, in other words (Corina, 2002: analytical representation and easier reading. In the cases
94) “in instances when the hand shape changes, hand of Location, the sub-specification appears before the
shape is functioning more like a vowel. In those signs with symbol for the place in space (the central longitudinal
no change in hand shape, hand shape serves a more plane, symbolised by l, and the right longitudinal lateral,
consonantal function”. Brentari (2002:30) has also

40
represented by the consonant b), so that al is the high part except the ring finger, has a symbol, and it is easy to use
of the central plane (as in CIELO, sky); el, the frontal part diacritical symbols to indicate the features of flexion (´),
of the same plane (as in TU, you); ub, the “low” part of union (`), contact (^) and link (¨), and to distinguish from
the lateral plane (as in BAJO, low); ab, the high part of the order of the fingers if the shape is open-handed (as in
the lateral plane (CONFERENCIA, conference), etc. POLVO, dust) or close-fisted (as in MINUTO, minute).
In the case of the Orientation, after the consonant m, the The method presented here is completed with certain
sub-specifications use a first vowel to indicate the rules for the simplification of location and orientation,
direction of the fingers of the hand (on the open palm); a based on considering certain locations or orientations
second one, the orientation of the palm: natural ‘natural’ and not symbolising them. Thereby, writing
orientation, or following on from the arm, which does not Spanish Sign Language becomes very easy.
need to be represented and for which the first vowel is We use the following two rules for the simplification of
sufficient (as in CONFERENCIA, conference, ma; or in locations (not written):
TU, you, me); orientation towards the signor or upwards a) Simplification of the ol central location of most
(an a is added as in PASADO, past, maa; or in QUE, two-handed signs.
what, mea); orientation towards the listener or downwards b) Simplification of the lateral location (ab, eb, ib,
(a u is added, as in COMPRENDER, to understand, mau; ob, ob, yb) when the hand is in its natural
or in COGER, to catch, meu); and orientation towards position following on from the arm (òma, instead
the right or inversely to natural continuity with the arm (a of abòma; òmi, instead of ilòmi, etc)
y is added, as in SEPARARSE, to separate, mey). The We use the following two rules for simplifying
same occurs with the other orientations for the direction of orientation:
the fingers (mi, mia, miu, miy; mu, mua, muu, muy a) Simplification of the orientation when the
etc..). location is a part of the body, and the palm is
In the case of Direction (D), the vowel added to the oriented towards that location (e.g. ynò, rather
straight movement symbol (w) states the direction: wa is than yòmi)
upwards, as in FUEGO (fire); we, forwards, as in b) Simplification of the me orientation when the
CONFERENCIA (conference); wo, backwards, as in sign is made in eb, as occurs in many signs such
COMPRENDER (understand). Curved directional as PISTOLA (pistol), BASTON (walking stick),
movements are represented by a c followed by two REGULAR (regular), etc.
vowels, one for direction and the other for curvature: cea Lastly, we simplify L and O by using only diacritical and
would be a direction curve forwards curving upwards, as numerical signs.
in DAR (to give); cya, curve towards the right curving The possibility of alphabetical writing has been tested,
upwards, as in ARCO (arch, bow), etc. These direction writing all the signs contained in Spanish Sign Language
vowels are added directly to the local movement symbols dictionaries, particularly Pinero’s dictionary (1989), and
when they are carried out with directional movement. also in the translation of several texts, including poetry,
Thus, the extension/flexion symbol l is followed by o to and in teaching the method to groups of signers. However,
indicate extension/flexion moving backwards, as in as we have stated already, writing is not a reproduction of
COMPRENDER (to understand), which is why this word spoken language: it is a representation, a record, with its
ends in lo; or a trembling movement, symbolised by t, is advantages and limitations, of the spontaneous act of
followed by e to indicate that it occurs in a forwards signing. The lack of a prosodic representation of the
direction, as in BOSQUE (forest), which is why this word writing of many oral languages is a limitation, particularly
ends in te. Local movements such as waving, beckoning from the point of view of non-literate persons, although
and twisting, indicate the direction of their local this limitation, related with the lack of context and the
movement with the respective vowels. non-presence of the interlocutors, makes the written
Some local movements are involved in symmetry message very suitable for reflection, and very open to
(tapping or hitting between the two hands, linking, etc.) interpretation.
and, in this case, may be represented using the two-handed Writing signed spontaneous conversation generally
s symbol. For example, a symmetric tapping movement involves adopting certain other symbols, particularly
between the two hands, such as CONTACT (contact), will Location. According to Liddel (1990), in addition to the
be symbolised by sp, where p is the symbol of the tapping phonological places where said lexical signs are located
F: a symmetric hitting movement, as in HIERRO (iron), is (10 in the signing space and 32 on the body), there also
symbolised by sx, where x is the symbol of the hitting F, exist anaphoric grammatical spaces and descriptive,
etc. The signs thereby will have a sequence as follows: analogical or topographical spaces, which copy the real
situation of objects in real space, and are used in blended
1. S (if it is two-handed) + indicators of the type of spaces in descriptions. There are no problems in applying
symmetry/QO of the passive hand the projection model to represent grammatical locations;
2. spacing descriptive locations may be represented by means of
3. body consonant / vowel + l/b (Location) directional repetitions, but if this is not possible, they will
4. optional point (Contact) have to be paraphrased by writing “to the left,” “crossed,”
5. Q (Configuration) etc. This is also the case with many non-manual
6. m (Orientation) + orientation vowels expressions describing modality, i.e. doubt, certainty, etc.
7. D consonant + direction vowel/s We now give the writing for certain Spanish Sign
8. F consonant/s + direction vowel/s Language signs of different phonological composition.
Disyllabic signs are written with a hyphen; L and/or O
We have left the representation of Q for the end. To a simplified using the rules mentioned above are written in
certain extent, it is the easiest, insofar that every finger, brackets:

41
Goldsmith, J. (1992) A handbook of phonological theory.
2 hand L C Q O D F disyllabic Oxford: Basil Blackwell
Fisher, S. & Siple, P. (1990) Theoretical Issues in Sign
Language Research. Chicago, vol. I. The University of
Chicago Press.
amor yn i (mi) Herrero, A. & Alfaro, J.J. (1999) Fonología y escritura de
(love) la lengua de signos española. Estudios de Lingüística
cauce sm (ol) ò (me) se 13: 89-116
(course) Herrero, A. (2003) Escritura alfabética de la lengua de
rubio c i (miu) zo signos española. Alicante, Publicaciones de la
(blond) Universidad de Alicante.
teoría t . T (ma) wruhob Jakobson, R. (1988) Obras selectas, tomo I. Madrid,
(theory) Gredos
espada (eb) aë meu cre - we Kanisza, G. (1980) Gramatica del vedere. Saggi su
(sword) percezione e gestalt. Bologne, Il Mulino. Ed. esp.
libro sc (ol) ò (me) creb Barcelona, Paidós, 1986
(book) Liddell, S.K. (1984) THINK and BELIEVE: Sequentiality
sordo r e (mau) -v in American Sign Language signs. Language 60: 372-
(deaf) 390
Portugal sm pn a miu zuy Liddell, S.K. (1989) American Sign Language: The
ayer hm. oa (maa) daheb Phonological base. Sign Language Studies 64: 195-278
(yesterday) Liddell, S.K. (1990a) Four functions of a Locus: Re-
casi (eb) aë mea grel examining the structure of space in ASL. In Lucas, C.
(almost) (ed) Sign Language Research: Theoretical
dar y . aë mo cea Issues.Washington, Gallaudet University Press, 176-198
(to give) Liddell, S.K. (1990b) Structures for Representing
China yn e (mo) zy - zu Handshape and local Movement at the Phonemic Level.
bilinguismo so’ami ei mau wu gre In Fisher, Susan and Patricia Siple (eds.), 37-66
(bilingualism) Malmberg, B. (1955) The phonetic basis for syllable
division. Studia Linguistica 9: 80-87
Acknowledgements Meier, R.P.; Cormier, K. & Quinto-Pozos, D. (eds)
This paper has been possible thanks to the economical (2002), Modality and structure in signed and spoken
support of Ministerio de Ciencia y Tecnología (Research languages. Cambridge: Cambridge University Press,
Proyect nº BFF2002-010016). Olson, D. & Torrance, N. (eds.) (1991) Literacy and
orality. Cambridge, Cambridge University Press. Ed.
References esp. Barcelona, Gedisa, 1998
Perlmutter, D.M. (1992) Sonority and syllable structure in
Brentari, D. (1998) A Prosodic Model of Sign American Sign Language. Linguistic Inquiry 23: 407-
Language Phonology. New York: The MIT Press 442
Brentari, D. (2002): Modality differences in sign language Pinedo, F. J. (1989) Nuevo diccionario gestual español.
phonology and Morphophonemics, in Meier, Richard Madrid, C.N.S.E.
P.; Cormier, Kearsy; and Quinto-Pozos, David (eds.), Sampson, G. (1985) Writing Systems. London: Century
pp. 35-64 Hutchinson Limited. Ed. esp. Barcelona, Gedisa, 1997.
Carstairs-McCarthy, A. (2001) ASL ‘syllables’ and Sandler, W. (1986) The Spreading Hand Autosegment of
language evolution: a response to Uriagereka, in American Sign Language. Sign Language Studies 50:1-
Language 77, 343-349 28
Clements, G.N. (1985) The geometry of phonological Sandler, W. (1989) The Phonological Representation of
features. Phonology Yearbook 2: 225-252 the Sign. Dordrecht, Foris
Corina, D. & Sandler W. (1993) On the nature of Sandler, W. (1990) Temporal Aspects in ASL Phonology,
phonological structure in sign language. Phonology 10: in Fisher, S. & Siple, P. (eds.), 7-36
165-207 Stokoe, W.C. (1960) Sign language structure: An outline
Corina, D.P. and Hildebrandt, U.C.(2002): of the visual communication systems of the American
Psycholinguistic investigations of phonological Deaf. Burtonsville, Linstok Press, 1993
structure in ASL. In: Meier, Richard P.; Cormier, Straka, G (1963) La division des sons du langage en
Kearsy; and Quinto-Pozos, David (eds), pp. 88-111 voyelles et consonnes peut-elle être justifiéé? Travaux
Coulter, G.R. (1990) Emphatic Stress in ASL. In Fisher, de Linguistique et Littérature, I, 17-74
Susan and Patricia Siple (eds.), 37-66 Wilbur, R.B. (1990) Why Syllables? What the Notion
Dively, V.; Metzger, M.; Tabú, S. & Baer, A. (eds) Means for ASL Research. In Fisher, S. & Siple, P.
(2000). Signed Languages. Discoveries from (eds.), 81-108
International Research, Washington DC: Gallaudet Wilbur, R.B. (1999) Stress in ASL: Empirical evidence
University Press. and linguistic issues. Language and Speech 42: 229-
Friedman, L.A., (1977) Formational Properties of 250
American Sign Language, in On the other hand: new
perspectives on American Sign Language. New York
Academic press. 13-56

42
Synthesis of Virtual Reality Animations from SWML using MPEG-4 Body
Animation Parameters

Maria Papadogiorgaki*, Nikos Grammalidis*, Nikos Sarris† and Michael G. Strintzis*

*
Informatics and Telematics Institute
1st Km Thrermi-Panorama Road
57001 Thermi-Thessaloniki, Greece
E-mail: {mpapad, ngramm, strintzi}@iti.gr


Olympic Games Organizing Committee Athens 2004
Iolkou 8 & Filikis Etairias
14234 Nea Ionia, Greece
E-mail: [email protected]
Abstract
This paper presents a novel approach for generating VRML animation sequences from Sign Language notation, based on MPEG-4
Body Animation. Sign Language notation, in the well-known SignWriting system, is provided as input and is initially converted to
SWML (SignWriting Markup Language), an XML-based format which has recently been developed for the storage, indexing and
processing of SignWriting notation. Each sign box (basic sign) is then converted to a sequence of Body Animation Parameters (BAPs)
of the MPEG-4 standard, corresponding to the represented gesture. These sequences, which can also be coded and/or reproduced by
MPEG-4 BAP players, are then used to animate H-anim compliant VRML avatars, reproducing the exact gestures represented in sign
language notation. Envisaged applications include producing signing avatars for interactive information systems (Web, E-mail, info–
kiosks) and TV newscasts for persons with hearing disabilities.

1. Introduction
The SignWriting system is a writing system for deaf sign
languages developed by Valerie Sutton for the Center of
Sutton Movement Writing, in 1974 [1]. A basic design
concept for this system was to represent movements as
they are visually perceived, and not for the eventual Figure 1: Three examples of representations of
meaning that these movements convey. In contrast, most
of the other systems that have been proposed for writing American Sign Language in SignWriting system.
deaf sign languages, such as HamNoSys (the Hamburg
Notation System) or the Stokoe system employ An efficient representation of these graphical symbols in a
alphanumeric characters, which represent the linguistic computer system should facilitate tasks as storage,
aspects of signs. Almost all international sign languages, processing and even indexing of sign language notation.
including the American Sign Language (ASL) and the For this purpose, the SignWriting Markup Language
Brazilian Sign Language (LIBRAS), can be represented in (SWML), an XML-based format, has recently been
the SignWriting system. Each sign-box (basic sign) proposed [7]. An online converter is currently available,
consists of a set of graphical and schematic symbols that allowing the conversion of sign-boxes in SignWriting
are highly intuitive (e.g. denoting specific head, hand or format (produced by SignWriter, a popular SignWriting
body postures, movements or even facial expressions). editor) to SWML format.
The rules for combining symbols are also simple, thus this Another important problem, which is the main focus of
system provides a simple and effective way for common this paper, is the visualization of the actual gestures and
people with hearing disabilities that have no special body movements that correspond to the sign language
training in sign language linguistics, to write in sign notation. A thorough review of state-of-the art techniques
languages. Examples of SignWriting symbols are for performing synthetic animation of deaf signing
illustrated in Figure 1. gestures has been presented in [5]. Traditionally,
dictionaries of sign language notation contain videos (or
images) describing each sign-box, however the production

43
of these videos is a tedious procedure and has significant specifies two types of SWML documents: sw_text (sign
storage requirements. On the other hand, recent language text generated e.g. an SWML editor or
developments in computer graphics and virtual reality, converter) and sw_table (sign language database or
such as the new Humanoid Animation (H-Anim) [9] and dictionary generated by an SWML aware application).
MPEG-4 SNHC [3] standards, allow the fast conversion • An sw_text document consists of sign_boxes and
of sign language notation to Virtual Reality animation text_boxes, where each sign box consists of a set
sequences, which can be easily visualized using any of symbols and each text box contains an
VRML-enabled Web browser. alphanumeric string.
In this paper, we present the design, implementation • An sw_table document consists of table of
details and preliminary results of a system for performing entries, where each entry consists of a sign_box
such a visualization of sign-boxes, available in SWML. and a corresponding gloss (a sequence of fields
The proposed technique first converts all individual containing descriptions for this sign box in an
symbols found in each sign box to a sequence of MPEG-4 oral language).
Body animation parameters. The resulting sequences can Each symbol is specified in SWML using the following
be used to animate any H-anim-compliant avatar using an fields:
MPEG-4 SNHC BAP player provided by EPFL [4]. The a) A shape number (integer) specifying the shape of
system is able to convert all hand symbols as well as the the symbol,
associated movement, contact and movement dynamics b) A variation parameter (0 or 1 for hand symbols /
symbols contained in any ASL sign-box. Although only 1,2 or 3 for movement and punctuation symbols)
manual (hand) gestures are currently supported, we plan to specifying possible variations (complementary
implement other body movements (e.g. torso) as well as transformations) of the symbol,
facial animation in the near future. The proposed c) A fill parameter (0,1,2 or 3 for hand and
technique has significant advantages: punctuation symbols / 0,1 or 2 for movement
• Web- (and Internet-) friendly visualization of symbols) specifying the way the shape is filled,
signs. No special software has to be installed, generally indicating its facing to the signer,
d) A rotation parameter (0-7) specifying a counter-
• Allows almost real-time visualization of sign clockwise rotation applied to symbol, in steps of
language notation, thus enabling interactive
45 degrees,
applications,
e) A transformation flip parameter (0 or 1)
• Avatars can easily be included in any virtual indicating whether the symbol is vertically
environment created using VRML, which is mirrored or not, relatively to the basic symbol
useful for a number of envisaged applications, and, finally,
such as TV newscasts, automatic translation f) The x and y coordinates of the symbol within the
systems for the deaf, etc. sign box.
For sign synthesis, the input for the sign synthesis system
• Efficient storage and communication of consists of the SWML entries of the sign boxes to be
animation sequences, using MPEG-4 coding visualized. For each sign box, the associated information
techniques for BAP sequences. corresponding to its symbols is parsed. Information
Significant similar work for producing VRML animations related to symbols that are supported by the sign synthesis
from signs represented in the HamNoSys transcription application, i.e. hand symbols as well as corresponding
system to VRML has been carried out by the EC IST movement, contact and movement dynamics symbols, is
ViSiCAST project [6], and its follow-up project “E- then used to calculate the MPEG-4 Body Animation
Sign”[10]. Current extensions of HamNoSys are able to Parameters.
transcribe all possible body postures, movements and
facial expressions [11] and significant work towards
supporting MPEG-4 BAPs has been made. The main
contribution of the proposed approach in this paper is the 3. Conversion of Sign Boxes to MPEG-4
attempt to work towards the same direction for the most
Body Animation Parameters
common and popular representation of Sign Languages, The issue of body modeling and animation has been
which is the SignWriting notation system. addressed by the Synthetic/Natural Hybrid Coding
The paper is organized as follows: Section 2 provides an (SNHC) subgroup of the MPEG-4 standardization group
introduction to SWML and describes how our application [3]. More specifically, 168 Body Animation Parameters
extracts information from SWML files. In Section 3, the (BAPs) are defined by MPEG-4 SNHC to describe almost
proposed technique for converting sign boxes to MPEG-4 any possible body posture. Most BAPs denote angles of
Body Animation Parameters is described. The synthesis of rotation around body joints. In this section, the proposed
animations for H-anim avatars is outlined in Section 4, system to convert symbols contained in a SWML sign box
while discussion and future work is presented in Section to BAP sequences will be presented.
5. Currently, symbols from the 1995 version of the Sign
Symbol Sequence (SSS-1995) are supported. This
sequence comprises an "alphabet" of the SignWriting
2. Introduction to SWML and parsing of notation system, while true images (in gif format) of each
SWML files symbol contained in this sequence are available in [2].
The proposed system is able to convert
SWML [2] is an XML-based format described by the • All 106 hand symbols,
SWML DTD (currently version 1.0 draft 2)[7]. The DTD

44
• All 95 (hand) movement symbols and symbols, as in the case of static gestures. Since the frame
rate is constant and explicitly specified within a BAP file,
• Two punctuation symbols (180,181), which the number of resulting frames may vary, depending on
contain synchronization information.
the complexity of the described movement and its
Other punctuation symbols as well as symbols that
dynamics. Synchronization symbols and contact also
represent face expressions and face, torso and shoulder affect the represented movement and in some cases
movements (43 symbols) are currently ignored (not
require special treatment.
decoded) by the system.
The conversion starts by first examining the symbols Smooth and natural-looking transitions from and between
contained within the input sign box. If no symbols the neutral body position and the body position
describing dynamic information such as hand movements, corresponding to a static gesture (or the start and end
contact or synchronization exist, the resulting BAP frames of a dynamic gesture) is achieved by generating
sequence corresponds to just one frame (i.e. a static additional intermediate frames using a “hierarchical” BAP
gesture is reproduced). Information provided by the fields interpolation procedure: intermediate BAP sets (frames)
of the (one or two) hand symbols, contained in the sign are generated to consecutively move first the arms, then
box, is used to specify the BAPs of the shoulder, arm, the wrist and finally the fingers from their previous
wrist and finger joints. On the other hand, if symbols positions to their new positions.
describing dynamic information exist, the resulting BAP A block diagram of the proposed system is illustrated in
sequence contains multiple frames, describing animation Figure 2, while additional details about the generation of
key-frames (i.e. a dynamic gesture is reproduced). The BAPs for static and dynamic gestures are provided in the
first key-frame is generated by decoding the existing hand following Subsections.

45
SWML
sign box
Have any
motion or One-frame
synchronization BAP
symbols been Sequence
read?
Find and Read first hand
No
symbol (The transformation (STATIC GESTURE)
flip field identifies whether
the symbol refers to right or
left hand). Yes
(DYNAMIC GESTURE)

Determine the
number of key
frames to be
Is there a Yes generated.
second
hand
symbol?

The BAPs generated from


the hand symbols are used
as the first key-frame.
No Read second
hand symbol

BAPs are generated for


For each hand symbol, the the other key-frames
variation, fill and rotation based on the fields of the
fields determine the BAPs of movement and
the shoulder, elbow and wrist. synchronization symbols.
The shape fields determine the
BAPs of finger joints

Linear Interpolation
between key-frames is used
to increase the frame rate of
Are there any No the resulting BAP
more motion or
sequence.
synchronization
symbols?

Yes
Final
Read the next available BAP
motion or synchronization Sequence
symbol

Figure 2: A block diagram of the proposed system.

46
3.1. Static gestures In the following, the procedure to extract useful
information from the SWML representation of a hand
The SignWriting system allows various transformations to symbol is summarized:
be applied to a basic symbol. A hand symbol for example Initially, the binary “transformation flip” parameter is
can exist in many different postures with bent fingers etc, used to identify whether the symbol corresponds to the left
represented with different shape numbers. Also the signer or right hand. Then the fill and variation parameters of
may either see his palm, the back of his palm or the side each symbol are used to determine the animation
of his palm (Figure 3). parameters of the shoulder and elbow joints:
• If (variation,fill)=(0,0),(0,1) or (1,3) then the axis
of the arm is parallel to the floor (floor plane).
• If (variation,fill)=(1,0),(1,1) or (1,2) then the axis
of the arm is parallel to the human body (wall
plane)
• If (variation,fill)=(1,0) or (1,3) then the signer
sees his palm
• If (variation,fill)=(1,1) or (0,0) then the signer
sees the side of his palm
• If (variation,fill)=(1,2) or (0,1) then the signer
sees the back of his palm
In addition, the rotation parameter is used to determine the
Figure 3: The signer sees a) his palm, b) the back of his animation parameters of the wrist joint:
palm c) the side of his palm. • If the signer sees the side of his palm, the rotation
value (multiplied by 45 degrees) is used to define
the R_WRIST_FLEXION BAP (for the right
hand) or the L_WRIST_FLEXION BAP (for
As seen in Figure 4, the hand may either be parallel with the left hand).
the wall (wall plane) or with the floor (floor plane). • In the other two cases (signer sees his palm or the
back of his palm), the rotation value (multiplied
by 45 degrees) is used to define the
R_WRIST_PIVOT BAP (for the right hand) or
the L_WRIST_PIVOT BAP (for the left hand).
Finally, the symbol shape number is used to specify the
animation parameters corresponding to finger joints, using
look-up tables of BAP values corresponding to each
symbol.
If the sign box contains a second hand symbol, similar
procedures are used to extract the body animation
parameters of the other hand. After the processing of all
existing hand symbols, all body animation parameters
corresponding to shoulder, elbow, wrist and finger joints
are determined and stored.
Figure 4: a) Hand is parallel with the wall plane b)
Hand parallel is with the floor plane
3.2. Dynamic gestures
The position of the palm may also change due to a rotation A movement symbol may exist in many forms describing
around the wrist joint. Furthermore, a “flipped” symbol either simple or complex movements. Movement can be
represents a symbol that is “mirrored” around the vertical either parallel to the wall plane or to the floor plane.
axis. This means that it actually describes a posture of the Furthermore, as can be seen in Figure 6a, movement
other hand. A hand symbol and its flipped version are symbols for the left and right hand have different
illustrated in Figure 5. representations. When the movement is associated with
the right (left) hand, the arrow representing its direction
has a dark (light) arrowhead. When both hands are
simultaneously moving to the same direction as a group,
the representation of the movement is done using a neutral
arrowhead, which is neither dark nor light. In some cases,
the size of a movement symbol is used to specify the
duration (i.e. the speed) of the hand.
For example, the arrow symbol in Figure 6b is illustrated
Figure 5: A basic hand symbol in three different sizes: the first represents a fast
movement forward, the second represents a movement
and its flipped version. forward with normal speed and the last represents a slow
movement forward.

47
dynamic gesture are then generated from a specific set of
functions.

3.2.2. BAP Interpolation


Finally, when the BAPs for all key-frames have been
computed, BAP interpolation is used to increase the frame
rate of the resulting BAP sequence. This interpolation
Figure 6: Three versions of a symbol specifying: procedure results to smoother transitions between key
frames.
a) movements of different hands, b) movements Interpolation is generally achieved by approximating the
with different time durations. motion equation using a mathematical function and then
re-sampling this function to obtain the desired
intermediate positions at intermediate time instants.
MPEG-4 standard allows the description of human body Various interpolation functions can be selected in order to
movement using a specific set of body animation improve results. Since body animation parameters
parameters corresponding to each time instant. Systems represent rotations around specific joints, quaternion
like SignWriting that use a high level animation interpolation was seen to provide good results [8], but the
description define movement by specifying a starting and complexity of the method is increased. For this reason, a
an ending position, in case of simple motion with constant linear interpolation technique was applied, which was
velocity, or the full trajectory, in case of more complex seen to be very efficient for most signs, since key-frames
motion. However, the description of complex motion is have been selected so as to simplify the movement
also possible by specifying a number of intermediate key- description between consecutive key-frames.
frames. In the following, the procedures for generating
these BAP key-frames are briefly described.
3.2.3. Synchronization (Movement
Dynamics) Symbols: A special case
3.2.1. Generation of BAP key-frames
The sign box may also contain one of the three supported
When all movement description symbols have been synchronization (movement dynamics) symbols (180,181
identified, the shape number field identifies their shapes and 182). These symbols as well as their fields and
(i.e. the type of movement). First, the total number of key- interpretation are described below:
frames to be produced is specified, based on the number Shape number=180
and nature of the available movement, movement • Variation=1, fill=0: simultaneous line (both
dynamics, contact, and synchronization symbols. More hands move at the same time)
specifically, a look-up table is used to define an initial • Variation=1, fill=1:alternating lines (the right
number k of key frames for each movement symbol. hand move in one direction while the left move
Furthermore, the fill parameter specifies whether the simultaneously in the opposite direction)
motion is slow, normal or fast. In addition, some symbols • Variation=1, fill=2: un-even alternating (one
explicitly specify the movement duration. For this reason, hand moves while the other is still then the
a classification of such symbols into three categories has second hand moves while the first remains still)
been defined and a different duration value D is defined • Variation=1, fill=3, rotation=0: slow movement
for each category: • Variation=1, fill=3, rotation=4: smooth
• Slow motion (D=3) movement
Shape number=181
• Normal motion (D=2)
• Variation=1, fill=0: tense movement
• Fast motion (D=1) • Variation=1, fill=1: tense movement with
The total number of frames to be generated when only one emphasis
motion symbol exists is N=kDP, where P is a fixed • Variation=1, fill=2: relaxed movement
multiplier (e.g. P=10). If the number of such symbols is • Variation=1, fill=3: relaxed movement with
more than one, the total number of key-frames is the emphasis
maximum between the numbers of key-frames, Shape number=182
corresponding to each symbol. Finally, if the sign box • Variation=1, fill=0: fast movement
contains a contact symbol, the total number of frames is • Variation=1, fill=1: fast movement with
increased by two (in case of simple contact) or four (in emphasis
case of double contact). These synchronization symbols are handled in a similar
The initial key-frame is generated by decoding the way as movement symbols but an exception exists for the
available hand symbols, exactly as in the case of static “Un-even alternating” symbol, where first one hand
gestures. The rotation and transformation flip fields moves, while the other hand is still and then the opposite.
specify the exact direction of movement. Also, the To handle this case the total number of key frames is
variation field specifies whether the right or the left hand doubled (N=2kDP). To produce the first kDP frames,
performs the movement. Using information from all BAPs are generated only for the first hand, so the second
available movement, contact and synchronization hand remains still. In the following, BAPs are generated
symbols, the other BAP key-frames of the specific

48
only for the second hand, to produce the next kDP frames, synthesis or not. The user then selects an H-anim
so the first hand remains still. compliant avatar, which is used for sign synthesis of the
selected term or terms. Furthermore, the user may produce
and display the corresponding sign(s) in SignWriting
4. Synthesis of animations using h-anim format (in PNG format) and SWML for a specific term or
the selected terms.
avatars
The "EPFLBody" BAP player [4], developed by the École
Polytechnique Fédérale Lausanne (EPFL) for the
Synthetic and Natural Hybrid Coding (SNHC) subgroup
of MPEG-4 was used to animate H-anim-compliant
avatars using the generated BAP sequences. Since most
BAPs represent rotations of body parts around specific
body joints, this software calculates and outputs these
rotation parameters as animation key-frames to produce a
VRML (“animation description”) file that can be used for
animating any H-anim-compliant VRML avatar. Two
frames from resulting animations are illustrated in Figure
7

Figure 8: Example query: “Welcome to my world”.


The user may then select the desired terms and then
produce and display sign synthesis results using the
selected words or the entire phrase, using any of the
Figure 7: Animation of the “You” sign in ASL using an
available H-anim avatars.
H-anim avatar

This experimental Web application has already allowed us


to identify significant problems with the synthesis of static
By including a VRML TouchSensor Node within the
and dynamic gestures, which have to be solved in the
VRML file describing the H-anim avatar, the viewer can
future, e.g. when contacts and complex movements are
interactively start and/or replay the animation sequence,
involved. A major problem that has to be solved occurs
by clicking on the avatar. The viewer can also interact by
when the sign-box contains contact symbols. In that case
zooming in and out to any specific body region and/or by
the touch between the hands, or the hand and the face is
rotating and translating the model within the 3-D space, in
difficult to be achieved. Problems may also occur for
order to fully understand the represented sign.
complex movements, when the inclinations of the hand
joints, which have been estimated in each key frame, are
Furthermore, further evaluation of the proposed sign not accurate enough for the exact description of the
synthesis system was possible by developing an online movement. Both problems can be solved in the future by
system [12] for converting text to Sign Language notation using inverse kinematics methods.
and corresponding VRML animation sequences for H-
anim compliant avatars. The application, whose interface Further evaluation is planned for the future, using Greek
is illustrated in Figure 8, is currently based on a 3200- and International SignWriting users, and attempts will be
word SWML dictionary file, obtained by the SWML site made to solve the problems that have been observed or
[2], which has been parsed and inserted into a relational will be observed in the future. Although these problems
database. The user is allowed to enter one or more words, indicate that much more work is needed for correct
which are looked up in this dictionary. If more than one synthesis of all signs, we believe that with this Web tool, a
entry is found, all possible interpretations are presented to very important step towards automatic Text to Sign
the user, so that he can choose the desired one. On the synthesis has been made.
other hand, if no entries are found for a specific word, the
word is decomposed using its letters (finger-spelling). In
any case, the user may choose whether to include a
particular term to the selected terms to be used for sign

49
5. Discussion and Future work [7] Antonio Carlos da Rocha Costa, Cracaliz Pereira
Dimuro (2001). Supporting Deaf Sign Languages
A novel approach for generating VRML animation in Written Form on the Web. The SignWriting
sequences from Sign Language notation, based on MPEG-
Journal, Number 0, Article 1, July.
4 Body Animation has been presented. The system is able
http://gmc.ucpel.tche.br:8081/sw-
to convert almost all hand symbols as well as the journal/number0/article1/index.htm.
associated movement, contact and movement dynamics
[8] M. Preda, F. Preteux (2001). Advanced virtual
symbols contained in any ASL sign-box.
humanoid animation framework based on the
As stated in the introduction, we plan to support non- MPEG-4 SNHC standard. Proceedings
manual body movements as well as facial animation
EUROIMAGE International Conference on
within the near future. Facial animation will be
Augmented, Virtual Environments and Three-
represented by MPEG-4 Facial Animation Parameters, Dimensional Imaging (ICAV3D'01), Mykonos,
while animation of H-anim compliant avatars using
Greece, pp. 311-314.
simultaneous face and body animation has been already
[9] Humanoid Animation Standard Group.
successfully implemented. A problem with using Facial Specification for a Standard Humanoid: H-Anim
Animation Parameters is that most of them, in contrast to
1.1. http://h-anim.org/Specifications/H-Anim1.1/
BAPs, describe complex non-rigid motions, and therefore
[10] Official site of E-sign (Essential Sign Language
most existing FAP player implementations are model- Information on Government Networks) project.
dependent. Furthermore, the resulting VRML animations
http://www.visicast.sys.uea.ac.uk/eSIGN/
are more complicated since they contain numerous
[11] Th. Hanke (2002). iLex - A tool for sign
CoordinateInterpolator nodes (one per face model vertex). language lexicography and corpus analysis. In:
Therefore, the computational demands for the hardware
Proceedings of the Third International
that is reproducing these animations are increased.
Conference on Language Resources and
Finally, a short-term goal is to design practical Evaluation, Las Palmas de Gran Canaria, Spain.,
applications of the proposed system, either as a “plug-in”
pp. 923–926.
to existing applications (e.g. sign language dictionaries) or
[12] Vsigns page, http://vsigns.iti.gr
as a stand-alone tool for creating animations for TV
newscacts (e.g. weather reports). Particular emphasis will
be given in applications that can be used and evaluated by
the Greek Sign Language community, thus a dictionary of
Greek Sign language, in SignWriter notation, is planned to
be supported in the near future.

6. Acknowledgement

This work was supported by FP6 IST Network of


Excellence “SIMILAR” -“The European taskforce
creating human-machine interfaces SIMILAR to human-
human communication”. The authors would also like to
thank Lambros Makris for designing and developing the
“Vsigns” Web Page.

References
[1] Official SignWriting site,
http://www.signwriting.org/
[2] Official Site of SWML,
http://swml.ucpel.tche.br/
[3] Moving Pictures Experts Group, (1999). Generic
coding of audio-visual objects - Part 2: Visual,
MPEG Document ISO/IEC JTC1/SC29/WG11
N3056. Maui.
[4] Fabrice Vergnenegre, Tolga K. Capin, and D.
Thalmann (1999). Collaborative virtual
environments-contributions to MPEG-4 SNHC.
ISO/IEC JTC1/SC29/WG11 N2802,
http://coven.lancs.ac.uk/mpeg4/
[5] A B. Grieve-Smith (2001). SignSynth: A Sign
Language Synthesis Application Using Web3D
and Perl. Gesture Workshop, London, pp. 134-
145.
[6] R. Kennaway (2001). Synthetic Animation of
Deaf Signing Gestures. Gesture Workshop, pp.
146-157, London.

50
Multipurpose Design and Creation of GSL Dictionaries
Eleni Efthimiou, Anna Vacalopoulou, Stavroula-Evita Fotinea, Gregory Steinhauer
ILSP-Institute for Language and Speech Processing
Artemidos 6 & Epidavrou, GR 151 25, Maroussi, Greece
{eleni_e, avacalop, evita, stein}@ilsp.gr

Abstract
In this paper we present the methodology of data collection and implementation of databases with the purpose to create extensive
lexical and terminological resources for the Greek Sign Language (GSL). The focus is on issues of linguistic content validation,
multipurpose design and reusability of resources, exemplified by the multimedia dictionary products of the projects NOEMA (1999-
2001) and PROKLISI (2002-2004). As far as data collection methodology, DB design and resources development are concerned, a
clear distinction is made between general language lexical items and terms, since the creation of resources for the two types of data
undergoes different methodological principles, lexeme formation and usage conditions. There is also reference to content and interface
evaluation mechanisms, as well as to basic linguistic research carried out for the support of lexicographical work.

1. Introduction
A basic requirement for the treatment of signs or sign The knowledge acquired with respect to the morpho-
streams as linguistic input for NLP and for the phonological operations the formation of simple and
development of applications that make use of linguistic complex signs allowed for: a) the construction of rules for
data, is the existence of adequate linguistic resources in creating new valid signs, b) the denomination of relevant
the form of electronic lexical databases and computational terms and c) the classification of GSL linguistic resources
grammars. into terminological lists. All these have significant impact
The Greek Sign Language (GSL) has only recently started on the development of both communication and
to be subject to systematic linguistic analysis. This is, on educational tools using technologies which allow the 3D
one hand, due to the fact that it was not until 2000 (Act representation of linguistic content.
2817) that GSL was recognized by the Greek Parliament
as an official language of the Greek State. On the other 3. Methodological principles of vocabulary
hand, this interest is directly connected to the formation
development of technologies, which enabled the creation The initial steps of our work on GSL vocabulary included
of electronic linguistic resources (including lexicons, a survey of the existing lexicography (Logiadis &
grammars and sign language corpora) for languages that Logiadis, 1985) and syntax literature. It came out that the
are uttered in the three-dimensional space (see also available knowledge of GSL was only based on individual
Efthimiou et al., 2004). Such resources can nowadays be fragmentary attempts. These usually lacked scientific
adequately stored, retrieved and represented, exploiting criteria, did not derive from systematic scientific analysis
the ability of current systems to incorporate various and generally involved the creation of some kind of
multimedia functionalities for the generation of signs, into lexicon. This fact is directly connected with the prevailing
a single platform. assumption that GSL is not an autonomous linguistic
system but, rather, some kind of representation of aural
2. GSL lexicography: the background Greek.
In contrast to other sign language systems, i.e. the ASL Consequently, the creation of lexical resources had to take
(Tennant & Gluszak Brown, 1998 ; Wilcox et al., 1998), into serious consideration the linguistic material that
systematic lexicographical work in respect to GSL has would serve as the basis for the lexicographical work
started only recently, within the framework of the (Johnston & Schembri, 1999) and which should reflect
NOEMA project (1999-2001). linguistic synchrony, also allowing for an adequate
This was the first attempt to create multipurpose reusable grammatical description of GSL (Bellugi & Fischer,
linguistic resources for GSL. Part of the project 1972).
description was the creation of a digital sign stream Next, we will present the methodologies adopted for
narration corpus and an electronic dictionary of basic GSL compiling two vocabulary lists: a general purpose basic
vocabulary. The spin-off products of that project, among vocabulary of 3,000 initial signs and a vocabulary of basic
which are a 3,000 entry multimedia bilingual dictionary computer-skills terminology.
(GSL-Greek) of basic vocabulary and a multimedia In both cases, extensibility and reusability were the main
children’s dictionary of GSL (Kourbetis & Efthimiou, design principles, whereas lack of previous linguistic
2003), reflect the methodology for creating linguistic resources dictated specific methodological approaches to
resources followed, the content and interface evaluation data collection (for the general purpose vocabulary), as
mechanism adopted, as well as the basic linguistic well as to new sign formation (for the computer-skills
research carried out to support the lexicographical work terminology list).
(NOEMA Project, 2001).

51
3.1. Methodology of creation of a general purpose • they have a high frequency rate in the vocabulary
basic vocabulary: data collection of Greek according to HNC data;
The first step of this task mainly involves the compilation • they are included in at least two of the proposed
of the basic sign vocabulary1 of GSL. In the process of basic vocabularies we took into account (Figure
compiling a list of 3,000 basic signs of GSL without an 1);
appropriate corpus available, a decision had to be made as • they can be expressed by words of more than one
to whether statistical frequencies, every day use or grammatical category (i.e. love(n)/love(v)) or by
vocabulary lists taught to young children would constitute a concatenation of synonyms (i.e. angry-furious).
our data. The aim of this procedure was to form the basic sign list
In order to overcome the lack of GSL resources, we of GSL as used by native signers without being biased by
comparatively studied the proposed basic vocabularies or external parameters. For this reason, our informants were
‘defining vocabularies’ of three well analyzed aural asked to propose synonym or antonym signs for concepts,
languages: English, French and German (Mueller et al., wherever possible, so that semantic relations be stated by
1995 ; Gougenheim, 1958 ; Longman Dictionary of means of GSL mechanisms rather than via influence from
Contemporary English). Based on this study, we gathered spoken Greek or other language systems.
a core 3,650 lemma list, which was, then, compared to
two other lists: 3.2. Methodology of development of
• the first one, containing 1,850 words, was terminological resources
provided by the Hellenic Federation of the Deaf As far as GSL terminological resources design is
(HFD) and derived from a previously videotaped concerned, we had to take into account that the
and lemmatized corpus to serve as basic study introduction of specific concept systems in the language
material for GSL; means creating new term systems for representing these
• the second one contained the 2,100 most frequent concepts (Sager, 1994 ; Otman, 1996). In the initial stage
words in the Hellenic National Corpus (HNC), an of defining the methodology for term formation, we
electronic corpus of general Greek developed by focused on the principle that new term denominations,
ILSP, which contained 13,000,000 words at the term signs in our case, should incorporate and
period of study. demonstrate the following properties innate to the
The HNC (1999) word list is of significant importance, language (Gee & Goodhart, 1985):
given that it contains words corresponding to existing • GSL mechanisms of vocabulary organization;
appearances in text corpora. On the other hand, the words • GSL mechanisms of word formation;
that consist the basic vocabularies of different languages • GSL lexical unit function in sign stream
carry an even heavier weight because they allow reference production.
to a set of concepts rather than isolated words. Such The task of term list formation (Rey, 1995) incorporates,
concepts may be viewed as basic in respect to everyday to a considerable extend, the characteristics and
communication. Since we proposed a concept-based conditions of lexicographical work. However, there is a
approach to vocabulary building, we had to take into crucial point of differentiation, as the items included in a
account the issue of the representation of these concepts terminology list carry a given semantic value only within
through different grammatical categories. We noticed that a specific context of use, outside which they may carry
in the vocabulary lists included in our study, concepts different meaning or no meaning at all.
were represented either by a single or by more than one Furthermore, terms are one-to-one representations of
grammatical category, without following a systematic way concepts, which are organized into systems (Rey, 1996)
of listing (i.e. in one case, the proposed representation and, in contrast to other lexical items, may consist of
involves basic/base(v) vs. base(n)/base(v) and in another complex syntactic and/or semantic units which are formed
difference/differ vs. difference/different). not merely by linguistic but also by other (i.e.
In the case of GSL vocabulary, we either adopted the mathematical) symbols or a combination of them (Wright
words suggested by HFD or followed suggestions made & Strehlow, 1995).
by individual native GSL informants. Specific The primary task in terms of the initial linguistic data
grammatical categories were further excluded from the collection was defining the field of coverage (Sager,
GSL list on the basis of the numerical restriction of 3,000 1990). This was followed by a study of the content of
signs. Subject to this exclusion were adverbs (unless no term intensive corpora on the selected fields of
equivalent adjective was available) and passive verb knowledge. The result was the extraction of a set of
forms and participles (unless the latter had an adjectival concepts for each field. Our example case is the field of
function in the language). computer-skills terminology. In this specific case, the
As a result, a 2,800 concept list was formed, which was language of initial knowledge creation is English. As a
then presented to HFD for comments, enrichment with result, a considerable proportion of the terms,
concepts specific to deaf communication and video denominating the relevant concepts, are transferred either
recording (Efthimiou & Katsoyannou, 2001). For every directly or indirectly from English into receiver
concept on the proposed list three parameters are true: languages, such as Greek. Consequently, the concept list
of computer-skills terminology had, in our case, two
existing representation equivalents in the context of
1 spoken languages: a set of English terms (source
One should notice that the notion of basic vocabulary is language) and a set of their Greek translations (receiver
not uniformly defined in the relevant literature, which language).
raises the issue of selecting the appropriate
methodological approach to deal with the data.

52
The task was to create terms in GSL for the same 4.2. Design and development of the terminological
concepts, allowing for the least possible influence by DB
previously existing representations, while creating The design of the terminological resources DB is based on
terminological items according to sign language word a term list, the formation of which was described in the
formation principles. This was a crucial prerequisite for methodology section 3.2 above. Each entry corresponds to
the proposed denominations to be recognized by native a term and includes fields for:
signers as components of GSL with acceptable internal • the video recorded term-sign,
structure and specific cognitive content. • a video capture file serving as a visualized
This task of concept denomination for the formation of a definition (Rousseau, 1983),
terminology list in GSL was undertaken by a working • the equivalent Greek term,
group of terminologists, computational linguists, • the equivalent English term,
computer scientists, GSL specialists and computer skills • a lemma identification code number,
teachers which included members of the Greek Deaf • a code indicator corresponding to the basic
Community. handshape for the term-sign formation in GSL,
The output of this group work was a list of video recorded • a link to HamNoSys features other than the
terms, which were entered into a DB along with their handshape, and
Greek and English equivalents. • sub-area fields in which each term is used.
In the case of computer-skills terminology, the sub-area
4. Organization of vocabulary databases fields include the following categories:
The internal organization of the lexical resources database • General Notions,
differs from the one designed for storing terminological • Word,
items with respect to lemma-related information as far as • Excel,
the expected functionality of resources is concerned. • Access,
Thus, synonyms and antonyms (Figure 2) are included • Internet Explorer,
only in the case of general vocabulary, whereas standard • Power Point and
GSL phonological features such as handshapes are • Windows.
included as lemma related information in both DBs. For By adopting this architecture, the extensibility of the DB
the same reasons, lemmas in the terminology DB are is guaranteed through the possibility of adding new terms,
related not only to field but also to sub-area of use, in entry fields or terminology domains. Moreover, DB
order to allow for greater precision and clear lemma maintenance through addition, deletion or modification of
interrelations. term entries is possible without crucial or risky changes in
terms of programming (Sowa, 2000).
4.1. Design and development of the general
purpose vocabulary DB 5. Dictionary implementation
Given the specific goal of creating exhaustive reusable To exemplify the (re-)usability of the lexical resources
vocabulary resources of GSL, the design of the general discussed here, we make a short reference to two relevant
purpose vocabulary DB incorporated a number of products: a bi-directional (aural Greek-GSL and GSL-
properties which include fields for: aural Greek) dictionary, compiled after a systematic
video recorded signs, survey of linguistic structure and a computer-skills
grammatical category of lemmas, trilingual dictionary (GSL-Greek-English).
synonyms, As far as the dictionary making process is concerned, the
antonyms (Figure 3), organisation of entries was based upon the principle of
interpretations, usability in terms of the two user groups. Thus, each sign-
lemma classification by thematic category, lemma is followed by different defining / exemplification
lemma translation into Greek and elements in both cases. In the general purpose dictionary
HamNoSys annotation features of lemma (Efthimiou & Katsoyannou, 2001 ; 2002), entry structure
phonology (Prillwitz et al., 1989). provides the following set of information with respect to
The DB was then enriched with lexical content following each GSL lemma:
the methodology for data collection described above. translation equivalent(s),
Experience gained by lemma analysis of the selected an explanation in Greek,
video signs enabled a number of assumptions regarding synonyms in GSL,
the morphological structure and sign formation antonyms in GSL,
mechanisms of GSL (Efthimiou & Katsoyannou, 2002). illustrative image (whenever possible),
This knowledge provided the grounds for introducing new thematic category for lemma classification.
signs as in the case of GSL terminology items. The inclusion of a Greek definition and translation helps
The implementation of the DB has already proven that the non-native GSL signers enrich their vocabulary of modern
above structure allows for a multi-dimensional use of the Greek. At the same time, thematic categorization enables
resources created. The reusability of the general GSL the learning of groups of signs which relate to each other
vocabulary resources has already been tested by the fact semantically or pragmatically.
that these resources provided the lexicographical content Lemma search is possible in the following manners:
for a number of dictionary products. The same DB content by order of handshapes within lemmas (Figure 4),
also draws on on-going research with respect to efficient by thematic category (e.g. «plant names»),
sign representation. by alphabetical order of the modern Greek
translations.

53
Dictionary users perceive the special features of GSL in (screen organization, menus, help provided), efficiency of
direct reference with Greek, while thematic fields function information accompanying the entry for each sign,
as a bridge between each sign and its Greek equivalent. adequacy of information introducing general aspects of
Concerning the terminology dictionary, as soon as the GSL grammar incorporated in the product, period for
application starts, the items in the DB are processed so as getting used to navigating through the product and
to filter the lemmas corresponding to the user selection possible recommendations for future versions. The output
criteria (PROKLISI Project, 2003). of that first circle of evaluation served as feedback for
The lemma screen includes the following elements: making improvements to the final dictionary product. The
thematic category, second evaluation step followed the same methodology,
a list of every lemma in this category, from with the purpose of verifying the acceptance of the
which users can select, product by the Greek Deaf Community. More information
the selected lemma in Greek, on the evaluation of the basic vocabulary dictionary can
the selected lemma in English, be found at the related project deliverable (NOEMA,
a video-lemma in GSL, 2001).
a list of all sub-area fields in which the selected A first version of the computer-skills terminology
lemma appears, dictionary was experimentally introduced as an education
a screen capture example of the term, support tool in a continuous education class. Comments
a videotaped text in GSL with a concise on both system functionality and content efficiency were
presentation of the selected thematic category. incorporated in the final product version to be released on
Users can access the content in the following ways: 30th March 2004.
by the main handshape which forms the sign
corresponding to each term. In this case, each 7. Future research & development goals
sign is also accompanied by equivalents in both Future development efforts in respect to both platforms
Greek and English, a list of thematic categories (basic vocabulary dictionary and computer terminology
relevant to the term, a video presentation of the dictionary) include investigation of the possibility of
term, and a videotaped text with an introduction implementing smarter search options, in relation to the
to the selected sub-area; ongoing extension of the basic vocabulary DB content.
by the Greek or English term equivalents in Efficient sign-based user look-up features will also be
alphabetically ordered lists (Figure 5). The incorporated along with fuzzy search capabilities (as
sign which corresponds to the selected term can proposed, for instance, by Wilcox et al. (1994)).
appear either by clicking on the list or by Based on the proposed methodology for the creation of
typing it in, in one of the suggested languages. the computer-skills terminology dictionary, other
Items of information available for this search specialized dictionaries, intended to serve knowledge
option include: a list of every sub-area in which transfer in several areas of interest, are foreseen to be
the selected lemma appears, a video created, in order to meet a wider range of educational and
exemplifying the lemma and the videotaped communication needs (Dowdall et al., 2002) of the Greek
text with an introduction to the selected Deaf Community.
thematic sub-area; Closing, we may notice that a children’s dictionary
by thematic sub-area. In this case, users can (Kourbetis & Efthimiou, 2003) has already been
select among seven thematic categories (Figure developed, following the release of the NOEMA
6) corresponding to the sub-areas in which dictionary, which will provide further linguistic material
computer-skills terminology is categorized. for educational applications addressing early primary
This option retrieves the corresponding terms in school needs.
three lists of equivalents: GSL-Greek-English.
Items of information available for this search
option also include the other sub-areas in which
Acknowledgments
the term appears, a video capture explanation of The authors wish to acknowledge the assistance of all
the term or an image, and an informative sign groups involved in the implementation of the proposed
stream presentation of the selected sub-area. methodologies: terminologists, computational linguists,
computer scientists, as well as the panel of GSL
6. Evaluation criteria and procedure consultants and Deaf individuals, acting as GSL
specialists and computer skills teachers.
Evaluation procedures for both dictionary products were More specifically, the authors wish to acknowledge the
carried out by user groups of native GSL signers in real assistance of Dr. V. Kourbetis, S. Antonakaki, D.
use environment. The basic vocabulary dictionary was Kouremenos and V. Sollis.
tested in two rounds, in the context of various This work was partially supported by the national project
communicative situations. The evaluation body was grants NOEMA (98 AMEA 11) and PROKLISI-EQUAL.
composed of GSL native signers of various age groups,
who were asked to use the dictionary in school, work and
home environment and complete an evaluation criteria
References
list. The related questionnaire contained 26 multiple Bellugi, U. & Fischer, S. (1972). A comparison of Sign
choice questions and 5 free input slots. The main language and spoken language. Cognition, 1, 173--200.
evaluation criteria comprised educational and Dowdall, J., Hess, M., Kahusk N., Kaljurand, K., Koit,
communication needs, type of profession, source that M., Rinaldi F. & Vider, K. (2002). Technical
disseminated the NOEMA product, interface design Terminology as a Critical Resource. In Proceedings of
the Third International Conference on Language

54
Resources and Evaluation (LREC 2002), (pp. 1897-- Tennant R.,A. & Gluszak Brown, M. (1998). The
1903), Las Palmas de Gran Canaria, ELRA. American Sign Language Handshape Dictionary. Clerc
Efthimiou, E., Sapountzaki, G., Carpouzis, C., Fotinea, S- Books, an imprint of Gallaudet University Press.
E. (2004). Developing an e-Learning platform for the Wright, S.-E. & Strehlow, R. A. (1995). Standardizing
Greek Sign Language. Lecture Notes in Computer and harmonizing terminology: theory and practice,
Science (LNCS), Springer-Verlag Berlin Heidelberg (in Philadelphia, American Society for Testing Materials.
print). Wilcox, S., Scheibmann, J., Wood, D., Cokely, D. &
Efthimiou, E. & Katsoyannou, M. (2002). NOEMA: a Stokoe, W. (1994). Multimedia dictionary of American
Greek Sign Language Modern Greek bidirectional Sign Language. In Proceedings of ASSETS
dictionary. Modern Education Vol. 126/127, 2002, 115- Conference, Association for Computing Machinery,
-118 (in Greek). (pp. 9--16).
Efthimiou, E. & Katsoyannou, M. (2001). Research issues
on GSL: a study of vocabulary and lexicon creation.
Studies in Greek Linguistics, Vol. 2 Computational
Linguistics, 42--50 (in Greek).
Gee, J. & Goodhart, W. (1985). Nativization, Linguistic
Theory, and Deaf Language Acquisition. Sign
Language Studies, 49, 291--342.
Gougenheim, G. (1958). Dictionnaire fondamental de la
langue française. Didier.
Hellenic National Corpus, Statistics. ILSP (Institute for
Language and Speech Processing) (1999). Available at:
http://www.xanthi.ilsp.gr/corpus/
Johnston T. & Schembri, A. (1999). On defining Lexeme
in a Signed Language. Sign Language and Linguistics
2:2, 115--185.
Kourbetis, V. & Efthimiou, E. (2003). Multimedia
Dictionaries of GSL as language and educational tools.
Second Hellenic Conference on Education, Syros (in
Greek), (in print).
Logiadis, N. & Logiadis, M. (1985). Dictionary of Sign
Language. Potamitis Press (in Greek).
Longman Dictionary of Contemporary English.
Mueller, J., Bock, H. & Georgiakaki, M. (1995).
Grundwortschatz Deutsch. Langenscheidt.
NOEMA Project (2001). Deliverable titled “Final
implementation report”. ILSP, 6/2001 (in Greek).
Otman, G. (1996). Les représentations sémantiques en
terminologie, Paris, Masson.
Prillwitz et al. (1989). HamNoSys. Version 2.0. Hamburg
Notation System for Sign Language. An Introductory
Guide. Broschur / Paperback (ISBN 3-927731-01-3).
PROKLISI Project: WP9: Development of a GSL based
educational tool for the education of people with
hearing impairments, Deliverable II: GSL linguistic
material: data collection methodology. ILSP, May 2003.
Rey, A. (1995). Essays on terminology, Amsterdam/
Philadelphia, John Benjamins.
Rey, A. (1996). Beyond terminology.In Somers, H. (ed.),
Terminology, LSP and Translation. Studies. in
Language Engineering in Honour of Juan C. Sager (99--
106), Amsterdam, John Benjamins.
Rousseau, L.-J. (1983). La définition terminologique. In
Duquet-Picard, D. (ed), Problèmes de la définition et de
la synonymie en terminologie: Actes du Colloque
international de terminologie (35--46), Québec,
GIRSTERM.
Sager, J. C. (1990). A Practical Course in Terminology
Processing, Amsterdam, John Benjamins.
Sager, J. C. (1994). Terminology: Custodian of
knowledge and means of knowledge transfer.
Terminology 1/1, 7--16.
Sowa, J. F. (2000). Knowledge Representation: Logical,
Philosophical, and Computational Foundations, Pacific
Grove, Brooks Cole Publishing Co.

55
Figure 1: Part of the GSL basic vocabulary DB; the 3rd column from left provides information as regards original (co-)
appearance of lemmas in source lists.

Figure 2: Part of the GSL basic vocabulary DB; synonym and antonym association to video-lemmas.

56
Figure 3: Synonym/antonym screen incorporated in alphabetical search capability.

Figure 4: Lemma search by handshape in the GSL – Modern Greek basic vocabulary dictionary.

57
Figure 5: Computer-skills term dictionary: alphabetical search screen.

Figure 6: Association of lemma to sub-area of field in computer-skills terminology DB.

58
From Computer Assisted Language Learning (CALL) to Sign Language
Processing: the design of e-LIS, an Electronic Bilingual Dictionary of Italian Sign
Language and Italian
Chiara Vettori, Oliver Streiter, Judith Knapp
Language and Law
EURAC; European Academy of Bolzano
Viale Druso/Drususallee 1, 39100 Bolzano/Bozen, Italy
{cvettori;ostreiter;jknapp}@eurac.edu

Abstract

This paper presents the design of e-LIS (Electronic Bilingual Dictionary of Italian Sign Language (LIS) and Italian), an ongoing
research project at the European Academy of Bolzano. We will argue that an electronic sign language dictionary has to fulfil the
function of a reference dictionary as well as the function of a learner’s dictionary. We therefore provide an analysis of CALL
approaches and technologies, taking as example the CALL systems ELDIT and GYMN@ZILLA developed at the European Academy
of Bolzano too. We will show in how far these approaches or techniques can be ported to create an electronic dictionary of sign
languages, for which system components new solutions have to be found and whether specific modules for the processing of sign
languages have to be integrated.

evident and has motivated similar projects in other


1. Introduction: Dictionaries of LIS countries.
Around 50.000 people in Italy are deaf. The first
language of the majority of them is LIS, Lingua Italiana 2. Towards e-LIS
dei Segni (Italian Sign Language), but there is also an Most sign language dictionaries form a hybrid between
undetermined percentage of oralist deaf people. LIS is a reference dictionary and a learner’s dictionary. This
also acquired as a second or third language by hearing often occurs because sign language is implicitly
family members, teachers, interpreters and logopedics, considered as the second language of a “learner’s
amounting to about 170.000 people using LIS, in various dictionary” de-facto created for the needs of hearing
degrees of language competence. Unfortunately, the people. At the same time these lexicographic works
quality and accessibility of LIS-courses and supporting pretend to fulfil the function of a reference dictionary of
material (dictionaries, text books, and videos) lack behind the involved sign language, only in virtue of the presence
the actual need. Moreover, the official support does not of drawings and photos representing different signs. “A
meet the high standards of other countries and does not major feature of such dictionaries is the absence of
comply with international recommendations, e.g. definitions, it being assumed that each sign would have
Recommendation 1598 (Council of Europe 2003), which exactly the same meaning(s) as the written word with
advice, among others, to broadcast television programs in which it is linked” (Brien 1997). This sort of production
sign language, to utilize new technologies for teaching treats signs as equivalents of the words of a spoken
sign languages and to include sign languages as a valid language and neglects the complexity, the dignity of sign
academic qualification. It is most likely that such status language and its peculiarities in semantics and syntax.
quo also depends on the position of the Italian government Lexical units in a sign language differ in a number of
which has not yet officially recognized LIS. important features from the translational equivalents in the
spoken language. These are:
As for LIS dictionaries, the vast majority of them are
paper based ones, e.g. Radutzky 1992 (752 signs, 2500 • the referential extension, i.e. which objects, states
sign meanings); Angelini et al. 1991 (400 signs). The and events are referred to by a word,
paper format, however, cannot obviously account for the • the conceptualization, e.g. as Abendstern,
possibility of describing the three-dimensional complexity Morgenstern or Venus (Frege 1892),
of each sign. A first, significant attempt in Italy to exploit • the micro-syntax, e.g. the derivational history of a
new technologies to approach sign languages in an word from its bases via compounding or derivation
innovative and more proficient way, was made by the to its final form,
team of Cooperativa Alba. Its members have opened an • the stability with which they belong to a word class
Internet portal for LIS (DIZLIS) that now features more (nouns vs. verbs),
than 1000 video-filmed signs, which represent a • the lexical relations they maintain, e.g. expressed
respectable size for a sign language dictionary, cfr. as lexical functions (Melc’uk 1974) and
Sternberg 1987 (3300 signs), Stewart et al. (2500 signs). • the affiliation of a word to a word class which does
Italian serves as vehicular language and dictionary index.1 not exist in the other language, e.g. classifiers in
The advantage of this presentation of signs over the sign language or functional prepositions in the
schematic and static drawing in paper dictionaries is spoken language.

1
http://www.dizlis.it

59
As LIS is an autonomous language and not a mere Figure 1 shows a screenshot of the dictionary entry for
visual representation of Italian, we designed a dictionary the Italian word “casa” (Engl. “house”). The left-hand
which describes two systems at the same time, the Italian frame shows the different word meanings. Each meaning
and the LIS one, and which can also build a bridge is described by a definition, a translation, and an example.
between them through a sort of “translating interface”. In The right-hand frame shows additional information, which
this perspective, accepting Stokoe’s description of what he depends on the selected word meaning. The collocation
calls “serious dictionaries” (Brien 1998), we are greatly tab lists the most frequent collocations along with their
motivated to focus on the definition of sign meanings that translation and an illustrative example. In the semantic
could reveal much of the deaf culture. field tab word relations (such as synonymy, antonymy,
This accommodates for two distinct user groups. (a) etc.) are illustrated in graphs for the learner. Verb valency
Hearing Italian people who study LIS and who will start is explained using colours and movable elements.
with an Italian query term in an Italian environment Adopting a comparative approach, ELDIT also stresses
(Italian definitions, explanations etc.); (b) LIS-signers specific differences between the Italian and the German
looking for a sign and who should have the possibility to language. Such differences are indicated by footnote
formulate query terms in LIS and have a LIS environment. numbers. Last but not least, each word used in the system
In order to assure the description of sign language in (e.g. in the definitions or in the example sentences) is
the sign language itself, therefore accounting for the annotated with lemma and part-of-speech and is linked to
specificity of this linguistic code2, appropriate modes of the corresponding dictionary entry, which facilitates a
rendering it into a Web-interface are required. One quick dictionary access for unknown words.
unexplored way of providing signs’ definitions could be
realized through the adoption of SignWriting (Rocha 4. GYMN@ZILLA
Costa & Pereira Dimuro 2003). In contrast to filmed A further interesting way of facing LIS and Italian is
definitions, in fact, SignWriting renders the definitions, represented by Gymn@zilla, a browser-like application
explanations and menu buttons searchable (Aerts et al. which integrates existing educational and non-educational
2004, Rocha Costa et al. 2004) and hyperlinkable. Words modules in a new didactic environment. Gymn@zilla
contained in a definition may thus be linked to lexical allows to access documents from the Internet and to
entries, which feature, as main component, the filmed convert its text into an easy reader text, a glossary and a
sign. completion exercise.
Gymn@zilla is used like any browser. The program
3. ELDIT accesses a web-page, identifies its language and encoding
One of the tools we already count on and from which and performs a simple word-form stemming of text. The
we intend to develop the e-LIS dictionary is ELDIT, an stemmed words and expressions are then linked to their
electronic learners’ dictionary for German and Italian. respective translations in external dictionaries. The linked
Inspired by the lexicographic research started in the ‘50s lemma is marked up as html tool-tip to provide an
and according to recent psycholinguistic and didactic immediate translation aid even without following the
theories (Aitchison 94, Kielhöfer 96), it covers a limited external link.
vocabulary consisting of approximately 3.000 words for Clicking on a word triggers two actions. First, the
each language. It also stores a large set of information for complete explanations of the external lexicon are opened.
each word entry and highly interlinked information pieces.
Second, the word, its context and its translation are added
to a personal glossary. The learner can edit the vocabulary
in his personal dictionary and use it for intentional
vocabulary acquisition, as opposed to incidental
vocabulary acquisition by annotated reading of the web-
page. Last, the learner can create interactive quizzes from
the personal glossary, for which Gymn@zilla
automatically offers inflected, uninflected and misspelled
forms to fill the gaps. Gymn@zilla handles a number of
language pairs, some going from a spoken language to a
Sign Languages (e.g. English=>ASL, c.f. Figure 2).
Through a triangulation of the translation dictionaries (e.g.
Italan => English => ASL) we will give Gymn@zilla new
dimensions of usage.

Figure 1: Dictionary entry for the Italian word "casa"


(house) in ELDIT.

2
Cfr: Les Signes de Mano
http://www.ivtcscs.org/media/mano.htm

60
Figure 3: SignWriting in combination with Sign
Language, a vision for the e-LIS system.
Figure 2: Annotated reading with Gymn@zilla.
Beside this kind of inner metalinguistic description, we
5. e-LIS Architecture won’t forget the peculiar needs of Italian speaking
learners of LIS who will presumably not be able to read
Hence it becomes obvious, even after this schematic
SignWriting and prefer videos of signs. For these users, as
analysis, that an electronic dictionary of sign language can
well as for signers studying Italian, Gymn@zilla can be
be much more than a list of search indices, each
easily invoked with its habitual functions:
hyperlinked to a video file. The search will start with an
Italian key word or a LIS key word entered in SignWriting
• Italian words will be rendered as easy reader
yielding a parallel list of all matching lemmas and
through video films or SignWriting
collocations in Italian-LIS (SignWriting), similar to LEO3,
• SignWriting will be rendered as easy reader
developed by the Munich University of technology, and
through video films or Italian
bistro4, developed at the European Academy Bolzano.
• personal word lists can be constructed
Clicking on a word or an expression makes this a search
• completion test can be started at any time (in
term, possibly inverting the direction of search. As in
Italian and SignWriting)
bistro, additional links will lead to the monolingual lexical
• Texts Italian and SignWriting located in the WWW
entries.
can be smoothly integrated into e-LIS, with
proposed or freely selected texts, in order to allow
The Italian entry will be close to its current form in the first steps outside the e-LIS environment. In
ELDIT, which might be profitably reused for developing case of any doubt, Gymn@zilla will take the user
e-LIS (c.f. Figure 1). Link texts to related entries in LIS always back to e-LIS to provide the necessary
will be rendered in SignWriting. The LIS entry will explanations.
feature the filmed representation of the LIS sign. All
definitions and explanation in the LIS entry will be in LIS, In addition, the analysis of possible difficulties LIS-
rendered in SignWriting. As in the Italian entry, each sign signers encounter in studying Italian (Taeschner et al.,
will be hyper-linked to the corresponding LIS entry. 1988; Fabbretti 2000; etc.) suggests another usage a sign-
Lexical functions, e.g. classifiers, collective nouns language dictionary could be put to. We intend to supply
(antelope => herd, ant => army) etc. will be realized as the dictionary with an apparatus of contrastive
hyperlinks to entries as well, as well as the backward grammatical macros in analogy to the footnote numbers in
relation. Example sentences, collocations, idioms in LIS ELDIT. These macros are triggered whenever a lexical
which do not have a proper lexical entry will be directly entry contains critical features, e.g. semantically weak
linked to the filmed sign presentation. As for the video prepositions such as “di” (“of”) which cause translation
approach, we will draw on the materials already difficulties for signers while writing in Italian, differences
developed for the site DIZLIS by the Cooperativa Alba. in word order etc. The lexical material of the entry and its
parallel counterpart (in LIS or Italian) will be inserted into
the macro and rendered from the point of view of the
actual entry, yielding a comparative and synoptic
descriptions of challenging grammatical aspects of the two
languages compared with the lexemes of the current entry.
Also in this perspective, the use of SignWriting could be
particularly useful because it permits to parcel two
equivalent strings in sign language and Italian and to
interrelate the single syntagms/parts thus immediately
showing the similarities and differences of the two
systems with the aid of colours (for the corresponding
elements) and explanations in sign language in case of
differences.
3
http://dict.leo.org/
4
http://www.eurac.edu/bistro

61
6. Conclusions
We have presented so far the rationale of e-LIS,
Electronic Bilingual Dictionary of Italian Sign Language
(LIS) and Italian. A short analysis of existing Sign
Language projects and of several CALL projects that have
been carried out in the past years at the European
Academy Bolzano has revealed that an electronic
dictionary of sign language can be much more than a
simple list of search indices, each hyperlinked to a video
file.
While reusing some tools and options of the Italian-
German dictionary ELDIT and enriching them through the
many didactic functions provided by Gymn@zilla, a
browser that converts Internet texts into easy reader ones,
we will develop a new type of sign language dictionary.
We hope that our system might contribute to a
research area that up to now has been quite neglected in
Italy and that it could contribute to and accelerate the
process which will lead Italian government to the official
acknowledgement of LIS.

7. References
Aerts, S., Braem, B., van Mulders, K. De Weerdt, K.,
2004. Searching SignWriting Signs, this volume.
Angelini, N., Borgioli, R., Folchi, A. and Mastromatteo,
M., 1991. I primi 400 segni. Piccolo dizionario della
Lingua Italiana dei Segni per comunicare con i sordi.
Firenze: La Nuova Italia.
Brien, D. and Collins, J., 1997. The CD-ROM Dictionary
of Deaf Community and Culture: Some Issues in the
Creation of Sign Language Dictionaries. Paper
presented at Empower'97: International Conference on
Deaf Education. http://www.ssc.mhie.ac.uk/archive/
proceedings/briencollins.html
Fabbretti, D., 2000. L’italiano scritto dai sordi:
un’indagine sulle abilità di scrittura di sordi adulti
segnanti nativi. Rassegna di psicologia, 1 (XVII):73-93.
Frege, G. 1892. Über Sinn und Bedeutun, Ztschr. f.
Philosophie und philosophische Kritik, NF 100:25-50.
Melc’uk, I., 1974. Opyt teorii lingvističeskix modelej
'Smysl-Tekst'. Moscow, Nauka.
Radutzky, E. (ed), 1992. Dizionario bilingue elementare
della Lingua Italiana dei Segni. Rome: Edizioni Kappa.
Rocha Costa, C. A. and Pereira Dimuro, G., 2003.
SignWriting and SWML: Paving the Way to Sign
Language Processing. Workshop Traitement
automatique des langues minoritaires et des petites
languges, TALN 10e conference, tome2, 193-202.
Rocha Costa, C. A. and Pereira Dimuro, G., Balestra de
Freitas, J., 2004. A sign matching technique to support
searches in sign language texts. , this volume.
Sternberg, M.L.A. (ed), 1987. American Sign Language
Dictionary. New York: Harper & Row, New York.
Taeschner, T., Devescovi, A., Volterra, V., 1988. Affixes
and function words in the written language of deaf
children. Applied Psycholinguistics, 9:385-401.

62
19 th Century Signs in the Online Spanish Sign Language Library:
the Historical Dictionary Project
Rubén Nogueira & Jose M. Martínez
University of Alicante
Ap.Correos 99.-E-03080 Alicante (Spain)

[email protected], [email protected]

Abstract

This paper will illustrate the work made in the Sign Language Virtual Library (http://www.cervantesvirtual.com/portal/signos), a
project aimed at people interested in Spanish Sign Language; and specially, at its main users, Deaf people. It is organised into six
different sections: Literature, Linguistics, researchers forum, Deaf culture and community, bilingual-bicultural education and didactic
materials. Each section contains different publications related to the above mentioned areas. Moreover, in this web you will also find
an innovation, since every publication includes a summary in Spanish sign language. Two sections will be described: The Historical
Dictionary published by Francisco Fernandez Villabrille and the Alphabetical Writing Lessons. Our intention is showing a full
functional version of the applications described on the paper.

available only to a few, mainly children of the nobility,


1. Introduction who, in exchange, favoured the clergy financially.
All languages are natural phenomena for the However, the following facts must be taken into account:
people who constantly use them to communicate. This is - until then it was commonly believed that Deaf
people had no understanding or language, this way of
also the case of deaf people and Sign Languages. For
different reasons, these languages have lacked the thinking changed thanks mainly to Ponce de León.
consideration enjoyed by oral languages. In fact, certain - Ponce de León invented the manual alphabet
and this is one of the first proofs of the visual-gestural
misconceptions – that sign languages were artificially
created, that they are universal expressions, that they are characteristics of the language of Deaf people and their
merely mimetic or that they were created to replace oral means of communication.
Thanks to these publications, now included in our
languages – still exist, although not as much in the field
of linguistics as in the rest of society. website: Biblioteca de Signos, The Online Spanish Sign
Several psycholinguistic studies (Bellugi, Klima & Siple, Language Library, we know of the existence of a tradition
of deaf education in Spain, begun by Pedro Ponce de León
1975) have indicated the natural acquisition of these
languages, as well as the various stages or phases of and Juan Pablo Martínez Carrión. This tendency was
development that deaf children must go through when known in Europe as ‘The Spanish School’ or ‘The School
of the Art of Teaching The Mute To Talk’.
learning to sign; these stages are similar to those
established for hearing children who learn an oral The word ‘mute’ appears in almost every one of these
language and may even occur earlier during the process. books; the first to mention the difference between ‘deaf-
mute’ and ‘deaf’ is Lorenzo Hervás, in 1795. The Spanish
(Folven & Bonvillian, 1991; Juncos et al., 1997).
Using gestures to communicate is inherent to School was very important until the end of the 18th
human beings; in this case, we could say that it is a century.
universal tendency. However, the codes applied by users Two of these very important books deserve
of sign languages relate to different cultural and linguistic particular mention: La Escuela Española de Sordomudos
patterns, in their phonetics, morphology, syntax… It must o Arte para enseñarles a escribir y hablar el idioma
be emphasized as far as possible that Sign is a language español (The Spanish School of Deaf-Mutes or The Art of
like any other, because – although it may seem otherwise Teaching Them To Write and Speak Spanish) written by
– most hearing people have never had contact with deaf the linguist Lorenzo Hervás, an important expert in
people and are totally unaware of their reality. linguistic typology, and the Diccionario usual de mímica
We will dwell here on a very important issue concerning y dactilología (Dictionary of Usage of Miming and
the study of any language: how has it been modified Dactylology), by Professor Francisco Fernández
through time? Villabrille, which is the starting point for the Diccionario
Observing these changes, not only throughout
Histórico de la LSE project (The Historical Dictionary of
time, but in accordance with the universal tendency of
evolution of languages, it is now possible to speak of the Spanish Sign Language).
linguistic evolution of Spanish Sign Language as the
normal historical evolution of a language. 2. The Historical Dictionary Project
Publications available in Spain (Hervás, 1795; More than a century has passed since the first
Bonet, 1620; Ballesteros, 1845), among others, show that, Spanish Sign Language dictionary was published by
thanks to the efforts of Brother Ponce de León, the Francisco Fernández Villabrille in 1851. His work gives
education of Deaf people started a long time ago in Spain. us a date for the formation and consolidation of this
At the time, Deaf people were taught to acquire an oral language.
language: Spanish or Latin, and education was a privilege

63
The project introduced here began when we were workshop, to later capture them, treat them digitally and
given the opportunity of offering this text, with its upload them onto the web page, where users may watch
translation into Sign Language, on the Internet, through the video simultaneously with a description of the sign
the Biblioteca Virtual Miguel de Cervantes (Miguel de given in Villabrille’s work and the new Alphabetic
Cervantes Virtual Library), an ambitious project for the Writing System for SSL, created by the University of
digital publication of Hispanic culture, with over twelve Alicante.
thousand digitalised books, in their respective sections. In We will now explain the technical matters
one of these sections, The Sign Language Library, we relating to this project.
translated all these texts into Sign Language, using video
combined with other elements, as we will explain in detail 3. Technical Description of the Historical
below. Our intention goes beyond leaving a testimony of Dictionary
the signs used by the Deaf in the 19th century and,
accordingly, when we finished presenting the book and Some time ago, the people working on the Online
the signs in LSE, we wanted to round off our work with Spanish Sign Language Library project expressed a wish
later dictionaries and current signs. to create an online version of the Historical Dictionary
In order to prepare the Historical Dictionary published by Francisco Fernandez Villabrille in 1851.
project, the team, composed of deaf people, Sign Applying some of the concepts previously used
Language interpreters and linguists with expertise in Sign in the Online Spanish Sign Language Library, we arrived
Language, first thoroughly revised the roughly 1500 signs at the following design:
contained in Villabrille’s dictionary. In doing this, we
studied the most descriptive phonologic components in
the dictionary and also looked for similarities and
differences with the current phonological system of SSL.
There are cases of disappearance, modification or addition
of phonemes and other interesting phenomena such as
variation of locations, hand shapes, movements and
fluency of components (assimilation) (Woodward, 1975;
Frishberg, 1975). Morphologically, we could mention
resources used for gender inflexion, number, person,
tense, etc., through repetition or a certain use of direction,
among others. Syntactically speaking, we have less
information but we can still discuss structure and the rules
for combining signs, which have, on occasions, undergone
changes and on others, stayed the same (incorporating
syntactic interferences from related languages, such as
oral Spanish). Semantically, we can see a transformation
in the natural evolution of the linguistic situations in Figure 1: Historical Dictionary design.
which the language is used, which implies that Sign
Language has the capacity to develop and evolve over First of all, the user chooses a letter of the
time, broadening/restricting meanings, borrowing from alphabet, by simply clicking on it. Next, a list appears
related languages, etc. with all the words in the dictionary beginning with that
During this revision we pondered, for instance, letter, accompanied by an image with the corresponding
whether the signs given in the document are still in use or historical sign. We now choose the word that we wish to
whether they have disappeared; we also analysed whether consult from the list, and the definition of the word
or not the signs are mimetic representations. Some signs appears. Depending on the word selected, several
that used to be mimetic (‘agile’ - AGIL, ‘to fall’ - informative icons may be activated. Additionally, the
CAER...) no longer are; others were originally arbitrary choice of a certain word, can initiate two more actions: a
(‘cat’ - GATO, ‘to make’ - HACER). No language is video reproduction of the signed representation of the
systematically mimetic. For example, let’s take old sign selected word, and its written representation (SEA). This
for ‘to fall’ – CAER, described in the dictionary as an is, in broad strokes, the challenge issued to the computer
imitation with the body representing the movement of the science department of the Online Spanish Sign Language
action. This representation cannot be considered Library project.
linguistic. Deaf people are conscious of whether they are Until then, we had a clear series of concepts
miming or using a linguistic sign, but this matter is regarding the integration of different technologies, such as
somewhat more complex for researchers, as the difference the exchange of “messages” between a flash file
is not always sufficiently clear. embedded in a Web document, the document itself and a
We have tried to distinguish conventional from video file. However, although these concepts were clear,
non-conventional, although we must take into account the we still needed to solve the biggest problem: the local
fact that we are dealing with a language with no register management of simple data bases; after all, we were
or standardization rules. talking about a dictionary. The solution was provided by
Once we had finished this analysis and agreed on Macromedia and the capacity of Flash to manage XML
the signed production of each sign (taking into account documents. We had now solved the database problem in
that our information was occasionally insufficient and, local mode, thus avoiding unnecessary host requests. All
accordingly, could not be recorded), the next step was to the words, definitions and associated extra information
record the almost 1400 signs in a professional video would be generated using XML rules.

64
The application consists of 4 frames, one static Looking at the current panorama, there are a
and housing an image, the rest being dynamic, at least as number of web pages with online LS dictionaries out
regards their content. In other words, the contents of the there: Silent Thunder Animations , the ASL Browser, ASL
frame would change without having to refresh. Let’s take University, A Basic Guide to ASL, Handspeak, Signhear
a look: ASL Dictionary. However, in our opinion, these attempts
do not give image the importance and the presence that it
must have within the languages of signs. Our intention
was to take full advantage of the capacity of image, using
the latest video reproduction technologies used on the
Internet (video streaming). By reinforcing the power of
image using such a powerful tool as Flash, not only did
we not turn our back on the possibility of using image on
the Internet, but, rather, we increased its effects.
However, we wouldn’t wish to give the
impression that this is the entire scope of the project: we
hope to make the dictionary a point of reference and
therefore, we will continue to develop new versions of the
application capable of supporting advanced term
searching, a choice of different dictionaries, etc...
An essential part of our work is processing
signed videos. The videos arrive at the laboratory in
miniDV format, and they are captured and produced using
Adobe Premiere (video editing software). After these
Figure 2: Arrows representing the exchange of messages stages, we now have a high quality version of the video,
which must now be coded in order to adapt it to the size
The core of the application would be located in restrictions imposed by the Internet. The technology used
the flash frame and would consist of a flash application at this stage is Windows Media Encoder, for the reason
embedded in the corresponding HTML document, plus all that commands can be included in the videos themselves
the JavaScript code included in the frame. This code (hypervideo). We are aware that we are working with
manages all the messages sent by and to the flash file. We proprietary software that may not be available to all users,
could say that it is the heart of the application, as it but we considered that it was what best fit the needs of
supports much of the graphic interface, manages and our portal. We do not reject the possibility of
processes the data, and creates the messages arising from standardizing access to the portal, as far as possible, and it
the preceding interpretation. These messages modify the is our intention to approach these questions as soon as
contents of the video frame and the SEA frame. The possible.
changes that affect the flash application are, obviously, We will now complete the description of our
self-managed and require no messages. work on the Online Spanish Sign Language Library
Thanks to the design capacities of Macromedia through a practical example: the writing lessons.
Flash MX, it was relatively simple to transfer what we had
devised on paper onto the screen (the interface). 4. Writing Lessons
The internal process could be summarized as
In this case, the people working on the Online
follows:
Spanish Sign Language Library project decided to include
1. A letter is selected.
an online version of the writing lessons in the portal.
2. The XML document associated with that letter is
Applying concepts previously used in the Online
loaded in the application and an image is shown with the
Spanish Sign Language Library, we arrived at the
corresponding sign.
following design:
3. A word is selected.
4. The data accompanying the selected word is
processed and, depending on the content, the following
processes will be triggered:
a. The video with the signed representation of the
word may start.
b. The written representation of the sign is shown in
the Alphabetic Writing System (Sistema de Escritura
Alfabética – SEA).
c. The definition of the word is shown.
d. The informative icons corresponding to the
selected word are activated.
Processes a and b create a message that is
interpreted by a JavaScript function (included in the
HTML document where the flash application is
embedded), which also modifies the video frame contents
and the SEA frame contents. Processes c and d are
internal to the flash application.
Figure 3: Writing Lessons design.

65
In this case, the user of the portal, having chosen work in Internet, we can calculate the diffusion
a writing lesson, would be looking at a page similar in possibilities of a web page of these characteristics.
design to figure 3. The flash frame would show the title
of the lesson and, immediately afterwards, the video 5. Conclusions
would start up in the video frame. Based on the contents
of the explanation, the flash application embedded in the Our main challenge, comparing this analysis with the
flash frame will show different multimedia contents to analysis of most oral languages, is that research on this
reinforce the video explanation, thereby improving language is more recent and, furthermore, is presented
understanding of the lesson. It would also be necessary to here in new form: image. The acoustic form of oral
include a menu indicating what part of the lesson the user languages as opposed to the image form of SL establishes
is in and making it possible to select different sections not only linguistic differences, but also different ways of
from the selected lesson. digital treatment.
We simply had to apply the technology Over time we have advanced in the linguistic knowledge
integration concepts mentioned in the Historical of SSL and its digital treatment. It is therefore possible to
Dictionary project to give life to the proposed application. achieve the main target of this Sign Library: to create a
As in the previous case, the application consists broad documental supply, essential for a language which
of 4 frames, one static and housing an image, the rest has no accepted written register. This is why it is so
being dynamic, at least as regards their content. In other important to offer a formal and standardised register, as
words, the contents of the frame would change without the Sign Library offers in the videos in the other sections.
having to refresh. Let’s take a look: This work, developed over the past three years, includes
literary registers in its literature section and academic
texts in the linguistic section on sign language, among
others.
We know that this project, unique in Spain, is
particularly followed by the deaf community, eager to
discover how their ancestors signed or the origin of many
of the signs they use today. This is all possible through the
Internet, the ideal framework for persons who do not need
to hear or to be heard in order to communicate.

Acknowledgements
This paper could not have been made without the support
of Cervantes Virtual Library: www.cervantesvirtual.com
and the economical support of Ministerio de Ciencia y
Tecnología (Research Proyect nº BFF2002-010016).

Figure 4: Arrows representing the exchange of messages


References
In this case, we used one of the special
A Basic Guide to ASL, U.S.A., URL:
characteristics of clips encoded with Windows Media
http://www.masterstech-home.com/ASLDict.html
Encoder (the capacity to insert commands within the
[Cited: 30/01/2004]
video itself). Once this is clear, it works as follows: we
ASL Browser , [online], U.S.A., URL:
first take note of the exact moments at which the video
http://commtechlab.msu.edu/sites/aslweb/browser.ht
clip must give way to a certain animations in flash frame.
m [Cited: 30/01/2004]
Next, the appropriate commands are inserted at those
ASL University, [online], U.S.A., URL:
moments. Using the JavaScript functions included in the
http://www.lifeprint.com/asl101/index.htm [Cited:
different frames, the desired effect was obtained. It is
30/01/2004]
therefore obvious that the main body of the application is
Bellugi, U.; Klima, E. & Siple, P.(1975). Remembering
situated in the video (in Windows Media Video format)
in signs. In: Cognition 3: 2, - pp. 93-125.
embedded in the video frame. However, it must not be
Biblioteca Virtual Miguel de Cervantes, [On line],
forgotten that the dynamic frames can communicate with
Alicante, Biblioteca Virtual Miguel de Cervantes.
each other, and this is what provides the necessary
URL: http://www.cervantesvirtual.com/. [Cited:
flexibility to be able to design and adapt another flash
30/01/2004]
application: the menu, from which the user can enter the
Biblioteca Virtual Miguel de Cervantes, Biblioteca de
lesson and select a section.
Signos [On line], Alicante, Biblioteca Virtual Miguel
Looking at these cases, we can see the flexibility de Cervantes. URL:
provided by the integration of several technologies in the http://www.cervantesvirtual.com/portal/signos.
process of developing multimedia applications. Each of [Cited: 30/01/2004]
the technologies can work alone, but they also provide the Biblioteca Virtual Miguel de Cervantes, Biblioteca de
capacity to modify the state of the other technologies Signos [On line], Alicante, Biblioteca Virtual Miguel
coexisting in the same surroundings. If we add the fact de Cervantes. URL:
that these technologies are all specifically designed to http://www.cervantesvirtual.com/portal/signos/autores
.shtml. [Cited: 30/01/2004]

66
Biblioteca Virtual Miguel de Cervantes, Biblioteca de
Signos [On line], Alicante, Biblioteca Virtual Miguel
de Cervantes. URL:
http://www.cervantesvirtual.com/portal/signos/cat_lin
güística.shtml. [Cited:30/01/2004]
Biblioteca Virtual Miguel de Cervantes, Biblioteca de
Signos [On line], Alicante, Biblioteca Virtual Miguel
de Cervantes. URL:
http://www.cervantesvirtual.com/portal/signos/cat_lit
eratura.shtml.[ Cited: 30/01/2004]
Bonvillian, J.D.; Folven, R.J. (1993): Sign language
acquisition: Developmental aspects. In: Marschark,
M. & Clark, M. D. (eds): Psychological perspectives
on deafness. Hillsdale, NJ : Erlbaum, - pp. 229-265.
Fernández Villabrille, F. (1851). Diccionario usual de
mímica y dactilología : útil a los maestros de sordo-
mudos, a sus padres y a todas las personas que tengan
que entrar en comunicación con ellos. Madrid,
Imprenta del Colegio de Sordo-mudos y Ciegos.
Frishberg, N. (1975)Arbitrariness and iconicity:
Historical change in ASL. In: Language 51: 3, - pp.
696-719.
Handspeak, [online], U.S.A., URL:
http://www.handspeak.com [Cited: 30/01/2004]

Hervás y Panduro, L. (1795). Escuela española de


sordomudos, o Arte para enseñarles a escribir y hablar
el idioma español, dividida en dos tomos. [Tomo I].
Madrid, Imprenta Real.
Hervás y Panduro, L. (1795). Escuela española de
sordomudos, o Arte para enseñarles a escribir y hablar
el idioma español, dividida en dos tomos. [Tomo II].
Madrid, Imprenta Real.
Juncos, O. [Et al.] (1997) "Las primeras palabras en la
Lengua de Signos Española. Estructura formal,
semántica y contextual". Revista de Logopedia,
Foniatría y Audiología , 17, , 3, pp. 170-181.
Pablo Bonet, J. (1620) Reducción de las letras y arte
para enseñar a hablar a los mudos. Madrid, Francisco
Abarca de Angulo.
Signhear ASL Dictionary, [online], U.S.A.,
URL:http://library.thinkquest.org/10202/asl_dictiona
ry_text.html [Cited: 30/01/2004]
Silent Thunder Animations Page , [online], U.S.A.,
URL:
http://www.velocity.net/~lrose/aniasl/anisign.htm
[Cited: 30/01/2004]
Woodward, J. & Erting, C. J. (1975) Synchronic
variation and historical change in ASL. In: Language
Sciences. A world journal of the sciences of
languages 37, - pp. 9-12

67
A Language via Two Others: Learning English through LIS
Elana Ochse
Dipartimento di Ricerca Sociale, Università del Piemonte Orientale
84 Via Cavour, 15100 Alessandria, Italy.
[email protected]

Abstract
The complex intercultural activity of teaching/learning to read and write in a foreign language clearly involves a reciprocal cultural
exchange. While trying to get students to efficiently learn the language in question, namely English, the teacher adapts to her pupils’
culture and communication mode: in this case LIS or Italian Sign Language.
This paper attempts to demonstrate the complex process of developing a corpus for analysis of selected foreign language classroom
exchanges. Here our emphasis is on face-to-face communication: what is imparted to the students by the teacher in Italian, how this
information is transmitted or filtered by the LIS interpreter, what information the students eventually receive and how they react to it.
A particular example of classroom activity has been filmed, transcribed and analysed from the points of view of successful
communication, on the one hand, and failure or breakdown of exchange, on the other.

1. Introduction 2. Method: data collection and presentation


A natural Sign Language, the dominant code in In accordance with linguistic anthropological
which face-to-face communication between Deaf research methods (Duranti, 1997), a corpus of
people and other signers takes place, can be put on a communicative events involving classroom
par with an oral mode of communication (Yule, discourse have been filmed. Meaningful excerpts
1985; Ochse, 2004); however, in order to achieve from these ethnographic records (more than 25 hours
literacy, the Deaf are obliged to learn another of videotaped activity) have been selected and
language with both a spoken and a written variant transcribed with the help of a native LIS (Lingua dei
(usually the majority language of his/her area or Segni Italiana) signer and linguistic expert.
country). Hence the “bilingual-bicultural” label In the present paper one of these excerpts,
which is often attached to Deaf signers (Swanwick, involving the teacher’s communication in Italian
1998; Prinz, 2002). Clearly the Deaf, who need a (Column A), a translation of the latter into English
written language “to take part in the culture of the (Column B), the interpreter’s rendering into LIS or
society in which they live” (Andersson, 1994) have Italian of the teacher’s or students’ contributions
a harder task than their hearing counterparts to learn (Column D) and the response or reactions of the
the written language whose spoken equivalent they class (Column C), has been analysed (See Table 1
cannot hear. This may result in varying levels of below). A comparison between Columns A/B and D,
second language literacy. i.e. the teacher’s original or translated verbal
The subjects in our present study are Deaf Italian communication followed by the LIS interpreter’s
adults who have chosen to study English as a foreign rendering of the latter, can give evidence of success
language for personal interest and, if they are regular or failure in comprehension, language
university students, to satisfy a credit requirement contact/interference and leakage.
for their degree courses. A special project has been As far as the transcription of the verbal and visual
started for Deaf adults at the local university texts is concerned, for clarity we have opted for the
allowing them to follow experimental all-Deaf simultaneous representation in four parallel columns
English classes with an emphasis on only written of “utterances” or “speech events” instead of the
English (i.e. reading and writing) and assisted by a “musical-score” format2.
LIS interpreter. The lesson deals with the possessive form and, as
From a certain point of view Italian and English is recommendable in Deaf didactics, has been
are very similar since they have both a spoken and a enriched visually by projecting different slides on
written component. In the present situation Italian, the screen.
the LIS-signer’s second language, is likely to be the The first slide portays a secretary in an office.
stronger written language because of more Names, like the secretary, Miss Smith and Mary
familiarity with it. On the other hand, English, like have previously been written on the board, in
all foreign languages, is probably used only in addition to various things that could be associated
classroom interactions and on some occasions in the with her in the photograph (e.g. PC, laptop, portable
external “linguistic landscape”1. computer, office, desk).
The second slide represents a woman holding a
baby in her arms. Once again, different names, such
1
cf. Elana Shohamy: paper, entitled “Linguistic Landscapes,
2
Multilingualism and Multiculturalism: A Jewish-Arab Lucas (2002) quotes Ehlich ((1993) : " the musical-score
Comparative Study”, presented at an international conference on allows the sequence of events to unfold from left to right on a
Trilingualism, Tralee (Eire) on Sept. 4th 2003. horizontal line …" (44).

68
as the baby, the mother and Joan, have been written In the second slide, where a child and a mother
on the board. are introduced, a student reads the word child on the
The process is repeated with two more slides. board phonetically. The interpreter fingerspells the
Then the class are shown a few written examples of word bambino but then signs and mouths BOY-
meaningful possessive phrases of the proper noun or GIRL to show the ambivalence of the English word
common noun possessor + thing/person possessed child.
(e.g. Miss Smith’s computer, Joan’s baby, the baby’s After child and mother, the interpreter feels the
mother). needs to list the following person nouns Joan (3rd)
and baby (4th).
3. Analysis An interesting example of hybridisation occurs
We have opted for an utterance/utterance analysis with GIOVANNA (common sign name for Giovanni
in the printed column format (column A vs D) to see followed by the fingerspelling A-N-N-A).
if single communicative acts have been successful or The choice of the word bambino by the teacher
not. The teacher explains that she has chosen a for both child and baby clearly confuses the students
particular position so that she can point out things to who ask for elucidation. The interpreter does not
the class on the screen. Then, to introduce the first repeat this to the teacher but immediately starts
possessor, she indicates the secretary, but feels the explaining that child (bambino) can be either male or
need to call on the class because she realises that female. No mention is made of the word baby.
their concentration is slipping. Before this After this presentation of the two slides
interruption the interpreter has transmitted very little (containing examples of possessor and possessed),
verbal information (GIRL), but probably sees the the teacher asks the class to write some sentences in
visual aid as an adequate alternative to a lengthy their exercise books, one with a proper noun and one
description. An image – the laptop computer – has with a common noun.
attracted the students’ attention and an animated She interprets “proper noun” as NAME-NAME-
signed conversation ensues. Since only one PERSON-MY and “common noun” as NAME-
videocamera has been used, we have to follow the NAME-SAME.
signing through the interpreter’s words. Initially she A stretch of interpreting follows which
tells the teacher that the lesson has been interrupted corresponds to silence on the teacher’s part.
by the students’ conversation, but then goes into the
student mode, interpreting directly what different 4. Results
students are signing. One student is particularly The following phenomena were found in the
enthusiastic about the laptop and reminisces about classroom interaction represented in Table 1:
one with two other girls. But then she apologizes bi- a) Bimodal communication (sign + mouthing) of
modally to the teacher (sign + lip-pattern). She everyday utterances such as YES or SORRY.
identifies the object as “everybody’s dream”. b) Interpreter’s initiative on two occasions,
When the students’ conversation subsides, the probably because she feared her previous
teacher resumes her presentation and repeats who interpretation had not been clear.
she was describing before the interruption (“the c) The use of facial expression, especially in
secretary”). To render the idea “This girl is a questions like qREADY, qWHAT,
secretary”, the interpreter concisely transmits the qUNDERSTAND, qALSO CD-COMPUTER.
information in a question-answer form: GIRLpl – d) Indication of persons or things by gestures (pl)
WORK – WHATq, followed by the brief answer or gaze.
SECRETARY and the fingerspelling of the Italian e) Particular LIS syntax in some questions or
equivalent. Showing adequate interest in the statements like READYq; MEANING-WHATq;
students’ previous conversation the teacher makes GIRL-WORK-WHATq. SECRETARY.
reference to it and asks a question about the meaning f) Use of fingerspelling in which Italian words
of the acronym PC. In the interpreter’s rendering she are spelt with the LIS alphabet, e.g. L-A, I-L, B-A-
fingerspells C-O-M-P-I-U-T-E-R with an additional M-B-I-N-O.
I, and then confuses the order of the letters PC, but g) Expression of plural form in LIS by repeating
quickly corrects herself. the sign with additional body posture, e.g.
Reference is then made to a number of phrases SENTENCE-SENTENCE; NAME-SURNAME-
written as examples on the board (the secretary’s NAME-SURNAME; NAME-NAME.
computer, Mary’s desk, Miss Smith’s office) and h) Body posture and sign: portable computer (the
containing different names for the same person (i.e. action of carrying accompanies the laptop bag);
first name, common noun, title and surname). The abbreviation (short) for Personal Computer.
interpreter “rewords” the message as follows:
MESSAGE-MESSAGE-SAME-FIRST; 5. Conclusion
EXAMPLE-SAME-BEFORE; WOMAN-NAMEpl; If the teaching had taken place directly in LIS,
NAME-SURNAME-NAME-SURNAME. i.e. without the presence of the interpreter, we could
At this point a student goes back to the previous have spoken of a single linguistic filter, but in this
discussion about the laptop and asks if it also has a case the presence of Italian as everybody’s common
CD compartment. language created a double linguistic and cultural
filter. This increased the risk of misinterpreting

69
information and sometimes led to the understanding
of different meanings from the ones that were
intended.

70
A B C D
mi metto davanti, così I’LL STAND IN FRONT SO (pl) MUST - STAND
posso indicare le cose I CAN POINT OUT THINGS
abbiamo una ragazza WE HAVE A GIRL WHO +GAZE
che possiamo chiamare WE CAN CALL THE READYq - HAVE -
la segretaria SECRETARY GIRL
guardate Anna LOOK AT ANNA (students signing to one (interrupts) GIRL
another) (waves hands for attention)
(………………….) (……………………) si, stanno parlando. Allora
(invisible to camera) stanno … Si, in effetti, è
molto bella questa foto col
computer con la ragazza,
dice Va ad Ar e An. Ti
ricordi? E’ bello. A …
parlo del computer
portatile molto carino.
Scusa scusa (s+s). Stavo
osservando. Giusto. Sogni
di tutti. Vero. Sogno. Si, si
(s+s)
è tuo sogno. IT’S YOUR DREAM. THIS WELL ( ) – YOUR
Questa ragazza è la GIRL IS DREAM (pl) LIKE A
segretaria THE SECRETARY LOT. GIRLpl WORK
WHATq. SECRETARY.
L-A
S-E-G-R-E-T-A-R-I-A
e lei ha questa cosa AND SHE HAS THIS THERE’S – LIKES –
che piace a Va: un THING WHICH Va LIKES. Va’S SIGN NAME –
computer oppure A COMPUTER OR SIMPLY LIKES A LOT –
semplicemente con TWO LETTERS, PC. PORTABLE COMPUTER
due lettere PC _____
OR C-O-M-P-I-U-T-E-R
(sic) OR PRONOUNCE
SHORT C-P. NO P-C.
SHORT P-C
qualcuno di voi sa cosa DOES ANYONE KNOW (students signing to one MEANING WHATq
vuol dire questo PC? WHAT PC MEANS? YES. another)
Si Si, personal computer
e qua come vedete ho AND HERE, AS YOU CAN pl SEE SENTENCE
fatto una cosa simile. SEE, I HAVE DONE SENTENCE. SAME
Ho chiamato la SOMETHING SIMILAR. I FIRST SAME WOMAN
ragazza con questo HAVE CALLED THE GIRL pl NAME pl NAME
nome e le ho dato WITH THIS NAME AND I SURNAME NAME
anche un cognome e HAVE ALSO GIVEN HER A SURNAME.
ho fatto vari esempi SURNAME AND I HAVE I GIVE SHOW
simile a quelli che MADE EXAMPLES LIKE EXAMPLE ALWAYS
abbiamo già fatto THE ONES WE HAVE SAME BEFORE SAME
prima ALREADY MADE
e ho fatto possedere AND I MADE HER THEN - I - PUT (pl)
anche altre cose come POSSESS OTHER THINGS OFFICE-DESK ______
un ufficio, una SUCH AS AN OFFICE, A THEN
scrivania DESK
ALSO CD - Anche cd del computer?
COMPUTER

71
si YES YES - YES (emphatically)
child che vuol dire CHILD MEANING BOY OR [kilεd] ph (pl) MEAN – CHILD –
bambino o bambina GIRL (fs) B-A-M-B-I-N-O -
(pointing at the word BOY-GIRL (s+s) – BOTH
“child” on the board) - SAME
poi mother si THEN MOTHER YES (student speaking) THEN (pl) MOTHER (pl)
mamma
poi Joan, è come THEN JOAN, LIKE 3rd – J-O (fs) - SAME –
Giovanna GIOVANNA GIOVANNI A-N-N-A
(s+fs)
e poi baby che è un AND THEN BABY THAT IS 4th – B-A-M-B-I-N-O (fs)
bambino (pointing at A CHILD – MEAN - SMALL
“the mother’s baby” CHILD(ph) - BABY
on the board)
UNDERSTAND – NOT
– REPEAT
MOM(pl) OWN –
CHILD – MEAN – OWN
– B-A-M-B-I-N-O (fs)
OR-GIRL-BOY – BOTH
– SAME (nods)
volete scrivermi WILL YOU WRITE AT ( ) NOW – PLEASE –
almeno due frasi LEAST TWO SENTENCES YOU – YOU – MUST
FOR ME WRITE – TWO –
SENTENCE - (pl)
SENTENCE - WITH
con un nome proprio ONE WITH A PROPER NAME - PERSON - MY
NOUN
e con un nome comune AND ONE WITH A (pl) SECOND -
COMMON NOUN SENTENCE – PUT –
NAME – NAME – SAME

scrivetele sui vostri WRITE IN YOUR YOU. OK ( )
quaderni EXERCISE BOOKS
si? avanti OK? GO AHEAD. TO YOU
adesso voi dovete NOW YOU MUST WRITE A SO – YOU – MUST
scrivermi delle frasi FEW SENTENCES IN THE WRITE – SENTENCES –
nello stesso modo SAME WAY USING THIS TWO – DIFFERENT –
usando queste INFORMATION SENTENCE PROPER
informazioni NOUN (s+s) (mouthing
“comune”)
PERSON (pl) – HAVE –
POSSESS (pl)
GIVE – WORDS – YOU –
SEE – MEMORIZE –
ELABORATE – BUILD –
SENTENCE -
SENTENCE
BUT –NAMES - TWO – -
DIFFERENT
ONE – SENTENCE –
NOUN - PERSONAL
potete per esempio FOR EXAMPLE YOU CAN SECOND (pl) –

72
dire “la mamma del SAY: THE BABY’S SENTENCE – L-A (fs)
bambino” MOTHER MOTHER - OF (pl) -
BABY
oppure il bambino OR THE MOTHER’S OR I-L (fs) BABY –
della mamma o il CHILD, OR JOAN’S CHILD HAVE – OF –
bambino di Giovanna GIOVANNI A-N-N-A
MEANING USE –
PROPER NOUN –
COMMON NOUN
UNDERSTOOD
Table 1: Transcription of a filmed extract of classroom interaction (University of Turin, 16th March 2002).

see conventional orthography representing spoken Italian


SEE translation into English of spoken Italian
SEE…………. English gloss of a LIS sign
I-SEE-YOU single LIS sign glossed by more than one English word
S-A-W when a word is fingerspelled, individual letters are separated by a hyphen
q question
() pause (shorter than two seconds)
( ) off-topic, overlapping signing amongst learners
s signing
(s+s) lip-patterning and signing simultaneously
(s+fs) signing followed by fingerspelling
ph phonetic pronunciation
________ use of body posture
overlap
+GAZE/-GAZE3 looking at or averting gaze from an addressee or object. Sometimes used as a form
of placement.
Table 2: Transcription conventions (adapted and developed from Napier 2002)

3
cf. Van Herreweghe (pp. 79-80).

73
6. Acknowledgements
This research is part of the Turin-based unit’s
contribution to the 2002 National Project (MUIR
COFIN, project n. 2002104353) entitled
“Strategies of Textual recasting for intercultural
purposes”. My special thanks to Claudio Baj (who
helped me with the transcriptions from LIS to
Italian), my Deaf students and Marco Gastini (for
filming the classroom activities)

7. Bibliographical References
Ann J. (2001). Bilingualism and Language Contact. In
C. Lucas (Ed.). The Sociolinguistics of Sign
Languages (pp. 33-60). Cambridge: Cambridge
University Press.
Duranti A. (1997). Linguistic Anthropology.
Cambridge: Cambridge University Press.
Hamers J.F. (1998). Cognitive and Language
Development of Bilingual Children. In I. Parasnis
(Ed.). Cultural and Language Diversity and the Deaf
Experience (pp. 51-75). Cambridge: Cambridge
University Press
List G. (1990). Immediate Communication and Script:
Reflections on Learning to Read and Write by the
Deaf. In S. Prillwitz, T. Vollhaber (Eds.), Sign
Language Research and Application (pp. 65-75).
Hamburg: Signum.
Livingston S. (1997). Rethinking the Education of Deaf
Students. Portsmouth: Heinemann.
Lucas C. (Ed.) (2002). Turn-Taking, Fingerspelling and
Contact in Signed Languages. Washington D.C:
Gallaudet University Press.
Marschark M., Lang H.G., Albertini J.A. (2002).
Educating Deaf Students: from research to practice.
Oxford: Oxford University Press.
Napier J. (2002). Sign Language Interpreting:
Linguistic Coping Strategies. Coleford: Douglas
McLean.
Ochse E. (2004) Language – English – Difficult//
Question – You – Think – What? In M. Gotti,
C.Candlin (Eds.), TEXTUS XVII (pp. 105-120).
Tilgher: Genova.
Prinz P.M. (2002). Cross-Linguistic Perspectives on
Sign Language and Literacy Development. In
R.Schulmeister, H.Reinitzer (Eds.) (2002). Progress
in Sign Language Research. In Honor of Siegmund
Prillwitz (pp. 221-233). Hamburg: Signum.
Swanwick R. (1998). The teaching and learning of
literacy within a sign bilingual approach. In S.
Gregory, et al. (Eds.), Issues in Deaf Education.
London: David Fulton Publishers Ltd.
Van Herreweghe M. (2002). Turn-taking mechanisms
and active participation in meetings with Deaf and
Hearing participants in Flanders. In Turn-Taking,
Fingerspelling and Contact in Signed Languages.
Washington D.C: Gallaudet University Press.
Yule G. (1985). The study of language. An introduction.
Cambridge: Cambridge University Press.

74
Making Dictionaries of Technical Signs:
from Paper and Glue through SW-DOS to SignBank

dr. philos. Ingvild Roald


Vestlandet Resource Centre
POB 6039 Postterminalen
NO-5897 BERGEN
Norway
E-mail: [email protected]

Abstract
Teaching mathematics and physics in upper secondary school vocabulary is made by one or a few persons who are
for the deaf since 1975, this author has felt the need to collect working in a field and have authority. An example is
signs for the various concepts. In the beginning illustration of
signs were pasted into a booklet. Then SignWriting appeared, Lavoisier (1743-94), who created the vocabulary of
and signs were hand-written and later typed into the booklet. chemistry and laid down its rules of
With the 3.1 version of SignWriter, the dictionary program ‘grammar’ for times to come.
appeared, and several thematic dictionaries were made. With the
new SignBank program, there are new opportunities, and I can Energy is the ability to
fill in what I before just had to code. Last year a Fulbright perform work.
research fellow and myself were collecting signs for
mathematics, and these are transferred into a SignBank file. Energy can have different
From that file various ways of sorting and analysing is possible. forms,
Here these various stages are presented, with a focus especially but the total amount of
on the SignBank and the opportunities and limitations that are energy in
present in this program. the world is constant.
In Norwegian, the term
Paper and Glue ‘energy’ may casually be
In Norway deaf students were educated according to the used to mean ‘force’ (kraft)
oralist method from 1880 until about 1960. Then signs The term ‘kraftkrevende
were introduced in the schools trough the system called industri’ really means
’Correct Norwegian Sign Language’ (C-NSL), a system ‘energy consuming industry’
supported by the Norwegian Deaf Association. Generally, Electrical power (kraft) is
deaf students were regarded as unable to grasp abstract really electrical energy
ideas, and they were educated accordingly. Thus, when
we started to question these old ’truths’ about the abilities ENERGY Energy is measured in units
of the deaf students, and to offer education in more of
subjects and at more advanced levels, there were no signs Symbol: E J = Joule
in the language for the new concepts. Discussing this 1 J = 1 N·m = 1 W·s
problem with the deaf people on the committee who
‘proposed’ or ‘ran’ the C-NSL, we agreed that the use of Figure 1:’Energy’ in paper and glue-version
available signs from C-NSL, from other Scandinavian
sign languages, from Gestuno or from American Sign SignWriter-DOS
Language, could be a basis for such technical signs. These With SignWriting came the possibility to write the signs
were the sign languages for which I could get hold of rather than relaying on photos or drawings of persons
dictionaries. (Dictionaries are listed in the references). All signing. When I first met SignWriting, while visiting a
signs were discussed with my deaf students, and to Danish school for deaf pupils in 1982, I was fascinated
preserve those signs I photocopied them and glued them and quickly adopted this way of preserving signs for my
into a booklet, according to theme. Examples are shown own benefit. With the arrival of the computer program,
in figure 1. The process of choosing or creating the signs the SignWriter, came the opportunity to type and preserve
is described in an article published on the Internet (Roald the signs in a neat way. In its first versions the SignWriter
2000). Although the creation of signs should ideally be did not have a dictionary program, but by utilising the
done by concensus in a population of native signers well word-processing possibilities and the possibilities to write
versed in the topic for which the signs are to be used in the Roman alphabet as well as in sign symbols, it was
(Caccamise, Smith et al. 1981), this would not be possible nevertheless possible to make short lists of signs for
in a population that is just entering a field of knowledge. specific themes. An example is shown in figure 2.
The process that was used in our case, has been reviewed
by Deaf teachers later on (Roald 2002), and was deemed
appropriate. It is also a fact that new technical terms are
coined in all languages when new needs arise,(Picht and
Draskau 1985). It is also a fact that often a whole new

75
one of ‘Summer Flowers’ and so on, and merge these
together and make a dictionary of ‘Flowers’, which again
could be merged with other small ones to make ‘Botany’
and ‘Biology’ and ‘Science’ and finally ‘Norwegian Sign
Language’. Several attempts along this road were made,
but the program would often fail in the merging process.
Figure 4 shows result of a failed attempt to merge two
smaller dictionaries. This was a setback, as it is
considerably harder first to make a large dictionary and
the weed out everything that is not inside your chosen
theme. Each time a new sign is added, it has to be added
Figure 2: Energy etc. in early SignWriter version seperately to each of the appropriate theme dictionaries,
rather than doing the merger process over at regular
intervals. The building of dictionaries for signs other than
SignWriter®-DOS Dictionary Program
my main subject, physics, therefore halted. In my
In the early to middle 1990’s I was given the task by my
computer are several small dictionaries for marine life, for
resource centre to develop materials for the teaching of
instance. They will be used to fill the Norwegian
deaf students. Signing was by now well established as the
SignBank.
language of the deaf education in Norway, even if
Another time-consuming problem, not related to the
Norwegian still was necessary for dealing with the outside
Dictionary program, has been the changes in symbols and
world. With the 3.1 version of the SignWriter program
which key they are allotted to. The newer versions have
came the attached Dictionary program. This program
not been compatible with the older ones, and strange
made it possible to create real dictionaries of signs for
pictures have resulted from this.
concepts and words. Each sign was written separately In addition, SignWriter and the Dictionary program can not be
either in the dictionary itself, or uploaded from a written run from newer computer platforms, such as Windows 2000 or
sign text. Each sign had to have a name, and the Windows XP.
dictionary was sorted alphabetically by these names.
Sometimes more than one sign would correspond to the
same Norwegian word. They might be variant signs for
the same concept, or they might be signs for different
concepts covered by the same name-word. These were
coded by (1), (2), (3), etc. Often a short explanation
would also go into the name field. The source of the sign
would also be in the field, as a coding or a short note. As
the writings, or spellings, of signs are not yet established,
at least not in Norwegian Sign Language, I often gave
multiple written versions of the same sign. These were
coded by using (i), (ii), (iii) etc. Examples are given in
figure 3.

Figure 4: Result of merger failure with SW Dict program.


The sign called ‘absorb’ is really ‘language’

The SignBank® Program


The SignBank is a very different program. It is built as
relational databases in a FileMaker® environment. The
program is not suited to write signed texts, but is a
toolbox for creating good Sign dictionaries in a variety of
styles. It can be searched alphabetically by sign-names,
but it can also be searched from the sign itself. In
addition, it can be searched by themes or other criteria
that the editor may choose to use. Chosen signs, making
up a large or small dictionary, can be printed in a variety
of ways. Video explanations of the signs can be added, as
Figure 3: ‘Absolutely’ in SW 4.3 Dictionary program
can photographs or drawings like those from the paper-
an-glue era. Other illustrations and explanations, in
Problems with SW’s Dictionary Program
spoken-language text or in signed-language text can be
The Dictionary program has a feature called ‘merging’.
added.
My hope was to use this feature to build a large dictionary
The recording of signs into the SignBank is rather time
from several smaller dictionaries. That way it would be
consuming, but this is necessary to make the program able
possible to make a dictionary of ‘Spring Flowers’ and
to sort by sign. To make this sorting by sign possible, a
merge this with a similar dictionary of ‘Wild Flowers’,
standard sequence of the different symbols in SignWriter,

76
along with a few extras from the Sutton All these steps are optional, except for step 1. A few signs
MovementWriting® system, is established. The full will have only this one symbol: most of the letters and
version, called SSS-04 (Sign Symbol Sequence) a large numbers are given that way. Also, the few non-manual
number of symbols, a shorter and more compact version signs will have step 3 only.
created for the American Sign Language contains a For step 4, the place of articulation, extra symbols may be
smaller number of symbols in 10 categories, divided into required that are not written in the sign. For the written
groups, again divided into symbols and their variations. sign, the placement in relation to the body is given by the
Now, the large numbers may seem overwhelming, but the structure of the written sign and its relations to the
system itself is largely transparent, and the huge number guiding (imagined) lines in the text. These are not part of
stems from the fact that a hand shape can have 16 the spelling for entering the sign into the SignBank. For
rotations and 4 ‘flippings’ (shadings) for the right hand use whenever necessary, symbols depicting the body and
alone, making a total 128 symbols for that handshape place of articulation are part of the SSS.
(both hands included), all neatly derived from one basic Once all the relevant symbols are entered into the spelling
symbol. With all possible hand shapes from all the signed of the sign, the sign should be saved into the bank. A
languages, this makes the inventory of hand symbols copy of the sign and the spelling can be made, for use
huge. In addition, there are the symbols for movement with other signs that share the same features. Sometimes
(again having variations in direction and size, as well as the exact same sign will cover more than one word in the
which hand is moving) and dynamics, and the symbols spoken language (as for any two-language dictionary).
for the face and other parts of the body, both as Editing can also be done on this copy, so that a sign
articulators and as places of articulation. The symbol varying from the first in one or a few of the symbols can
sequence is used both in the SignWriter programs and in be entered with less entering work.
the SignBank, and will constitute the International
Phonetic Alphabet. In the SignBank, this Symbol
Sequence is used to order and look up the signs by the
signs themselves. Thus, it becomes possible to have a
dictionary for one signed language only, with definitions
and explanations in the same signed language, without
having to restore on any spoken language. It will also be
possible to have dictionaries between two or more .signed
languages.
For a sign to be recorded, it first has to be written in SW-
DOS or SW-JAVA, and then made into a .gif-file or a
.png-file. The writing rules of signs in SignWriting are
still somewhat ambiguous, as writing rules for the
different signed languages have not had time to make the
orthography settle. Thus, a writer may have several ways
of writing the same sign, as a way to for the signing
community to settle for one or the other. This I have done
by using the coding (i), (ii), (iii), etc., for different ways
of writing the same sign.
The term ‘spelling’ in SignBank parlance, means the
symbols chosen for the ordering of the sign into the
sequence, and the ordering of these symbols. For that
purpose, most signs are seen as consisting of three
‘syllables’: starting configuration, movement, and ending
configuration. The rules now are:

1. Initial dominant hand in shape and rotation and


shading
2. Non-dominant hand similarly, if that hand is taking
part in the sign.
3. Initial symbol for other articulators
4. Place of articulation Figure 5: Sign spellings of ‘absolutely’
5. Movement of dominant hand (fingers first, then hand,
etc.) In addition to entering the spelling of the sign, and a
6. Movement of non-dominant hand word-name for it, the editor has the possibility of entering
7. Movement of other articulators (brows, eyes, mouth, linguistic data, the sign in a sign-context or a sign-text
…) explanation of the sign like in any one-language
8. End dominant hand dictionary. It is also possible to make dictionaries
9. End non-dominant hand covering more than one signed or spoken language.
10. Dynamics Video, animations and still picture illustrations may also
be added. In all, the possibilities are available for

77
whatever one may want, as the program itself may be
augmented with new features in new related database
files. Below is shown a few of the features in the program

Figure 8: Parts of dictionary, sorted by sign or word

In a short article like this one, it is not possible to present


all the features of a program like the SignBank. Suffice it
to say that the program opens a new era in dictionary
creation for the signed languages, combining the written
sign, sign illustration, video of sign and of sign in context,
translation between signed languages and between signed
and spoken languages. We have only begun to scratch
these possibilities.

References
Author unknown (1975). GESTUNO International Sign
Language of the Deaf; Language Gestuel International
des Sourdes. Carlisle, UK, The British Deaf
Figure 6: Linguistics page for ‘mathematics’ Association.
Caccamise, F. and H. G. Lang (1996). Signs for Science
and Mathematics: A resource book for Teachers and
Students. Rochester, NY, Rochester Institute of
technology.
Caccamise, F., N. Smith, et al. (1981). "Sign Language
Instructional Materials for Speech, Language and
Hearing Professionals." Journal of Academy of
Rehabilitative Audiology 1981(14): 33-61.
Norsk/nordisk tegnspråkutvalg (1976). Nordisk
Tegnspråk, 155 tegn vedtatt på Nordisk tegnspråk-
seminar i Malmø 1975. Bergen, Norway, Norske Døves
Landsforbund
Picht, H. and J. Draskau (1985). Terminology : an
introduction. Guildford, University of Surrey
Department of Linguistic and International Studies.
Roald, I. (2000). Terminology in the Making - Physics
terminology in Norwegian Sign Language, The Deaf
Action Committee for SignWriting. URL :hhtp://www.
SignWriting.org/forums/lingustic/ling032.htm
Roald, I. (2002). "Reflection by Norwegian Deaf
Teachers on Their Science Education: Implications for
Instructon." Journal of Deaf Studies and Deaf
Education 7(1): 57-73.
Tegnspråksutvalget (1988). Norsk Tegnordbok
Figure 7: Page with snapshot from video (=Norwegian Sign Dictionary). Bergen, Norway,
Døves Forlag.

78
Searching SignWriting Signs
Steven Aerts∗ , Bart Braem∗ ,
Katrien Van Mulders† , Kristof De Weerdt†

University of Antwerp
Campus Middelheim, Building G
Middelheimlaan 1
B 2020, Antwerp, Belgium
{bart.braem, steven.aerts}@ua.ac.be


University of Ghent
Rozier 44
B 9000 Gent, Belgium
{katrien.vanmulders, kristof.deweerdt}@ugent.be
Abstract
At the moment the publication of the first written Flemish Sign Language (VGT) Dictionary is in progress. It consists of VGT glossaries
and allows its users to lookup the signs for over 2000 Dutch words. The signs are written in SignWriting.
We have established an electronic representation of this sign language dictionary. Searching for signs starting from a Dutch word works
straightforward. The opposite, receiving results ordered by relevance, has never been develloped before. In this paper we explain how
we have worked out such a system.

1. Introduction with extra semantic information, however, is common prac-


We have developed an online database driven dictio- tice. It usually consists of selecting the type and direc-
nary system currently containing about 5000 signs(Aerts tion of the movement, the location on the body where the
et al., 2003). These signs were (and are) collected by re- sign is made and finally the hand form. This information
searchers of the university of Ghent and written down in needs to be added to each individual sign, causing a big
SignWriting using SignWriter DOS(Gleaves, ). Our system slowdown in dictionary development due to tedious man-
can convert these binary files to SWML, which is an XML- ual work. When creating a huge, constantly evolving dic-
based SignWriting representation language(da Rocha Costa tionary this is highly undesirable.
and Dimuro, 2003). The major advantage of SignWrit- 3. Semantic view on
ing in general and SWML in particular is that it is very SignWriting
lightweight and thus ideal for the web. Our database is
modelled on the SWML structure to contain exactly the The first question we should ask ourselves is whether
same information. the following common search information can be extracted
The SignWriting system itself is a practical visual writ- from SignWriting signs:
ing system for deaf sign languages, composed of a set • Type of movement: the movements are pretty well de-
of intuitive graphical-schematic symbols and simple rules scribed as they are represented by different symbols,
for combining them to represent signs(da Rocha Costa but whereas it is rather easy for human beings to find
and Dimuro, 2003). It was invented by Valerie Sut- the only physiologically feasible possibility, it is dif-
ton inspired by her already created choreographical ficult to find the matching moving body part with a
writing, called language DanceWritingSutton, a) Sut- computer.
ton, b). SignWriting symbols represent the body
• Direction of the movement: this is almost impossible
parts involved and the movements and face-expressions
to extract. Because of the two-dimensional represen-
made when producing signs. This way in an elec-
tation in SignWriting the difference between horizon-
tronic representation each symbol is stored along with
tal and vertical movements is - again - only easily de-
its transformations: translation, mirror and rotation.
tectable by human beings.
In this paper we provide an outline to construct an intuitive,
user-friendly, yet powerful search by sign system for Sign- • Location: a course grained distinction between zones
Writing. All information can be extracted from the signs of positions should be possible, but only when a body
only, without external information e.g. the position of the part is touched. When no body part is touched, the
hand or the dominant hand. location can only be extracted from the SignWriting
symbols by considering the most likely physiological
2. The manual approach positions.

Searching using only the SignWriting signs has never • Hand form: is very accurately defined in SignWriting
been done before. Searching for the meaning of a sign through the use of different symbols. Hand forms will
in a database containing filmed signs manually enriched obviously be the key feature to search by.

79
4. Elaboration 5.2. Performance
4.1. User input Selecting signs through the right symbols
The user first specifies the hand form. Then he specifies and contacts is a database issue where per-
which body zones are touched (head, torso or legs) and the formance is not at stake. A well designed
way in which they are touched (touch, grasp, in-between, database will process those queries fast enough.
strike, brush contact or rub contact). Determining the body zones is rather straightforward
It is also possible to specify the orientation of the hands and can also be done in the database. If we see one thick
(palm up, back of the hand up or side view). This does not black horizontal line, for example, we know the chest is
change the essence of the search. meant, whereas two lines depict the legs and hips. Figure 1
illustrates this principle.
4.2. Processing Calculating the goodness-measure will be done over a very
Selecting all signs that include the specified hand limited number of matching body-symbols and contacts:
form(s) is obvious. The different types of touching are also a sign containing four contacts is extremely rare. The
very well depicted in SignWriting, but one type of touching number of comparisons will be low and will not affect the
has multiple variations. The body zones involved and the global performance by one order-of-magnitude.
type of contacts are accurately specified. If multiple zones
or contacts are involved, matching them is difficult. 6. Future work
It is however possible to parametrise the goodness We do not have an implementation of the search al-
of a match. The goodness-measure we would like gorithm right now, which is the only missing part in our
to use is the product of the distances between the dictionary system. We expect to have a working system
touch and the middle of the corresponding body zone. ready soon, because the method we have described is pretty
When measuring the goodness from a match of n straightforward to implement.
contacts ci (i = 1..n) and the corresponding body The structure of our system is built with the algorithm
zones zi this results in the following function: in mind. The exact mapping of SWML on the relational
µ(c1 , ..., cn , z1 , ..., zn ) = Πni=1 ((cix − zix )2 + (ciy − ziy )2 ) database prevents loss of information about the signs. Ev-
ery single symbol can be traced back to its containing sign,
4.3. Results allowing fast lookups of relevant signs.
We are able order matches, using this mea-
sure: the closer the match the lower the goodness- 7. Conclusion
measure. Dropping very bad matches, which have
The great advantage over existing systems is the fact
a very high goodness-measure, is also possible.
that all information originates from the signs only.
The ordering does not happen in a natural way (But-
This system is intuitive for a user with basic SignWriting
ler, 2001; Sutton, 2004), because for that to be possible
knowledge. Its friendliness will largely depend on the used
manually added information about e.g. the dominant hand
interface but can be improved with the goodness measure
would be necessary.
and a broadened search. The real strength of our system
5. Issues lies in the use of the very well specified SignWriting hand
forms, which compensates for the vague movements.
5.1. Precision & Recall Because of the use of databases, SWML
Precision is described as the ratio between the rele- and relatively simple calculations, this
vant retrieved signs and the retrieved signs, whereas recall method is also straightforward to implement.
stands for the ratio between the relevant retrieved signs and Most important, the Deaf Community and its researchers
the relevant signs. will benefit by this new search method since it allows for
In this case: is the user able to clearly specify what he easier dictionary-searching. Moreover the system can be
wants? If this is problematic, improving usability will be used as an online reference for the meaning of SignWriting
possible by broadening the search to closely matching sym- signs.
bols.
8. References
Aerts, Steven, Bart Braem, Kristof De Weerdt, and
Katrien Van Mulders, 2003. http://plantijn.
ruca.ua.ac.be/˜aertsbraem/interface/ -
Vlaams Gebarenwoordenboek.
Butler, Charles, 2001. An ordering system for SignWriting.
The SignWriting Journal, 0(1).
da Rocha Costa, Antonio Carlos and Gracaliz Pereira
Dimuro, 2003. SignWriting and SWML: Paving the way
to sign language processing.
Figure 1: Determining the body zones Gleaves, Richard. http://signwriting.org/
forums/software/sw44/ - signwriter dos com-
puter program.

80
Sutton, Valerie, a. http://www.dancwriting.
org/ - DanceWriting: Read and write dance.
Sutton, Valerie, b. http://www.signwriting.
org/ - SignWriting: Read, write, type sign languages.
Sutton, Valerie, 2004. Sutton’s SignSpelling guide-
lines 2004. SignWriting Library Database,
[email protected].

81
Sign Printing System - SignPS
Inge Zwitserlood
Viataal
Theerestraat 42, NL-5271 GD Sint Michielsgestel, The Netherlands
[email protected]

Doeko Hekstra
Handicom
Oranjelaan 29, NL-3843 AA Harderwijk, The Netherlands
[email protected]

Abstract
The development of the Sign Printing System (SignPS) is based on the need of a way for sign language users and teachers to compose
pictures of signs without having considerable drawing skills, to store these pictures into a database and to retrieve them at wish for
several purposes. The sign pictures are abstract but nevertheless recognizeable without specific training. The programme is not
developed for scientific purposes, but for use by the general (signing) public.

1. ‘Drawing’ sign pictures coherent representations of sign strings. Furthermore,


Similar to languages, sign languages often require a most people's drawing skills are not sufficient to make
static representation, that can be used in print and drawings of signs, and photographs require special
processed in one’s own pace. Currently, several types of equipment and additional adaptations in order to
static representations: represent the dynamic part of signs (such as the
photographs, drawings, movement of the
glosses and several notation hands).
systems are used to represent Thirdly, although
single signs, sometimes also most notation systems
for the representation of sign (e.g. SignWriting,
sequences. The reason for yet HamNoSys, KOMVA)
another system is that the do not entail these
existing systems have several problems, they are not
disadvantages, that are userfriendly for
overcome by Sign PS common language
Glosses, first have the users in general,
disadvantage of not giving because special training
information on the shape of a is needed to learn to
sign. Second, since glosses use them, and, more
are labels taken from spoken importantly, in some
languages, whose groups of sign language
grammatical structure is often users there is a general
considerably different from resistance against the
the sign language, much of use of such systems for
the information that is present common use.
in a sign cannot be expressed The Sign Printing
by words or affixes of these System overcomes
spoken languages. Various these problems by
subscripts and superscripts offering everyone with
are then needed to represent basic sign language
this information. skills a tool for quick
Disadvantages of and easy construction
photographs and many of sign pictures. The
drawings are the unnecessary program opens with the
details they show (clothes, contours of the head
hairstyles), that can distract and shoulders of a
the onlooker from the Figure 1 Components from restricted sets can be added to signer. Handshapes can
message. Photographs show a sign picture. be chosen from a
particular persons, drawings limited set of
have particular styles. As a result it is seldom possible handshapes and added to the picture. These can be
to use separate photographs and drawings to construct moved, copied, rotated or mirrored into the desired

82
orientation. In the same way, arrows and other hierarchy. Although it is not possible to change the
movement symbols can be chosen from limited sets, database structure, a user can add concepts and glosses
added to the picture and edited. Furthermore, particular to the database and even add categories to the semantic
facial expressions are composed by choosing and/or structure.
editing face components: eyes, eyebrows, mouth and Retrieval of sign pictures is possible in three ways.
nose. Subsets of these sign components are shown in First, the database can be searched by gloss name,
Figure 1. which is a common way for retrieval in many sign
3-Dimensionality is suggested by 3-dimensional databases. Second, a user can search for sign pictures
movement block arrows for movements towards and within the hierarchically structured semantic fields in
away from the signer and by varying the size of the the database. By choosing a particular semantic field,
hands. A large-sized hand gives the impression that it is the user is shown the subset of the gloss names of the
closer to the onlooker than a small-sized one, as signs that are in this field in the database. This is
illustrated in Figure 2. illustrated in Figure 3.

Figure 2 3-dimensionality in a 2-dimensional picture


(NGT sign for ‘tube on shoulder’)

These sign pictures are rather abstract in that they only


contain the minimal number of components necessary
for understanding the sign. Because of this abstraction,
they can also be easily combined to form sign strings.
On the other hand, the abstraction in the sign pictures is Figure 3 Searching within the semantic field of
not so extreme that special training is required for ‘celebration activities’
learning to recognize the signs.
At present, the Sign Printing System contains the Third, a user can search sign pictures by selecting
sign components needed for signs from Sign Language components of the signs, viz. handshape(s) and/or place
of the Netherlands (henceforth: NGT). It will be fairly of articulation. For instance, in Figure 4 the results are
easy for the developers to make adaptations for other shown of a search operation for a sign with a particular
sign languages (such as different sets of handshapes). handshape that is made in front of the chest. The
particular orientations of handshapes in signs is not
2. Storage and retrieval taken into account in the handshape search facility.)
The Sign Printing System has in common with
photographs and drawings that the sign pictures are
stored as whole units. Once stored, a user can retrieve
the sign pictures as whole units and does not need to
compose a particular sign picture anew every time it is
needed.
An innovative part of the Sign Printing System is
the database. A sign picture that is stored in the
database must be connected to a concept and to a gloss.
In case of synonymic signs, it is possible to connect
more than one sign picture to one concept (and one
gloss). For instance, NGT has several synonymic signs
meaning ‘good’ that can all be stored with the same
gloss. This facilitates retrieval, since synonyms will not
be overlooked. A sign picture can also be connected to
more than one gloss (depending on the language from
which the gloss stems). For instance, NGT has only one Figure 4 Searching by sign components
sign meaning ‘cat’, whereas Dutch has two words with
that meaning, kat and poes. The sign picture can be thus
labelled with both glosses, but still be stored as one
3. Use of the sign pictures
picture. A number of concepts are present in the The Sign Printing System is part of a range of software
database from the start. They are ordered in a semantic applications using communication symbols and

83
databases (sharing the same format) holding these 4. Further developments
symbols (such as Bliss, PCS and Picture this), called The Sign Printing System is still under development. It
Symbol for Windows. Among the applications are a has not yet been used and tested by large user groups. A
plain word processor, an email program and several first working version is distributed and a pilot course in
educational tools. The sign pictures stored in the the use of the program has recently been taught to a
database of the Sign Printing System can be used small user group (NGT teachers and speech therapists at
directly in these applications. Retrieval of these pictures Viataal). This group currently evaluates the program
is fast, and additionally, other elements (pictures, and their first reactions are very positive. A preliminary
photographs or symbols) can be retrieved from the inventory of desirable adaptations shows that the set of
connected databases and used in the same application. viewing angles of handshapes should be extended and
For instance, one can combine the picture of the sign for that the user-friendliness needs to be slightly improved.
‘cat’ with a picture of a cat, and/or with the Dutch word
kat. The combined use of sign pictures and a
photograph is illustrated in an item of a multiple choice
test in Figure 5.

Figure 5 Multiple choice test for NGT idiom with


SignPS pictures

Common Windows programs do not have direct access


to the databases. The sign pictures can be retrieved from
the database and stored as graphic files with an export
tool that is included in Symbol for Windows. These
pictures can easily be inserted in applications such as
Word or Powerpoint. The sign pictures can also be cut
and pasted into these applications. An example of a
lecture using an NGT sentence is shown in Figure 6.

Figure 6 Powerpoint presentation with SignPS pictures

84
A new Gesture Representation for Sign language analysis
Boris Lenseigne∗ , Frédérick Gianni∗ , Patrice Dalle∗

IRIT
Université Paul Sabatier 118 route de Narbonne 31062 Toulouse cedex 4
{lenseign,gianni,dalle}@irit.fr

Abstract
Computer aided human gesture analysis requires a model of gestures, and an acquisition system which builds the representation of the
gesture according to this model. In the specific case of computer Vision, those representations are mostly based on primitives described
from a perceptual point of view, but some recent issues in Sign language studies propose to use a proprioceptive description of gestures
for signs analysis. As it helps to deal with ambiguities in monocular posture reconstruction too, we propose a new representation of the
gestures based on angular values of the arm joints based on a single-camera computer vision algorithm.

1. Gesture representation classification rely on the way gesture is performed. This


1.1. Previous work approach is presented in recent linguistic research (Boutet,
2001) which suggests that an articulation-based represen-
Most of the descriptions of Sign language vocabulary
tation may have appropriate properties to allow the repre-
relies on linguistic studies and are those used in notation
sentation of the function of the gesture. So that, using joint
or transcription systems, SignWriting1, Hamnosys2 (Prill-
values to represent gesture is an interesting choice. This as-
witz and al., 1989). In the case of computer aided Sign
sumption leads us to propose a method, based on a single
language analysis, we distinguish systems using a specific
camera to compute a gesture representation based on joint
hardware such as data gloves (Braffort, 1996)(Starner T.,
angle evolution in time.
1998)(Vogler C., 1999) and those using cameras. Data
gloves based applications process directly on values pro-
vided by the sensors. in he case of a computer vision 2. Computing articulation values from a
system, gesture model cans be bidimentionnal or tridi- single image
mentionnal. When several cameras are used (Wren C.,
1999)(Vogler C., 1998)(Somers M.G., 2003) 3D recon- Articulations values calculation is performed in two
struction is possible and gestures can be analyzed directly stages : a geometrical reconstruction of the 3D posture of
in 3D space. In the case of a single camera, gesture ana- the arm and the computation of corresponding articulations
lysis can be performed directly in 2D images (Starner T., values. As we use a single camera, a direct 3D reconstruc-
1998)(Tanibata N., 2002) or some additional image pro- tion of the arm is not possible, and the geometrical method
cessing has to be performed for a partial 3D estimation. provides us with a set of four possible configuration of the
In this case, visual aspect of gestures is deduced from arm in a given image. A configuration is represented by
3D models (Horain P., 2002)(Athitsos V., 2004), or 3D the 3D Cartesian coordinates of each joint (shoulder, elbow
model is used to constrain the reconstruction (Lenseigne B., and wrist). Those coordinates are grouped together to form
2004). Both solutions leading to ambiguities. a set of four possible motions for the arm and joint values
Thus notation systems and vision-based gesture ana- can be computed for each trajectory to build articulation-
lysis systems use a representation of signs derived from based motion representation.
a tridimentionnal perceptive description. Gestures are lo-
cated in a speaker-centered frame (ScF) (fig. 5) but des- 2.1. Geometric resolution
cribed from external point of view. Those descriptions are
In this section we describe how to reconstruct a set of
based on the definition of a set of elementary motions and
possible 3D pose of a human arm from a single picture
for each elementary motion a set of parameters. Widely
using a single calibrated camera. This configuration is de-
used primitives are, straight, curve and complex motions.
fined by the position of each segment’s limits (shoulder, el-
Such a classification is only suitable for standard vocab-
bow and wrist) in Cartesian coordinates. Given an image,
ulary description, leads to a classification of gestures in
we are able to reduce the space of possible poses for the
terms of geometrical primitives and to a description of ges-
arm to four configurations only using a simple model of the
tures from the observer’s point of view.
scene, the camera and some assumption about it.
1.2. A new gesture representation
A different way to represent gesture is to use a proprio- 2.1.1. Requirements
ceptive point of view. In such a case, motion analysis and Our technique is based on several assumptions, which
may be crippling under an uncontroled environment.
1
http://www.signwriting.org/ However they could be raised if reconstruction can be per-
2 formed with a scale factor which does not affect joint values
http://www.sign-lang.uni-hamburg.de
/Projects/HamNoSys.html computation.

85
Acquisition device : The acquisition device is made up determine the position of the preceding articulation and the
of a single camera, which has been calibrated in order to be length of each segment of the arm, which means that we
able to calculate the equation of the projective ray across have to know, from the beginning, the 3D position of the
a given pixel, which suppose that the perspective transfor- shoulder. This can be problematic in an uncontrolled envi-
mation matrix C is known. Many techniques of calibra- ronment. However, when the problem is to obtain qualita-
tion were previously proposed, for instance in (Gurdjos P., tive or relational values, or for angular values calculation,
2002) or (Heikkilä, 2000). a reconstruction with a scale factor can be sufficient. The
Tracking the articulations : We also make the assump- position of the shoulder could then be fixed as a prelimi-
tion that we are able to identify the 2D positions of the three nary. Identically, dimensions of each segment can be fixed
articulations of the arm in the image. Tracking techniques arbitrarily as long as the ratio of their respective lengths is
abound and depend on the problem to solve (degree of ho- respected.
mogeneity of the background, the use of markers, motion
2.1.3. Algorithm
models, etc...). The precision needed for tracking depends
The method we present exploits a simple geometrical
on the precision needed for reconstruction. A study of the
model of the scene and especially of the structure of the
influence of tracking errors on reconstruction can be found
arm. We suppose that the coordinates of the points corres-
in (Lenseigne B., 2004).
ponding to the articulations are known, in the image. They
Arm pose : We only speak here about rebuilding the pos- can be writen in homogeneous coordinates as :
ture of an arm, without considering the hand. Within the
framework of a geometric resolution, we define the posture p1 = (u1 , v1 , 1)T
• for the shoulder : f
by the position in space of the articulations (shoulder, el-
p2 = (u2 , v2 , 1)T
• for the elbow : f
bow, wrist), i.e. if coordinates are expressed in the camera
frame : p3 = (u3 , v3 , 1)T
• for the wrist : f
T
• for the shoulder : P1 = (X1 , Y1 , Z1 )
After the calibration of the camera, we can compute for
• for the elbow : P2 = (X2 , Y2 , Z2 )T each point in the image the associated projection ray, which
is the line (passing by the optical center and the point image
• for the wrist : P3 = (X3 , Y3 , Z3 )T
considered) containing the 3D counterpart of this point.
Using this representation, the estimating of the posture of Set of possible configurations for the elbow : knowing
the arm is reduced to the calculation of three points in P1 the (possibly arbitrary) position of the shoulder in space,
space. the set of possible positions for the elbow can be defined as
2.1.2. Geometrical model of the arm the sphere S1 centered on the shoulder and whose ray is the
The arm is modeled by the articular system connect- length kl1 k of the arm. The Cartesian equation of such a
ing the shoulder to the wrist. This system consists of ar- sphere is :
ticulations (a ball-and-socket joint for the shoulder and a
(X1 − x)2 + (Y1 − y)2 + (Z1 − z)2 − kl1 k2 = 0 (1)
revolving joint for the elbow) connecting rigid segments
(arm and forearm) noted li , the segment li connecting the Equation of the projection ray : e p1 is the position of
articulations Pi and Pi+1 . The position of the final body the shoulder in the image, expressed in homogeneous co-
corresponds to the wrist position, i.e. with P3 , end of the ordinates. The calibration of the camera gives us the pers-
segment l2 . Since those articulations allows only pure ro- pective transformation C matrix defining the transforma-
tations of the segment li around the articulation Pi , we can tion from a 3D frame associated to the camera4, to the 2D
define the set of the reachable positions by the articulation image frame 5 . The matrix defining the perspective trans-
Pj (j = 2, 3) as a sphere, centered on the preceding articu- formation which forms the image is traditionnaly written as
lation Pj−1 and whose ray is kli k3 (cf. figure 1). Using follows :  
f ku 0 u0
C= 0 f kv v0  (2)
0 0 1
P1
P3
Where:

l1 P2
• f is the focal length ;
l2

• ku , kv are scale factor, horizontal and vertical, in


pixels/mm

Figure 1: Model of the articular system of the arm. The sphere • (u0 , v0 ) is the position of the principal point in the
represents the set of the possible positions for the elbow. image frame (the projection of the optical center of
the camera).
this model, the reachable space for each articulation posi-
4
tion becomes a sphere whose parameters are known if we the origin of this frame is in the optical center
5
the origin of the image frame is in the left higher corner of
3
kli k is the norm of the segment li the image

86
This matrix let us deduct the position in the image frame of 2.2. Extension to image sequences analysis
a point epi = (ui , vi , 1)T projection of a point whose coordi- In the case of image sequences, we calculate a set of
nates are expressed in the camera frame Pi = (Xi , Yi , Zi )T 3D points candidates for each image. During the sequence
:      those points have to be merged to build trajectories. For
ui f ku 0 u0 Xi
 vi  =  0 each branch of the solution tree (except particular confi-
f kv v0   Yi  (3)
guration) there are two points to assign to a pair of tra-
1 0 0 1 Zi
jectories. Since it is not possible to know directly which
The inverse of this matrix is used to calculate, for each point must be attached to a given trajectory, we introduce
point pi in the image, the equation of the associated pro- a linearity criterion : we calculate the angle α between
jection ray in space. The projection ray is the line passing →
− →
− →

vectors V i,j,k and V i,j,k+1 , where V i,j,k is defined by
through the focal point of the camera and the considered →

points Pbi,j,k−1 and Pbi,j,k , and V i,j,k+1 by points Pbi,j,k and
point in the image plane. The original 3D point is neces-
Pbi,j,k+1 . Pbi,j,k is the j (j = 1, 2) estimated space coordi-
th
sarily located on this line. Here is a parametric equation of
nates of the articulation i in the k th image of the sequence.
the projection ray, where λ is a simple multiplying coeffi-
We must therefore calculate the norm of the cross product
cient : →
− →
− →
− →

k V i,j,k ∧ V i,j,k+1 k = k V i,j,k kk V i,j,k+1 ksin(α). (figure
Ri (λ) = λe pi (4)
2).
e i represents the coordinates of the image point in the
p
camera frame :
 1 u0 
f ku 0 f ku Pi,j,k-1
p pi with C −1 =  0
e i = C −1 e 1
f kv
v0 
f kv (5)
Vi,j,k
0 0 1
 (u −u )  Pi,j,k
i 0 Vi,j,k+1
f ku
 (vi −v0 )  α
ei = 
So that : p  (6) Pi,j,k+1
f kv
1
Therefore the 3D position we search is the intersection of
the surface of the sphere S1 defining the set of the possible Figure 2: Building trajectories : we first compute the cross prod-
configurations for the elbow, and the projection ray ri (λ). uct between the last guiding vector of the current trajectory and
the new one build by using the (white) candidate point. Linea-
Calculation of those intersections in the camera frame con-
rity criterion consists in merging to the current trajectory the point
sists in determining values for λ such as : which minimises the norm of this cross product.
(X1 − λ (ufi −u0) 2 (vi −v0 ) 2
ku ) + (Y1 − λ f kv )
2 2
(7)
+(Z1 − λ) − kl1 k = 0 The candidate for which the norm is weakest is affected
to the corresponding trajectory. The second point of the
This is a second degree polynomial aλ2 +bλ+c = 0 whose branch is then affected to the other one.
coefficients are :
Particular configurations : The construction of the tra-
a = ( (ufi −u 0) 2 (vi −v0 ) 2
ku ) + ( f kv ) + 1; jectories described above can be done correctly in the ge-
b = 2[( (ufi −u 0) (vi −v0 )
ku )(−X1 ) + ( f kv )(−Y1 ) − Z1 ];
(8) neral case where the algorithm gives two intersections bet-
c = X12 + Y12 + Z12 − l12 ween the projection ray and the sphere. However there are
Solving this polynomial gives two possible values for λ, configurations where this assumption is false. Those con-
possibly a single double one, the positions pb2,j (j = 1, 2) figurations must be taken into account in the algorithm ;
possible for the elbow comes now directly since r (λ) = they can also be used as detector for particular movements.
λe
pi . There are two categories of particular configurations:
Using the same technique, we are able to calculate the 1. The polynomial (8) has only a single solution. It hap-
possible positions pb3,j (j = 1..4) of the wrist, considering pens when the considered segment (arm or forearm)
the two spheres whose centers are given by the estimated is included in a plane parallel to the image plane. In
positions of the elbow and rays by the length of the forearm. this case, the projection ray is tangent to the sphere
We can calculate for each value of the position of the elbow and there will only be a single “intersection” with the
two possible positions for the wrist and thus four possible sphere. This point is then added to the both trajecto-
configurations for the arm. ries : it indeed corresponds to a case where the two
This algorithm allows us to reduce the set of possible possible trajectories will cross.
configurations for an arm to four possibilities for a single
image. Elbow’s positions are symmetric in regard of a plane 2. The polynomial (8) does not have any solution. In the
parallel to the image and containing the shoulder. Calcula- absence of noise, this case can occur only for the wrist
tion of the wrist’s position is performed from each possible : after having calculated the two possible positions for
elbow position so that we obtain four possible positions for the elbow, we define the pair of spheres which forms
the wrist. In the same way as for the elbow, each couple of the set of the possible positions of the wrist. There are
solutions is symmetric in regard of a plane parallel to the cases where, based on the “wrong” position of the el-
image and containing the corresponding elbow position. bow, the sphere does not have any intersection with the

87
i θi di αi ai matrix for a revolving joint is :
1 θ1 0 −π/2 0
" #
2 θ2 0 π/2 0 cos θi − sin θi 0 ai
i−1 cos αi sin θi cos αi cos θi − sin αi −di sin αi
3 θ3 l1 −π/2 0 Ti =
sin αi sin θi sin αi cos θi di cos αi
4 θ4 0 π/2 0 0 0 0 1
(9)
5 0 l2 0 0
Where θi , αi , di , ai are the DH parameters.
Table 1: DH parameters describing the human arm system Direct geometrical model : The direct geometrical
model gives the transformation from Cartesian coordinate
space to angular values of the each joint. The 4x4 matrix
0
projection ray. Those configurations directly allows us T5 specifies the homogeneous transformation from frame
to cut a complete branch from the solution tree. 0 (the shoulder) to frame 5 (the wrist) (figure 3). This
matrix is built by multiplying the successive homogeneous
2.2.1. Angular values calculation transformation matrices i−1 Ti , i = 0, .., 5.
The parametric model of the human arm is based on
q
the modified Denavit-Hartenberg (DH) parameters descrip- 1
z
tion (Denavit J., 1955). This representation provides a sys-
x
tematic method for describing relationships between adja- y
R1 y
cent links. As long as the frames attached to each articu-
O x
lation are positionned using DH algorithm (Cf. 2.2.1.. The 1,2
q
model consists in a 4x4 homogeneous transformation ma- 2 R2 z

trix corresponding to the transformation from link 1 to link L1


q
3, which describes, in fact, the arm system. This ma- z
3

trix is parametrized with angular values of each joint and


x
link lengths. This matrix constitute the direct geometrical y y
R3
model. Whereas the inverse geometrical model provides O x
3,4
the joint angular values in function of the joint Cartesian q
4 R4 z
coordinates.
Modified parameters of Denavit-Hartenberg : The DH L2
method is systematic as long as the axis system Ri attached
to each joint (figure 3) is defined using the following rules :
y
O
5 x
1. Oi−1 is the perpendicular common to link Li−1 and
Li axes located on link Li−1 ; R5 z

2. axis xi−1 is the unit vector of the common perpendi-


Figure 3: Arm model showing the frames used in direct geomet-
cular oriented from link Li−1 to link Li−1 ;
ric model calculation
3. zi is the unit vector of link Li ;
This model is parametrized by θi , the joints angular va-
4. axis yi is set so that : yi = zi ∧ xi lues, and allows cartesian coordinates calculation. For an-
gular coordinates calculation we need to inverse this model.
5. relationships between frame Ri and Ri−1 are defined
Inverse geometrical model : The inverse geometrical
by the following parameters :
model is parametrized by the Cartesian coordinates of the
• αi is the offset angle from axis zi−1 to zi around wrist and returns the angular value θi for each joint. The
xi−1 ; first way to calculate this model would be to calculate the
inverse of 0 T5 , but in regard of the complexity of the calcu-
• di : the distance from the origin of the (i − 1)th lation, splitting up the kinematic chain will be a far better
coordinate frame to the intersection of the zi−1 solution. We calculate angular values for each joint sepa-
axis with the xi ; rately by defining the inverse transformation 3 T0 that gives
• θi : the joint angle from xi−1 to xi turning around us the shoulder’s joints angular values from elbows’s Carte-
zi ; sian coordinates (expressed in frame R0 ) and 5 T3 which
gives us elbow’s angular values from wrist’s Cartesian co-
• ai : the offset distance from the intersection of ordinates expressed in R20 frame. R20 is a virtual frame ori-
the zi−1 axis with the xi axis. ented as R2 and centered on the elbow.
Considering only the shoulder, we can write DH para-
With the joint frame O and 1 jointed, the arm model is given meters (table 2) for the shoulder-elbow system, and define
by the D-H parameters is shown in table 1. the homogeneous transformation matrix 0 T3 transforma-
DH parameters are used to write an homogeneous trans- tion matrix by multiplying the elementary transformation
formation matrix for each joint. The generic form of the matrices (9)which specifies the transformation from frame

88
P1 z
Oxy : horizontal plane
θ1 Oxz : sagital plane
P3 Oyz : frontal plane
θ2
θ3

P2 θ4 O
y

x
Figure 4: The inverse geometric model of the arm gives the joint
angular values θ1 , θ2 (shoulder joint) , θ3 , θ4 (elbow joint) know-
ing shoulder, elbow and wrist Cartesian coordinates P1 , P2 , P3 Figure 5: Representation of speaker-centered frame (ScF) show-
ing the planes where most of the elementary gestures are realized.

i θi di αi ai
1 θ1 0 −π/2 0 The results (fig. 6) concern two circular motions of the left
2 θ2 0 π/2 0 hand done in a plane parallel to Oyz plane of ScF, the first
3 0 −l1 0 0 one with the arm in extension (gesture A) and the other one
with the elbow bended (gesture B). The third gesture pre-
sented is a circular one made with the elbow bended in a
Table 2: DH parameters for shoulder-elbow system plane parallel to Oxz plane of ScF (fig. 5) (gesture C). So
that gestures A and B have quite similar aspect from the
viewer’s point of view and that gesture B and C are per-
Ri−1 to frame Ri (i = 1, 2, 3) : formed by moving articulations in a similar manner. Joints
" cos θ12 − sin θ1 cos θ12 −l1 cos θ1 sin θ2
# values are computed on each solution provided by the geo-
0
T3 =
sin θ1 cos θ2 cos θ1 sin θ12 −l1 sin θ12 metric reconstruction algorithm.
− sin θ2 0 cos θ2 l1 cos θ2
0 0 0 1 Figure (6) presents angular values evolution for each
joint of the arm model and for three different gestures.
Where cos θ12 stands for cos θ1 ∗ cos θ2 . Those values are presented in polar coordinates and ρ pa-
The fourth column of 0 T3 represents the direct geometric rameter stands for time (which means that gesture duration
model, so the inverse geometric model is : has been normalized). Different curves correspond to an-
 gular values computation for each geometrical solution. If
θ1 = arctan(y2 /x2 ) we except noise on angular values implied by geometrical
(10)
θ2 = arccos(z2 /l1 ) reconstruction, different angular trajectories for a same an-
gle can be either confused (fig.6, θ1 and θ4 variations for
Doing the same calculation for the wrist brings (wrist’s
gesture A) or symmetric (fig.6, θ1 and θ2 variations for ges-
cartesian coordinates have to be expressed in elbow-
ture B and C). So that for each solution, changes in angular
centered frame R20 ) :
value variation occur at the same time.
 One can remark too, that gesture B and C have closer
θ3 = arccos(z3 /l2 )
(11) signatures than gesture A and B in the sense that θ1 ,θ2 and
θ4 = arctan(y3 /x3 )
θ3 variations have the same kind of symmetry for those ges-
As the arm model is redundant, direct inversion using tures : θ1 and θ2 are symmetric in regard of on a horizontal
the analytical solution will lead to unexpected reversal in axis and θ3 values present symmetries in regard of a verti-
angular values. To avoid it, we use a numerical resolution cal one. And that angular values for each articulation take
method to compute the first two joint values (θ1 , θ2 ). This values in the same part of the angular space.
method can be initialized with previously computed values
so that the new ones stay as close as possible to them which 4. Conclusion
leads to smooth trajectories. Solution is computed itera- Articulation-based motion representations are used to
tively using the pseudo-inverse of the arm system Jacobian improve results computed by a single-camera geometrical
(Klein C.A., 1983). Only the last two values are analyti- algorithm which estimates possible poses of a human arm,
cally calculated. This approach allows us to obtain a set of being given a single image. This algorithm, provides us
angular values corresponding to the given 3D joints posi- with a set of four possible motions for the arm in space. We
tion, even when the arm has a singular configuration. made the assumption that using such a representation of
gesture could allow us to use any of those solutions for ges-
3. Articulation-based motion representation ture analysis. Primary experimentations on simple gestures
Articulation-based motion representation could be used brought out relationships such as symmetries or confusion
to distinguish, among geometrical solutions, the good one, between angular values for the different solutions, which
so that the first point to study is the variation of joint values is due to symmetries between the different solutions. On
for each solution. The second one concerns the possibility the other hand, recent linguistic issues made the assump-
to use those representations to differentiate gestures based tion that using a proprioceptive representation of gesture
on the way they are made. Preliminary experiences have is more suitable for Sign language analysis than a descrip-
been made using a video corpora of elementary gestures. tion based on elementary gestures described from an ob-

89
90 1 90 1
120 60 120 60 ume 2915 of Lecture Notes in Computer Science. Gen-
150 0.5 30 150 0.5 30
ova, Italy: Springer.
180 Theta 1 0 180 Theta 2 0
Boutet, D., 2001. Une approche morphogénétique du sens
210 330 210 330

240 300 240 300


dans la gestuelle conversationnelle. In C. Calvé et al.
270 270
(ed.), Oralité et Gestualité. Paris: l’Harmattan.
z
90 1 90 1
Braffort, A., 1996. Argo : An architecture for sign lan-
120 60 120 60

150 0.5 30 150 0.5 30


guage recognition and interpretation. In P. Harling and
O 180 Theta 3 0 180 Theta 4 0
al. (eds.), Progress in Gestual Interaction. Springer.
y
210 330 210 330
Denavit J., Hartenberg R.S., 1955. A kinematic notation
x
A 240
270
300 240
270
300 for lower-pair mechanisms based on matrices. Journal
120
90 1
60 120
90 1
60
of Applied Mechanics:215–221.
150 0.5 30 150 0.5 30 Gurdjos P., Payrissat R., 2002. Calibrage plan d’une
180 Theta 1 0 180 Theta 2 0 caméra en mouvement à focale variable. In RFIA 2002.
210 330 210 330 AFRIF-AFIA.
z 240
270
300 240
270
300
Heikkilä, J., 2000. Geometric camera calibration using cir-
cular control points. IEEE Transactions on Pattern Anal-
120
90 1
60 120
90 1
60 ysis and Machine Intelligence, 22(10):1066–1077.
0.5 0.5

O
150 30 150 30
Horain P., Bomb M., 2002. 3d model based gesture acqui-
y 180 Theta 3 0 180 Theta 4 0
sition using a single camera. In Proceedings of the IEEE
x 210 330 210 330
Workshop on Applications of Computer Vision (WACV
240 300 240 300
B 270 270
2002). Orlando.
90 1 90 1
120 60 120 60
Klein C.A., Huang C.H., 1983. Review of pseudoinverse
150 0.5 30 150 0.5 30
control for use with kinematically redundant manipula-
180 Theta 1 0 180 Theta 2 0
tors. IEEE Transactions on Systems, Man, and Cyber-
210 330 210 330

240 300 240 300


netics, SMC-13(3):245–250.
270 270
z Lenseigne B., Gianni F., Dalle P., 2004. Estimation mono-
90 1 90 1
vue de la posture du bras, méthode et évaluation. In
120 60 120 60

150 0.5 30 150 0.5 30


RFIA 2004. Toulouse: AFRIF-AFIA.
O y 180 Theta 3 0 180 Theta 4 0
Prillwitz, S. and al., 1989. HamNoSys. Version 2.0; Ham-
x 210 330 210 330
burg Notation System for Sign Languages. An introduc-
C 240
270
300 240
270
300 tory guide. Hamburg : Signum.
Somers M.G., Whyte R.N., 2003. Hand posture matching
Figure 6: On the left : gesture A,B,C representation in ScF. Ges- for irish sign language interpretation. In ACM Proceed-
ture A and B have similar visual aspect from the viewer’s point ings of the 1st international symposium on Information
of view, while gesture B and C are performed with similar ar- and communication technologies. Dublin Ireland: Trin-
ticulation motion. On the right : joint values computed on each ity College Dublin.
gesture and for each solution provided by geometrical reconstruc- Starner T., Pentland A., Weaver J., 1998. Real-time ameri-
tion. Each solution is displayed as different curve. Each graph
can sign language recognition using desk and wearable
presents the evolution of the angular value for a given angle, from
computer based video. IEEE Transactions on Pattern
left to right, from top to bottom : θ1 , θ2 , θ3 , θ4 . Angle values are
displayed in polar coordinates and ρ parameter stands for the time Analysis and Machine Intelligence, 20(12):1371–1375.
so that a constant angle value for an angle would be displayed as Tanibata N., Shirai Y., Shimada N., 2002. Extraction of
straight line starting at the center of polar frame. hand features for recognition of sign language words. In
15th International Conference on Vision Interface. Cal-
gary, Canada.
servator point of view. Our algorithm make it possible to Vogler C., Metaxas D., 1998. Asl recognition based on a
build such a articulation-based motion representation from coupling between hmms and 3d motion analysis. In Pro-
single-camera data. Considering gestures performed in a ceedings of the International Conference on Computer
similar manner with different orientations and comparing Vision. Mumbai, India.
the results to gestures performed in a different manner but Vogler C., Metaxas D., 1999. Parallel hidden models for
similar form observers point of view, we could observe that american sign language recognition. In Proceedings of
using our method will lead to a different gesture classifica- the International Conference on Computer Vision.
tion than the ones based on visual aspect in image or tridi- Wren C., Pentland A., 1999. Understanding purposeful hu-
mentionnal representations. Further researchs have to be man motion. In Proc. IEEE International Workshop on
perform to bring out useful criterions to analyze real Sign Modelling People (MPEOPLE).
language gestures from this point of view, but primary re-
sults are encouraging.

5. References
Athitsos V., Sclaroff S., 2004. Database indexing methods
for 3d hand pose estimation. In Gesture Workshop, vol-

90
Phonetic Model for Automatic Recognition of Hand Gestures

Jose L. Hernandez-Rebollar
The George Washington University, Electrical and Computer Engineering
725 23rd St. NW. Lab 302, Washington DC 20052
[email protected]

Abstract results in a reduced field of view unable to fit hand


movement or body posture, and a high bandwidth
This paper discusses a phonetic model of hand gestures that connection (processor) is required to transmit (analyze) the
leads to automatic recognition of isolated gestures of the data stream and reproduce the video at acceptable speed.
American Sign Language by means of an electronic instrument. An alternative is the combination of angular sensors of
The instrumented part of the system combines an AcceleGlove different types mounted directly on the signer's joints of
and a two-link arm skeleton. The model brakes down hand
gestures into unique sequences of phonemes called Poses and
interest. Although bulkier, cumbersome and more
Movements. Recognition system was trained and tested on obtrusive, these instrumented approaches have been more
volunteers with different hand sizes and signing skills. The successful in capturing hand postures [Grimes1983,
overall recognition rate reached 95% on a lexicon of 176 one- Kramer1998] than the approaches based on video alone
handed signs. The phonetic model combined with the recognition [Uras1994].
algorithm allows recognition of new signs without retraining. In this work the combination of a phonetic model of
hand gestures and a novel instrumentation to capture and
1. Introduction recognize the hand gestures in American Sign Language, is
discussed. Non-manual components such as facial
The development of automatic means to study sign expression, eye gaze and body posture are not considered
languages is destined to have enormous impact on here.
economy, society and science. Costello [1999] estimates
that American Sign Language (ASL) is the fourth most 2. Review of previous approaches.
used language in the United States with 13 million people,
including members of both the hearing and deaf The first and most important step in the recognition
community. Some 300,000 to 500,000 of them are ASL process is to extract, from a given gesture, all the necessary
native-speakers, which means that their full integration to features that allow the recognition system to classify it as
society depends on their ability to overcome the language member of one and only one class. Two things are needed
barrier by using all means at their disposal. William Stokoe to achieve that step: a model that describes gestures in
[1995] was probably the first linguist to involve engineers, terms of necessary and sufficient features, and a capturing
not only educators, in solving the challenge of better system suitable to detect such features. It is imperative for
communication, he wrote: "Looking back, it appears that the resulting set of features (pattern) to be different for
linguistics was made possible by the invention of writing. each gesture, and it is desirable for the resulting pattern to
Looking ahead, it appears that a science of language and have a constant number of features (fix dimensionality) and
communication, both optic (gestures) and acoustic as few as possible (reduced dimensionality).
(speech), will be enabled, in all probability, not by The model proposed in this work is based on the
refinements in notational systems, but by increasing assumption that any hand gesture can be analyzed as a
sophistication in techniques of recording, analyzing, and sequence of simultaneous events, and each sequence is
manipulating visible and auditory events electronically." unique per gesture. Those events are referred in this work
It is ironic that even though humans learned how to as phonemes. As straightforward as this scheme could
communicate through gestures before learning how to sound, it could be cause of debate among many signers and
speak, methodologies for analyzing speech and spoken teachers who conceive signs as indivisible entities. The
languages are far better understood than the methodologies following is a review of different phonemes and structures
for analyzing and, in consequence, recognizing gestures that have been proposed to model hand gestures.
and sign languages.
Engineers found a way to capture speech in 1915 with 2.1. Phonetic structure
the invention of the carbon microphone. This transducer
produces an electrical signal corresponding to change in air By using traditional methods of linguistics to isolate
pressure produced by sound waves, which contains all the segments of ASL, Stokoe found that signs could be broken
information required to record and reproduce speech down into three fundamental constituent parts: the hand
through a speaker. Sign language, in turn, combines hand shape (dez), hand location with respect to the body (tab),
movements, hand shapes, body posture, eye gaze, and and the movement of the hand with respect to the body
facial expression that are not easy to capture by using only (sig), so these phonemes happen simultaneously. Lidell
one type of sensor. Approaches that use arrays of video [1989] proposed a model of movements and holds, Sandler
cameras to capture signing struggle to find an adequate [1986] proposed movements and locations, and Perlmutter
way of reproducing tri-dimensional images. The high
resolution needed to capture hand shape and eye gaze

91
[1988] proposed movements and positions, all of them By following definitions 1 to 7, icons, letters,
happening sequentially. initialized, and non-initialized signs, are modeled by PMP
Some automatic systems have followed models similar of fixed dimensionality, while compound, pantomimic,
to Stokoe [Bauer, 2000; Vamplew, 1996] and Lidell classifiers, and lexicalized finger spelled words, are
[Vogler, 1999]. By using Stokoe's model, patterns are of modeled as sequences of variable length. These patterns
reduced and fix dimensionality but similar for gestures that are listed in Table 1.
are only different in their final posture (such as GOOD and
BAD). Patterns that result from Liddell's model eliminate Sign Model
this problem by considering the initial, final, and
intermediate states of the hand and the movements that Two handed PMP, PMP
happen in between. Still, the model produces ambiguous icons one sequence per hand
patterns with variable dimensionality. As an example,
when signing FATHER, tapping the thumb of a 'five' hand Finger spelled PMP
shape against the forehead, the sequence can be described words per letter
as a Movement followed by a Hold followed by a
Movement and finished by a Hold (MHMH) or as a Lexicalized sequence of 2n-1 phonemes
HMHMH if the hand is considered to start from a static finger spelled, n= number of letters
position, or as a simple Hold, as many signers do not make *compound signs, n=number or signs*
long movements when tapping. Closely linked to these **pantomimic n=number of pauses**
models are the recognition methods suitable to recognize
the resulting patterns. Hidden Markov Models (HMM) and
Neural Networks (NN) have been used to recognize Table 1. Signs and their respective sequences of phonemes
complete sentences [Starner, 1998], isolated words
[Waldron, 1995], or phonemes [Vamplew, 1996], but none As a proof of concept, a Lexicon of one-handed signs
of those approaches has been able to integrate hand gesture from two dictionaries [Costelo,1999; IDRT, 2001] with
and finger spelling in one recognition system. patterns of the form PMP were targeted for recognition.
Since any sign is merely a new combination of the same
phonemes, the recognition system is composed by small
2.2. The Pose-Movement model subsystems that capture a finite number of phonemes
complemented by a search engine, which compares
Under the sequential models previously explained,
captured sequences against stored sequences.
ASL resembles the linear structure of spoken languages:
phonemes make up words, and words in turn make up
sentences. Phonemes in these models are, in some degree, 3. Instrumentation
the three simultaneous components of Stokoe, so the
execution of ASL gestures can be seen as a sequential The instrument designed to capture all the phonemes
combination of simultaneous phonemes. Specifically, two found in the resulting sequences (53 postures, including six
types of phonemes: one static and one dynamic. orientations; twelve movements and eleven locations)
Definition 1: A pose is a static phoneme composed of comprises an Acceleglove [Hernandez, 2002] to capture
three simultaneous and inseparable components hand postures, and a two-link skeleton attached to the arm
represented by vector P = [hand shape, palm orientation, to capture hand location (with respect to the shoulder) and
hand location]. The static phoneme occurs at the beginning hand movement. Data is sent serially to a laptop Tthinkpad
and at the end of a gesture. running windows 98 on a Pentium III. The sign recognizer
Definition 2: A posture is a vector of features Ps = is based on a search algorithm.
[hand shape, palm orientation]. Twenty-four out of the 26
letters of the ASL alphabet are postures that keep their 3.1 Training and testing
meaning regardless of location. Letters J and Z are not
considered postures because they have movement. Posture, location and movement were recognized
Definition 3: Movement is a dynamic phoneme independently; trained and tested with help of 17
composed by the shape and direction of the trajectory volunteers of different skill levels, from novice to native
described by hands when traveling between successive signer. That selection allowed covering a range of accents
poses. M=[direction, trajectory]. and deviations with respect to the citation form. The search
Definition 4: A manual gesture is a sequence of poses algorithm was tested with 30 one-hand gestures first, and
and movements, P-M-P. 176 later to test scalability. The complete list of signs is
Definition 5: L, the set of purely manual gestures that found in [Website].
convey meaning in ASL is called the lexicon.
Definition 6: A manual gesture s is called a sign if s 3.2. Postures
belongs to L.
Definition 7: Signing space refers to the physical The posture module starts recognizing any of six palm
location where signs take place. This space is located in orientations: vertical, horizontal, vertical up-side down,
front of the signer and is limited by a cube bounding the horizontal tilted, horizontal palm up, and horizontal tilted
head, back, shoulders and waist. counter clockwise.

92
Afterwards, the posture recognizer progressively
discriminates postures by the position of fingers. Decision
trees are generated as follows [Hernandez, 2002b].
-For all trees, start decision nodes evaluating the far head
position of the pinky finger and base the subsequent node's
decision on the next finger (ring, middle, index, thumb).
-If postures are not discriminated by finger flexion,
then continue with finger abduction. far chest
-If postures are not different by individual finger
flexions or abductions, then base classification on the
overall finger flexion and overall finger roll. far stomach
To train the orientation nodes, all seventeen signers
were asked to hold the initial pose of FATHER, NICE,
elbow
PROUD, PLEASE, THING and ASIDE. In average, the
orientation module accurately recognized 94.8% of the
samples. The worst recognition rate corresponded to
horizontal postures where the threshold is blurred by the (a)
deviations introduced by signers' accents, since they were
asked to hold their poses, not to hold their hand in a certain
position. head

3.2.1. Aliases shoulder cheek


chin
Since accelerometers do not detect angular positions
around the gravity vector, 10 postures were impossible to
discriminate based on finger bending or abduction around
chest
the gravity vector. These postures are called aliases. This
aliasing reduced the number of recognizable postures from
53 to 43. The highest accuracy (100%) corresponded to the stomach
vertical palm with knuckles pointing down used to sign
PROUD, the worst accuracy rate corresponded to postures
C and E, with 68%, for a recognition average of 84%.

3.3. Locations (b)


Figure 1. a) Far locations. b) Close locations.
By looking at the initial and final position of the hand
during the execution of each sign in the lexicon, eleven The overall accuracy rate was 98.1% : head 98%, cheek
regions in the signing space were identified: head, cheek, 95.5%, chin 97.5%, shoulder 96.5%, chest 99.5%, left
chin, right shoulder, chest, left shoulder, stomach, elbow, shoulder 98.5%, far chest 99.5%, elbow 94.5 %, stomach,
far head, far chest and far stomach. To train the recognizer, far head and far stomach 100%. The skeleton system does
four signers were asked to locate their hand at the initial not need an external reference source, and it is immune to
poses of several signs that start or finish at those regions: ambient noise; that makes it a better choice for a portable
FATHER, KNOW, TOMORROW, WINE, THANK YOU, instrument that infrared and magnetic trackers.
NOTHING, WHERE, TOILET, PLEASE, SORRY,
KING, QUEEN, COFFEE, PROUD, DRINK, GOD, YOU, 3.4. Movements
FRENCH FRIES and THING. Volunteers were chosen
based on their heights so they cover the full range of height Movements of the one-handed signs considered in this
among the group of volunteers. work are described by means of two movement primitives:
Figure 1 shows the initial and final locations captured curviness [Bevilaqua2001] and direction. Both metrics are
with the two-link skeleton as executed by the middle orientation and scale independent. As with the case of hand
height signer (1.70 mts). Figure 1a corresponds to postures and locations, the exact movement varies from
locations close to the body and Figure 1b corresponds to signer to signer and from trial to trial. Six directions (up,
locations away from the body. A human silhouette is down, right, left, towards, and away) and two levels of
superimposed on the plane to show locations related to curviness (straight and circular) were identified in the
signer's body. The plane y-z is parallel to the signer's chest, Lexicon that gave a total of twelve different movements.
with positive values of y running from the right shoulder to Same four signers were asked to perform the six basic
the left shoulder and positive values of z above the right movements along the main axes and the two curves ten
shoulder. times each. Directions left and right were classified
Similar to orientations and postures, locations are with less than 100% (77% and 75%) reducing overall
solved using a decision tree, thresholds on y and z accuracy to 92%. A curviness greater than 4 discriminated
boundaries are set at least 4 around the mean, and 3 on circles from straight lines with 100% accuracy, but only
x due limitations imposed by the instrumentation. signs with straight movements were implemented in the
recognition algorithm.

93
4. Search Engine. recognition program to a wearable computer for a truly
portable electronic translator. The long-term objective
A variation of template matching called conditional shall include a grammar correction module to rearrange the
template matching was used to classify complete signs. sequence of translated glosses and correct for tenses,
Conditional template matching compares the incoming gender, and number as needed by the spoken language.
vector of phonemes (captured with the instrument) against
a pre-stored file of patterns, component by component, and 6. References
stops the comparison when a condition is met:
-For all patterns in the lexicon, extract a list of signs - Bauer, B., Hienz, H., and Kraiss, K., 2000. Video-Based
matching the initial posture captured by the Acceleglove. Continuous Sign Language Recognition Using Statistical
This is the first list of candidate signs. Methods. IEEE 2000, pp 463-466.
-For all patterns in the list of candidates, select the - Bevilacqua F., Naugle L., and Valverde I., 2001. Virtual Dance
signs matching the initial location captured by the two- and Music Environment Using Motion Capture. Proc. of the
link skeleton. This is the new list of candidate signs. IEEE-Multimedia Technology and Applications Conference,
Irvine, CA.
Repeat the matching and creation of new lists of - Costelo, Eleine 1999. Random House Webster"s Concise
candidates by using movement, final posture and final American Sign Language Dictionary. Random House Inc. NY.
location. - Fels, Sidney S., and Hinton, Geoffrey E., 1993. Glove Talk -A
Stop when all components have been used OR when Neural-Network Interface Between a Data-Glove and a Speech
there is only one sign on the list after matching the initial Synthesizer. IEEE Transactions on Neural Networks, vol. 4,
location. That sign on the list is called 'the most likely'. No. 1. January.
- Grimes, G. 1983. US Patent 4,414,537. November.
The search algorithm can be seen as a decision tree - Hernandez Jose L., Kyriakopoulos, N., Lindeman, R.. The
with a variable number of nodes. The expected probability AcceleGlove a Hole-Hand Input Device for Virtual Reality.
ACM SIGGRAPH Conference Abstracts and Applications
of finding a given sign is inversely proportional to the 2002. pp 259.
depth of the tree. In other words, it is more likely to - Hernandez, Jose L., Kyriakopoulos, N., Lindeman R. A Multi-
recognize a sign if it is the only one in the lexicon Class Pattern Recognition of Practical Finger Spelling
performed with certain initial pose (such as PROUD), and Translation, IEEE International Conference on Multimodal
it is less likely to recognize two signs when only the final Interfaces ICMI'02. October 2002, pp 185 -190.
pose makes them different (such as GOOD and BAD). - IDRT 2001. The Ultimate American Sign Language Dictionary.
The Institute for Disabilities Research and Training Inc.
Copyright 2001.
4.1. Evaluation.
- Kramer, J., and Leifer, L., 1988. The Talking Glove: An
Expressive and Receptive Verbal Communication Aid for the
An initial evaluation used only 30 signs taken from Deaf, Deaf-Blind, and Nonvocal, SIGCAPH 39, pp.12-15
Starner (1998), Vogler (1999), and Waldron (1995): (spring 1988).
BEAUTIFUL, BLACK, BROWN, DINNER, DON'T - Lidell S., and Johnson, R., 1989. American Sign Language: The
LIKE, FATHER, FOOD, GOOD, HE, HUNGRY, I, LIE, phonological base. Sign Language Studies, 64: 195 -277.
LIKE, LOOK, MAN, MOTHER, PILL, RED, SEE, - Perlmutter, D., 1988. A moisac theory of American Sign
SORRY, STUPID, TAKE, TELEPHONE, THANK YOU, Language syllable structure, paper presented at Second
THEY, WATER, WE, WOMAN, YELLOW, and YOU. Conference on Theoretical Issues in Sign Language Research.
Gallaudet University, Washington, DC.
The PMP sequences reflect the citation forms as found in
- Starner, T., Weaver, J., and Pentland, A., 1998. A Wearable
Costello [1999] and in the Ultimate ASL Dictionary Computer Based American Sign Language Recognizer, MIT
[IDRT2001]. The overall recognition rate was 98% since Media Lab. Technical Report 425.
almost all of them have different initial poses. - Stokoe, William C., Armstrong, David F., Wilcox, Sherman E.
Gesture and the Nature of Language. Cambridge University
4.2. Scalability Press, 1995
- Uras, C., and Verri, A. 1994, On the Recognition of The
Since any new sign is a combination of the same Alphabet of the Sign Language through Size Functions.
phonemes, the lexicon can be expanded without retraining Dipartimento di Fisica, Universitá di Genova. Proceedings of
the search algorithm. When tested on 176 one handed signs the 12th IAPR Int. Conf. On Pattern Recognition. Conference
B: Computer Vision and Image Processing Vol 2,1994. pp 334-
performed by one signer the overall recognition rate
338.
reached 95%. - Vamplew, P., 1996, Recognition of Sign Language Using
Neural Networks, Ph.D. Thesis, Department of Computer
5. Conclusions and Future Work Science, University of Tasmania.
- Vogler, C. and Metaxas, D., 1999, Toward Scalability in ASL
The model, instrumentation and recognition algorithm Recognition: Breaking Down Signs into Phonemes. Gesture
explained in this work represent a framework for a more Workshop"99, Gif-sur-Yvette, France, March 17-19.
complex system where a larger lexicon can be recognized - Waldron Majula B. 1995, Isolated ASL Sign Recognition
by extending the patterns to include non-manual gestures System for Deaf Persons, IEEE Trans. On Rehabilitation
Engineering vol 3 No. 3 September.
when the required instrumentation to detect them becomes
- Sandler, W. 1986, The Spreading hand autosegment of
available. American Sign Language. Sign Language Studies 50: 1 -28.
Work in the immediate future will incorporate a second - website: http://home.gwu.edu/~jreboll/signlist.txt
PMP sequence for the non-dominant hand, and migrate the

94
Development of a new „SignWriter“ Program

Daniel Thomas Ulrich Noelpp


[email protected]
Moos bei Köniz, Berne, Switzerland
http://www.signwriter.org/

Abstract
The „Sutton SignWriting“ system is a practical writing system for deaf sign languages. The symbols describe shape, location and
movement of hands as well facial expressions and other signing information. „SignWriter Java 1.5/Swing“ is being developed as the
successor to „SignWriter DOS“, a program for typing and editing „SignWriting“ texts, used by school children, teachers, linguists and Deaf
people. The new Java version 1.5 „Tiger“ is used in development and Swing as the graphical user interface.

people the possibility of writing to each other, making notes


1. The new program and reading text written in their native language.
A „SignWriter Java 1.5/Swing“ program as the
successor to „SignWriter DOS“ programmed by Richard In the eighties, Richard Gleaves developed the first
Gleaves is being developed in the new Java 1.5 („Tiger“) „SignWriter“ program, which made it possible to type
version using the Swing graphical user interface library. „SignWriting“ on the computer. The latest version 4.4 is
The existing „SignWriter DOS“ program is a simple, yet now eight years old. It is an excellent software from the
powerful program for typing and editing „SignWriting“ early days of personal computers, but it has become
texts. As many school children, teachers and linguists are somewhat outdated. The computer resources at that time
already using this program for their everyday work, it is were limited and the operating systems were very different
important that the typing conventions are not changed very from those of today. The user interface no longer meets the
much. Support for the SGN files („SignWriter DOS“ file expectations which today’s users have. One of the biggest
format for „SignWriting“ texts) is important as well. In a drawbacks to this earlier version is that it only runs under a
summary, former users shouldn't need to change their way pure DOS system. Modern Mac OS, Windows NT, 2000
of working with „SignWriting“ or not very much. and XP all require a DOS virtual machine to start
„SignWriter DOS“. There are other shortcomings: Low
There are some new features, however: A friendlier user resolution of the symbols which leads to visible pixelization
interface (thanks to Swing of Java 1.5) is implemented, (zigzag effect on round curves or oblique lines) and inverted
which is also easier for new users to understand. And display (white on black). These are all reasons why a
because there are different „alphabets“ in use, a multi- successor to the SignWriter DOS is urgently needed by the
alphabet capability seems to be important, too. The old SignWriting community.
symbols of „SignWriter DOS“ are retrofitted into the
framework of the multi-alphabet capability, or expressed in 3. Demonstration and Discussion
a simpler way: „SignWriter Java 1.5/Swing“ understands The program is being redeveloped from scratch using
the old „alphabet“, but can work with and convert to the the new version 1.5 of Java and with the Swing graphical
new ones. And another important thing is the support for user interface library. Development is open source.
SWML files (an XML file format to store „SignWriting“ „SignWriter“ is layered onto an alphabet package called
texts, developed by Antônio Carlos da Rocha Costa). „signwriter.alphabet“ which knows about the symbols and
is modeled after Sutton's „SymbolBank“. It is hoped that
It is hoped that the new „SignWriter“ program is especially the „alphabet“ package can be reused in other
accepted by the SignWriting community as the successor to projects outside „SignWriter Java“.
„SignWriter DOS“. Public release is planned for autumn, The diagram shows some Java interfaces and classes
2004. which make up the programmer's interface to the alphabet.
This interface is multi-alphabet capable. The programmer
2. About „SignWriting“ and the old program loads an Alphabet object using the Factory.loadAl-
„Sutton SignWriting“, developed by Valerie Sutton, is a phabet() method. From the alphabet one can manage the
practical writing system which can be used for all the sign symbols and base symbols. The package is immutable: once
languages of the world. The symbols of „SignWriting“ loaded it is impossible to destroy the alphabet by mistake.
describe the shape, location and movements of the hands, as For symbols within a sign there's another class called
well as the facial expressions which a signer makes and SignSymbol outside the package (not shown in the
other signing information. This writing system gives Deaf diagram). There are many more technical details interesting
for developers. But because the audience of the
demonstration are end users as well, we will stop here.

95
Please see fig. 1 at the bottom of the paper for an UML
class diagram. 6. Donations
We thank Ingvild Roald, for the generous financial support!
The new features of Java 1.5 are used in the pro- The project team is working hard with rather limited
gram. They are genericity (especially useful for resources. If you are willing to give a donation to the
collections of objects like the symbols of a sign, development, it is appreciated very much. It is planned to
Sign.getParts() returns a list of sign parts with put a list of supporters and donators in the About menu of
the type List<Part> for example); the enhanced for the SignWriter prominently. Would you like to be included
loop for an easier iteration through collections and in this list? Please contact Daniel Noelpp.
many others. Important for end-users additionally is
the improved look-and-feel of Swing which gives Java 7. References
applications a more modern and friendlier appearance
than before. Sutton, V. (2004) The International Movement-Writing
Alphabet – The IMWA Sign-Symbol-Sequence for All Sign
The demonstration is an opportunity to show and Language and Gesture. La Jolla: Deaf Action Committee
discuss design decisions and diagrams, screenshots for SignWriting.
and last-minute experiences and to play with the latest Sutton, V. (2002) Sutton's Sign-Symbol-Sequence 2002. La
development version of the unfinished software. Jolla: Deaf Action Committee for SignWriting.
Developers can ask questions about inner workings. Sutton, V. (2002) Sutton's SymbolBank: Sign-Symbol-
End users about the features and the look-and-feel. It Sequence 1999 compared to Sign-Symbol-Sequence 1995.
is a big opportunity for the team as well! We need the La Jolla: Deaf Action Committee for SignWriting.
feedback. Without feedback we don't know whether Costa, A. C. R. & Dimuro, G. P. (2001) A SignWriting-
we do the right thing. You have an impact on the Based Approach to Sign Language Processing. Escola de
development. Informática, Universidade Católica de Pelotas
Sutton, V. (1999) Lessons in SignWriting – Textbook and
Be warned, however. The software is unfinished and not Workbook. La Jolla: Deaf Action Committee for
even in alpha stage. Things might not work at all. SignWriting (2nd ed.)
Sutton, V. & Gleaves, R. (1995) SignWriter – The world's
4. About Daniel Noelpp first sign language processor. La Jolla: Deaf Action
Committee for SignWriting.
Born Deaf in Switzerland 1970, he attended a residential Sun Microsystems, Inc. (2004) J2SE 1.5 „Tiger“ Feature
school for Deaf children near Berne. Later, he was List. Sun Microsystems, Inc. Santa Clara CA
„mainstreamed“ into a school with hearing children. He Sun Microsystems, Inc. (2004) Java 2 SDK, Standard
received his college diploma in 1989. After several years Edition, Version 1.5.0 - Summary of New Features and
studying at the University of Berne, he worked as a Enhancements. Sun Microsystems, Inc. Santa Clara CA
Software Engineer for the same University as well as for
several companies in Switzerland. In 2000, he worked for
six months as a Software Consultant in Pune, India. At the
present time, he is attending HTI (University of Applied
Sciences) in Berne and developing „SignWriter Java
1.5/Swing“ at home.

5. The team members


The software is not developed by Daniel Noelpp alone.
The other members of the team are Günel Hayirli (HTI
student, hearing) and Matthias Noelpp (Electrical engineer,
hard of hearing).

96
Fig. 1: UML class diagram for package signwriter.alphabet

97
An Overview of the SiGML Notation and SiGMLSigning Software System
Ralph Elliott, John Glauert, Vince Jennings, Richard Kennaway
School of Computing Sciences
University of East Anglia
Norwich NR4 7TJ, UK
{re,jrwg,vjj,jrk}@cmp.uea.ac.uk
Abstract
We present an overview of the SiGML notation, an XML application developed to support the definition of Sign Language sequences
for performance by a computer-generated virtual human, or avatar. We also describe SiGMLSigning, a software framework which uses
synthetic animation techniques to provide real-time animation of sign language sequences expressed in SiGML.

1. Introduction the University of Hamburg. We then give a brief overview


We have developed the SiGML notation (Elliott et al., of SiGMLSigning, the back-end software subsystem iden-
2001) to support our work in the ViSiCAST and eSIGN tified above. We conclude with a simple example.
projects (Glauert, 2002; Glauert et al., 2004). These
projects have been concerned with the development of tech- 2. Gestural SiGML and HamNoSys
niques for the generation of sign language performances by
As we have indicated, gestural SiGML is based on
a computer-generated virtual human, or avatar.
HamNoSys (Prillwitz et al., 1989), that is, the Hamburg
The name SiGML is an abbreviation for “Signing Ges-
Notation System. This notation has been developed to sup-
ture Markup Language”. SiGML is an XML applica-
port phonetic-level transcription of sign language perfor-
tion (Bray et al., 2004). Thus, SiGML data is represented
mance by (real) human signers, and is intended to pro-
as plain text in computer systems. SiGML encompasses
vide a model of sign language phonetics that is indepen-
several data formats used at different stages in the gener-
dent of any particular sign language. We have developed
ation of virtual human animations, but its most prominent
gestural SiGML with the explicit intention of formulating a
rôle is as the interface notation used in a prototype system
model of signing gesture production which respects Ham-
supporting the generation of signed animation from natural
NoSys’s model of sign language phonetics. At the start of
language text. This system was a major part of the ViSi-
the ViSiCAST project, HamNoSys stood at version 3. In
CAST project; as outlined in (Elliott et al., 2000), it con-
preparation for the development of gestural SiGML, an ini-
tains two major subsystems:
tial phase of the ViSiCAST project saw the development
• A “front-end” which uses natural language processing of HamNoSys version 4 (Hanke et al., 2000; Hanke and
techniques to translate (English) text into an equiva- Schmaling, 2002). As far as the manual aspects of sign-
lent Sign Language form, for which a phonetic-level ing are concerned, HamNoSys 4 does not radically alter
description is generated. the already well-established features of HamNoSys 3, but
generalises and regularises several of those features. The
• A “back-end” which uses 3-D animation technol- more prominent changes in HamNoSys 4 occur in connec-
ogy (together with artificial language processing) to tion with the non-manual aspects of signing, for which a far
generate a virtual human animation from the given more comprehensive framework is provided than was pre-
phonetic-level description. viously available. Following HamNoSys, gestural SiGML
The natural language subsystem is designed to support out- includes both a manual component, concerned with the
put for several different national sign languages. Thus, it configuration and actions of the hands, and a non-manual
divides into a common initial stage, producing a language- component, concerned with other linguistically significant
neutral semantic representation (using DRT), followed by features of signing such as head movement, eye movement,
a stage specific to the target sign language. The most fully eye gaze, and mouthing. In the rest of this section we out-
developed of the latter is that for British Sign Language line some general features of the SiGML notation before
(BSL) (BDA, 1992), which uses HPSG as the supporting briefly describing the two components in turn.
grammatical formalism. More details on this work by our
colleagues, Marshall and Safar, can be found in (Safar and 2.1. General Features of Gestural SiGML
Marshall, 2001; Safar and Marshall, 2002b; Safar and Mar- Considered as XML, a valid SiGML document is a pure
shall, 2002a; Safar and Marshall, 2002c). element hierarchy: every element is constrained by the
The interface between the two subsystems is the SiGML DTD (Kennaway et al., 2002) either to have element con-
notation, specifically the SiGML module we refer to as tent or to be empty, that is, no SiGML element contains
“gestural” SiGML. In the following section we describe any embedded text, although of course it can, and in most
gestural SiGML in more detail, concentrating on its rela- cases does, contain attribute definitions. A SiGML docu-
tion to HamNoSys, the long established notation system ment defines a sequence of “signing units”. Typically, a
for sign language transcription developed by our partners at signing unit is an explicit gestural definition for a single

98
sign, but it may also be a direct definition of avatar anima- • Repeated motion: various forms of single or multiple
tion parameters, or an indirect reference to another SiGML repetition of a given motion.
document. A gestural sign definition is represented by a
<hamgestural_sign> element. Since it is intended The simplest form of motion is a straight line motion
that any HamNoSys sign definition can be represented in in a given direction (any of the 26 directions defined by a
SiGML, we also allow a tokenised form of a HamNoSys non-zero position vector each of whose individual 3-D co-
sign, represented by a <hns_sign> element. For con- ordinates is either zero or one, or half-way between two ad-
venience of reference each of these sign elements has a jacent directions of this kind). A straight line motion may
gloss attribute, giving a (spoken language) gloss of the be modified in a wide range of ways, including changing
sign’s meaning. the distance moved, and tracing a curved, wavy or zig-zag
path to the given end point. Other forms of simple mo-
2.2. Manual SiGML tion include circular and elliptical motions (again with a
wide range of variants), fluttering of the fingers, and sev-
The manual component of a SiGML sign is represented eral forms of wrist motion.
by a <sign_manual> element. SiGML ascribes the
same general structure to the manual component of a sign 2.3. Non-Manual SiGML
as does HamNoSys: an initial configuration followed by a The non-manual component of a SiGML sign is repre-
sequence of actions or motions, which may well themselves sented by a <sign_nonmanual> element. As described
be composite. Each of these components may involve both in (Elliott et al., 2004), the internal structure of this el-
hands or just one hand, usually the signer’s “dominant” ement closely follows non-manual feature definitions in
hand (i.e. right hand for a right-handed signer). The initial HamNoSys 4. Thus, non-manual actions are partitioned
configuration is a hand configuration, together with a loca- into a hierarchy of tiers, corresponding to distinct articu-
tion for that configuration. The configuration for each hand lators, as follows:
defines its hand shape, and its orientation in 3-D space. This
orientation is specified as two components: extended finger • Shoulder movements
direction (the direction of the metacarpal of the index fin-
• Body movements
ger) and palm orientation (the rotation of the palm about the
axis defined by the other component). There is a basic set of • Head movements
a dozen standard handshapes, such as a fist, a flat hand, and
a “cee” formed by the thumb and index finger. Many vari- • Eye gaze
ations of these can be defined by specifying adjustments • Facial expression: Eye-Brows, Eye-Lids, and Nose
to the position of the thumb, various forms of bending of
some or all fingers, and specific forms of contact or cross- • Mouthing: Mouth Pictures and Mouth Gestures.
ing between pairs of fingers. Hand shapes exemplify of
HamNoSys’s rather “operational” approach to the structure Here, “facial expression” refers solely to those expres-
of feature definition: a simple instance of the given feature sive uses of the face which are phonetically significant; by
can be specified with no more than one or two symbols, contrast those uses which express the signer’s attitude or
while a more complex instance is obtained by appending emotions about what is being articulated, important though
additional modifier symbols defining how the required in- they may be, cannot at present be expressed in SiGML (nor
stance can be obtained from a simpler one. in HamNoSys). The two forms of mouthing reflect the dis-
tinction between motion of lips and tongue caused by spo-
In general terms, the location of a hand is defined with
ken accompaniment to signing (mouth pictures), and other
reference to a site on the signer’s body, head, arm or (other)
phonetically significant motions of lips, tongue, jaw and
hand, and a rough measure of the proximity of the hand
cheeks (mouth gestures). A mouth gesture often has a rel-
to that site. With some misgivings, we have retained in
atively elaborate internal structure which SiGML does not
SiGML the HamNoSys concept of a “hand constellation”,
attempt to reflect, instead just identifying the unanalysed
a special form of location which allows the definition of a
whole by a single label.
potentially quite elaborate configuration of the hands as a
pair, with (optionally) a location of this configuration rela-
tive to the body.
3. SiGMLSigning Animation Software
SiGML structures motions in a broadly similar fashion System
to HamNoSys, although SiGML tends to relegate to the SiGMLSigning is the software system we have devel-
level of informal semantics physical constraints to which oped, with support from partners in the ViSiCAST and eS-
HamNoSys gives direct syntactic embodiment. There is a IGN projects, to generate virtual-human signing animations
repertoire of primitive motions, which may be combined on-screen from a sign sequence specified in SiGML. Archi-
in temporal sequence or in parallel, that is, concurrently, tecturally, this system can be viewed as a pipeline of three
to any extent that makes physical sense. In SiGML, there processing stages, together with a control module which co-
are two other forms of structured motion (both inspired by ordinates and schedules the transfer of data between these
comparable features in HamNoSys) stages, stores the data they generate, and provides a pro-
grammable control interface. In its current form, the soft-
• Targeted motion: a motion for which an explicit target ware is packaged as a set of Active X controls, which al-
location (possibly a hand constellation) is specified. low it to be deployed relatively easily in applications and

99
HTML pages on Microsoft Windows systems. The three where (Kennaway, 2001; Kennaway, 2003; Elliott et al.,
processing stages are: 2004).
The first processing stage performs relatively straight-
• SiGML Input and Pre-processing forward pre-processing of the SiGML input. Its most ba-
sic function is to decompose this input into individual sign
• Animation Generation definitions, so that each can be handled in the appropri-
ate manner: the <hamgestural_sign>s can be fed di-
• Virtual Human Animation
rectly to the AnimGen stage, the <hns_sign>s are first
The interface between the first two stages is a sequence passed through a HamNoSys-to-(gestural-)SiGML transla-
of gestural SiGML sign definitions; the interface between tor, while those containing pre-generated animation data
the second and third stages is a sequence of animation pa- are just converted directly to the internal stored format out-
rameter sets, one set for each frame in the final animation. put by the AnimGen stage, which is by-passed in this case.
We outline each of these stages in turn, taking them in re- The HamNoSys-to-SiGML translation takes the form of
verse order, in order to highlight the context each stage de- an additional processing pipeline: conventional context-
fines for its predecessor. free parsing techniques (augmented with backtracking to
The final stage uses conventional 3-D animation tech- account for HamNoSys’s many syntactic ambiguities) are
nology. An avatar is represented by a virtual skeleton – used to generate a syntax tree, which is then transcribed
a connected hierarchy of virtual bones – and a surface into an intermediate XML form, called HamNoSysML or
mesh – a connected tissue consisting of thousands of small, HML; gestural SiGML is then generated from this using an
coloured, textured polygons. The configuration of these XSLT transform (Clark, 1999; Kay, 2000).
polygons determines the appearance of the avatar. The po- The SiGMLSigning software system is thus a “script-
sition and orientation of every polygon is determined (as able”, virtual human signing animation system, accepting
part of the avatar’s definition) by the position and orienta- as input arbitrary signing sequences expressed in SiGML,
tion of one or more of the avatar’s virtual bones. Hence and providing the corresponding animation on any avatar
a static posture of the avatar’s surface appearance is com- which supports the simple rendering interface described
pletely determined by a static posture of its virtual skeleton: above. Finally, it is noteworthy that the core animation
standard 3-D rendering techniques, using a combination of module, AnimGen, generates frames at a sufficiently high
software and special-purpose graphics hardware, can be re- rate that the animation appears almost instantaneously in
lied on to produce the one from the other. So, an animation response to the SiGML input.
of the avatar is defined simply by the appropriate sequence
of static skeleton configurations, one for each animation 4. A Simple Example
frame (typically at the rate of 25 fps). A refinement of this The following is the HamNoSys sequence for a very
system allows the avatar’s appearance (in each frame) to be simple gesture (which does not represent any actual sign):
further modified by applying predefined distortions, known
as morph targets or morphs, directly to the surface mesh.
This technique is especially useful to us in defining facial 
non-manual gestures. The supplier of an avatar must there-
fore provide, as a minimum, a description of the physical Here, the first symbol specifies the hand shape, a fist with
structure of the avatar’s skeleton and a list of its available the index finger extended, the second and third symbols
morphs, together with a simple rendering interface which specify the orientation of the hand: the index finger points
(i) allows a skeleton configuration to be specified (together outwards from the signer’s body, with the palm facing to the
with morph weights, if required), and (ii) accepts a request left; no initial location is explicitly specified for the hand,
to render the corresponding posture. so a default, neutral, position in front of the signer’s body
The preceding stage, at the heart of the SiGMLSign- is assumed; the final symbol specifies a straight movement
ing system, is the animation generation stage, performed from this initial position in an outwards direction, that is,
by a module called AnimGen. This maps a given sequence away from the signer’s body. The insertion of a few more
of gestural SiGML sign descriptions to the correspond- symbols into this example results in a genuine sign, namely
ing stream of avatar animation parameters. This stream the DGS (German Sign Language) sign ”going-to”:
is avatar-specific, since it depends crucially on the defi-
nition of the avatar’s physical characteristics provided by
the avatar supplier. Indeed, we have found that avatar- 
independent sign synthesis depends crucially on the speci-
fication by the avatar supplier of of the locations (relative to Here, the hand shape has a modifier specifying that the
the skeleton) of quite a large number of sites on the avatar’s thumb is extended, the initial finger direction is now
surface mesh, in addition to the basic physical characteris- upwards-and-outwards, the outward motion has an upward
tics already mentioned. The task of this stage, therefore, arc modifier attached to it, and this motion is composed in
is to derive precise numerical animation parameters from parallel with a change of finger direction to downwards-
the physically relatively imprecise SiGML sign definitions. and-outwards. The whole is prefixed with a symbol speci-
The manner in which this is done currently, and some of fying motion of both hands in parallel, with the initial con-
the issues that arise, have been described more fully else- figuration of the non-dominant hand mirroring that of the

100
explicitly specified dominant hand. The HNS-SiGML form be recorded for future reference in a file, in which case it
of this is: is stored in SiGML’s CAS (Character Animation Stream)
format. A few lines of the output for our “going-to” exam-
<?xml version="1.0" encoding="iso-8859-1"?>
ple on the VGuido avatar, developed by our eSIGN project
<!DOCTYPE sigml SYSTEM .../sigml.dtd>
<sigml> partner Televirtual, is shown in Figure 4 below.
<hns_sign gloss="DGS_going-to"> The animation generated for this sign in isolation has a
<hamnosys_manual> duration of about 320ms (preceded by another 320ms while
<hamsymmpar/> the avatar’s hands move from the rest position to the initial
<hamfinger2/> position of the sign itself. In Figure 4. below we show the
<hamthumboutmod/> animation frames for the start and finish of this sign.
<hamextfingeruo/>
<hampalml/> Acknowledgements
<hamparbegin/> We acknowledge with thanks financial support from the
<hammoveo/> European Union, and assistance from our partners in the
<hamarcu/>
ViSiCAST and eSIGN projects.
<hamreplace/>
<hamextfingerdo/> 5. References
<hamparend/>
</hamnosys_manual> BDA, 1992. Dictionary of British Sign Language. Faber
</hns_sign> and Faber.
</sigml> Bray, T., J. Paoli, C.M. Sperberg, E. Mahler, and
F. Yergeau (eds.), 2004. Extensible Markup Language
This is parsed during the input/pre-processing stage into the (XML) 1.0. http://www.w3.org/TR/REC-xml/.
intermediate HML form shown (at the end of the paper) in Clark, J., 1999. XSL Transformations (XSLT) Version 1.0.
Figure 2. In this easily generated but rather verbose format, http://www.w3.org/TR/xslt.
an element typically corresponds to a HamNoSys syntactic Elliott, R., J.R.W. Glauert, and J.R. Kennaway, 2004. A
category, while an attribute typically corresponds to an in- framework for non-manual gestures in a synthetic sign-
dividual HamNosys symbol, although the HamNoSys par- ing system. In Proc. Cambridge Workshop Series on
allel composition brackets and the HML <paraction1> Universal Access and Assistive Technology (CWUAAT).
elements provide a counter-example to this general rule of Elliott, R., J.R.W. Glauert, J.R. Kennaway, and I. Mar-
thumb. shall, 2000. Development of language processing sup-
The XSLT translation which is applied to the HML port for the ViSiCAST project. In ASSETS 2000 4th
form shown in Figure 2 produces the much flatter Gestu- International ACM SIGCAPH Conference on Assistive
ral SiGML form shown immediately below: Technologies. Washington DC, USA.
<sigml> Elliott, R., J.R.W. Glauert, J.R. Kennaway, and K.J. Par-
<hamgestural_sign gloss="DGS_going-to"> sons, 2001. D5-2: SiGML definition. Working docu-
<sign_manual both_hands="true"> ment, ViSiCAST Project.
<handconfig handshape="finger2" Glauert, J.R.W., 2002. ViSiCAST: Sign language using vir-
thumbpos="out"/> tual humans. In International Conference on Assistive
<handconfig extfidir="uo"/> Technology ICAT 2002. Derby: BCS.
<handconfig palmor="l"/>
Glauert, J.R.W., J.R. Kennaway, R. Elliott, and B-J.
<par_motion>
<directedmotion direction="o" Theobald, 2004. Virtual human signing as expressive an-
curve="u"/> imation. In Proc. AISB-2004: Symposium on Language,
<tgt_motion> Speech and Gesture for Expressive Characters.
<changeposture/> Hanke, T., G. Langer, C. Metzger, and C. Schmaling, 2000.
<handconfig extfidir="do"/> D5-1: Interface definitions. working document, ViSi-
</tgt_motion> CAST Project.
</par_motion> Hanke, T. and C. Schmaling, 2002. HamNoSys 4 (Course
</sign_manual> Notes). http://www.sign-lang.uni-hamburg.de/Projekte/
</hamgestural_sign>
HamNoSys/HNS4.0/HNS4.0eng/Contents.html.
</sigml>
Kay, M., 2000. XSLT – Programmer’s Reference. Wrox
The synthetic animation module, AnimGen, pre-processes Press Ltd.
this Gestural SiGML into a more explicit form of SiGML in Kennaway, J.R., 2001. Synthetic animation of deaf sign-
which the hand-shape information is reduced to numerical ing gestures. In 4th International Workshop on Ges-
measures of joint angles (on a scale of 1 to 4), and the rôle ture and Sign Language Based Human-Computer Inter-
of both hands is made explicit. This explicit form is shown action, LNAI. Springer-Verlag.
(at the end of the paper) in Figure 3. Kennaway, J.R., 2003. Experience with and requirements
The stream of animation data output by AnimGen is ex- for a gesture description language for synthetic anima-
tremely voluminous, and is usually passed directly from tion. In 5th International Workshop on Gesture and Sign
the computer system’s internal memory to the avatar ren- Language Based Human-Computer Interaction, LNAI,
dering module. However, if desired, this data stream may to appear. Springer-Verlag.

101
Figure 1: Animation frames for the “Going-To” Example.

Kennaway, J.R., R. Elliott, J.R.W. Glauert, and KJ Par-


sons, 2002. SiGML Document Type Definition (DTD).
http://www.visicast.cmp.uea.ac.uk/sigml/sigml.dtd.
Prillwitz, S., R. Leven, H. Zienert, T. Hanke, J. Henning,
et al., 1989. HamNoSys Version 2: Hamburg Notation
System for Sign Languages — An Introductory Guide,
volume 5 of International Studies on Sign Language and
the Communication of the Deaf . SIGNUM Press, Ham-
burg.
Safar, E. and I. Marshall, 2001. Translation of English Text
to a DRS-based, Sign Language Oriented Semantic Rep-
resentation. In Conference sur le Traitment Automatique
des Langues Naturelles (TALN), volume 2.
Safar, E. and I. Marshall, 2002a. An Intermediate Seman-
tic Representation Extracted from English Text For Sign
Language Generation. In 7th Symposium on Logic and
Language.
Safar, E. and I. Marshall, 2002b. Sign Language Synthesis
using HPSG. In 9th International Conference on Theo-
retical and Methodological Issues in Machine Transla-
tion (TMI).
Safar, E. and I. Marshall, 2002c. Sign language translation
via DRT and HPSG. In 3rd International Conference on
Intelligent Text Processing and Computational Linguis-
tics (CICLing).

102
<hamnosysml>
<sign gloss="DGS_going-to">
<hamnosys_sign>
<sign2>
<symmoperator att_par_or_lr="hamsymmpar"/>
<minitialconfig2>
<handconfig2>
<handshape2>
<handshape1 handshapeclass="ham_finger2" thumbpos="ham_thumb_out"/>
</handshape2>
<extfidir2>
<extfidir1 extfidir="direction_uo"/>
</extfidir2>
<palmor2>
<palmor1 palmor="ham_palm_l"/>
</palmor2>
</handconfig2>
</minitialconfig2>
<action2t>
<action1t>
<action1>
<par_action1>
<action1>
<simplemovement>
<straightmovement
arc="ham_arc_u" movement="ham_move_o"/>
</simplemovement>
</action1>
<action1>
<simplemovement>
<replacement>
<extfidir1
extfidir="direction_do"/>
</replacement>
</simplemovement>
</action1>
</par_action1>
</action1>
</action1t>
</action2t>
</sign2>
</hamnosys_sign>
</sign>
</hamnosysml>

Figure 2: Intermediate HML form for the “Going-To” Example.

103
<sigml/>
<hamgestural_sign gloss="dgs_going-to">
<sign_manual both_hands="true">
<handconfig handshape="finger2" thumbpos="out"
bend2="0.00 0.00 0.00 0.00"
bend3="4.00 4.00 4.00 0.00"
bend4="4.00 4.00 4.00 0.00"
bend5="4.00 4.00 4.00 0.00"
bend1="-0.30 2.20 2.20 0.30 0.00" />
<split_handconfig>
<handconfig extfidir="uo" palmor="l"/>
<handconfig extfidir="uo" palmor="r"/>
</split_handconfig>
<handconstellation contact="medium">
<location location="palm" bodyside="nondom" contact="touch"/>
<location location="palm" bodyside="dom" contact="touch"/>
<location location="chest" contact="medium"/>
</handconstellation>
<par_motion manner="targetted">
<directedmotion manner="targetted" direction="o" size="medium"
curve="u" curve_size="medium" ellipse_direction="l"/>
<tgt_motion manner="targetted">
<split_handconfig>
<handconfig extfidir="do"/>
<handconfig extfidir="do"/>
</split_handconfig>
<handconstellation contact="medium">
<location location="palm" bodyside="nondom" contact="touch"/>
<location location="palm" bodyside="dom" contact="touch"/>
</handconstellation>
</tgt_motion>
</par_motion>
</sign_manual>
</hamgestural_sign>
</sigml>

Figure 3: Explicit low-level SiGML for the “Going-To” Example.

<CAS Version="CAS2.0" Avatar="VGuido">


<Frames Count="32">
<Frame Duration="20.0000" BoneCount="67" MorphCount="42">
<Morph Name="eee" Value="0.0000"/>
....
<Bone Name="ROOT">
<Position x="-0.0007" y="-0.0501" z="-0.0496"/>
<QRotation x="-0.0286" y="-0.7137" z="0.0276" w="0.6993"/>
</Bone>
....
</Frame>
....
</Frames>
</CAS>

Figure 4: Character Animation Stream (CAS) Data for the “Going-To” Example.

104
Statistical Sign Language Translation
Jan Bungeroth, Hermann Ney
Lehrstuhl für Informatik VI, Computer Science Department
RWTH Aachen University, D-52056 Aachen, Germany
{bungeroth, ney}@informatik.rwth-aachen.de
Abstract
In the field of machine translation, significant progress has been made by using statistical methods. In this paper we suggest a statistical
machine translation system for Sign Language and written language, especially for the language pair German Sign Language (DGS) and
German. After introducing the system’s architecture, statistical machine translation in general and notation systems for Sign Language,
the corpus processing is scetched. Finally, preliminary translation results are presented.

1. Introduction
The current progress in statistical machine translation
suggests the usage of these methods on automatic Sign Lan-
guage translation. This paper presents a first approach to
such an application and discusses the advantages and dis-
advantages.
Presentation Recognition
Deaf people, while fluent in their local Sign Language,
often experience comprehension problems when they read
written text or even lip-read spoken language. Thus for as-
sisting the Deaf to communicate in a world of spoken lan-
guages, translation is needed. Currently human interpreters
fill this gap, but their service is expensive and not always Recognition
! Phonological
available. While a machine translation system can not fully Model System
replace an interpreter, it offers instant help in the everyday
communication.
We therefore propose a system for translating a Sign
Language into a spoken language and vice versa. Such a
complete system translating from Sign Language to spoken
language needs a gesture recognizer as input, the translation intensive DGS in
TROUBLESOME Gloss Notation
system and a speech synthesizer as output. The complete phh
system translating from spoken language to Sign Language
needs a speech recognizer as input, the translation system
and a graphical avatar as output. In this paper the focus is
held on the translation part. Figure 1 presents a schematic
overview of such a system.
Translation

2. Related Work
In the recent years several groups showed interest in ma-
chine translation for Sign Languages.

• In our group, Bauer et al. (1999) proposed a frame- But that was very troublesome. Written Text
work for statistical-based Sign Language translation.
The authors suggested to translate recognized video-
based continuous Sign Language to spoken language. Figure 1: Automatic Sign Language translation system

• Other recent work was done by Sáfár and Marshall


(2002) for translating English into British Sign Lan- • Also van Zijl and Barker (2003) propose another rule-
guage using a rule-based approach. Here the grammar based concept for translating English text to South
was modeled utilizing the HPSG formalism. The sys- African Sign Language (SASL).
tem is able to translate simple sentences.
Huenerfauth argues that a rule-based approach is better
• Huenerfauth (2004) introduces a rule-based concept suited for Sign Language translation than statistical models
for translating English text to American Sign Lan- because large corpora are difficult to obtain. He concludes
guage (ASL). that the use of a rule-based approach is more appropriate

105
than the statistical. For our work, we do not think of this
as an alternate option. Distinct corpora for Sign Languages with Sign Language translation, an appropriate Sign Lan-
are planned and already worked on. Additionally the opti- guage representation is necessary to transfer data from and
mization of the statistical translation process for scarce re- to the sign recognizer and the presentation avatar. Further-
sources as suggested e.g. by Nießen and Ney (2000) allows more a word or phoneme based notation is needed for the
for further improvement. internal alignment with the written words of the spoken lan-
guage. A corpus based on such a notation system should
3. Statistical Machine Translation qualify for learning and testing a statistical machine trans-
lation, but it might need pre- or postprocessing.
Until recently, only rule-based systems were used for The following notation systems are introduced:
natural language translation. Such systems typically re-
quire hand written rules and dictionaries. However, over • Glosses are written words, where one gloss represents
the last ten years a new approach has evolved, namely the one sign. Additional markings provide further infor-
statistical approach. This approach makes use of statistical mation, e.g. non-manual signs. Unfortunately no gloss
decision theory and statistical learning. Such a system is standard exists, which results in inconsistent annotated
trained using a set of sentence pairs. In recent evaluations corpora.
like Chinese to English1 and Arabian to English transla-
tions, it was found that these statistical approaches were • The notation system introduced by Stokoe (1960) was
comparable or superior to conventional systems. the very first phonological symbol system of ASL. It
In statistical machine translation a source sentence divides signs into movement (sig), hand shape (dez)
f1J = f1 . . . fJ is transformed into a target sentence eI1 = and location (tab) which occur simultaneously. As it
e1 . . . eI by choosing the sentence with the highest proba- focuses on ASL the application on other Sign Lan-
bility from all possible target sentences. This is given by guages is not always possible. An ASCII encoding
Bayes’ decision rule of the Stokoe system is available2 .

• The Hamburg Notation System HamNoSys (Prillwitz,


êI1 = argmax{P r(eI1 ) · P r(f1J |eI1 )}.
eI1
1989) is a more general form of the Stokoe system.
Figure 3 shows an English sentence in gloss nota-
Several statistical models are used to estimate the free tion with markings and the corresponding HamNoSys
parameters with large training data (e.g. see Brown et al. glyphs.
(1993), Och and Ney (2000)). One target source word posi-
tion is assigned to each source word position by alignments. • Liddell and Johnson (Liddell, 1984) suggest a sequen-
tial division of the sign stream into movement and hold
Figure 2 shows the general architecture of the statistical
segments. This avoids the simultaneous occurrence of
translation approach.
phonemes.
Source Language Text

Transformation

J
f1

J I
Lexicon Model
Pr(f 1 | e 1 )
Global Search:

Alignment Model
maximize Pr( e 1I ) J I
Pr(f 1 | e 1 )

over e 1I I
Pr( e 1 )
Language Model

Figure 3: Example for HamNoSys and gloss notation taken


from Prillwitz (1989)
Transformation

5. Corpus Preparation
Target Language Text Statistical machine translation systems are trained us-
ing bilingual corpora containing full sentences. But two
Figure 2: Architecture of the translation approach based on major problems arise when dealing with Sign Language.
Bayes’ decision rule The first problem is the lack of large corpora. For example
in written language, the Hansards corpus with French and
English sentences from debates of the Canadian Parliament
4. Notation Systems contains about 1,470,000 sentences. For Sign Language we
have not found a corpus with more than 2000 sentences.
Several different notations and phonological systems The second problem is the lack of a notation standard. The
are common in Sign Language research. When dealing
2
http://world.std.com/˜mam/ASCII-Stokoe.html
1
http://nist.gov/speech/tests/mt/

106
existent corpora use gloss notations which are too difficult neg-MÖGEN
to learn with limited corpora. Furthermore inconsistent use neg MÖGEN
of the notation system complicates the problem.
For a starting basis the corpus collected by the DESIRE Table 3: Separating non-manual signs in the corpus file
Team Aachen3 consisting of 1399 sentences in DGS and
German was investigated as it was one of the biggest avail-
able for us. Table 1 shows the details of this corpus, where These methods were used to form the new corpus of 200
singletons are words occurring only once. Note the very sentences. In this corpus the number of singletons is kept
high number of singletons. This comes from the high di- low for better training. In addition most words or word
versity of the sentences. In addition, every word with a forms have an entry in a bilingual manual lexicon. Table 4
non-manual sign, e.g. (1), is counted as an extra word. gives an overview of the corpus. While this is not enough
neg

(1) HABEN training data for a fully-fledged translation system, it allows


“not have” the first experiments, we will discuss in section 6.

DGS German DGS German


no. of sentence pairs 1399 Training: no. of sentence pairs 167
no. of running words 5480 8888 no. of running words 845 828
no. of distinct words 2531 2081 no. of distinct words 73 142
no. of singleton words 1887 1379 no. of singleton words 15 48
Testing: no. of sentence pairs 33
Table 1: DESIRE corpus statistics no. of running words 157 161
no. of distinct words 43 74
no. of singleton words 18 40
This is not usable for statistical machine translation.
Thus for first experiments a small corpus was built from Table 4: The small DGS/German corpus statistics
the DESIRE corpus. Several considerations were made:
Brackets indicating a non-manual sign on a whole
phrase or sentence are expanded. Consider the sentence
(2). 6. Results
qu

(2) WAHL+ERGEBNIS WISSEN DU For translation experiments, training and testing data is
“Do you know the election results?” needed, as-well as an objective error measurement. The
Table 2 shows the ASCII representation of this sentence corpus shown in table 4 is divided into training samples
before and after expanding the brackets. (83% of the sentences) and testing samples (17% of the sen-
tences). The training is performed by using various statis-
WAHL+ERGEBNIS qu-{WISSEN DU} tical models like IBM Model 1-4 (Brown et al., 1993) and
WAHL+ERGEBNIS qu-WISSEN qu-DU others like Hidden Markov Models HMM (Och and Ney,
2000). Figure 4 shows the alignment of a sentence pair
Table 2: Expanding brackets in the corpus file which is obtained in training. For testing, the test sentences

?
Additional information to locus agreement was deleted gesehen
as it can not be learned. E.g. in the phrase (3) the ‘ar- Nachrichten
beit’ refers to a place in signing space. This information abend
is deleted. After the translation to DGS it can be partially gestern
reconstructed by rules. du
(3) ARBEITEN X‘arbeit’ hast
GESTERN
ABEND
NACHRICHTEN
SEHEN
qu-GEWESEN
qu-DU

“at work”
When suitable, the non-manual signs were treated as
single words. As an example (4) is processed as seen in ta-
ble 3, so it can be mapped to the German translation “nicht
mögen”. But (5) is kept so it can be mapped to the German
“unmöglich”.
neg

(4) MÖGEN Figure 4: Trained alignment of a sentence


Alignment #58 pair
“to like not”
neg
in the source language are translated and compared with
(5) MÖGLICH the the known target sentences. These translation results
“impossible” are evaluated.
We use the following objective evaluation criteria for
3
http://www.germanistik.rwth-aachen.de/desire error measurement:

107
German automatic DGS translation manual DGS translation
du wartest darauf daß der Tee kommt DU WARTEN BIS TEE KOMMEN DU WARTEN BIS TEE KOMMEN
frische Bananen und Äpfel FRISCH ÄPFEL UND BANANEN BANANEN FRISCH UND ÄPFEL
schmecken gut SCHMECKEN GUT SCHMECKEN GUT
ich mag nicht fliegen ICH NICHT UNKNOWN fliegen FLIEGEN ICH neg MÖGEN

Table 5: Translated sentence pairs for German and DGS

• mWER: Future work includes the construction of a more suit-


The word error rate (WER) is computed as the min- able corpus and further improvement of the translation per-
imum number of substitution, insertion and deletion formance. Especially we expect performance gain from
operations that have to be performed to convert the the use of better dictionaries and linguistic knowledge like
generated sentence into the target sentence. This per- morpho-syntactic information.
formance criterion is widely used in speech recogni-
tion. This minimum is computed using a dynamic 8. References
programming algorithm and is typically referred to as B. Bauer, S. Nießen, and H. Hienz. 1999. Towards an auto-
edit or Levenshtein distance. In addition for the multi- matic Sign Language translation system. In Proc. of the
reference WER (mWER) not only one but a set of ref- Int. Workshop on Physicality and Tangibility in Interac-
erence translation sentences is used. (Nießen et al., tion, Siena, Italy.
2000) P. F. Brown, S. A. Della Pietra, M. J. Della Pietra, and
R. L. Mercer. 1993. Mathematics of statistical machine
• mPER: translation: Parameter estimation. Computational Lin-
The position-independent word error rate (PER) com- guistics, 19(2):263–311.
pares the words of the two sentences without consid- M. Huenerfauth. 2004. A multi-path architecture for ma-
ering the word order. The PER is less than or equal chine translation of English text into American Sign lan-
to the WER. The multi-reference PER (mPER) again guage animation. In Proc. Student Workshop at Human
considers a set of reference translation sentences. Language Technolgies conference HLT-NAACL, Boston,
MA, USA.
We performed the translation from German to DGS on S. Liddell. 1984. Think and believe: Sequentiality in
the small corpus. Table 6 shows the mWER and mPER American Sign Language. Language, 60(2):372–399.
error rates for our experiments. As a reference the baseline S. Nießen and H. Ney. 2000. Improving SMT quality
is a single word-to-word translation. We then applied our with morpho-syntactic analysis. In Proc. on the 18th Int.
models for the training of alignment models to improve the Conf. on Computational Linguistics, Saarbrücken, Ger-
results. many.
S. Nießen, F.J. Och, G. Leusch, and H. Ney. 2000. An eval-
mWER [%] mPER [%] uation tool for machine translation: Fast evaluation for
single word 85.4 43.9 machine translation research. In Proc. of the Second Int.
alignment templates 59.9 23.6 Conf. on Language Resources and Evaluation (LREC),
pages 39–45, Athens, Greece.
Table 6: Testing results for German to DGS F. J. Och and H. Ney. 2000. A comparison of alignment
models for statistical machine translation. In Proc. on
the 18th Int. Conf. on Computational Linguistics, pages
The examples in table 5 show translations from our test 1086–1090, Saarbrücken, Germany.
corpus. The first sentence is a correct translation, while S. Prillwitz. 1989. HamNoSys. Version 2.0; Hamburg No-
the second sentence is in partial disorder. The last sentence tation System for Sign Language. An Introductionary
shows a wrong word order and missing words. Guide. Signum Verlag.
E. Sáfár and I. Marshall. 2002. Sign Language genera-
7. Summary tion using HPSG. In Proc. Int. Conf. on Theoretical and
Methodological Issues in Machine Translation, pages
For the translation of spoken language into Sign Lan- 105–114, TMI Japan.
guage, we propose statistical machine translation. Such a
W. Stokoe. 1960. Sign Language structure: An outline of
system is trained with bilingual corpora. While Sign Lan-
the visual communication systems of the American deaf.
guage corpora are still rare, we demonstrated how such a
Studies in Linguistics, Occasional Papers, 8.
corpus can be prepared for the translation system. Further-
L. van Zijl and D. Barker. 2003. South African Sign Lan-
more we performed first experiments on a small German-
guage machine translation system. In Proc. 2nd Int.
DGS corpus and presented results. While this is meant only
Conf. on Computer graphics, virtual Reality, visualisa-
as a small-scale example and a proof-of-concept, we are
tion and interaction in Africa, pages 49–52, Cape Town,
confident of applying our methods to real-world conditions
South Africa.
and corpora.

108
Computer support for SignWriting written form of sign language
Guylhem Aznar, Patrice Dalle
TCI team, IRIT lab www.irit.fr
<[email protected]>, <[email protected]>

Abstract
Signwriting's thesaurus is very large. It consists of 425 basic symbols, split in 60 groups from 10 categories. Each basic symbol can have 4
different representations, 6 different fillings and 16 different spatial rotations.

While signwriting is more and more used by the deaf layer and the font layer. These layers are independent and
community, it currently lacks a complete and platform therefore easy to adapt and improve. In the keycode layer,
neutral computer support to let signwriters share each signwriting “basic symbol” is coded by a different
documents regardless the applications and the underlying number called “internal name”.
operating system they may be using. This basic symbol is first positionned geometrically by
Based on previous research, various propositions have “positionning elements” defining concentric circles and
been made, resulting in multiple incompatible systems. the respective angular position of the basic symbol on
The main problem currently is the lack of a consistent these circles. The basic symbols can be completed by
basis upon which compatibility could be built : the most additional information regarding the possible variations,
advanced and used system, SWML [1], is multiplatform such as spatial rotations, required in order to form the
thanks to Java but requires dedicated applications like the complete symbol. These “additional information
previous attempts. elements”, like the basic symbols and the positionning
Moreover, the use of XML based representation requires elements, are also coded by one or more numbers also
dozens of lines of code for each symbol, resulting in called internal names.
oversized files which can not be parsed, used or read with All these internal names are linked to their respective
standard tools. XML linking to bitmap pictures for on- meanings in a mapping table. Additional internal names
screen representation prevents the integration of a real can be defined following the evolution of signwriting's
font system, needed for a true portability, and cause standard. Finally, “delimitors” are used to group basic
scalability problems. symbols into complete signwriting units. In the unicode
Moreover, like previous systems, SWML still comes with layer, another mapping table is used : these internal names
a complex user interface, a little easier to learn but slower, are mapped to unique unicode characters. One or more
symbols being entered via the mouse. internal name can be mapped to a unicode character, but
Even if this advanced approach helped the signwriter each unicode character can only have one mapping. This
community, replacing the manual insertion of GIF graphic non-bijective approach is required to follow the unicode
files for each symbol, at the moment, the signwriting standard.
community must revert to screenshots and pictures to In the entry layer, signwriting symbols can be entered by
ensure documents can be shared and read, resulting in different peripherals like a keyboard or a mouse. The
little reusability for both users and researchers, and low mouse driven graphical input system will be completed by
computational possibilities worsened by the absence of other entry modes in the future. Following the traditional
signwriting optical recognition software. Guylhem Aznar, key mapping entry mode, a table maps internal names to
a first year medical resident and a PhD student in the physical keys on the keyboard. Multiple keyboard
Computer Science from Pr. Patrice Dalle TCI team in mapping tables allow different physical dispositions for
IRIT (Toulouse, France), is proposing a unicode based different countries or following user preferences.
representation for Signwriting with a suite of free The entry layer is separated from the rest of the system. It
software tools running on GNU/Linux but also supporting is only relevant to the system by its dependancy on the
non-free operating systems. unicode layer, required in order to output Unicode
This approach based on unicode is putting a strong characters following the keycode layer specifications.
emphasis on facilitating communication and compatibility In the rendering layer, a unicode reconstruction engine
through a unicode reconstruction engine. like Gnome's Pango, transform the flow of unicode
Usage and computer entry are also made simpler thanks to characters into a graphical representation, i.e. a complete
different possibilities of human interaction : keyboard, signwriting symbol. It is not yet suitable to the display:
mouse and sensitive area (handwriting) support, which all elements are still numbers (then called “external
result in the same unicode-text output. This output can names”), and must be replaced by graphics.
then be shared, reused or studied easily. The transformation is coded by a set of rules [3]
The choice of unicode over XML facilitates integration in describing the possible combination and the outputs, like
existing software. The system works in layers : the entry for unicode arabic and indian languages support. In the
layer, the keycode layer, the unicode layer, the rendering font layer, a font subsystem like Gnome's Freetype/xft2,

109
which support both traditional bitmap fonts and vectorial can be subject to automated computer analysis, exchanged
fonts, takes care of the graphical representation, replacing by researchers, etc. Possible evolutions of the system
external names by their corresponding graphical symbols. include a statistical approach for auto completion and
Different fonts can of course be used. handwriting recognition, and will certainly focus on the
Considering a symbol has been entered though the entry user interface with the design of specific Gnome
layer, it must then be transcribed into a serie of unicode Accessibility features.
characters following these steps:
- first, a positionning element is used to define a circle. If 1. References:
this circle preceded by another circle before the initial [1] Rosenberg, A. Writing Signed Language, “In Support
delimitor, it is embedded in that circle. A special of Adopting an ASL Writing System”, Dept. of
type of circle is used to define the contour of the face Linguistics, Univ. of Kansas, USA, 1999
- then, basic symbols are positionned on that circle, with http://www.signwriting.org/forums/research/rese010.h
positionning elements to define their angular position tml
followed by additional information elements if these basic [2] Antonio Carlos da Rocha Costa and Gracaliz Pereira
symbols need rotations, special fillings, etc. Dimuro, “A SignWriting- Based Approach to Sign
- finally, a delimitor is used to mark the end of the Language Processing", Universidade Catholica de
signwriting unit. Pelotas, Brasil, 2001
The internal names of these entities are never used – http://www.techfak.unibielefeld.de/ags/wbski/gw2001
instead, unicode characters are used, which allows book/draftpapers/gw38.pdf
existing software to process signwriting. These Unicode [3] Klaus Lagally, “ArabTe{X} : typesetting arabic with
caracters are then mapped to the internal names, and the vowels and ligatures”, 1992
rendering layer geometrically and spatially reconstruct a http://citeseer.nj.nec.com/rd/64325440%2C95192%2
complete signwriting unit in the form of external names. C1%2C0.25%
The font layer then replaces this information by the 2CDownload/http://citeseer.nj.nec.com/cache/papers/c
graphical drawing of the complete unit. s/350/ftp:zSzzSzftp.informatik.unistuttgart.dezSzpubz
Currently, the different layers are under work. They do SzlibraryzSzncstrl.ustuttgart_fizSzTR-1992-
not require the same amout of work: the most complicated 07zSzTR-1992-07.pdf/lagally92arabtex.pdf
part is the definition of rules for the rendering layer [4], [4] Finite State Automata and Arabic Writing - Michel
the hardest task is drawing fonts, the most important is the Fanton Certal-Inalco
keycode layer to provide a quick replacement to SWML http://acl.ldc.upenn.edu/W/W98/W98-1004.pdf
and the longest part is reserving enough space in unicode
for a correct signwriting implementation. The latter may
eventually be impossible, in which case “private”
unicodes areas will have to be used. This should only
cause some minor changes in the unicode layer, but will
damage the portability benefits of using unicode.
This entire “text-like” layered approach makes a clear
separation between the various sub-systems used,
providing a solid base upon which new sub-systems can
be built (for ex. in the entry layer, handwriting
recognition) and any layer can be upgraded (ex: adding
additional vectorial fonts, supporting a new signwriting
standard) without requiring a full system redesign.
Applications following Gnome's API can immediately
take advantage of signwriting support, which means a
whole desktop suite of software is made available for free
to deaf-users. Moreover, signwriting features (ex: writing
from top to bottom) no longer need special handling
through specific applications, thanks to Gnome
localisation support.
An additional advantage is the portability of the model.
Support on the GNU/Linux based PDAs requires no
further work. Windows or MacOS support would require
minimal support in the entry layer and at some specific
points in the font layer.
The upcoming support of Windows and MacOS by
Gnome applications means these steps could also simply
be removed in the short term. Moreover, Signwriting
transcription in standardized unicode text means the text

110
Chinese Sign Language Synthesis and Its Applications
Yiqiang Chen, Wen Gao, Changshui Yang, Dalong Jiang, Cunbao Ge
Institute of computing technology, Chinese Academy of Sciences, Beijing, China, 100080
{yqchen, wgao, csyang, dljiang, cbge}@jdl.ac.cn

Abstract
The Sign Language is the communication language for deaf-mute community. Everywhere in the world may have their own sign language.
There are over 20.57 million deaf people with thousand kinds of language in China. Hence a set of standard Chinese Sign language for the
Deaf-Mute has been revised several times by Chinese Deaf-mute Association supported by the Chinese Government. The updated standard
Chinese sign language will help you easily communicate with any deaf people in China.

“How can we learn it in a short time and convenience sign words set must have expressions to make the
way"? The traditional face-to-face and tape recorder gesture understood. Therefore, face animation and
teaching methods can't express meaning well due to time expression synthesis is more important for the whole
and space limitation. Then a Chinese sign language
synthesis system has been developed. The system uses system. Second is motion retargeting technology,
artificial intelligence and computer graphic technology to which can retarget the standard data to any given
demonstrate the Chinese sign language from optional character modal to make the animation data singer-
customers by a 3-dimensions virtual human. The software independent. And the third is synchronization modal
will help to demonstrate a standard Chinese sign language between gesture and lip motion. There are also
by a 3-dimensions virtual human if you print Chinese several characters of the system: first is the system
language (Fig.1).
covered large vocabularies of Chinese Sign
Language, totaling up to 5596 sign language terms,
30 Chinese finger languages and 24817 synonym,
almost contained all of contents of middle schools
and elementary schools text book in China. Second
is it realized study interaction and not limited by
time and space. Third is customers could choose
from several human images or models. And last are
clear interface, easy operating and free adding new
sign language. The experiment shows that the
software was given a score of 92.98 for visual and
understanding finger spelling, 88.98 for words,
87.58 for sentences by the students from these Deaf-
mute Schools. The system has a great significant for
standardizing, studying and popularizing Chinese
Fig.1: Chinese Sign Language Synthesis System Sign Language and plays a very important role for
the hearing impaired society. This system could be
This system has integrated advanced research results used in all kinds of service business and public
and key technologies both domestic and abroad. places, and could bring a great convenience for the
There are three new technologies being developed in deaf-mute's life and study, etc WebSigner (Fig.2(a)),
our system. First is realistic face animation, in sign TVSigner (Fig.2(b)), OnlineSigner (Fig.4(b)).
language, there are about 1/3 words of the whole

111
(a) (b) (c)

Fig.2: The system applications


The WebSigner (a) can be used for aiding deaf person
to obtain information from Internet using his convention
way. The TVSigner (b) can generate virtual signer for TV
that aids deaf person to watch TV. The OnlineSigner (c)
can be used for Chinese sign language learning, which
allows you learn standard Chinese sign language in a
short time.

112
“Progetto e-LIS@”
Paola Laterza and Claudio Baj
ALBA Piccola Soc. Coop. Sociale Via San Donato, 81 – 10144 TORINO (Italy) E-mail: [email protected]

"Progetto e-LIS@" is the presentation of a work-in- from the fist-shape: first the thumb, then the index finger
progress, which was started in November 2000 by two and the numbers from zero to five. The thumb represents
Italian scholars, Paola Laterza (who is a hearing the number 1, the index 2, and so on, up to 5 with the
psychologist) and Claudio Baj, a Deaf LIS teacher. Their open hand. Then there are also two fingers that appear
aim is to find a system of cataloguing signs in order to simultaneously, then three, and so on, up to the point of
create a complete but flexible multimedial dictionary, to having five extended fingers and an open hand. When
be used both by competent Italian Sign Language users there are two handshapes that have the same two extended
and by competent users of Italian. This research presents a fingers, preference is given to the one with two joined
new way of ordering signs, different from the usual fingers rather than to the one with open fingers, because
alphabetical one, and is more congenial to the signing the latter looks more open from the visual level (e.g. “H”
community's linguistic needs, which are clearly oriented vs. “V”). In cataloguing the handshapes, reference is
to the visual-corporeal channel rather than to the written- made to the dominant hand, even if a sign requires both
oral one. In fact, there are Italian/Sign Language hands with different handshapes. The handshape symbols
dictionaries based on the alphabetical order, but there is have been taken from the dictionary by Radutzky (1992).
none that goes from Sign Language to the written-oral
language (Italian). Special attention has been paid to how 2 nd version (15th January, 2001)
signs are systematised: so far the handshape parameter Figure 2: 2nd version
has been explored in detail, but in the near future we plan In the second version we maintained the same criteria
to associate it with two more parameters, viz. location and as in the first, but a few slight changes were made in the
orientation. At a later date movement and non-manual choice of the principal handshapes. We felt the need for a
signals will also be included among the cataloguing further criterion which would allow us to flexibly insert as
criteria. The objective is not only to put signs in order many handshapes as possible by following an order that
according to a more flexible and therefore acceptable will not create confusion. Therefore we saw the addition
system for signers (like the alphabetical order satisfies of the subgroup criterion as a useful innovation. The
hearing people's phonological needs), but also to allow for principal handshapes are still 14, but with some
the quick search of signs in the multimedial dictionary. variations. The sequence of the hanshapes were changed
The paper describes how, after elaborating different vis à vis the previous version. However, the number of
versions in their step-by-step research, the two researchers principal handshapes remained unchanged. The new order
decided that the present format was more functional, of principal handshapes was as follows: A – S – G – I – L
practical and economical from the point of view of the – Y – H – V – Ycor – 3 – 4 – 3/5 – B – 5. Over and above
dictionary as an instrument. They will present the results the 14 principal handshapes, we started to include other
already obtained in their research as well as their “subordinate” handshapes, putting them in subgroups
intermediate findings to demonstrate their chosen work dependent on the principal ones. The subgroups were
method but also to receive feedback from other Italian and catalogued according to the position of the fingers in the
European realities. principal handshapes, from which, with progressive
curving, bending or closing movements, one finally
1. HANDSHAPES reached the subordinate subgroups. After singling out the
1 st version (27th November, 2000) subgroup criterion we chose to add a further criterion to
Figure 1: 1st version order the handshapes within the subgroups themselves.
Our first step was to single out a number of so-called According to this criterion, the subordinate handshapes
“principal” handshapes, chosen from the ones that follow a contrasting closing-opening movement, followed
appeared clearest, best-defined, with extended fingers and by the principal handshapes: starting from the maximum
in alignment with the hand, easy to remember for either opening of the principal handshape, the subgroup is
experienced or inexperienced signers. 14 handshapes were shaped by the progressive closing of the fingers (e.g. L,
chosen: As – A – S – G – I – L – Y – H – V – Ycor – 3 – cc, Lp, Lq, Lch, T). In this version 37 handshapes were
4 – B – 5. These were ordered by starting from the closed catalogued.
fist and progressing to the open hand, since we recognized
the fist as the origin of all the other handshapes (cf.
Volterra 1987). Subsequently one finger at a time appears

113
3 rd version A (12th March, 2001) depending on the typology of the closure movement (i.e.
Figure 3: 3rd version A flat or circular). The flat handshapes moving towards
Here we followed up our previous findings and tried to progressive closure with extended fingers precede the
add more and more handshapes, but at the same time handshapes with curved fingers, since the latter enclose a
maintaining clarity and linearity. To facilitate our research more limited area of the palm, while the former leave a
for the multimedial dictionary, we decided to subdivide wider opening. Sixth criterion: in signs where both hands
the subgroups further, creating branches of the principal are used, handshapes are sometimes different. In
handshapes. In the previous version each subgroup was cataloguing these cases, reference is made to the dominant
linear and the handshapes (both curved and flat) were hand (i.e. for right-handed people, the right hand, and for
collocated within it and ordered according to a very the left-handed, the left hand).
arbitrary criterion of closure based on the impression of
more or less filling of the visual space. Here, on the other 3 rd version B (6th February, 2002)
hand, some branches were drawn up from those Figure 4: 3rd version B
handshapes which, starting from the principal one, follow In the following version the previously elaborated
a movement of flat closure while other branches follow a criteria have undergone some more changes; moreover,
movement of curved closure. This version includes 53 five new handshapes have been added to reach a total of
handshapes, of which 20 are principal, and represents an 58. Some movements have also been carried out, to better
attempt to list and order all the handshapes existing, in our satisfy the recognized criteria. We thus have 20 principal
opinion, in Italian Sign Language. The principal handshapes: S – G – Yi – I – L – Y – H – V – Ycor – Hs –
handshapes are: As – A – S – G – I – L – Y – H – V – 3 – Ys – Wstr – W – 4str – 4 – 3/5 - B – Bs – 5. The new
Ycor – Hs – 3 – Ys – W – 4str – 4 – 3/5 – B – Bs – 5. In handshapes are those that are used very little but are
this version we started to systematize the criteria; some present in LIS and have never been catalogued officially.
remained unchanged while other new ones were created First criterion: The principal handshapes have fingers
from the previous versions. First criterion: the order of the extending from the fist at a right angle and are not bent,
principal handshapes proceeds from the closed fist to the while the other fingers are closed, i.e. they have contact
progressive extension of one finger at a time, from the with the palm of the hand. They have been singled out
thumb to the little finger, and subsequently of two, three, among those handshapes which correspond to the
four and five fingers extended at the same time. In the numbers “1” to “5” in one hand, starting from the thumb
first five handshapes, each finger is withdrawn to leave and ending at the little finger. Second criterion: The
space to the following one, following the numerical order principal handshapes follow the movement of progressive
from the thumb to the little finger. The same principle extension of the hand from a closed to an open position,
guides the order of the handshapes formed either by pairs from “1” to “5”, from the thumb to the little finger,
of fingers, by threes, fours or fives. Second criterion: following the intermediate passages. Third criterion: In
among the principal handshapes, according to the the subordinate handshapes the fingers are in a bent
principle of progressive opening of the hand, those with position, and, if they have contact with parts of the hand,
joined fingers precede those with the same but separate it is not with the palm (like in the closed handshapes), but
fingers. Third criterion: having chosen to consider all almost exclusively with the fleshy tip of the thumb. Other
handshapes as independent/separate from each other, we contacts between fingers are only considered as part of
decided that a linear, sequential list of 53 handshapes movement of large closure. Subordinate handshapes are
would be difficult to implement. To overcome the grouped together in subgroups. Fourth criterion: Since the
difficulties that a very long list would cause in fingers are straight and not bent in the principal
cataloguing, in learning, memorizing and use, already handshapes, it is selfevident that the subordinate
during the second version we opted for the creation of handshapes follow a movement of progressive closure
subgroups. As “principal” handshapes we chose within the subgroups, in contrast with the movement of
handshapes which were clearly contrasting with each progressive opening in the principal handshapes. Fifth
other and easy to perform from the motorial point of view. criterion: Different branches originate within a subgroup
The subgroups consisted of those “subordinate” from a principal handshape, depending on the typology of
handshapes that present limited distinctive features and closure movement applied, i.e. flat or circular. Flat
are more difficult to perform. Fourth criterion: since the handshapes closing progressively with extended fingers
principal handshapes, chosen from the clearest and most precede the ones where the fingers are curved, since the
distinct, are performed with the fingers in an extended latter occupy a more limited area of the palm, while the
position and in alignment with the hand, it is obvious that former allow for a larger opening. Sixth criterion: In signs
the movement necessary to order the subgroups follows where both hands are used (Volterra, 52), the handshapes
the progressive closure of the fingers, contrary to the are sometimes different. For purposes of cataloguing, in
movement of progressive opening of the principal these cases reference is made to the dominant hand (i.e.
handshapes. Fifth criterion: since an enormous variety of the right hand for right-handed people and vice versa for
subordinate handshapes exist within the subgroups, we the left-handed). For the purposes of this research, the
have tried an ulterior subdivision to create more order. latter version is presently considered the most functional,
Different branches originate from a principal handshape, the clearest and simplest for ordering signs. The criteria

114
that have been emphasized are definitive, in the present organized. In this way the arbitrariness of criterion 2 is
state-of-the-art. proved.

2. COUNTERCHECKS 4 th countercheck (13th November, 2002) Confutation


During the work-in-progress, when the criteria for of criterion 5
cataloguing the signs had been established, we looked for Subdivision according to order in finger movement
counterarguments and confutations which could show Figure 8: 4th countercheck version A and Figure 9: 4th
which of these were fundamental, superfluous or countercheck version B
arbitrary, but keeping version 3B as the reference point. In When this countercheck was started, 15 principal
this way we started to build up new versions. handshapes and 7 subgroups were selected. The principal
handshapes were singled out according to criterion 2 (i.e.
1 st countercheck (3rd July, 2002) moving from the fist to the open hand, followed by the
Inversion of criterion 2 Main handshapes Figure 5: 1st sequential appearance of fingers from 0 to 5), but
countercheck considering handshapes with joined fingers as
The 21 principal handshapes have been put into a subordinate. Within the subgroups criterion 4 (i.e.
particular order by inverting criterion 2, i.e. from the progressive closure of the fingers but without
maximum to the minimum opening of the fist closure, to distinguishing straight and curved finger positions) was
see if this criterion is fundamental or arbitrary. Building followed in contrast with criterion 5. It was therefore
up the scheme, this criterion proved to be arbitrary, since proved that, without criterion 5, especially in the “5th
exclusion or confusion of handshapes does not result from finger” subgroup, the attempt to create a sequence is
the inversion of the order. This countercheck did not confused since it is difficult to clearly identify “more
include subgroups. In the next counterchecks we shall see open” or “more closed” handshapes. Criterion 5 is
if the order remains functional when subgroups and other therefore fundamental. (This countercheck proved to be
criteria are added. similar to version 3b in many respects, but it was useful in
verifying the importance of criterion 5). In the version
2 nd countercheck (2nd October, 2002) following this countercheck, criterion 2 was mainly
Confutation of criteria 1 and 3 Linear sequence followed, thereby distinguishing as principal handshapes
Figure 6: 2nd countercheck both the ones with united fingers and the ones with open
The 58 handshapes have been ordered according to the fingers (e.g. “U” vs. “V”) as a movement of maximum
progressive opening of the hand without creating opening. Therefore 20 principal handshapes were singled
subgroups (criterion 3), and therefore in a linear sequence. out. Moreover criterion 5 was also taken into
We saw that, in this way, groupings of handshapes consideration. In fact, this countercheck produced
according to finger positions did not take place if no subgroups which were very similar to the “3b” version,
distinction between principal and subordinate handshapes with a few minor changes. What makes it different from
(criterion 1) was effected. The sequence of resulting version “3b” are the principal handshape families, created
handshapes was therefore determined randomly and according to the appearance of fingers: “fist” family, “1st
exclusively through the perception of the hand more or finger” family, “2nd finger” family, “3rd finger” family,
less filling the visual space. Moreover, in a similar “4th finger” family, “5th finger” family, which could prove
sequence, it was impossible to single out a simple logic to useful for better categorizing and memorizing
understand and memorize: remembering 58 elements handshapes. But the negative consequence lies in the
without any clear, precise reference points proved to be additional passages that must be carried out to reach the
difficult. Thus we concluded that is was necessary to desired handshape, which could be a further source of
single out principal handshapes and subgroups in order to confusion. In the present state-of-the-art we have proved
produce an applicable order. Therefore criteria 1 and 3 that criteria 1, 3 and 5 are fundamental, while criterion 2
proved to be fundamental. The order in which fingers is arbitrary.
open up could be inverted, from the open hand to a fist,
but there were no structural changes and no handshapes
were excluded. In this way the arbitrariness of criterion 2
was confirmed.

3 rd countercheck (9th April, 2004)


Inversion of criterion 2 With subgroups Figure 7: 3rd
countercheck
Inverting the order of the principal handshapes and going
against criterion 2, i.e. from the open hand to the closed
fist, and following the creation of subgroups according to
criteria 3, 4, 5 and 6, leads to the reproduction of a
“version 3b” in reverse, but without making it less clear or

115
3. Bibliographical References
Baker, C. & Cokely, D. (1980). American Sign Language:
a teacher's resource text on grammar and culture. Silver
Spring (USA): T.J. Publishers Inc.
Fischer, S.D. & Siple, P. (1990). Theoretical Issues in
Sign Language Research, Vol. 1. Chicago (USA): The
University of Chicago Press.
Kanda, K. (1994). “A computer dictionary of Japanese
Sign Language” in Ahlgren, I., Bergman, B. & Brennan,
M., Perspectives on Sign Language Usage. Durham (GB):
The International Sign Linguistics Association.
Kyle, J.G. & Woll, B. (1985). Sign Language: The study
of deaf people and their language. Cambridge:
Cambridge University Press. Lucas, C., Bayley, R. &
Valli, C. (2001). Sociolinguistic Variation in American
Sig Language. Wahington DC (USA): Gallaudet
University Press.
Radutzky, E. (1992). Dizionario bilingue elementare della
lingua italiana dei segni. Roma (I): Edizione Kappa.
Volterra, V. (1987). La Lingua Italiana dei Segni: la
comunicazione visivo-gestuale dei sordi. Bologna (I): Il
Mulino.

116
117
118
119
120
121
122
123
124
125

You might also like