03 Iwn Sanskrit Lexicon
03 Iwn Sanskrit Lexicon
Amba Kulkarni
Department of Sanskrit Studies, University of Hyderabad,
Sivaja S. Nair
Department of Sanskrit Studies, University of Hyderabad,
[email protected]
Abstract
Hyderabad
[email protected]
as a list of synonyms. The close inspection of the structure of the Amarako a gives much more s insight into the way the words are organised. When a student memorises it, though in the beginning it appears as a linear list of words, as he starts understanding the meaning of the words, reads the commentaries on this text and starts using these words, the linear structure unfolds into a knowledge web with various links.
Hyderabad
The Sanskrit ko as such as Amarako a, s s Vaijayantiko a s apart view in from of etc. have the a built in knowledge structure of its own which revealing ontological a with holistic many specic Knowledge classication, these ko as s provides concepts. concerns culture
various
non-observational,
facts. In this paper we present a few representative examples of the concept clusters from the two Sanskrit ko as; s There e-form Amarako a s resources and Vaijayantko a. s in suitable
Many
other
ko as s
such
as
Vaijayantko a, s H r val, a a
Trik ndsesa, a. . .
is a necessity to make these valuable available so that the NLP community working in Indian Languages can be benitted. Key Words : Amarako a, s Vaijayantko a, s The modern efforts of building ConceptNet aimed at building a network around various concepts, have some parallels with the structure of these Sanskrit ko as. s
In this paper, we present a few samples of knowledge structure from two important ko as s viz Vaijayantko a and Amarako a, and compare s s the knowledge structure involved with that of the ConceptNet.
Introduction
The Indian tradition of transmitting knowledge orally is on the verge of vanishing. As the oral transmission demands, Indian traditional
Amarako a s
educational culture was organised to be formal and intensive as opposed to the modern culture which is more informal and extensive (Wood, 1985). would In traditional his circumstances, largely a by child oral receive education Amarako a s primarily (a work named that deals as with Namalinganu asana s
4th
century A.D.
transmission, mainly through rote-learning. The method employed was through recitation and remembering. A child is taught the alphabet (varnam l ), he would memorise a few verses, a a . subh sitas, and then start reciting a dictionary a. of synonymous words the Amarako a till it s is memorised. It typically would take anywhere between 6 months to a year to memorise a list of approximately 10,000 Sanskrit words arranged
(Oka, 1981) - and is the most celebrated and authoritative ancient thesaurus of Sanskrit with around 60 commentaries and translations into modern Indian as well as foreign languages such as Chinese, Tibetan, French, etc. (Patkar, 1981). It is considered as an essential requisite for a Sanskrit scholar and as such a child is asked to memorise it even before he starts his studies formally. It consists of 1608 verses composed in
anustup meter ..
called K ndas. a. .
2.0.1 Classication
Each of the three K ndas is further subdivided a. . into various vargas. The classication of three k ndas into 25 vargas is as below. a. .
Prathamakandam . .
Vaijayantko a s
svargavargah (heaven) . vyomavargah (sky) . digvargah (direction) . k lavargah (time) a . dhvargah (cognition) . sabd divargah (sound) a . n tyavargah (drama) a. . p t labhogivargah (nether world) a a . narakavargah (hell) . v rivargah (water) a .
is
voluminous
lexicon
by
Approximately
18,000
tokens
are there in Vaijayantko a.The lexicon is divided s into two broad divisions, viz. synonym sets and polysemous words. The synonym sets are further divided into ve classes or k nda's, viz., svarga a. . (heaven), antarksa (sky), bhumi (earth), patala . (nether world) and samanya (miscellaneous). The polysemous words are classied into three classes based on the number of syllables they contain, viz., two, three and more than three. Thus Vaijayantko a has eight classes which are further s sub-divided into two or more sub-sections called adhyayas. There are total forty three in number. The classication of rst ve classes are shown in below.
Dvityakandam . .
bhumivargah (earth) . puravargah (towns or cities) . sailavargah (mountains) . vanausadhivargah (forests and medicines) . . simh divargah (lions and other animals) . a . manusyavargah (mankind) . . brahmavargah (priest tribe) . ksatriyavargah (military tribe) . . vai yavargah (business tribe) s . sudravargah (mixed classes) .
Adidev dhy yah (supreme diety) a a . Lokap l dhy yah (guardian deities) a a a . Yaksadhy yah (semi-divine beings) a . .
Trtyakandam . . .
vi esyanighnavargah (adjective) s . . . samkrnavargah (miscellaneous) . n n rthavargah (polysemous) a a . avyayavargah (indeclinables) . a ling disangrahavargah (gender) .
Amarako a s contains 11,580 content words (tokens). Some of the tokens are repeated either within a k nda or across the k ndas leading to a. . a. . only 9,031 types. The k nda-wise distribution of a. . the tokens and types is shown in Table 1. The organisation of words is typically in the form of a set of synonymous words.
Sloke sastam gurum jneyam sarvatra laghu pancamam . . . .. . and as such is known as Trikand . .
Van dhy yah (forest) a a . Pa usangrah dhy yah (animals) s a a . Manusy dhy yah (mankind) a . . a Br hmanadhy yah (priest tribe) a a . . Ksatriy dhy yah (military tribe) a a . . a Sudr dhy yah (mixed class) a . Vai y dhy yah (bussiness tribe) s a a .
relations indicate various kinds of relations. They may be classied as hierarchical or associative. The hypernym indicating a more general term or the hyponym showing a more specic term are the examples of hierarchical relation. Similarly the holonym-meronym relation marking the whole-part relation is also a hierarchical relation. In addition various other relations are indicated by the adjacency of the synsets. These may be termed as associative relations, which indicate some kind of association of one synset with the other. This association may be the association among human beings, or the association of certain objects with certain other objects. We illustrate below some such relations with examples.
Pur dhy yah (town & cities) a a . a But dhy yah (living being) a .
ConceptNet
ConceptNet
! is a commonsense knowledgebase
and natural-language-processing toolkit. It is a semantic network of commonsense knowledge. It aims to give that computer ordinary an access the to the of but commonsense information knowledge kind know
people
usually leave unstated. ConceptNet is generated automatically from the English sentences of the Open Mind Common Sense(OMCS) corpus. Fig.1 shows the ConceptNet representation of the sentence wake up in the morning and drink coffee.
Figure 2: Relations of Visnu from Amarako a s .. Thus we see here, the association of synsets indicate various kinship terms such as father, sibling, wife, son etc. Knowledge of such kinship terms is an essential data base for understanding various texts on our religion. Next we see various Figure 1: Knowledge representation in special instruments that are associated/used by Visnu such as conch, discus, sword, jewel, bow, .. etc., then the charioteer, vehicle, ministers etc. Thus all the synsets together give a holistic view
5 Knowledge structure in Sanskrit Ko as s
ConceptNet
After examining the entries in Amarako a and s Vaijayantko a, s polysemous we noticed that except all the other words (n n rthavarga), a a
synsets in a class show some semantic relation to the class it belongs to and sometimes even to the preceding or following synsets. These semantic
!
http://web.media.mit.edu/hugo/conceptnet/
half of a lunar month (paksa) (2.1.79) . Figure 3: Relations of Visnu from Vaijayantko a s .. rst (white) paksa of a lunar month . (2.1.79) second (black) paksa of a lunar month . (2.1.79) paksa in which the tithis become shorter . (2.1.79) paksa in which the tithis become longer . (2.1.79) short fteenth day of the second (krsna)paksa (2.1.80) .. . . short rst day of a paksa (2.1.80) . lunar month (2.1.80) solar month (month in which the sun passes to another ra i) (2.1.80) s star month (2.1.81) month of 30 days (2.1.81)
{Day (1.4.2)
Morning (1.4.2 - 1.4.3) Twilight (1.4.3) Evening (1.4.3) First four hours of a day (1.4.3) Second four hours of a day (1.4.3) Third four hours of a day (1.4.3) Period of the day (1.4.3) Night (1.4.3 - 1.4.4) A dark night (1.4.5) A moonlight night (1.4.5) A night and two days (1.4.5) First part of night (1.4.6) Midnight (1.4.6) Sequence of nights (1.4.6) Space of three hours (1.4.6)
Thus we see that these structures provide complete picture of the Indian calendar which is an essential part of Indian culture.
the Amarako a. The words here refer to the king, s military, ministers, various category of people engaged in the services of kings, etc.
Last day of the half month (1.4.7) Precise moment of the full or the new moon (1.4.7) Full moon day (1.4.7) Full moon whole day(1.4.8) Full Moon with a little gibbous on part of a day (1.4.8) No moon day (1.4.8) wanning crescent (1.4.9) No moon whole day (1.4.9) In this example, we see various concepts, in Indian tradition, associated with the concept of time. These may be broadly classied into the concepts associated with the apparent solar motion and those associated with the lunar motion. Man of the military tribe (2.8.1) King (2.8.1) Universal monarch (2.8.2) An emperor (2.8.2) King over a country (2.8.2) Paramount sovereign (2.8.3) Multitude of kings (2.8.3) Multitude of military tribe (2.8.4) Minister (2.8.4) Deputy minister (2.8.4) Priest (2.8.5) Judge (2.8.5) King's companions (2.8.5) Body guards of a king (2.8.6) Warder (2.8.6)
Superintendent (2.8.6) Village Superintendent (2.8.7) Superintendent of many villages (2.8.7) Superintendent of Gold (2.8.7) Superintendent of Silver (2.8.7) Superintendent appartments (2.8.8) Outside guard of the womens' appartment (2.8.8) attendant of a king (2.8.9) eunuch (2.8.9) Prince whose territories lie on the frontiers of those of the enemy (2.8.9) Neighboring prince (2.8.9) Prince whose territories lie beyond those of the friend (2.8.10) Enemy in the rear (2.8.10) of the womens'
The Sanskrit ko as on the other hand contain s words which are :(a) culture specic (svarga, visnu, etc.), .. (b) reveal social or man made structures (kingdom, house, etc.), (c) throw light on various social practices, in addition to the observational facts such as classication of animals, the ora and fauna etc.
This
structure
thus facts,
many mostly
non-observational
part of our culture. In this sense the structure of these ko as is complementary to what the s ConceptNet provides and is an essential part of the Indian culture.
There is a necessity to make these valuable This again gives a good background of the military structure in ancient days, throwing hight on the social structure in those days. resources available in suitable e-form so that the NLP community working in Indian Languages can be benitted.
7 Computational Tools
The team at the Department of Sanskrit Studies, University of Hyderabad, has already taken a lead by starting a pilot study of Amarako a. There is a s need to explore and build e-structures with other Sanskrit Ko as mentioned above in order to make s this knowledge available for NLP applications.
avayavavayav (part-whole relation) paraparajati (is a kind of relation) janyajanaka (child-parent relation) patipatn (husband-wife relation) svasvami (master-possession relation) ajvika (livelihood)
References
There
are
few
other
relations
such
as
Bharati Akshar, Kulkarni Amba and Nair S. Sivaja (2008) Use of Amarakosha and Hindi WordNet in Building a Network of On Sanskrit Natural Words
International
Conference
Language
There is a fundamental difference between the development of ConceptNet and the knowledge found in the Sanskrit Ko as. The ConceptNet s is aimed at `capturing the common sense'. This common sense typically concerns with the behaviourial observations, social norms etc. and also focuses typically on actions with which other
Colebrooke H.T (1808) Kosha or Dictionary of the sungskrita language by Umura singha with an
Gustav,
Oppert
(1893) Madras
The Sanskrit
Vaijayant and
of
Yadavaprak sa a
vernacular
Liu, H and Singh P (2004) ConceptNet - a practical commonsense reasoning tool-kit, BT Technology Journal V ol 22 No 4. October Liu, H and Singh P (2004) Commonsense Reasoning in and over Natural Language, Proceedings of the 8th International Conference on Knowledge-Based Intelligent Information and Engineering Systems, Wellington, New Zealand. Moosath, Parameshvaran T.C. (1914) Paarameswarii : malayalam commentory of Amarakosha National Book Stall, Kottayam. Moosath, Parameshvaran T.C.(1956) Triveni :
malayalam commentory of Amarakosha National Book Stall, Kottayam. Nair S. Sivaja, Swain Pritilaxmi and Kulkarni Amba (2009) across Developing network of Sanskrit words
Part-Of-Speech
categories,
CSATS'09,
Ksheraswamin,
Upasana
Varanasi. Patkar M.M (1981) History of Sanskrit Lexicography Munshiram Manoharlal Publishers Pvt. Ltd., Delhi. Pandit, Sivadatta (1915) With the Namalinganusasana commentary of
(Amarakosha)
(Vyakhyasudha or Ramasrami) of Bhanuji Dikshit Tukaram javaji proprietor of the Nirnaya Sagar Press, Bombay.