Knowledge Representation in Sanskrit and Artificial Intelligence - NASA
RIACS, NASA Ames Research Center, Moffet Field, California 94305

Abstract

In the past twenty years, much time, effort, and money has been expended on designing an unambiguous representation of natural languages to make them accessible to computer processing These efforts have centered around creating schemata designed to parallel logical relations with relations expressed by the syntax and semantics of natural languages, which are clearly cumbersome and ambiguous in their function as vehicles for the transmission of logical data. Understandably, there is a widespread belief that natural languages arc unsuitable for the transmission of many ideas that artificial languages can render with great precision and mathematical rigor.

But this dichotomy, which has served as a premise underlying much work in the areas of linguistics and artificial intelligence, is a false one. There is at least one language, Sanskrit, which for the duration of almost 1000 years was a living spoken language with a considerable literature of its own. Besides works of literary value, there was a long philosophical and grammatical tradition that has continued to exist with undiminished vigor until the present century. Among the accomplishments of the grammarians can be reckoned a method for paraphrasing Sanskrit in a manner that is identical not only in essence but in form with current work in Artificial Intelligence. This article demonstrates that a natural language can serve as an artificial language also, and that much work in AI has been reinventing a wheel millennia old.

First, a typical Knowledge Representation Scheme (using Semantic Nets) will be laid out, followed by an outline of the method used by the ancient Indian Grammarians to analyze sentences unambiguously. Finally, the clear parallelism between the two will be demonstrated, and the theoretical implications of this equivalence will be given.

Semantic Nets

For the sake of comparison, a brief overview of semantic nets will be given, and examples will be included that will be compared to the Indian approach. After early attempts at machine translation (which were based to a large extent on simple dictionary look-up) failed in their effort to teach a computer to understand natural language, work in AI turned to Knowledge Representation.

Since translation is not simply a map from lexical item to lexical item, and since ambiguity is inherent in a large number of utterances, some means is required to encode what the actual meaning of a sentence is. Clearly, there must be a representation of meaning independent of words used. Another problem is the interference of syntax. In some sentences (for example active/passive) syntax is, for all intents and purposes, independent of meaning. Here one would like to eliminate considerations of syntax. In other sentences the syntax contributes to the meaning and here one wishes to extract it.

I will consider a “prototypical” semantic net system similar to that of Lindsay, Norman, and Rumelhart in the hopes that it is fairly representative of basic semantic net theory. Taking a simple example first, one would represent “John gave the ball to Mary” as in Figure 1. Here five nodes connected by four labeled arcs capture the entire meaning of the sentence. This information can be stored as a series of “triples”:

give, agent, John

give, object, ball

give, recipient, Mary

give, time, past.

Note that grammatical information has been transformed into an arc and a node (past tense). A more complicated example will illustrate embedded sentences and changes of state:

John Mary
book past
Figure 1.

“John told Mary that the train moved out of the station at 3 o’clock.”
Figure 2.

As shown in Figure 2, there was a change in state in which the train moved to some unspecified location from the station. It went to the former at 3:00 and from the latter at 3:O0. Now one can routinely convert the net to triples as before.

The verb is given central significance in this scheme and is considered the focus and distinguishing aspect of the sentence. However, there are other sentence types which differ fundamentally from the above examples. Figure 3 illustrates a sentence that is one of “state” rather than of “event.” Other nets could represent statements of time, location or more complicated structures.

A verb, say, “give,” has been taken as primitive, but what is the meaning of “give” itself? Is it only definable in terms of the structure it generates? Clearly two verbs can generate the same structure. One can take a set-theoretic approach and a particular give as an element of “giving events” itself a subset of ALL-EVENTS. An example of this approach is given in Figure 4 (“John, a programmer living at Maple St., gives a book to Mary, who is a lawyer”). If one were to “read” this semantic net, one would have a very long text of awkward English: “There is a LLJohn” who is an element of the “Persons” set and who is the person who lives at ADRI, where ADRI is a subset of ADDRESS-EVENTS, itself a subset of ‘ALL EVENTS’, and has location ‘37 Maple St.‘, an element of Addresses; and who is a “worker” of ‘occupation 1’. . .etc.”

The degree to which a semantic net (or any unambiguous, non-syntactic representation) is cumbersome and odd-sounding in a natural language is the degree to which that language is “natural” and deviates from the precise or “artificial.” As we shall see, there was a language spoken among an ancient scientific community that has a deviation of zero.

The hierarchical structure of the above net and the explicit descriptions of set-relations are essential to really capture the meaning of the sentence and to facilitate inference. It is believed by most in the AI and general linguistic community that natural languages do not make such seemingly trivial hierarchies explicit. Below is a description of a natural language, Shastric Sanskrit, where for the past millennia successful attempts have been made to encode such information.

Shastric Sanskrit

The sentence:

(1) “Caitra goes to the village.” (graamam gacchati caitra)

receives in the analysis given by an eighteenth-century Sanskrit Grammarian from Maharashtra, India, the fol- lowing paraphrase:

(2) “There is an activity which leads to a connection-activity which has as Agent no one other than Caitra, specified by singularity, [which] is taking place in the present and which has as Object something not different from ‘village’.”

The author, Nagesha, is one of a group of three or four prominent theoreticians who stand at the end of a long tradition of investigation. Its beginnings date to the middle of the first millennium B.C. when the morphology and phonological structure of the language, as well as the framework for its syntactic description were codified by Panini. His successors elucidated the brief, algebraic formulations that he had used as grammatical rules and where possible tried to improve upon them. A great deal of fervent grammatical research took place between the fourth century B.C and the fourth century A.D. and culminated in the seminal work, the Vakyapadiya by Bhartrhari. Little was done subsequently to advance the study of syntax, until the so-called “New Grammarian” school appeared in the early part of the sixteenth century with the publication of Bhattoji Dikshita’s Vaiyakarana-bhusanasara and its commentary by his relative Kaundabhatta, who worked from Benares. Nagesha (1730-1810) was responsible for a major work, the Vaiyakaranasiddhantamanjusa, or Treasury of definitive statements of grammarians, which was condensed later into the earlier described work. These books have not yet been translated.

The reasoning of these authors is couched in a style of language that had been developed especially to formulate logical relations with scientific precision. It is a terse, very condensed form of Sanskrit, which paradoxically at times becomes so abstruse that a commentary is necessary to clarify it.

One of the main differences between the Indian approach to language analysis and that of most of the cur- rent linguistic theories is that the analysis of the sentence was not based on a noun-phrase model with its attending binary parsing technique but instead on a conception that viewed the sentence as springing from the semantic message that the speaker wished to convey. In its origins, sentence description was phrased in terms of a generative model: From a number of primitive syntactic categories (verbal action, agents, object, etc.) the structure of the sentence was derived so that every word of a sentence could be referred back to the syntactic input categories. Secondarily and at a later period in history, the model was reversed to establish a method for analytical descriptions. In the analysis of the Indian grammarians, every sentence expresses an action that is conveyed both by the verb and by a set of “auxiliaries.” The verbal action (Icriyu- “action” or sadhyu-“that which is to be accomplished,“) is represented by the verbal root of the verb form; the “auxiliary activities” by the nominals (nouns, adjectives, indeclinables) and their case endings (one of six).

The meaning of the verb is said to be both vyapara (action, activity, cause), and phulu (fruit, result, effect). Syntactically, its meaning is invariably linked with the meaning of the verb “to do”. Therefore, in order to discover the meaning of any verb it is sufficient to answer the question: “What does he do?” The answer would yield a phrase in which the meaning of the direct object corresponds to the verbal meaning. For example, “he goes” would yield the paraphrase: “