Most people pretty much take language for granted, and don't even think about it unless they're confronted with someone doesn't understand them, and can't be understood by them in turn. In such a case, we say that the other speaks "a different language" than we do, but what does that really mean?

Does the other have incomprehensible thoughts in their head that we can't understand? No; internally, all (normal) people think in pretty much the same manner, with the same general streams of consciousness, and the same general methodology. There are cultural differences, some of which relate to language, but the underlying structure of human thought it fairly universal. The difficulty that arises with someone who speaks a different language is merely that although we could understand the thoughts and ideas of the other person, we can't correlate the sounds they are using to the ideas they are trying to relate. (Incidentally, how do children correlate the language they hear with the reality that surrounds them when they initially learn to speak? That's a rather large open problem for neurologists, psychologists, and computer scientists.)

Attempts have been made to create language-independent representations for the foundational level of thought, and one of the most well-known is called CD: Conceptual Dependency. The purpose of conceptual dependency structures is to represent a concept at a fundamentally low level; this representation can then be processed by a computer, or can be translated into any human language.

It isn't as simple as it sounds! For example, here is a slide excerpted from the PowerPoint presentation I linked to above. It displays a (possible) representation for the sentence "Jan kicked the cat.".

Yes, sentence examples in artificial intelligence are always mildly violent or sexual. Draw your own conclusions. This graphical depiction of a conceptual dependency isn't simple to explain (view the presentation), but that should give you the flavor. CD is very versitile, and any concept that can be expressed in spoken or written language can be built into a dependency structure.

So what? Well, these structures can be encoded digitally and fed into a computer program that can analyze them and extract meaning from them. For example, many pieces of software have been written that use CD to read short stories, and then answer questions such as "Who was the main character?" "Why did the knight kill the dragon?" and that sort of thing. Some systems can even read stories, and then generate their own stories based on what they learn from their reading. "Generate a story about bravery." (The stories aren't usually very good.)

What's more, CD can serve as an intermediate step in language translation. Rather than building a X*X translation engines that can translate every X language into every other, 2X engines can be built: X to translate a language into CD, and X to translate CD into each language. The idea doesn't work perfectly yet, but the concept is sound.

Human language is a fantastic tool for sharing the information that's otherwise trapped inside our brains, but don't be fooled into thinking that language is the same as thought. Thought and language are closely related -- and most people actually do think in streams of language -- but by isolating them we can reach a greater understanding of their interaction and operation than we can if we are forced to consider them together.



