Automating "Translation"

I've written a very simple script that "translates" Tibetan—or rather, it does as much Tibetan work as many of us do in the university-level classroom.

I'm going to pick on Rockwell here, because that's the book I was taught Tibetan from. All I've done for the script is input the glossary from Rockwell—the script simply converts the Tibetan symbols to Rockwell's given English equivalent.

In other words, I've outsourced the tedious work—work every student is taught to do in the grammar-translation classroom—of "using the glossary to look up words one-by-one" and then "write out the English words one-by-one," and given those jobs to the computer. 

What I've done is given the computer the following tasks:

  1. Recognize the Tibetan symbol(s)
  2. Look the symbol string up in the glossary
  3. Replace the Tibetan symbols with English symbols

Using the Script

For example, I can feed the following symbol string to the computer:

Input: 

  1. །སངས་རྒྱས་ཆོས་དང་དགེ་འདུན་ལ
    །སྒོ་གསུམ་གུས་པས་སྐྱབས་སུ་མཆི KP 3A:6 (verse)
  2. ་་་བླ་མས་བུ་ཆེན་རྣམས་ལ་ཆོས་དང་གདམས་ངག་གི་དཀོར་མཛོད་རྣམས་ཁ་ཕྱེ་་་ MINT 88:11-2...

What the script spits out is:

Output: 

  1. ། [Buddha] [dharma] [*AND*] [Buddhist community, sangha] [*TO(la)*]
    [./,/;/:/?/!]
    [three gates = body, speech, mind] [respect, devotion] [*AGENT/INST*] [refuge] [*TO*(ladon)] [to go] KP 3A:6 (verse)
  2. ་ [lama] [*AGENT/INST*] [son] [great] [*PLURAL*] [*TO(la)*] [dharma] [*AND*] [oral instruction] [*OF*] [treasury] [*PLURAL*] [pf. to open] ་ MINT 88:11-2...

Now I'm ready to "translate"! Using the grammar clues [*GRAMMAR*], all I need to do now is re-arrange the English words into a sentence that makes sense. (This is exactly what many Tibetan students are up to to this day!). 

But wait... I have to ask, couldn't I give this output to just about any native English speaker, even one who doesn't know a lick of Tibetan? With a few simple instructions (try reading it backwards; word order is SOV; etc.), they, too, could begin "learning Tibetan" without ever actually seeing even one word of Tibetan.

The question I'd like Tibetan teachers and students to think very hard about is this: If the computer can easily provide the English for us, what "Tibetan" work is really being done?

Reading

I've mentioned in a past post that "reading" is a complex process that requires: 1) decoding and 2) comprehension. What the script does is a version of the first bit of work for the student: decoding.

And, I'm arguing, that's all the Tibetan work a student can do in the Grammar-Translation classroom anyway. When we're taught in English, and we decode Tibetan into Englishall of our comprehension necessarily comes in English! We never learn how to comprehend Tibetan.

If students don't learn how to comprehend the source text (to truly read it), how can we expect them to learn how to translate?! If students can't learn how to think in Tibetan, how can they begin to understand an author who thinks in Tibetan?!

Machine Translation

One final point: This is not machine translation. Even if we started asking the computer to re-arrange the words according to the grammar (which we could do), it wouldn't be anything like how modern MT (machine translation) works.

Early on, this was exactly the method computer scientists tried to use to make a working MT. But natural languages are idiomatic; they break their own rules; they are metaphorical, not literal; they are, in the end, so much more than just bare vocabulary and grammar that this approach simply doesn't work.

So much so that every working MT model, like Google Translate, is based on other methods—like multi-word-level statistical analysis. 

If even machines can't make word-by-word translation work (and, they are much much faster at looking up words in the glossary than humans are), then why do we still expect Tibetan students to?!

screenshot-118.png