A certain amount of Wolfram|Alpha input is actually quite language independent—because it’s really in math, or chemistry, or some other international notation, or because it’s asking about something (like a place) that’s always referred to by the same name.
But inevitably many inputs do depend on human language—and in fact even now about 5% of all inputs that are given try to use a language other than English.
The approach our team took during the initial development of Wolfram|Alpha was to accumulate large corpuses of linguistic usage in different areas, then to abstract from these rules and meta-rules that could be slotted intoWolfram|Alpha’s linguistic processing system.
Now that Wolfram|Alpha has been released, our team has a major new—and more accurate—source, at least for English: the millions and millions of actual inputs that are given to the system.
So what’s involved in generalizing to other languages? A certain amount can be done by word- or phrase-wise translation. Often there will be multiple translations at this level. And when there are several words or phrases together, there will often be a combinatorial explosion in the number of possibilities.
The generalization of Wolfram|Alpha to all major human languages is a huge undertaking. But it’s one that we’re committed to pursuing.
Source: Wolfram|Alpha Blog
