onsdag 30. september 2009

Facebook lets everyone crowdsource translations

When Facebook began expanding internationally, instead of paying professionals to translate its site into other languages, it turned to its users. They quickly made it available in 65 languages.
The Tower of Babel
Now Facebook’s turning around and offering that technology to other sites. Called Translations for Facebook Connect, it’s available to any of the 15,000 partners who use the company’s Connect service. (Launched last year, Connect lets people use their Facebook ID to log-in to other online destinations.)
It works for Web sites, apps and widgets. Publishers and developers insert a few lines of JavaScript into their source code, pick which languages they want and then turn to Facebook users to translate content.
Sharing its translations strategy is not a completely altruistic move: by dangling the carrot of free translations in exchange for using Facebook Connect, the startup can expand its reach and deter companies from using other competing ID systems.
It also can start collecting more substantial amounts of data on how people translate language, which it could then feed into translation algorithms later. In contrast, Google has relied on programs, rather than humans, for translation efforts. Then again, it has a extraordinarily rich data set to extrapolate rules from with book search and pretty much every translated site on the web. Facebook just has its own interface to pull translation data from, but it could have much more with all the mundane and colloquial conversations happening every day on the site.
From Dave Ellis, a Facebook engineer and computational linguistics expert:

Social natural language processing is (in a sense) in its infancy. We hope to capture aspects of its evolution, just as the field comes to better describe and understand ongoing changes in human languages. We expect more fine-grained analyses to follow, using our framework to compare and contrast a variety of languages (from Bantu to Balinese) and phenomena (inside jokes, cross-linguistic usage of l33t and txt msg terms).

Ingen kommentarer:

Legg inn en kommentar