Winds of Change.NET: Liberty. Discovery. Humanity. Victory.

Formal Affiliations
  • Anti-Idiotarian Manifesto
  • Euston Democratic Progressive Manifesto
  • Real Democracy for Iran!
  • Support Denamrk
  • Million Voices for Darfur
  • milblogs
Syndication
 Subscribe in a reader

Call for Information: Arabic and Farsi Machine Translation

| 6 Comments

Jeff Jarvis unfortunately buries his lead in a post re Spirit of America fundraising: Both SixApart (typepad) and Blogger have agreed to help put together useable Arabic versions so the circle of Middle East bloggers can be expanded beyond those with English language skills. Tip of the hat and best wishes to them, as well as SoA's own project to produce an Arabic blogging tool.

So let me do the VC thing and ask the "lead the duck" question: Assuming we can foster an Arabic blogging community, how do we stay connected across the language barrier? Yes, there are dedicated folks working at manually translating blog posts and comments at places like Sarmad's Road of a Nation Forums, but it's a grind and slows the action to a crawl.

One part of the answer is to throw cheap cycles at the problem, that is, machine translation. Yes, I'm fully aware of the limitations of the technology, but a cheap something is better than an expensive nothing. And we can all find common ground in making fun of the translation software, at the least.

To that end, I'd like to ask the help of readers to identify best-of-breed technology sources for translation between English and Arabic and Farsi. I'm already aware of Language Weaver and Meaningful Machines on the Arabic to English front. Since much of the development in these areas has been funded (quite openly) by DARPA and In-Q-Tel, there's more available for the path into English than the other way, but we're going to need both, so please send in any tips. The current state-of-the-art has moved to corpus based translation, but an older rule based system would be better than nothing. Right now, this is a technology survey, we'll worry about how to get the systems integration and business deals done later. Have a clue for me? Drop it in the comments.

6 Comments

Tim, it's not my area of expertise, but you might touch base with the AI / natural language people at SUNY Albany ... they're doing work for DARPA now (a new relationship IIRC) but might be able to point you in some good directions.

Ack, clicked too fast. A good person to start with is Tomek Strzalkowski.

I should add a few practical qualifiers: Needs to be server based, preferably with a web services interface or capable of having one constructed. Either a commercial product, or if academic, constructed with forethought for scalability and operations.

This is slightly off topic but related - at least for me it is.

From that biiiggg country that begins with a "see", I have been reading the Iraqi blogs daily since they began. Until about 5 or 6 days ago. Blogspot is blocked here and I was able access the sites in a round about way - through the google translator. Now I can't reach any blogspot site.

So, as you are making your plans, please try to find a way to make English translations available to all the world.

It would mean a lot to me.

Good point, JFarr, and it introduces the related (not 'off') topic of firewall subversion to allow communication against the desire of oppressive regimes. Perversely, I suspect the current most advanced techniques in that regard come from spammers on one hand, and p2p networks on the other.

I used to use Ajeeb when it was free:

http://english.ajeeb.com/

It was good enough in many cases. Not perfect by any standard.

Leave a comment

Here are some quick tips for adding simple Textile formatting to your comments, though you can also use proper HTML tags:

*This* puts text in bold.

_This_ puts text in italics.

bq. This "bq." at the beginning of a paragraph, flush with the left hand side and with a space after it, is the code to indent one paragraph of text as a block quote.

To add a live URL, "Text to display":http://windsofchange.net/ (no spaces between) will show up as Text to display. Always use this for links - otherwise you will screw up the columns on our main blog page.




Recent Comments
  • TM Lutas: Jobs' formula was simple enough. Passionately care about your users, read more
  • sabinesgreenp.myopenid.com: Just seeing the green community in action makes me confident read more
  • Glen Wishard: Jobs was on the losing end of competition many times, read more
  • Chris M: Thanks for the great post, Joe ... linked it on read more
  • Joe Katzman: Collect them all! Though the French would be upset about read more
  • Glen Wishard: Now all the Saudis need is a division's worth of read more
  • mark buehner: Its one thing to accept the Iranians as an ally read more
  • J Aguilar: Saudis were around here (Spain) a year ago trying the read more
  • Fred: Good point, brutality didn't work terribly well for the Russians read more
  • mark buehner: Certainly plausible but there are plenty of examples of that read more
  • Fred: They have no need to project power but have the read more
  • mark buehner: Good stuff here. The only caveat is that a nuclear read more
  • Ian C.: OK... Here's the problem. Perceived relevance. When it was 'Weapons read more
  • Marcus Vitruvius: Chris, If there were some way to do all these read more
  • Chris M: Marcus Vitruvius, I'm surprised by your comments. You're quite right, read more
The Winds Crew
Town Founder: Left-Hand Man: Other Winds Marshals
  • 'AMac', aka. Marshal Festus (AMac@...)
  • Robin "Straight Shooter" Burk
  • 'Cicero', aka. The Quiet Man (cicero@...)
  • David Blue (david.blue@...)
  • 'Lewy14', aka. Marshal Leroy (lewy14@...)
  • 'Nortius Maximus', aka. Big Tuna (nortius.maximus@...)
Other Regulars Semi-Active: Posting Affiliates Emeritus:
Winds Blogroll
Author Archives
Categories
Powered by Movable Type 4.23-en