Winds of Change.NET: Liberty. Discovery. Humanity. Victory.

Formal Affiliations
  • Anti-Idiotarian Manifesto
  • Euston Democratic Progressive Manifesto
  • Real Democracy for Iran!
  • Support Denamrk
  • Million Voices for Darfur
  • milblogs
Syndication
 Subscribe in a reader

Call for Information: Arabic and Farsi Machine Translation

| 6 Comments

Jeff Jarvis unfortunately buries his lead in a post re Spirit of America fundraising: Both SixApart (typepad) and Blogger have agreed to help put together useable Arabic versions so the circle of Middle East bloggers can be expanded beyond those with English language skills. Tip of the hat and best wishes to them, as well as SoA's own project to produce an Arabic blogging tool.

So let me do the VC thing and ask the "lead the duck" question: Assuming we can foster an Arabic blogging community, how do we stay connected across the language barrier? Yes, there are dedicated folks working at manually translating blog posts and comments at places like Sarmad's Road of a Nation Forums, but it's a grind and slows the action to a crawl.

One part of the answer is to throw cheap cycles at the problem, that is, machine translation. Yes, I'm fully aware of the limitations of the technology, but a cheap something is better than an expensive nothing. And we can all find common ground in making fun of the translation software, at the least.

To that end, I'd like to ask the help of readers to identify best-of-breed technology sources for translation between English and Arabic and Farsi. I'm already aware of Language Weaver and Meaningful Machines on the Arabic to English front. Since much of the development in these areas has been funded (quite openly) by DARPA and In-Q-Tel, there's more available for the path into English than the other way, but we're going to need both, so please send in any tips. The current state-of-the-art has moved to corpus based translation, but an older rule based system would be better than nothing. Right now, this is a technology survey, we'll worry about how to get the systems integration and business deals done later. Have a clue for me? Drop it in the comments.

6 Comments

Tim, it's not my area of expertise, but you might touch base with the AI / natural language people at SUNY Albany ... they're doing work for DARPA now (a new relationship IIRC) but might be able to point you in some good directions.

Ack, clicked too fast. A good person to start with is Tomek Strzalkowski.

I should add a few practical qualifiers: Needs to be server based, preferably with a web services interface or capable of having one constructed. Either a commercial product, or if academic, constructed with forethought for scalability and operations.

This is slightly off topic but related - at least for me it is.

From that biiiggg country that begins with a "see", I have been reading the Iraqi blogs daily since they began. Until about 5 or 6 days ago. Blogspot is blocked here and I was able access the sites in a round about way - through the google translator. Now I can't reach any blogspot site.

So, as you are making your plans, please try to find a way to make English translations available to all the world.

It would mean a lot to me.

Good point, JFarr, and it introduces the related (not 'off') topic of firewall subversion to allow communication against the desire of oppressive regimes. Perversely, I suspect the current most advanced techniques in that regard come from spammers on one hand, and p2p networks on the other.

I used to use Ajeeb when it was free:

http://english.ajeeb.com/

It was good enough in many cases. Not perfect by any standard.

Leave a comment

Here are some quick tips for adding simple Textile formatting to your comments, though you can also use proper HTML tags:

*This* puts text in bold.

_This_ puts text in italics.

bq. This "bq." at the beginning of a paragraph, flush with the left hand side and with a space after it, is the code to indent one paragraph of text as a block quote.

To add a live URL, "Text to display":http://windsofchange.net/ (no spaces between) will show up as Text to display. Always use this for links - otherwise you will screw up the columns on our main blog page.




Recent Comments
  • Demosophist: I'm wondering if the current whip count (favoring the nos) read more
  • Alchemist: I think you misunderstood mark. I was saying in my read more
  • Demosophist: Roland: I have just never had a good feeling about read more
  • Roland Nikles: In his treatise, The Constitution of Liberty (1960), F. A. read more
  • Demosophist: Roland: If the measure passes I too will hope for read more
  • Glen Wishard: Roland:I am rooting for the thing to pass, with fingers read more
  • Roland Nikles: I regret that I haven't had the time to follow read more
  • Demosophist: My dissertation research was on the 1996 House elections. That read more
  • jan: Congress should be an instrument of the people. But in read more
  • Armed Liberal: Tom, I'd suggest that the other difference is that no read more
  • Foobarista: If there's a sure-fire way to see the downfall of read more
  • mark buehner: I will say both Republicans and Democrats have done a read more
  • mark buehner: "I still think the best way to eliminate these groups read more
  • Alchemist: Honestly, I think both parties are beholden to special interests... read more
  • Perry The Cynic: What will it take? The effective destruction of the "media-industrial read more
The Winds Crew
Town Founder: Left-Hand Man: Other Winds Marshals
  • 'AMac', aka. Marshal Festus (AMac@...)
  • Robin "Straight Shooter" Burk
  • 'Cicero', aka. The Quiet Man (cicero@...)
  • David Blue (david.blue@...)
  • 'Lewy14', aka. Marshal Leroy (lewy14@...)
  • 'Nortius Maximus', aka. Big Tuna (nortius.maximus@...)
Other Regulars Semi-Active: Posting Affiliates Emeritus:
Winds Blogroll
Author Archives
Categories
Powered by Movable Type 4.23-en