Tuesday, March 23, 2010

Started Beta testing!

Start Your Translation Engines!

The CrisisTerp site instance is up and running on my slicehost space! After a good bit of debugging today, it seems to be working well.

Initial Bugs

Bug: Turns out the moses-python code I wrote was a complete debacle. The code was good for only 1 translation. After completing 1 translation, my moses-python wrapper would crash the entire site.

Solution: Python Multiprocessing to the rescue! I was able to use multiprocessing to run a translation job, store the translation result, wipe the moses-python memory space, and re-instance that code for subsequent jobs. very simple fix, not a long term fix though.

Bug: For some of the unknown sequences, moses generates an "|UNK" sub-string in the translation output. Simple enough fix for regexs.

Bug: Google-Analytics. I wanted to track usage statistics and that sort of thing to see if it's worth maintaining a site (slicehost ain't free) . I had some issues with the javascript google gave me, but I managed to work around the issues. Google-Analytics is now tracking CrisisTerp! On that note, I can't say enough positive things about CherryPy, web developing isn't my forte but, CherryPy has made all the difference!

Thought: If the site generates enough traffic, then I'll probably consider monetizing parts of it (ad-sense or ad-sensing this blog) to reduce the cost of running the site. If traffic load is dense enough, I may need to consider purchasing more space from slicehost. In the event I choose to go this route, I'll make the dollar amounts (costs, etc) publicly available. If the income ends up turning some amount of positive balance, then I'll donate that additional money to the Redcross. Seems fitting.

Next Step?

I need to package the site code, my parallel corpora, and the moses-python wrapper for deployment on the ccmts google code page OR on a new google code page. I'm tempted to fire up a new page...then again, that's a lot more work than it's worth.

Future Projects

I have a lot of ideas for where to take this system.
  • I've some additional languages I'd like to target.
  • Expect to see more language support added in the future, webservice integration features (registration form for a UUID so I can better track bandwidth consumption)
  • Integration with social networking sites!
That about wraps it up for now. I'm off to Boston for PAX this Friday, so I'll be out of the loop. Let it be known, I'll be working toward the packaging of corpora and source code next week!
ct

No comments:

Post a Comment