A research on making Word Lens for Microsoft’s platforms (lets do it)

Word Lens app is on titles. Its perfect and just awesome. But It had some bugs as well, but the post about bugs on Word Lens is dated back to 2010. The developers tuned the algorithm and made Google buy their own app.  I have a pain, as this app will never make her presence to Windows Phone and Windows store, obviously because Google owns this app and you all know the Google’s strategy, they will never publish apps for small user based platforms. So I thought to implement a kind of app like this to Microsoft’s platforms. I may not succeed in this process, but I would like to share the ideas I had and researches I have done with my mates, sometimes this post may motivate them to do a kind of app.

The <BIG> Idea </BIG>

OCR has a very long history starting in 1914, Emanuel Goldberg developed a machine that read characters and converted them into standard telegraph code.In the late 1920s and into the 1930s Emanuel Goldberg developed what he called a “Statistical Machine” for searching microfilm archives using an optical code recognition system. In 1931 he was granted USA Patent number 1,838,389 for the invention. The patent was acquired by IBM.

Its Omnipage started their OCR software business in 1970s, by today Omnipage support 125 languages. Then Tesseract (yes the same Tesseract, whose SDK we are using to implement OCR in Android apps) released offline version of software by 1985. In 1989 ABBYY released their OCR software and most computer scanner hardware vendors adopted to them and distributed the ABBYY fine reader software with their scanners. ABBYY was a until smart phones dominate the world. Some people still use them.

Then with the presence of iPhones, smart phones started to dominate the world, then comes Android and Windows Phone. By 2010, people feels like a PC in their pocket, they carry essential documents in their pocket and it opens the gate for most PC software vendors to develop apps for smart phone devices. Many developers started to develop OCR apps for smart phones to scan and get the text from a picture and Augmented Reality came to smart phones mean while Google introduced Google Goggles to their Android app store and Windows Phone came with an app called “Bing Vision” to scan an image, get the text on it and translate to other languages. The below is the image of Bing vision app. Remember, Bing Vision app is not powered with Augmented reality but it supports more languages then the current Word Lens app.

Not a buggy app, but not efficient in the past

A developer, a game developer indeed, started to develop an app that uses combined Augmented Reality and OCR functions in smart phone and translate single word to other languages, named “Word Lens” Some early blog posts say, this app was not quite good enough.

See the above image from this post, made in 2010 and you can see how this Word Lens app sucked that time around. This image show how this app had problem in identifying the correct things, ie, combined augmented reality and OCR worked well here though useless.

See the above picture,  I got from the same blog post above. It translated a text where the correct and meaningful translation is “if you’re going to vomit, put your hands here.” Here the problem causer is translation server. The post claims the original text was in Spanish. But by 2010 most Spanish -> English translation servers were working perfectly. So the problem is with the algorithm used in the app by 2010. There is a AndroidCentral.com post titled “Word Lens review – great translations for single words” yes by that time this Word Lens was not quite efficient to translate a sentence. There is no problem with the translation servers as they were able to translate sentences by 2005 so again, yes its the algorithm used in 2010-2012 caused the problem.

Continuous tuning makes perfect

Good apps does not enjoy a super duper start but they stay tuned and enjoys a super stay in the market place, Word Lens does it. Though it does not have a great start at the very beginning with in 6 months it hits news. The developers contentiously work on their algorithm to make it perfect and even impress Google and make Google buy the firm.

The Algorithm

Today, I was thinking about to implement the same app for Windows Phone. I searched internet for about 3+ hours and finally end up with the general algorithm used in Word Lens. Its posted in stackoverflow.com as an answer for a question. The person answered might be  John DeWeese, nicked named jd. He briefs the algorithms as

  1. copy the image from the camera and get its grayscale component
  2. level out the image so the text stands out clearly against the background
  3. draw boxes around things that look like characters & sentences
  4. do OCR: match the pixels in each box against a database of characters — this is actually pretty hard!
  5. collect the characters into words, look up in a dictionary (this is hard too, because there will be mistakes in the OCR)
  6. draw the results back onto the image

This algorithm is yet to be implemented in Microsoft platforms, but its not impossible, we can do it. All 6 steps are very possible with C#. But the time is the constraint. I am working on it and ask my mates, the good Windows and Windows Phone developers to work on it.

Some links you may refer to make this possible

I previously worked on Image processing and OCR for 2013 Imagincup project  and have some good knowledge and maintain a URL directory of the resources. Here, I am giving some links any one can refer and do it.

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s