OCR Sharp - Enterprise OCR for C# Developers

OCR Sharp is a new end-to-end optical character recognition technology for developers on the .Net platform. It is developed in C#.

It is uniquely easy to install, and uses advanced natural language processing models and spellchecking technology to achieve accuracy up to 10 times greater than conventional OCR packages such as Tesseract.

Why do developers need OCR Sharp?

C# developers and those developing for the .Net platform / runtime are most commonly engaged to develop business systems and intranet applications.   The purpose of these applications is to help businesses become more efficient moving away from the paper world into a knowledge-based enterprise. OCR technology allows old paper resources to be digitized as well as allowing newly generated paper documents to enter the digital knowledge base as they are produced.

OCR is a key technology in moving towards a paperless office

What problems do OCR technology present

OCR technology is relatively in its infancy. Although OCR technology to date has come quite accurate at identifying individual letters it is not as smart as a human being. It doesn't truly understand language and therefore it makes mistakes all the time giving as little as 90% accuracy in identifying words. This is clearly

not suitable for an entirely automated process, and eat thousands of man hours in manual editing and correction. OCR sharp moves the ball forwards by using predictive models and and true understanding of natural language to “read” documents understand words in context and produce digitized text which is an order of magnitude more accurate than we’re used to from OCR technology.

OCR sharp is in a promising beta and we can expect seats initial launch on the.net marketplace and a new get within 2016