Optical Character Recognition | OCR

Text Recognition, Machine Learning & Document Search

Optical Character Recognition (OCR) is a software technology for translating text, tables and even drawings from physical documents that have been digitally scanned into machine-readable text or code. OCR is also known as text recognition. OCR systems are made up of a combination of hardware and software that is used to convert physical documents into machine-readable text. Advanced OCR services make use of Machine Learning (ML) to implement sophisticated pattern & character recognition for identifying languages, diagrams, tabular data or handwriting styles.

The process of OCR is often used to translate purchase receipts, postal mail, legal or historic documents. The output can be into database entries in an application or traditional document formats such as PDF, Spreadsheet or desktop publishing such as Microsoft Word. The advantage of OCR is that once the information is machine-readable, it may be searched, copied and pasted, re-purposed into other media or used in calculations.

iFactory has experience with building OCR applications and we can also assist with incorporating OCR web services into existing client applications.