How to make a simple Optical Character Recognition script.

Published Apr 26, 2018Last updated May 25, 2018
How to make a simple Optical Character Recognition script.

Optical character recognition is the recognition of typed, handwritten or printed text and converting them into text. OCR can be used to automate various task involving humans, like in banking, OCR is being used to process checks without human involvement, generating content of documents from their scanned images, it can also be helpful for visually impaired people, etc.
For this OCR we'll be using Microsoft's Computer Vision API. We'll do a post request for making a API call in python. and in response, we'll get output in JSON format.
To get started you are required to have a Microsoft account, and after that, you can get a free subscription to computer vision API for 30 days. You have to acquire your secret subscription key which looks similar to this 98f714r6vb2e193018b28fg1u9b3b0d7e7

#Defining base url for API call.
base_url = "https://westcentralus.api.cognitive.microsoft.com/vision/v1.0/"
ocr_url = base_url + "ocr"

#Defining subscription key and headers for subscription key.
sub = "98f714r6vb2e193018b28fg1u9b3b0d7e7"
headers  = {'Ocp-Apim-Subscription-Key': sub}

Microsoft OCR API is quite flexible and we can define many parameters depending upon our use cases, here we defining two parameters, the language which is English in our case (defined by 'unk') and whether to detect orientation of text or not, which is defined as true in our case. We also need url of the image on which we want to run our OCR (we can also upload a local image for OCR), so we'll define url of the image.

#Defining parameters and orientation
params   = {'language': 'unk', 'detectOrientation ': 'true'}

#Defining image url
img = "https://quotefancy.com/download/18846/original/wallpaper.jpg"
data = {'url': img}

Following is the image at above link
wallpaper.jpg

Now we'll import requests for making a post request mentioning ocr_url, headers, params and json.

import requests
response = requests.post(ocr_url, headers=headers, params=params, json=data)
response.raise_for_status()
analysis = response.json()
print analysis

The JSON output of the above script contains data about bounding box coordinates, orientation and text angle, for each word line by line. Here's the ouput

{  
   'language':'en',
   'orientation':'Up',
   'textAngle':0.0,
   'regions':[  
      {  
         'boundingBox':'689,768,2462,1049',
         'lines':[  
            {  
               'boundingBox':'689,768,2462,180',
               'words':[  
                  {  
                     'boundingBox':'689,768,541,158',
                     'text':'Work'
                  },
                  {  
                     'boundingBox':'1293,768,450,158',
                     'text':'hard'
                  },
                  {  
                     'boundingBox':'1816,768,158,156',
                     'text':'in'
                  },
                  {  
                     'boundingBox':'2041,768,771,180',
                     'text':'silence,'
                  },
                  {  
                     'boundingBox':'2889,768,262,158',
                     'text':'Let'
                  }
               ]
            },
            {  
               'boundingBox':'689,1037,2454,181',
               'words':[  
                  {  
                     'boundingBox':'689,1075,399,143',
                     'text':'your'
                  },
                  {  
                     'boundingBox':'1135,1074,722,103',
                     'text':'success'
                  },
                  {  
                     'boundingBox':'1918,1037,217,140',
                     'text':'be'
                  },
                  {  
                     'boundingBox':'2184,1075,399,143',
                     'text':'your'
                  },
                  {  
                     'boundingBox':'2638,1037,505,140',
                     'text':'noise.'
                  }
               ]
            },
            {  
               'boundingBox':'1717,1358,408,52',
               'words':[  
                  {  
                     'boundingBox':'1717,1359,173,51',
                     'text':'Frank'
                  },
                  {  
                     'boundingBox':'1913,1358,212,52',
                     'text':'Ocean'
                  }
               ]
            },
            {  
               'boundingBox':'1782,1765,276,52',
               'words':[  
                  {  
                     'boundingBox':'1782,1765,276,52',
                     'text':'@quote├čancu'
                  }
               ]
            }
         ]
      }
   ]
}

Enjoy!

P.S: Just in case if you need any clarification do post a comment.

Discover and read more posts from Akhand Pratap Mishra
get started