Micorsoft project oxford OCR

Introdution

Recently, Micorsoft has released a set of artificial intelligence based vision, speech and language APIs. In the set of vision, there are computer vision, face, emotion, and vedio API. In the set of speech, there are speech, speaker recognition, and custom recognotion intelligence service API. In the set of language, there are spell check, language understanding intelligence service, and web language model API. Today, I'm going to test the functionality of object character recognition in the computer vision API.

Prerequistes

An account of Microsoft
Subscribe to the Microsoft Project Oxford

Demo

Here is the demo site without a proper subscription key. You will need to fill in your subscription key to make it working properly.
start

After entering the subscription key and the image url, the result would be returned beneath. result

Code

This simple page of application would let the user enter the subscription key and the image url and send a request to the Microsoft Project Oxford APIs using the ajax function of JQuery. There are request headers needed to be set before sending out the AJAX (asynchronous HTTP), including Content-Type and Ocp-Apim-Subscription-Key. Upon receiving the results, the string would be appended below.

<script src="https://code.jquery.com/jquery-2.2.1.min.js"></script>
<script type="text/javascript">
	function submitForm(){
		var sbkey = $("#key").val();
		var data = {};
		data.Url = $("#url").val();
		$("img").attr("src", data.Url);
		data = JSON.stringify(data);
		$.ajax({
		  type: "POST",
		  url: "https://api.projectoxford.ai/vision/v1/ocr?language=unk&detectOrientation=true",
		  data: data,
		  dataType: "json",
		  beforeSend: function(xhrObj) {
		  	// Request headers
            xhrObj.setRequestHeader("Content-Type", "application/json");
            xhrObj.setRequestHeader("Ocp-Apim-Subscription-Key", sbkey);
          }
		}).done(function(data){
			$("#result").text(JSON.stringify(data));
		}).fail(function(err){
			alert(err.getAllResponseHeaders())
		});
	}
</script>
<form name="imageForm" id="imageForm">
	Subscription key:<br>
	<input type="text" name="key" id="key"><br>
	Image url:<br>
	<input type="text" name="Url" id="url" value="http://www.sinaimg.cn/dy/slidenews//21_img/2013_45/52267_2677262_838873.jpg">
	<input type="button" value="Submit" onclick="submitForm()">
</form>
<img id="img" alt="image" width="50%">
<div id="result"></div>

Conclusion

There are a growing number of such service provided by vendors, such as Microsoft, Google, IBM etc. We would embrace these tools to strength and enrich the functionality of our application. An idiom comes to me:

Don't reinvent the wheel, just realign it. 
Anthony J. D'Angelo

Though reinvent the wheel is another great way to go through the background knoledge of a domain, it's move energy saving and efficient to use these APIs.