Automate Form Processing 2 | Next Journey with Microsoft Form Recognizer

Microsoft’s Form Recognizer

At Build 2019 back in May, Microsoft announced that Form Recognizer API will be one of the new features added in their Azure Cognitive Service. Microsoft Form Recognizer are still in private preview at the point of writing, which you can request for the early access at their page here.

With the privilege, I have gotten the access to it.

What’s Form Recognizer?

  • Applies advanced machine learning to accurately extract text, key/value pairs, and tables from documents.
  • Tailors its understanding to your documents, both on-premises and in the cloud.
  • Easily extract text and structure, with simple REST API.
  • Capable to run on the edge devices with container services. (link)
  • It even has the capability to translate receipts! (Let me prepare another topic just for this!)

New approach with Form Recognizer

With Form Recognizer, we can now use a single service to achieve what we did using the first approach.

To automate form processing, we need to utilize several APIs of Form Recognizer:

  • Prepare sample set of forms: Prepare several sample forms which you have and it shouldn’t contain your business sensitive data. (I tried and can’t train with those PDFs exceeds ~980KB)
  • Upload those to Azure Blob Storage
  • Pass the Shared Access Signature (SAS) to the /formrecognizer/v1.0-preview/custom/train API endpoints.
  • Then submit your test dataset to formrecognizer/v1.0-preview/custom/models/{modelId}/analyze to analyze the data content in your form. The sample outcome as below.

Conclusion

After testing, the translated text given by Form Recognizer API is really promising. However, we developers or Microsoft need to over come few challenges or limitations at the point of writing, before going live:

  • Limitation of 4MB of training data set is really meh… not feasible. (Even one of my PDF with only 980KB couldn’t go through the training)
  • How we should extract the data with key-pair value, and some, missed out from being matched into key-pair value. :(

But overall, I am confident with this even it is in private preview as it is even providing KEY-PAIR value which I love the most. I would try it again when it comes to public preview or perhaps production and I couldn’t wait to integrate it with my app!

Please follow me for more upcoming AI topics @
Follow me @ Twitter: @hmheng
Subscribe My Channel @ YouTube: http://bit.ly/hmheng_yt
More slides @ SlideShare: https://www.slideshare.net/HiangMengHengMarvin
Blog: http://www.techconnect.io

--

--

Marvin Heng

An tech specialist also a Microsoft MVP who loves to try new things and publishes sample codes on github.