Tuesdays with Tom - What's an Intelligent Document Processing Pipeline?
So what is an Intelligent Document Processing (IDP) Pipeline? We'll touch on what it means to configure an IDP Pipeline in the cloud & Its four components.
1 minute read
May 3, 2021
With Sherpa, BP3's Intelligent Document Processing Services you provide the documents, and we take care of the rest. Our team of experts in our Document Service Center configures and manages your document pipeline in the AWS cloud to create a reliable and maintenance free stream of data to run your business.
So, what do we mean when we say that we configure an Intelligent Document Processing Pipeline in the cloud?
An Intelligent Document Processing (IDP) pipeline is made up of 4 components:
Optical Character Recognition (OCR) engine - that reads document images, and extracts the characters and physical position of each word on the page.
Data Extractors - take the data from the OCR engine and return the specific, structured information that you want to extract from the document.
Human-In-The-Loop (HITL) - Sometimes humans need to review or correct pieces of extracted data. This could be due to poor scan quality, stray markings on the page, or even new document types or formats entering the pipeline. The HITL component contains user interfaces that are specifically designed to allow humans to be super-efficient at reviewing and correcting potential data extraction errors.
Orchestration Engine - Intelligent Document Processing is a business process that relies on efficient orchestration between machine and human tasks. The orchestration engine ties all the components together to create a robust and scalable solution that can process millions of documents a day as easily as it can process one.
At BP3, our Intelligent Document Processing pipelines are powered by AWS components and AI engines. These technologies enable us to provide scalable and secure IDP Services that meet the unique needs of our clients.