Capture
Capture, corroborate, and contextualize data found in any type of document.
Capture
Our proprietary capture engine routes each document to the optimal digitization tool. We employ a variety of techniques, including but not limited to: machine learning, computer vision, and micro-templates.
Human-in-the-Loop workflows
Whenever possible, we fully automate data extraction and validation. In cases where we cannot read a document with perfect machine-only confidence, Ocrolus systematically 'loops-in' our team of human reviewers to verify data fields that our software could not automatically confirm. We then use a series of machine checks to provide additional quality control for our human reviewers.
Benefits of our Human-in-the-Loop workflow
- Parallel processing: Documents are split into dozens of discrete tasks that multiple human reviewers can complete, all at once, in near real-time.
- Smarter processing: Thanks to our flywheel producing labeled training data for machine learning, Ocrolus gets smarter with every document we verify.
Next Steps
For a step-by-step walkthrough about how to upload and capture an example bank statement, view the Getting started with bank statements guide.
For instructions on how to a specific file type, view any of the following guides:
-
Mixed documents: PDF files containing more than one type of document
Example - Mixed document captured success response
{
"status": 200,
"message": "OK",
"response": {
"mixed_uploaded_docs": [
{
"created_ts": "2019-05-13T19:10:43Z",
"name": "document.pdf",
"checksum": "xxxxc0a1dd4081b470bb7587b6f969f3",
"page_count": 6,
"pk": 1031111,
"uuid": "8dc88eed-2b6f-414a-82f2-a99bd181cf6e"
}
]
}
-
Bank statements: PDF bank statements
Example - Bank statement captured success response
{
"status": 200,
"message" : "OK",
"response": {
"uploaded_docs" : [
{
"created_ts" : "2018-06-20T19:47:12Z",
"name" : "bank-statement.pdf",
"checksum" : "50a62f72f4b5b80b18f75fbed690dc47",
"page_count" : 16,
"pk" : 1,
"uuid": "8dc88eed-2b6f-414a-82f2-a99bd181cf6e"
}
]
}
-
PDF forms: General PDF forms (Such as paystubs, tax documents, loan applications, and more.)
Example - PDF captured success response
{
"status": 200,
"message": "OK",
"response": {
"pk": 8597,
"name": "Book 123",
"created_ts": "2019-08-30T19:28:30Z",
"book_status": "ACTIVE",
"forms": [
{
"type": "ISO_APP",
“pk”: 100,
“form_config_pk”: 2,
"raw_fields": {...}
}
]
}
-
Paystubs: Verify income with a complete paystub capture
Example - full paystub extraction success response
{
"book_uuid": "60299d7a-fc4d-11ea-adc1-0242ac120002",
"doc_uuid": "5ed38d6c-8d8b-47ba-b773-214d6ad4cc6e",
"doc_page_numbers": [3,4],
"uuid": "1f0cc882-54b5-4bea-88ac-bbc23ab867e6",
"employer": {...},
"employee": {...},
"employment_details": {...},
"paystub_details": {...},
"net_pay": {...},
"earnings": {...},
"deductions": {...}
}
-
Image groups: Non-PDF file formats, such as JPEG, PNG, BMP, and TIFF
Example - image group captured success response
{
"status": 200,
"message": "OK",
"response": {
"image_group_pk": 87479
}
Updated about 1 month ago