Capture

Capture, corroborate, and contextualize data found in any type of document.

Capture

Our proprietary capture engine routes each document to the optimal digitization tool. We employ a variety of techniques, including but not limited to: machine learning, computer vision, and micro-templates.

Human-in-the-Loop workflows

Whenever possible, we fully automate data extraction and validation. In cases where we cannot read a document with perfect machine-only confidence, Ocrolus systematically 'loops-in' our team of human reviewers to verify data fields that our software could not automatically confirm. We then use a series of machine checks to provide additional quality control for our human reviewers.

Benefits of our Human-in-the-Loop workflow

  • Parallel processing: Documents are split into dozens of discrete tasks that multiple human reviewers can complete, all at once, in near real-time.
  • Smarter processing: Thanks to our flywheel producing labeled training data for machine learning, Ocrolus gets smarter with every document we verify.

Next Steps

For a step-by-step walkthrough about how to upload and capture an example bank statement, view the Getting started with bank statements guide.

For instructions on how to a specific file type, view any of the following guides:

  • Mixed documents: PDF files containing more than one type of document

    Example - Mixed document captured success response

{
	"status": 200,
  "message": "OK",
  "response": {
  "mixed_uploaded_docs": [
  	{
  		"created_ts": "2019-05-13T19:10:43Z",
    	"name": "document.pdf",
    	"checksum": "xxxxc0a1dd4081b470bb7587b6f969f3",
    	"page_count": 6,
    	"pk": 1031111,
    	"uuid": "8dc88eed-2b6f-414a-82f2-a99bd181cf6e"
		}
	]
}
  • Bank statements: PDF bank statements

    Example - Bank statement captured success response

{
	"status": 200, 
	"message" : "OK",
	"response": {
	"uploaded_docs" : [
		{
			"created_ts" : "2018-06-20T19:47:12Z",
			"name" : "bank-statement.pdf",
			"checksum" : "50a62f72f4b5b80b18f75fbed690dc47",
			"page_count" : 16,
			"pk" : 1,
			"uuid": "8dc88eed-2b6f-414a-82f2-a99bd181cf6e"
		}
	]
}
  • PDF forms: General PDF forms (Such as paystubs, tax documents, loan applications, and more.)

    Example - PDF captured success response

{
	"status": 200,
	"message": "OK",
	"response": {
	"pk": 8597,
	"name": "Book 123",
	"created_ts": "2019-08-30T19:28:30Z",
	"book_status": "ACTIVE",
	"forms": [
		{
			"type": "ISO_APP",
			“pk”: 100,
			“form_config_pk”: 2,
			"raw_fields": {...}
	}
	]
}
  • Paystubs: Verify income with a complete paystub capture

    Example - full paystub extraction success response

{
	"book_uuid": "60299d7a-fc4d-11ea-adc1-0242ac120002",
	"doc_uuid": "5ed38d6c-8d8b-47ba-b773-214d6ad4cc6e",
	"doc_page_numbers": [3,4],
	"uuid": "1f0cc882-54b5-4bea-88ac-bbc23ab867e6",
	"employer": {...},
	"employee": {...},
	"employment_details": {...},
	"paystub_details": {...},
	"net_pay": {...},
	"earnings": {...},
	"deductions": {...}
}
  • Image groups: Non-PDF file formats, such as JPEG, PNG, BMP, and TIFF

    Example - image group captured success response

{
	"status": 200,
	"message": "OK",
	"response": {
	"image_group_pk": 87479
}