Capture

Capture, corroborate, and contextualize data found in any type of document.

Capture

Our proprietary capture engine routes each document to the optimal digitization tool. We employ a variety of techniques, including but not limited to: machine learning, computer vision, and micro-templates.

Human-in-the-Loop workflows

Whenever possible, we fully automate data extraction and validation. In cases where we cannot read a document with perfect machine-only confidence, Ocrolus systematically 'loops-in' our team of human reviewers to verify data fields that our software could not automatically confirm. We then use a series of machine checks to provide additional quality control for our human reviewers.

Benefits of our Human-in-the-Loop workflow

  • Parallel processing: Documents are split into dozens of discrete tasks that multiple human reviewers can complete, all at once, in near real-time.
  • Smarter processing: Thanks to our flywheel producing labeled training data for machine learning, Ocrolus gets smarter with every document we verify.

Next Steps

For a step-by-step walkthrough about how to upload and capture an example bank statement, view the Getting started with bank statements guide.

For instructions on how to a specific file type, view any of the following guides:

  • Mixed documents: PDF files containing more than one type of document

    Example - Mixed document captured success response

{
    "status": 200,
  "message": "OK",
  "response": {
  "mixed_uploaded_docs": [
    {
        "created_ts": "2019-05-13T19:10:43Z",
        "name": "document.pdf",
        "checksum": "xxxxc0a1dd4081b470bb7587b6f969f3",
        "page_count": 6,
        "pk": 1031111,
        "uuid": "8dc88eed-2b6f-414a-82f2-a99bd181cf6e"
        }
    ]
}
  • Bank statements: PDF bank statements

    Example - Bank statement captured success response

{
    "status": 200, 
    "message" : "OK",
    "response": {
    "uploaded_docs" : [
        {
            "created_ts" : "2018-06-20T19:47:12Z",
            "name" : "bank-statement.pdf",
            "checksum" : "50a62f72f4b5b80b18f75fbed690dc47",
            "page_count" : 16,
            "pk" : 1,
            "uuid": "8dc88eed-2b6f-414a-82f2-a99bd181cf6e"
        }
    ]
}
  • PDF forms: General PDF forms (Such as paystubs, tax documents, loan applications, and more.)

    Example - PDF captured success response

{
    "status": 200,
    "message": "OK",
    "response": {
    "pk": 8597,
    "name": "Book 123",
    "created_ts": "2019-08-30T19:28:30Z",
    "book_status": "ACTIVE",
    "forms": [
        {
            "type": "ISO_APP",
            “pk”: 100,
            “form_config_pk”: 2,
            "raw_fields": {...}
    }
    ]
}
  • Paystubs: Verify income with a complete paystub capture

    Example - full paystub extraction success response

{
    "book_uuid": "60299d7a-fc4d-11ea-adc1-0242ac120002",
    "doc_uuid": "5ed38d6c-8d8b-47ba-b773-214d6ad4cc6e",
    "doc_page_numbers": [3,4],
    "uuid": "1f0cc882-54b5-4bea-88ac-bbc23ab867e6",
    "employer": {...},
    "employee": {...},
    "employment_details": {...},
    "paystub_details": {...},
    "net_pay": {...},
    "earnings": {...},
    "deductions": {...}
}
  • Image groups: Non-PDF file formats, such as JPEG, PNG, BMP, and TIFF

    Example - image group captured success response

{
    "status": 200,
    "message": "OK",
    "response": {
    "image_group_pk": 87479
}

See also