You can use Capture to capture, validate, and interpret data from a wide range of documents. Our advanced capture engine ensures that each document is processed through the most suitable digitization tool to optimize both, accuracy and efficiency. By leveraging powerful technologies such as machine learning, computer vision, and micro-templates, we can intelligently classify and extract data from even the most complex document types. This seamless integration of automated processes ensures that documents are not only digitized but also verified and contextualized for further analysis and decision-making. This provides you a reliable and actionable insights.

Human-in-the-loop (HITL) workflows

While we aim to fully automate data extraction and validation, there are cases where machine-only confidence isn’t enough. In these instances, Ocrolus automatically engages human reviewers to verify data fields that our software couldn’t confirm. After the review, additional machine checks are performed to enhance quality control.

Benefits of the HITL workflow

  • Parallel Processing: Documents are divided into smaller tasks, allowing multiple human reviewers to work simultaneously and complete the process in near real-time.
  • Smarter Processing: Each document verified contributes labeled data, continuously improving our machine learning models and making the system smarter with every interaction.

Upload and capture bank statement data

For detailed instructions on uploading and capturing a bank statement, refer to the Getting started with bank statements guide. For instructions on how to process a specific file type, see the following guides:

  • Mixed documents: A mixed PDF document is a single or multi-page PDF containing different form types that you prefer Ocrolus to classify.

    • Click to expand or collapse the JSON code example of this event
      {
      	"status": 200,
        "message": "OK",
        "response": {
        "mixed_uploaded_docs": [
        	{
        		"created_ts": "2019-05-13T19:10:43Z",
          	"name": "document.pdf",
          	"checksum": "xxxxc0a1dd4081b470bb7587b6f969f3",
          	"page_count": 6,
          	"pk": 1031111,
          	"uuid": "8dc88eed-2b6f-414a-82f2-a99bd181cf6e"
      		}
      	]
      }
      
  • Bank statements: A bank statement is a financial document that summarizes account transactions.

    • Click to expand or collapse the JSON code example of this event
      {
      	"status": 200, 
      	"message" : "OK",
      	"response": {
      	"uploaded_docs" : [
      		{
      			"created_ts" : "2018-06-20T19:47:12Z",
      			"name" : "bank-statement.pdf",
      			"checksum" : "50a62f72f4b5b80b18f75fbed690dc47",
      			"page_count" : 16,
      			"pk" : 1,
      			"uuid": "8dc88eed-2b6f-414a-82f2-a99bd181cf6e"
      		}
      	]
      }
      
  • PDF forms: General PDF forms such as pay stubs, tax documents, loan applications, and so on.

    • Click to expand or collapse the JSON code example of this event
      {
      	"status": 200,
      	"message": "OK",
      	"response": {
      	"pk": 8597,
      	"name": "Book 123",
      	"created_ts": "2019-08-30T19:28:30Z",
      	"book_status": "ACTIVE",
      	"forms": [
      		{
      			"type": "ISO_APP",
      			“pk”: 100,
      			“form_config_pk”: 2,
      			"raw_fields": {...}
      	}
      	]
      }
      
  • Pay stubs: Verify income with a complete paystub capture.

    • Click to expand or collapse the JSON code example of this event
      {
      	"book_uuid": "60299d7a-fc4d-11ea-adc1-0242ac120002",
      	"doc_uuid": "5ed38d6c-8d8b-47ba-b773-214d6ad4cc6e",
      	"doc_page_numbers": [3,4],
      	"uuid": "1f0cc882-54b5-4bea-88ac-bbc23ab867e6",
      	"employer": {...},
      	"employee": {...},
      	"employment_details": {...},
      	"paystub_details": {...},
      	"net_pay": {...},
      	"earnings": {...},
      	"deductions": {...}
      }
      
  • Image groups: Non-PDF file formats, such as JPEG, PNG, BMP, and TIFF.

    • Click to expand or collapse the JSON code example of this event
      {
      	"status": 200,
      	"message": "OK",
      	"response": {
      	"image_group_pk": 87479
      }