Authenticity score

Suggest Edits

In addition to comprehensive fraud signals revealing the context and method of tampering, we provide a singular Authenticity score, indicating the likelihood that a document is genuine.

The Authenticity score ranges from 0-100. The score weighs the context of what was tampered with and our confidence in the signal.

We categorize scores as follows:

0-30: VERY LOW authenticity
31-60: LOW authenticity
61-80: MEDIUM authenticity
81-100: HIGH authenticity

Reason codes are returned to offer transparency into the score, and can double as high-level signals that allow for simple logical rules to determine the flow of the document. The score should be a helpful signal to prioritize document flow.

You may find that below or above a certain threshold can be rejected or approved without manual review.

Authenticity score in the Dashboard

The Authenticity Score column in the Book List displays the lowest authenticity score of any document in a Book, allowing you to quickly identify potential fraud. This column is sortable, enabling you to prioritize your workflow by focusing on documents with low scores that may indicate severe fraud, or high scores that can be approved with minimal review. Similarly, the Book Overview page shows the lowest score from any document in the upload. The authenticity score is also available on the Detect tab of the document detail page, providing additional information such as reason codes.

In the Detect tab of the document detail, we display reason codes associated with the score. With each reason code, we share our confidence in that finding.

Any reason codes that dramatically impact scores are highlighted in bold red text. The reason codes serve as high-level signals and can also help with your decision.

Examples

VERY LOW
LOW
MEDIUM
HIGH

Choosing a threshold

The score is derived from an assessment of both the severity of identified signals and the corresponding confidence levels.

As a general guideline, scores below 61 are categorized as low authenticity. However, if you are less concerned with balance tampering, for example, you might consider 35 and under to be low authenticity. Conversely, if any instance of tampering is grounds for rejection, a threshold below 70 might be deemed as low authenticity.

Authenticity score in the API

Authenticity score is returned at the document level. The response includes the numerical score and reason codes.

"form_authenticity": {
            "version": "1.0",
            "score": 20,
            "reason_codes": [
              {
                "code": "110-H",
                "confidence": "HIGH",
                "description": "bank statement account info tampered"
              },
              {
                "code": "120-M",
                "confidence": "MEDIUM",
                "description": "bank statement balance info tampered"
              }
            ]
          }

Each reason code is comprised of a distinctive identifier code (e.g., 110-H), a descriptive label such as bank statement account info tampered, and an assigned confidence level (HIGH/MEDIUM/LOW) indicating our degree of confidence in the respective signal. In this instance, we express high confidence in detecting tampering with account information and moderate confidence regarding potential tampering with balance information.

Using Authenticity score to optimize your workflow

Dashboard users

In only 3 clicks, you can navigate to visualizations and signals outlining your worst fraud:

Sort by Authenticity score in the dashboard to find high-risk documents that need urgent review.
Click on Books with low scores.
Then, click on documents within that Book with the lowest scores to review detailed findings.

A similar approach can be used to identify low-risk documents that can be moved forward with minimal review.

A weekly or daily workflow may involve filtering for date range to include documents from the previous day or previous week, then sorting by score to prioritize which documents need review.

API Users

The score itself can be used to automate workflows based on specified thresholds. For example, a score of 30 or below indicates high confidence that the document was created using a template, while a score of 45 or below suggests tampering with balance or earnings information.

FAQs

How do False Positives affect the score?
Because the authenticity score has confidence built in, true positives should have a lower score and therefore can be prioritized in the flow.
Why am I not seeing scores of 0 or 100?
Because we cannot be 100% sure of the authenticity of a document, we do not return scores of 0 or 100. The lowest possible score is 10. The highest possible score on an image is 80, and the highest possible score on an e-pdf is 90.
Why are some reason codes in bold on the dashboard?
Some reason codes that warrant special attention such as bank statement account number tampered: high confidence are shown in red bold font.
Why do I see an asterisk next to some documents?
An asterisk indicates that the document is non-parsable.
Why do some docs not have scores?
The authenticity score launched on November 15, 2023, and all supported documents were backfilled for 2023. Documents submitted before Jan 1, 2023, as well as unsupported document types, do not have authenticity scores.

Updated 9 months ago