The Client: A Top-Five Originator

This bank is one of the largest in the United States. It is a leading lender offering a range of quality home loans, including government and conventional. These loans are provided through multiple channels.

Paradatec for Mortgage provides a unique approach for companies wishing to reduce the manual labor costs and increase the accuracy levels associated with classifying and capturing data from loan documents.

The financial services industry is challenged with managing large volumes of documents with varying layouts containing immense amounts of data – part of which is highly critical with regard to compliance. The traditional manual process for classifying and keying data from these documents is time consuming, error prone, and costly due to the sheer volume and complexity of the documents. In an industry where standardizing forms is not possible due to their varying systems and points of origination, an acceptable automation solution must be able to cope with this variability.


The mortgage lending industry presents a number of unique challenges for classifying and extracting data from key documents. This is due in part to the large volumes of disparate document variations found in most loan files.

  • A typical incoming mortgage loan file may contain 250 to 600+ pages of various size documents, comprising more than 250 potential document types. Older loans files may grow to well over 1000 pages.
  • Manually sorting each set of loan documents is a labor intensive and error prone effort, typically requiring the addition of document separator pages if the file is to be scanned.
  • Due to the sheer labor effort required, the typical level of detailed document sorting possible with a manual approach is very “coarse”. In other words, only the most critical documents and document groups are classified rather than attempting to identify all specific document types. An example of this limitation might be a manual grouping of a series of specific documents into a “Credit Documents Group” rather than breaking these out specifically by document types such as bank statements, credit reports, and brokerage statements.
  • To compete in this extremely competitive market segment, organizations are looking for ways to reduce costs and streamline their processes.
  • In addition to the challenges described above, this top five originator was looking for a solution to help automate the laborious task of providing data for a number of audit-centric applications. These ad-hoc projects commonly had tight timelines and included wide ranges of loans, and millions of pages to be audited.

Project Description

At the start of the Project, this top five originator had in place a sophisticated document capture infrastructure feeding a well-known enterprise content management system. What was missing from this infrastructure was an advanced recognition module that could deal with the document variations expected in an organization serving borrowers across the nation. The Paradatec solution provided a seamless interface to this capture infrastructure. This greatly simplified the implementation by allowing the existing interfaces to both front-end scanning and back-end image storage to be largely unaffected by the addition of the recognition technology.

The Paradatec system was selected after an exhaustive evaluation process. A competing solution was initially selected. However, after months of tests it was determined that the Paradatec system had a number of capabilities that surpassed other solutions previously tested or reviewed:

  • Paradatec was by far the fastest technology available to read / OCR mortgage documents. Pre-production technical due diligence empirically showed a system that was capable of processing approximately 1 million images per day on a single twelve-core server.
  • The Paradatec system was able to use one set of rules to process and recognize all document variations. Because of the extremely large number of documents (and variations of each) which this top five originator encounters, they required the flexibility offered by a non-template-based ADR (Automated Document Classification) and data extraction solution.
  • Paradatec for Mortgage offered pre-built mortgage logic which “understands” the vast majority of the document types and variations that were required to be recognized.  This feature of the Paradatec offering allowed this originator to rapidly implement an ADR and data extraction solution for their specific needs.


The project was successfully implemented and released to production on-time. As a result of this experience with both the Paradatec staff and the Paradatec solution, this customer is prepared to act as a reference on behalf of Paradatec. Prospective clients are encouraged to take advantage of this opportunity.

Per Neil Fraser, Director of US Operations, “To be chosen by such a high-profile client for a project of this size was a vote of confidence for Paradatec and our technology. I would encourage other similarly placed clients to reach out to Paradatec to setup a one day blind test. In just a day it is possible to see just what this technology can do, right out of the box.”

Further Information

You can download the complete case study here: