Five Essential Market Standards for Cash Flow Score Development

Modern cash flow underwriting, with the benefit of open banking and digital bank transaction data, can greatly improve the results of traditional credit underwriting. Cash flow scores have been shown to be highly predictive of credit default risk, including for thin-file and no-file consumers, and orthogonal to traditional credit assessment, increasing predictive power by as much as 30% over conventional scores. There is growing evidence that for certain types of loans (like cash advances and small dollar loans), cash flow underwriting does a better job than traditional underwriting on a stand-alone basis. As a result, cash flow underwriting is gaining rapid market adoption, and a number of cash flow-based credit scores have emerged.

Cash flow-based credit scores differ from traditional credit scores in important ways - perhaps greatest among them, in how the underlying data used to develop the score is collected and processed. While historical credit report information on hundreds of millions of consumers going back many years is available to lenders and traditional credit score companies for model development, cash flow data is generally consumer-permissioned for each use-case, and no large third-party repository of historical data exists. This reality creates a chicken-and-egg problem, both for lenders looking to get started using cash flow data and also for cash flow scoring companies.

Prism Data began as part of a consumer lending company, which helped us overcome this challenge initially by using data collected through years of our own cash flow underwriting for billions of dollars of originations (Prism was subsequently spun out into an independent business). In more recent years, we have partnered with dozens of banks and lenders that share performance data with us as part of a data consortium we’ve formed for the advancement of cash flow scores and other analytical use-cases. 

In this post, we’ll outline five foundational principles that we adhere to in the operation of this data consortium and the development of general-purpose cash flow scores. In doing so, we hope to highlight not just the practices that we employ in the creation of our products but also standards for the prudent, ethical and effective development of cash flow scores in the market more broadly. 

I. Score development datasets must include observations from a sufficient number of different lenders 

Unlike custom scorecards built for individual institutions, general-purpose credit scores are designed to be broadly applicable across the lending ecosystem. This means the development population needs to reflect the diversity of the entire market, which starts with a large number and mix of lenders.

Data from many different lenders should be included in score development, and no single lender or product segment should comprise a disproportionate percentage of the sample. Indeed, that is exactly how conventional credit scoring models are developed. Including too few lenders in a development sample—in the interest of specialization, or because of a simple lack of data—introduces significant risks, both to those that intend to utilize the resulting score and to those that have contributed data to the model development consortium. 

Our latest generation of scores (v4) were developed using performance data from many different lenders, with no single lender comprising more than 20% of a development sample. 

This is essential for the following reasons:

  1. Broad Market Representation. A general-purpose score should work across banks, non-bank lenders, fintechs, and credit unions alike. Including a diverse representation of lenders is essential to creating such a broadly effective model that is not overfit for a particular population.
  2. Model Generalizability. The goal is to build a model that performs well regardless of the lender’s credit policies or customer acquisition strategy. A diverse dataset trains the model to recognize universal patterns, not lender-specific quirks.
  3. Diverse Credit Products. Each lender may offer a different mix of loans—personal, home, auto, Buy-Now-Pay-Later (BNPL), credit cards, etc. Including many lenders helps the model learn from this full spectrum of credit behaviors.
  4. Behavioral and Demographic Diversity. Consumers vary by region, employment type, income level, and credit behavior. The more lenders you include, the better chance you have at meeting the needs of lenders serving a wide range of consumers. 
  5. Bias and Risk Mitigation. Over-reliance on one or two data sources can embed institutional biases into your model. A wider lender base improves fairness, especially when used in regulated environments.
  6. Protection of Client Confidentiality. Contributors to a data consortium benefit from the cumulative learnings that result from aggregating their results with the results of many other lenders in a protected manner that does not expose the proprietary performance data of any contributor to the others. Participants must be protected from “leakage” of their proprietary information into the market, which can occur if a derivative scoring product relies too heavily on an individual lender’s performance. 

Note that these principles apply even in models created for a particular segment of the market or type of lending product. Though the training sample may focus on a single product type or more narrow band of the credit spectrum, it still should include observations from a sufficient number of different lenders to avoid overfitting and proprietary information leakage. 

Ultimately, a general-purpose score is only as good as the data it's built on. Including a broad and balanced set of lenders not only improves model performance but also builds trust with customers and regulators.

II. The development sample must cover a significant period of time and control for differing macroeconomic conditions

The economic and credit environment is ever-shifting, and the industry has experienced a number of major macroeconomic changes over the past decade. Specific public policy programs and initiatives, such as the COVID-driven stimulus and student loan repayment pause, have also impacted consumer credit behaviors and the performance of loans in recent years. 

A well-performing cash flow score must maintain the ability to measure risk and rank-order borrowers across a time horizon spanning a variety of macroeconomic conditions. To accomplish this, the underlying data used to develop a score must span a sufficiently long time period, the length of which may vary a bit based on the target of the model and its intended use. 

For a typical credit scoring model predicting the likelihood of default in the next year, at least 5 years of history (and ideally considerably more) should be included in the development sample. Macroeconomic factors and conditions present during the sample period should also be taken into account, and may indicate the need for an even longer observation window.

Models that are reliant on a narrow window of historical data risk overindexing on the prevailing economic conditions during that window of time and may rapidly degrade when applied in a different environment.

III. Scoring models must be transparent and well-documented 

There should be no mysteries regarding exactly how a general-purpose scoring model has been developed and tested. Transparency is essential for a model to gain approval for use and to withstand scrutiny from external stakeholders like regulators and debt investors.

All of the steps taken to accumulate, select and prepare the underlying data must be clearly documented and disclosed, including all sampling, filtering, and standardization applied. Each feature used in the model should be reviewed for regulatory compliance and the features and selection process should be cataloged in model governance documentation. Finally, only modeling techniques that can support regulatory requirements for explainability, like the need to provide the specific, principal reasons for taking an adverse action, should be deployed. 

IV. Scores should be monitored in production and redeveloped periodically  

Once in production, the statistical performance of a score should be actively monitored on an ongoing basis by the scoring provider. If model performance degrades—which can occur for a number of reasons, such as changes in the economic environment or in consumer behavior—the score should be redeveloped with fresh data.

Cash flow underwriting may be particularly susceptible to changes in the marketplace as transaction data constantly evolves: new merchants emerge, new financial products are created, and new behaviors develop. In the past ten years, the growth of P2P payments and BNPL, for instance, have changed the complexion of bank account transactional data. Because of this, cash flow score providers must also devote a significant ongoing time and effort to keep their underlying data processing and categorization current with the market.

V. Data consortiums must protect the contributions of participating lenders 

As established in the score development standards above, lenders have much to gain by participating in cash flow data consortiums. Broad and diverse datasets allow for the creation of stronger scores and other analytical products, which benefit all participants. Companies that manage cash flow data consortiums are also well positioned to solve common problems like transaction data categorization or taxonomy on behalf of all participants. This is exactly the process that has allowed for the development of advanced scores and analytical techniques based on credit bureau data. In the ordinary course, lenders report their performance data back to the credit bureaus, who format it in a standardized way and then produce derivative products and services that deliver analytical value.

At the same time, it is essential that the proprietary information of each lender that contributes data to such a consortium is protected. Proper safeguarding requires, but is not limited to, the following:

  1. Strong prohibitions against sharing or selling one individual lender’s performance data to another.
  2. Collection of only the minimum data elements necessary to inform score performance and improvement.  
  3. Rigorous data security standards to protect sensitive information. Note that particularly sensitive Personally Identifiable Information (PII) is not required for a cash flow data consortium. 
  4. High model-building standards as it relates to the number of lenders included in a score development sample. As noted above, score development datasets should include performance data from many different lenders and no individual lender’s results should make up a disproportionate amount of the training sample.

Questions to Ask Any Cash Flow Score Provider

  • Where was the training data for the score obtained?
  • How many lenders were included in the score development sample?
  • What credit product types are included in the score development sample?
  • What credit segment(s) are included (e.g., subprime, near prime, prime, super prime)?
  • What time period(s) does the training data come from?
  • What testing has been done to ensure the score is performant as a general-purpose score and not overfit to particular training populations?
  • How is raw transaction data processed and categorized by the score provider as part of their score development process? How is accuracy of categorization ensured?
  • Are the scores explainable? Do they come with adverse action reasons, and how are those reasons derived? Have they been tested in the market with consumers and regulators?
  • What risk management and model governance procedures are in place? 
  • How are models reviewed for Fair Lending and FCRA compliance? 
  • What procedures are in place for the ongoing monitoring of score performance?
  • If clients are asked to share proprietary performance data back to the score provider, how is that done and what protections are in place to secure that proprietary information?
  • What minimum standards does the score provider adhere to in order to ensure the proprietary performance data of an individual client cannot be accessed by competitors and does not “leak” into the market through derivative scoring products?
  • How does the scoring provider support client oversight requirements, including sampling requests from regulators during periodic examinations? 

Final Thoughts

Credit scoring in any form is currency for trust. As cash flow based credit scores continue to gain adoption in the financial services industry, high standards for model development are necessary. 

Just because open banking data is relatively new does not mean that model developers can shirk their regulatory and ethical responsibilities to build, deploy, and monitor models in an empirically derived, statistically sound, and explainable way. The same regulations and expectations should apply to all credit models. In fact, new approaches and technologies should err on the side of greater conservatism and transparency while expecting more tough questions from users and regulators.

Adherence to the principles described above is essential for cash flow underwriting models to be performant, reliable, and compliant with applicable regulations—in other words, worthy of trust from banks, lenders, investors, regulators, and ultimately consumers. 

About Prism Data

At Prism Data, we pioneered the cash flow-based credit score with our CashScore® product, a simple three-digit measurement of creditworthiness similar to a conventional credit score but based on bank account transaction data rather than credit bureau data. CashScore has been shown to be highly predictive of credit default risk, including for thin-file and no-file consumers, and orthogonal or additive to traditional credit assessment. 

Our flagship CashScore predicts the likelihood that a borrower will go “bad” in the next twelve months and can be used to approve or decline loan applications (and to make a host of other decisions), just like a conventional score. In addition to the flagship score, Prism Data also offers CashScore Detect, a short-term risk score designed to identify the risk of first-party fraud and first payment default, and CashScore Extend, a credit risk score focused specifically on cash advance and small-dollar lending.  

After collecting millions of historical cash flow underwriting records going back many years, we released CashScore v3 in 2022—the industry’s first general-purpose and “consortium-based” cash flow underwriting score—and CashScore v4 in 2024. These scores, similar to conventional credit scores, are based on results from many different lenders across a long period of time, providing robust and diverse training data representing a wide range of products, economic conditions, and risk levels. The result is market-leading predictive power in the cash flow underwriting category and a score that is relied upon today by dozens of clients ranging from top 10 credit card issuers to some of the fastest growing fintechs in the country.