Tom Petro is a board director of Univest Financial Corp and chairs the risk management and trust committees. He is the former CEO of Fox Chase Bank and author of the book “AI in the Boardroom: Preparing Leaders for Responsible Governance.” He writes and speaks frequently on AI governance, strategy and oversight, drawing on decades of leadership at the intersection of business, technology and capital.
The Right Questions To Ask About AI in the Boardroom
A bank director delves into fairness, accuracy and risk appetite in artificial intelligence models used by vendors or inside the bank.
In the boardroom, we directors set high bars. In manufacturing, we expect zero injuries and projects delivered on time. In finance, we demand resilience and returns. It’s natural to bring that same “both/and” instinct to artificial intelligence: we want systems to be fair and accurate.
That sounds right. It also sounds simple.
But in algorithms, “both/and” has limits. Fairness can mean different things, and those meanings can pull against each other. Accuracy, too, has layers. What kind of mistake we’re willing to make matters as much as how many mistakes we make. Boards that demand “fair and accurate” without defining either can turn a strategic choice into a hidden fault.
The truth is algorithms don’t duck the question. They optimize for something. If the board does not set the perimeter, the choice is made by the vendor, the developer, or the operating team — often invisibly, and without alignment to our values.
This article brings the issue down to earth for bank directors. As more banks and bank vendors begin to use AI in lending solutions and underwriting, we unpack how different definitions of fairness play out in lending. Then, we show how boards can embed those choices into a risk appetite statement (RAS) that aligns AI perimeters with strategy, culture and trust.
Why Fair and Accurate Don’t Always Travel Together
Think of AI as a scoreboard. How we keep score determines who wins. If we track only total points, one team wins; if we track defensive stops, another might. Same game, different scoreboard, different outcome.
Algorithms work the same way. The measure we ask them to optimize becomes the “game” they play. In lending, three fairness scoreboards come up again and again:
-
- 1. Consistent Scoring (Calibration): A score means the same thing for everyone.
- 2. Balanced Mistakes (Equalized Errors): Errors are spread more evenly across groups.
- 3. Similar Treatment (Individual Fairness): Applicants with similar profiles are treated the same.
Each scoreboard sounds reasonable. Each benefits some borrowers and disadvantages others. And each creates different exposures for the bank. Suppose our bank uses an AI model to help flag which small-business loan applications should be fast-tracked for human review. The data, the training, the system — all the same. The only difference is which scoreboard we tell the model to optimize.
Consistent Scoring (Calibration)
-
- Who’s helped: Prime borrowers with long histories, well-documented files, and strong personal credit.
- Who’s hurt: Thin-file or credit-invisible applicants, often first-time entrepreneurs and disproportionately Black, Hispanic, and immigrant owners.
- Why: Reliability of scores is preserved, but uneven error burdens remain.
Balanced Mistakes (Equalized Errors)
-
- Who’s helped: Near-prime and historically under-approved borrowers, especially minority- and first-time owners, who gain access that past scoring overlooked.
- Who pays the cost: The institution, in the form of more false positives (borrowers who default), higher pricing complexity, or tightened capital buffers.
- Why: This redistributes the burden of mistakes, opening access to borrowers rejected under the consistent scoring method while excluding some who qualified by that standard. This method adds risk.
Similar Treatment (Individual Fairness)
-
- Who’s helped: Applicants with full, high-quality files that match profiles of already-approved peers.
- Who’s hurt: Those whose data is incomplete, even if they are creditworthy — again, disproportionately thin-file and minority borrowers.
- Why: Fairness here depends entirely on what’s observable in the data.
Now picture these on the same dashboard. Turn one up, and the others pull down. None is neutral. Each represents a perimeter choice that determines who is included and who is left out.
The Hidden Risk if Boards Don’t Choose
When boards stay at the slogan level —“fair and accurate”— the scoreboard is chosen for us. That can happen three ways:
-
- Developer defaults: Data scientists pick the metric that fits their tools.
- Vendor settings: Third-party systems ship with preset fairness modes.
- Operational pressure: Teams select what helps them meet loan growth and conversion goals.
Each path leaves directors blind to choices that shape markets, reputations and regulatory exposure.
Declaring the Perimeter With Risk Appetite
This is where risk appetite statements come in. At their best, RASs make boundaries visible: they declare who is inside the perimeter, who is outside, and what trade-offs we accept to draw the line there.
Fairness provides a sharp test case. Translating this into risk appetite language could look like this:
-
- Consistent Scoring: We accept a perimeter that favors uniformity and defensibility, even if it limits access for underserved groups.
- Balanced Mistakes: We accept a perimeter that expands access across groups, even at the cost of higher credit losses within our tolerance.
- Similar Treatment: We accept a perimeter that rewards full, documented files, even if it leaves thin-file borrowers outside the line.
Each is concise. Each signals values. Each spells out a trade-off regulators, customers, and investors can see and understand.
Other Perimeter Choices AI Forces Us To Draw
Fairness is not the only boundary AI presses. Similar tensions appear in other high-impact domains:
-
- Accuracy Versus Speed: Do we prize real-time decisions, or do we slow down to reduce error?
- Transparency Versus Performance: Do we prefer explainable but less accurate models, or more accurate black boxes?
- Innovation Versus Compliance: How close to today’s regulatory lines are we prepared to go in pursuit of growth?
- Customer Autonomy Versus Engagement: Do we maximize engagement, even if it reduces customer choice?
Each of these is a perimeter question. Each belongs in risk appetite.
What Good Looks Like
For high-impact use cases, strong AI-related RASs should answer three questions clearly:
-
- 1. Who is inside the boundary — and why?
- 2. Who is excluded — and why?
- 3. What trade-offs are we accepting to draw the line there?
An example outside lending makes the point:
We have a moderate appetite for AI-driven personalization that enhances the customer experience, provided it is transparent and respects customer choice. We have no appetite for practices that obscure options or exploit vulnerable groups. We are willing to accept lower short-term revenue for long-term trust and regulatory alignment.
The statement is short. It is strategic. And it sets a clear perimeter the board can defend.
Questions Directors Should Be Asking Now
At the next risk appetite review, directors should probe:
-
- For each high-impact AI use case, what perimeter is being set?
- Who is included and excluded by that perimeter?
- Are those trade-offs spelled out in the RAS, or left implicit?
- What triggers escalation when outcomes drift outside the boundary?
- How are third-party vendors bound to respect the perimeters we set?
Precision Over Slogans
Directors are right to want both fairness and accuracy. But unless we specify what kind of fairness matters for a given use case, and how much accuracy we are prepared to trade to achieve it, we’re not making policy — we’re reciting slogans.
The algorithms are already running. The boundaries are already being set. The only question is whether directors will declare those perimeters deliberately or let them be drawn by default.