The Language of AI Governance


There is much ado these days regarding the language facility popular AI tools such as ChatGPT, Midjourney, Microsoft Copilot, and Gemini Connected appear to display. What do these systems comprehend? Do they understand? Are the underlying architectures a big leap toward artificial general intelligence (AGI) or an entertaining dead end? This article is not about that. This is a loosely related thread on how the language we use to describe AI systems affects our ability to govern them effectively.

It may sound ridiculous, but effective AI governance relies on not governing “AI.” After all, what is AI? To the broad public, AI is often synonymous with ChatGPT and other generative AI tools.

To your colleagues, AI may be a hazy mishmash ranging from traditional machine learning (ML) and statistical algorithms straight through to the buzzy apps du jour. In a sense, all parties are right. The AI moniker encapsulates a broad spectrum of analytic (call them algorithmic, if you prefer) techniques. AI governance is, therefore, an exercise in portfolio management.

Emerging regulations put a premium on the risk level of AI-enabled applications. But evaluating risk requires a coherent understanding of the inherent capabilities and fallibilities of the underlying AI technique(s), as well as how algorithmic outputs will be applied within the context of a given business process or product—especially as products and services increasingly utilize multiple techniques in concert to deliver satisfactory results.

What AI regulations perhaps miss—or take for granted—is the underlying literacy required to make a qualified risk assessment. An application may be flagged as potentially high-risk on its contextual merits (in healthcare or social services, for example). However, the realized risk profile is highly dependent on the techniques applied. While transformer-based large language models (LLMs) are the hottest algorithms on the block, they are not appropriate (or necessary) for every circumstance in which natural-sounding text needs to be generated. Likewise, in the case of medical diagnosis, a simple (and transparent) logistical regression sometimes outperforms a sophisticated, black-box deep-learning model.

Governance comes down to ensuring the right analytic tools are applied to the right jobs in the right way. Can we use AI to solve this problem? Is AI the best tool for the job? Questions of this ilk are common yet nonsensical when it comes to practical, on-the-ground governance. Specifying the specific technique(s) under consideration is far more cumbersome and far more effective.

Overly loose or broad descriptive language can be equally problematic throughout the AI product lifecycle. Our human tendency to subscribe human traits to inanimate objects is well-documented.

This tendency to anthropomorphize is ruthlessly exploited by makers of AI systems. We give them common human monikers and speak of their ability to conversate, hallucinate, ideate. We compare their performance to humans and make allowances for or against according to our personal tech predilections. We speak of them as decision makers, colleagues, companions. We use them and their pronouns when writing about the personification of systems that rely on conversational interfaces and ML, all of which is perfectly predictable and understandable. Yet none of which lightens the governance load, particularly given the incredible power of language to engage and beguile.

When it comes to defining expected outcomes, acceptable errors, risk tolerances, or validation scenarios (the list goes on), word choices matter. Common idioms and anthropomorphized language do not belong in your analytics or product lifecycle—AI or otherwise.

Rather than leaning on personified traits, be explicit, even clinical, about a system’s properties. Does it matter if an LLM-enabled chatbot hallucinates? That’s hard to answer without being specific about your tolerance for erroneous outputs or logical inconsistencies. Does the material the system references to generate those beautiful images matter? It’s better to ask what level of potential copyright infringement is palatable. Even simple changes such as referring to the output of an AI-enabled system as an output, rather than a reply, can heighten the team’s objectivity at every stage of product development. Save the (responsible use of) human-friendly analogies and evocative language for product marketing and PR.

Here is a final, perhaps slightly farther afield, reflection. Often, what we don’t say in conversation is more important than what we do. What we emphasize speaks volumes about what we don’t. There is incredible pressure today to deploy AI in all aspects of business. Some of the hubbub arises from the fear of missing out (FOMO) and some from a sincere desire to be ahead of the curve or do the right thing. While the hypering is calming somewhat, the broader echo chamber preferences saying yes, rather than no, when it comes to anything labeled AI.

The challenge then is for teams is to ensure that equal time is spent contemplating not just “Why?” but “Why not?” Consider how much time decision makers at all levels spend discussing capabilities and value versus limitations, risks, and harms. Chances are, the proportions here skew heavily toward “Why?” Or, in the alternate, the “Why not?” conversation is scoped strictly in relation to the “Why?” If so, it may be time to flip the script. Start with “Why not?” or utilize separate teams to consider both sides of the coin.

This may not change the proportion of AI-labeled projects getting a bright green light, but equal airtime should change the proportion that are deployed mindfully.



Newsletters

Subscribe to Big Data Quarterly E-Edition