AI in Diligence: The Risk No One Models

Iain Duncan
Mar 18
8 min read

Analytics dashboards on a tablet representing RingStone insights on AI in diligence.

Introduction

It will come as no surprise to readers of the RingStone Tech blog that the last couple of years have seen a dramatic increase in AI-related technical diligence. However, what is a bit surprising for us on these engagements is how we are being asked to answer what would normally be business model decisions. The nature of the field is so technical, and the developments so new, that investors are turning to multiple diligence teams to answer fundamental questions about the target business. Is the business defensible? Is the offering unique? Can it be profitable?

Having worked in the diligence space for seven years now, I’ve personally been on AI- and ML-focused diligences in which the target companies ranged from wildly successful to being sold at a loss. I’ve had the privilege of working alongside our subject matter expert, Dr. George Tzanetakis, who has been lecturing and researching AI and ML for over 20 years, including as guest faculty at Google Research. In this blog post, I’m going to share a common, and potentially deadly, mistake we frequently see in AI-focused companies: misunderstanding the business ramifications of machine learning uncertainty.

This is a mistake so profound that it can be unfixable without pivoting to a completely different business model. To understand this all-too-common mistake, let’s review some fundamentals of machine learning.

Predictive Machine Learning

Since the early 2010’s, with the advances of GPUs and various algorithmic developments, deep learning approaches have provided significantly better results in a variety of classification problems.

AI in 2026 has become pretty much synonymous with machine learning. More specifically, deep learning, in which multi-layer neural networks are used for probabilistic prediction problems. Since the early 2010s, with the advances of GPUs and various algorithmic developments, deep learning approaches have provided significantly better results in a variety of classification problems. This is the case whether we’re talking about systems that identify objects in photos, extract meaning from recorded audio, or read and answer questions using written language. The keyword here is predictive. In deep learning problems, every answer is a prediction of the likelihood of various continuations or connections.

This is fundamentally different from the so-called “classical AI” approaches of the 80s and 90s, in which “expert systems” made decisions using symbolic logic and complex rules – lots and lots of rules. Expert systems are predictable. Give them the same inputs, and you get the same output, unless random variability is deliberately added to the system. If the system can solve the problem, it will generally always solve the same problem in the same way. Conversely, if it can’t solve the problem, it will fail to answer, and do so every time.

Deep learning systems are pretty much the opposite. Because an answer is fundamentally a prediction, the system will always give us an answer. The prediction might just be a bad one. This also means that, given identical input, the answer may not be the same in some cases.

Given a sufficiently large and appropriate training set, these predictions can be remarkably accurate a remarkable amount of the time. But at least some of the time, they make errors. Unlike a person making an error, these errors can seem truly nonsensical to a human with an understanding of the problem domain. Worse, the system can make them over and over again without necessarily learning from previous mistakes.

The McDonald's AI-powered ordering system provided an unintentionally hilarious example of this when it interpreted orders mistakenly as being for hundreds of McNuggets or butter on ice cream. While human tellers would know from context and experience that they must have heard the order wrong, the AI saw nothing wrong with this outlandish order. Dr. Balaji Lakshminarayanan, senior research scientist on the Google DeepMind team, put this very succinctly: “Neural networks do not know when they don’t know.”

The Big Mistake

Now, we should hardly be surprised that this fundamental element of machine learning is so widely misunderstood. We have some of the most well-financed companies in history doing everything they can to convince us that incorrect predictions are mere “hallucinations,” with the term implying the errors are some kind of temporary deviant thinking that’ll go away as soon as the models sober up. LLM vendors have been proclaiming that they are closing in on artificial general intelligence, in which AI systems will achieve human-like intelligence. But while perhaps diminishing, the hallucinations are stubbornly sticking around.

What does all this have to do with the potentially deadly mistake we see in AI companies during diligence? And not just during diligence, but also in the wider market? This mistake is building a business on a value proposition that is only viable if the AI output can be guaranteed to be correct. Get this right, and deep learning can be a game-changer. Get it wrong, and your business can be stuck trying to get off the ground.

Machine learning systems can be amazing at recognizing patterns that are difficult or impossible for people to recognize, and they can do it very quickly. We’re seeing this capability harnessed for a huge variety of problems, from computer vision, to summarizing documents, to “talking” with customers for support. In some use cases, occasional incorrect predictions are not much of a problem, and are vastly outweighed by the benefits. But in others, the problems can be insurmountable. While details must be redacted and the names changed to protect the innocent, I can attest that I have seen companies that avoided this mistake doing very, very well, and others who made the mistake struggling to get any real traction.

Stories From The Field

One successful company I assessed was (in general terms) using machine recognition to find candidate items in a massive catalog. The value proposition was that the time needed for highly skilled, well-paid human technicians to search for these items (a necessary part of their work) was cut dramatically. The important part is that the system didn’t need to be right every time for this to be highly valuable. The technicians were shown a list of candidate items and selected the right candidate - or none at all - based on their expert judgment. In the best-case scenario, significant time was saved. In the worst-case scenario, the technician saw no matching item and reverted to the slower pre-machine-learning workflow. In no case were items selected and used without a human expert checking the selection. Not only did this save labor, but the savings were easy to measure and demonstrate, making for a very successful business proposition, and the downside of occasional errors was negligible.

The business had a solid value proposition because the suggestions from the ML models helped the experts work better and faster, but did not replace expert oversight.

In a similar vein, I also assessed a healthcare company that was using sophisticated machine-learning systems to help professionals make diagnoses. The business had a solid value proposition because the suggestions from the ML models helped the experts work better and faster, but didn’t attempt to replace expert oversight.

On the other hand, I have also conducted diligence on a target that was attempting to automate part of the ordering process for retailers. The work they were automating away was not particularly expensive - a retail clerk counting things for reorders. On the other hand, AI mistakes were costly because they resulted in potential spoilage or wasteful returns if the quantities ordered were off. While prototypes were successful in garnering interest from external stakeholders, without a 100% accuracy rate the value from the automation was simply not compelling enough to land the big contracts the business model depended on, and the company was slowly sinking.

Avoiding The Big Mistake

To avoid making such a high-stakes mistake, AI companies need to critically

and unflinchingly ask, “What happens if the model’s wrong?”

To avoid making such a high-stakes mistake, companies making products that depend on AI predictions need to critically and unflinchingly ask, “What happens if the model’s wrong?” This seems like such a simple question that one might think businesses forgetting to ask it would be rare. But rather than being limited to startups, we’re seeing this mistake cause significant losses all over, even in large and established companies. Air Canada was recently fined and slapped down in court for refusing to honor wildly incorrect advice on air fares given by their website chatbot, with the story making the news and damaging their already fragile reputation. Taco Bell rolled out AI ordering, only to have orders for 18,000 cups of water crash the entire system. Legal firms relying on document summarization have been sanctioned for manufacturing case precedent, to the point that some firms are now banning AI use for searching.

All of these are manifestations of the same basic mistake. In each case, nobody thought hard about the ramifications of occasional wildly off-base predictions. Everyone bought the sales pitch from AI vendors that “100% accuracy is just around the corner.” This is somewhat understandable when we look at the fear of missing out gripping the industry right now and the messaging from AI company CEOs. But if we listen to the researchers themselves - the ones building the systems - we should have healthy skepticism that hallucinations will go away any time soon, if ever. Or that machine learning models will stop occasionally making outlandish predictions. A recent study from the RAND think tank put it this way:

“Many business leaders also do not realize that AI algorithms are inherently probabilistic: Every AI model incorporates some degree of randomness and uncertainty. Business leaders who expect repeatability and certainty can be disappointed when the model fails to live up to their expectations.”

Conclusion

While this veers into conjecture, I believe that a major reason this mistake is so widespread is that those of us who grew up with computers prior to the age of machine learning have an ingrained expectation that computers are predictable, logical machines that always do the same thing in a given situation. We have a deep-seated belief that computers always follow instructions precisely. Machine learning turns this paradigm so thoroughly on its head that it’s difficult to adjust our perspectives, even when we know better.

This is exacerbated to no small degree by the fact that we now have machine learning systems that actually talk to us. When we think of talking artificial intelligence, we imagine cultural characters such as Star Trek’s Data or the ship computer - rational and logical to a fault. On some level, regardless of the evidence, we feel that artificial intelligence suggesting butter on ice cream or eating gravel must just be a temporary road bump on the path to a perfectly rational decision maker.

The reality is that machine learning models represent a completely different paradigm of computing compared to classical programming. While we are all used to bugs in software, we have no historical frame of reference for programs confidently speaking to us in our own tongue while occasionally spouting nonsense. Combine this with the sales pitches coming from AI vendors, and we have a dangerous and almost irresistible temptation to believe these systems can do more than they can.

Ultimately, how we get our answers doesn’t change the business fundamentals. We still need to look at what happens when we’re wrong, what happens when we’re right, and whether the potential impacts of disasters are too great for our value proposition. Evaluating the cost of unlikely errors is a notorious blind spot for us fallible humans, but it’s a problem that will remain critical regardless of advances in technology.

About the Author

Iain C.T. Duncan has spent the last 20 years working in the software industry as a CTO, developer, software architect, and consultant, building B2B and internal web applications at startups, agencies, and early-stage companies in Canada and the U.S.

As a practitioner at RingStone, he works with private equity firms globally in an advisory capacity, conducting technical diligence on early and mid-stage companies and preparing companies for the diligence process. He has worked in the diligence sector for six years and has been involved in hundreds of diligence efforts as a practitioner, reviewer, and trainer. Before the diligence sector, he worked on software in e-commerce, association management, non-profit fundraising, genomics, and online education, among others.

An active open-source developer and researcher, Iain is currently completing an interdisciplinary PhD in Computer Science and Music at the University of Victoria, working on applications of Scheme Lisp to algorithmic music and music pedagogy. He is the author of the Scheme for Max open-source programming extension to the Max/MSP audio programming platform and is the founder and developer of the online music education platform SeriousMusicTraining.com.