Skip to main content

Command Palette

Search for a command to run...

From hype to systems thinking

Hard-earned lessons from the enterprise reality of GenAI

Updated
10 min read
From hype to systems thinking

Workers on the GenAI ground have learned some painful lessons in the past years. There are burnouts, there are those that made the shiny demo but it did nothing in production, and there are those that went the full loop where a pilot finally made it to production. Many companies will wake up and ask themselves how to get the AI slop out of their systems when it is way too late, because there was a mantra: Good is good enough. This mantra was not least inspired by the push of executives to finally collect on the ROI they were promised. The issue, however, is that the delayed negative feedback of this mantra will only be visible much later unless it is measured upfront and consciously decided on before deployment. In order to avoid those issues, there are many things you need, which means that AI can be extremely costly in development. Unless you get to the ROI stage, these costs are hard to justify. Hence, judicious decisions need to be made on feasibility, and the only way to make such choices is by measuring and that starts before your GenAI project. Even GenAI pilots are often very costly; hence, creating a decision process where these choices are made upfront, and the messenger is not shot for pointing out issues, is something to think about carefully. There are, however, learnings collected by observing organisations and drawing lessons from them that can help you get to ROI in a more reliable way. You may ignore them at your own peril.

You will need systems thinking

If you work in GenAI, you see systems everywhere where others may see isolated problems. Just imagine a process where you have certain work instructions that should dictate how a case is handled. In the process of data discovery, you look into the case and uncover that instead of following the work instructions, the data violates several assumptions. The messy reality is that in this process you have people checking quality and sending cases back, upon which the case handlers start disregarding the work instructions and optimizing for the quality check instead, covering far more than necessary just to avoid rework. Here, the system is doing exactly what it is incentivized to do. The indicators of satisfaction are defined incompletely, and the system works obediently toward its goals while creating unintended results. These types of misaligned incentives cause data quality issues.

More importantly, they are hard to grapple with on an organizational level, and much harder to grapple with for individual data scientists tasked with solving problems using data produced in such a process. This means you have to tightly link with other departments that often have no incentive to deal with your problem, and more importantly, redefine the incentives of the process itself. If you do not solve this problem, you will ultimately use GenAI as a force multiplier to amplify this organisational dysfunction by creating output that is aligned with the flawed data produced by the original mis-incentivized process. What this also means is that suddenly you are not only building an AI application, but you are cleaning up a process that should not exist in this shape, often without a clear mandate to do so. To turn the ship around, you need to specify new indicators. These could be implemented using for example GenAI to measure quality, not just to create output, but only once the dependencies around data quality have been sufficiently resolved.

An AI project is not a standalone model. It usually integrates into a product. It gets data from various sources, sometimes in real time, and writes to other data sources. All these data sources need to have APIs exposed and accessible. There are legal and risk constraints, especially in regulated industries. All these requirements demand cross-functional teams, which collapse organizational boundaries. You will come to know how an organization that optimized locally suddenly has to think jointly struggles to rebalance.

Without a clear understanding of how roles expand in scope and authority, how incentives need to be reinvented, and what system-level change is required, GenAI applications may fail to provide the return. The organizational chart has an enormous effect on how information flows, and just like physical systems define the systems performance, changes in the organizational structure are rarely simple and quick so are hard to use as leverage points.

One way to think about this, is to allow building of cross functional “synthetic” units and create the right goals for those units to function well within old structures until new structures can be found. This is where paradigms come in, what was the previous culture of information flow between these joint units. What are perceptions on the solutioning and how is expertise distributed in that synthetic unit to evaluate paradigms coming from the outside? External hype does influence what goals are set in an organization. That means that decisions on how authority and trust are distributed in these synthetic units matters more than some may understand.

The GenAI hype is the prevailing agreement on reality that teams have to operate in, which is why their function extends often into education of their organization. As an example at the start Foundation models where touted as an omnipotent solution. Nowadays the consensus is that evaluation in AI solutions is important as they do fail, and they do so in very unexpected ways if no quality assurance is applied. We see that AI projects are and go along with radical reinvention of the organization at a systems level. This requires bold choices, meta-level thinking, and flexibility from all parties involved.

You will need strategy and processes around that strategy

One of the most powerful ways to influence a complex system is its goal. If the goal is not clearly defined, the outcome will not be aligned with the organization’s goals. That is why strategy is not an afterthought here; it is a central question of alignment.

“Technology happens. It’s strategy that decides whether it’s a disaster or an advantage.” — Andy Grove

I will be honest: I never understood the importance of strategy before. It felt like something for managers, not for me. I was building things, moving along… Working in GenAI for two years changed my mind completely. I cannot think of a transformation that would be possible without strategy. Especially in a hype-driven environment, strategy is about choosing what to do and what not to do, and doing so carefully. I do not believe in purely top-down strategy, which is often too removed from the reality of the work being done as well as the possibilities of the technology. But I also do not believe that individual contributors like Joe and Anna alone, can or should fix this on their own. Their focus should be on gathering information, selecting the right use cases, and enabling building of the GenAI applications their management requests.

Have a use case discovery process and pipeline

Joint ownership between Business and IT. Use case discovery must be a joint responsibility. It does not work without Business involvement. The goal is to make work easier where it actually happens. This works best when technical experts define what is possible for example: classification, extraction, summarization and help users understand patterns that may be applicable in their process. Business should then help decide which pain points to solve based on this understanding.

Business value. Many organizations struggle to quantify business value. It is often unclear where value is created or where delays generate cost. KPIs are rarely defined close enough to the problem to measure improvement. If a case takes x days, but most of that time is waiting or unrelated delay, are you actually making a dent with your potential GenAI solution? Without explicit feedback loops and baselines, improvement cannot be quantified. Even worse without measurements representing these issues in the system, it may be very hard to understand the true value of what you propose to build and you may invest in the wrong idea.

Structure discovery top-down. Without shared patterns or blueprints, sub-organizations reinvent evaluation frameworks independently. Providing structure allows teams to focus on gathering the information needed to decide which use cases to pursue first.

Is there a moat? Most GenAI applications have no moat. Unless you have strong reasons to believe you can add value beyond what a vendor or startup can provide, and are willing to own the risk yourself, you may want to let it rest and buy later. In GenAI, a moat usually exists where quality is hard to define and deeply tied to your own process and understanding for example an interpretation of a policy within your organization. You are making a significant bet; but crucially not every bet needs to be internal.

Check Feasibility and ground it in data

Data quality and integration

Workers on the ground have learned painful lessons and no longer believe the hype. Models may be powerful language computers, but if your process has had quality issues for years, your data will reflect that. Making a model understand what it needs to do often requires significant upfront investment. Domain-specific jargon, unstructured data, and feature dependencies on other products can easily add months. Sometimes taking care of data or clarifying prerequisite architecture decisions must come before building anything, and that should be acceptable.

Experienced people

Experienced people are your champions. Technology is only a small part of the transformation. What actually changes is how decisions are made and how work gets done. You are often retrofitting quality onto historically low-quality processes. Documentation may define what should be correct, but that knowledge is often tacit or outdated. Policies branch endlessly, and the most experienced people become essential to disambiguate what was never clearly defined or what Junior co workers can not decisively articulate. You will find processes that make you question how they ever reached this state. You will encounter static data fields that were assumed to be dynamic. You will push back on product decisions and optimize within changing processes which creates a moving target, often without formal authority. This is why strong product ownership matters. A product owner must understand trade-offs, orchestrate strategy into executable pieces, and define what success looks like.

You will need to build trust

GenAI is a human problem at every level

GenAI will replace jobs.

Being dishonest about this does not work. People are already aware, even if some narratives exaggerate the effect. There are processes that no longer require humans and can be handled more reliably through automation. At the same time, GenAI is built in collaboration with the very people whose work may change or disappear.

This creates a delicate human dynamic. Organizations have a responsibility to support people through this transition and help them redefine their work, often by focusing on creative or judgment-heavy tasks. In large enterprises, this matters even more, as data and process maturity can take years. Trust also becomes critical when deploying systems. Compliance and risk teams will scrutinize your work, and rightly so. Foundation models do not understand what quality means in your organization it will be your job to define that.

A shiny demo may achieve 70% performance, but if your process depends on correctness, being wrong 30% of the time is unacceptable, especially when junior staff are the human-in-the-loop. Defining and measuring quality becomes the responsibility of the builders. It is however the responsibility of everyone involved to look at measuring quality as an advantange. The same measurements that improve performance can satisfy regulatory requirements such as the EU AI Act. But this only works if incentives across departments are aligned around quality. Trust is also necessary toward users, and application builders need to deal with that trust more carefully than is currently being done. Human-in-the-loop systems only create value if users spend less time correcting output than doing the work themselves. Errors should be understandable. Feedback should be easy. Users should not need to debug systems. Their role is to give honest feedback. Frustrating the party you place your trust in may have unintended consequences. And last but not least, I recommend identifying your AI champions. Work closely with them. Minimize layers between users and builders. Too much indirection creates misunderstanding and alignment failure. Sitting with users and understanding their pain points is one of the fastest ways to build trust and value.

Let’s end at the beginning: You will need buy-in

GenAI is transformation, not application development. It requires top-down support. Leadership must stand behind it, especially when uncomfortable truths about data quality, technical debt, and organizational readiness surface. Many organizations lost time by misunderstanding the problem as purely technical. Those starting later may benefit from hindsight, but only if leadership allows these lessons to travel upward and downward through the organization. Understanding takes time. Alignment takes time. This information must move, and be accepted, for transformation to succeed.

From Gen AI Hype to Systems Thinking