A Generative Ai Reset Rewiring To Turn Potential Into Value in 2024
A Generative Ai Reset Rewiring To Turn Potential Into Value in 2024
A generative AI reset:
Rewiring to turn potential
into value in 2024
The generative AI payoff may only come when companies
do deeper organizational surgery on their business.
It’s time for a generative AI (gen AI) reset. The initial enthusiasm and flurry of activity
in 2023 is giving way to second thoughts and recalibrations as companies realize that
capturing gen AI’s enormous potential value is harder than expected.
With 2024 shaping up to be the year for gen AI to prove its value, companies should
keep in mind the hard lessons learned with digital and AI transformations: competitive
advantage comes from building organizational and technological capabilities to broadly
innovate, deploy, and improve solutions at scale—in effect, rewiring the business for
distributed digital and AI innovation.
Companies looking to score early wins with gen AI should move quickly. But those hoping
that gen AI offers a shortcut past the tough—and necessary—organizational surgery
are likely to meet with disappointing results. Launching pilots is (relatively) easy; getting
pilots to scale and create meaningful value is hard because they require a broad set of
changes to the way work actually gets done.
Let’s briefly look at what this has meant for one Pacific region telecommunications
company. The company hired a chief data and AI officer with a mandate to “enable the
organization to create value with data and AI.” The chief data and AI officer worked with
the business to develop the strategic vision and implement the road map for the use cases.
After a scan of domains (that is, customer journeys or functions) and use case opportunities
across the enterprise, leadership prioritized the home-servicing/maintenance domain to
pilot and then scale as part of a larger sequencing of initiatives. They targeted, in particular,
the development of a gen AI tool to help dispatchers and service operators better predict
the types of calls and parts needed when servicing homes.
Leadership put in place cross-functional product teams with shared objectives and
incentives to build the gen AI tool. As part of an effort to upskill the entire enterprise to
better work with data and gen AI tools, they also set up a data and AI academy, which
the dispatchers and service operators enrolled in as part of their training. To provide
the technology and data underpinnings for gen AI, the chief data and AI officer also
selected a large language model (LLM) and cloud provider that could meet the needs of
the domain as well as serve other parts of the enterprise. The chief data and AI officer
also oversaw the implementation of a data architecture so that the clean and reliable
data (including service histories and inventory databases) needed to build the gen AI tool
could be delivered quickly and responsibly.
Our book Rewired: The McKinsey Guide to Outcompeting in the Age of Digital and AI (Wiley,
June 2023) provides a detailed manual on the six capabilities needed to deliver the kind of
broad change that harnesses digital and AI technology. In this article, we will explore how
to extend each of those capabilities to implement a successful gen AI program at scale.
While recognizing that these are still early days and that there is much more to learn, our
experience has shown that breaking open the gen AI opportunity requires companies to
rewire how they work in the following ways.
Much of gen AI’s near-term value is closely tied to its ability to help people do their
current jobs better. In this way, gen AI tools act as copilots that work side by side with
an employee, creating an initial block of code that a developer can adapt, for example,
or drafting a requisition order for a new part that a maintenance worker in the field
can review and submit (see sidebar “Copilot examples across three generative AI
archetypes”). This means companies should be focusing on where copilot technology can
have the biggest impact on their priority programs.
2
Copilot examples Some industrial companies, for example, have identified
maintenance as a critical domain for their business.
across three Reviewing maintenance reports and spending time with
generative AI workers on the front lines can help determine where
archetypes a gen AI copilot could make a big difference, such as
in identifying issues with equipment failures quickly
and early on. A gen AI copilot can also help identify
• “Taker” copilots help
root causes of truck breakdowns and recommend
real estate customers
resolutions much more quickly than usual, as well as
sift through property
act as an ongoing source for best practices or standard
options and find the most
operating procedures.
promising one, write
code for a developer,
The challenge with copilots is figuring out how to
and summarize investor
generate revenue from increased productivity. In
transcripts.
the case of customer service centers, for example,
companies can stop recruiting new agents and use
• “Shaper” copilots provide
attrition to potentially achieve real financial gains.
recommendations to sales
Defining the plans for how to generate revenue from the
reps for upselling customers
increased productivity up front, therefore, is crucial to
by connecting generative AI
capturing the value.
tools to customer relationship
management systems,
financial systems, and
customer behavior histories;
Upskill the talent you have
create virtual assistants to but be clear about the gen-AI-
personalize treatments for specific skills you need
patients; and recommend
By now, most companies have a decent understanding
solutions for maintenance
of the technical gen AI skills they need, such as model
workers based on historical
fine-tuning, vector database administration, prompt
data.
engineering, and context engineering. In many
cases, these are skills that you can train your existing
• “M aker” copilots are
workforce to develop. Those with existing AI and
foundation models
machine learning (ML) capabilities have a strong head
that lab scientists at
start. Data engineers, for example, can learn multimodal
pharmaceutical companies
processing and vector database management, MLOps
can use to find and test
(ML operations) engineers can extend their skills to
new and better drugs
LLMOps (LLM operations), and data scientists can
more quickly.
develop prompt engineering, bias detection, and fine-
tuning skills.
3
answers, for example. It took one financial-services company three months to train its
best data scientists to a high level of competence. While courses and documentation
are available—many LLM providers have boot camps for developers—we have found
that the most effective way to build capabilities at scale is through apprenticeship,
training people to then train others, and building communities of practitioners. Rotating
experts through teams to train others, scheduling regular sessions for people to share
learnings, and hosting biweekly documentation review sessions are practices that have
proven successful in building communities of practitioners (see sidebar “A sample of new
generative AI skills needed”).
It’s important to bear in mind that successful gen AI skills are about more than coding
proficiency. Our experience in developing our own gen AI platform, Lilli, showed us that
the best gen AI technical talent has design skills to uncover where to focus solutions,
contextual understanding to ensure the most relevant and high-quality answers are
generated, collaboration skills to work well with knowledge experts (to test and validate
answers and develop an appropriate curation approach), strong forensic skills to figure
out causes of breakdowns (is the issue the data, the interpretation of the user’s intent, the
quality of metadata on embeddings, or something else?), and anticipation skills to conceive
of and plan for possible outcomes and to put the right kind of tracking into their code. A
pure coder who doesn’t intrinsically have these skills may not be as useful a team member.
While current upskilling is largely based on a “learn on the job” approach, we see a rapid
market emerging for people who have learned these skills over the past year. That skill
growth is moving quickly. GitHub reported that developers were working on gen AI projects
“in big numbers,” and that 65,000 public gen AI projects were created on its platform in
2023—a jump of almost 250 percent over the previous year. If your company is just starting
its gen AI journey, you could consider hiring two or three senior engineers who have built a
gen AI shaper product for their companies. This could greatly accelerate your efforts.
While developing Lilli, our team had its mind on scale when it created an open plug-in
architecture and setting standards for how APIs should function and be built. They
developed standardized tooling and infrastructure where teams could securely
experiment and access a GPT LLM, a gateway with preapproved APIs that teams could
access, and a self-serve developer portal. Our goal is that this approach, over time, can
4
A sample of new help shift “Lilli as a product” (that a handful of teams
use to build specific solutions) to “Lilli as a platform”
generative AI (that teams across the enterprise can access to build
skills needed other products).
The following are examples For teams developing gen AI solutions, squad
of new skills needed for the composition will be similar to AI teams but with data
successful deployment of engineers and data scientists with gen AI experience and
generative AI tools: more contributors from risk management, compliance,
and legal functions. The general idea of staffing squads
• data scientist: with resources that are federated from the different
– prompt engineering expertise areas will not change, but the skill composition
– in-context learning of a gen-AI-intensive squad will.
– bias detection
– pattern identification
– r einforcement learning Set up the technology architecture
from human feedback to scale
–h yperparameter/large
language model fine- Building a gen AI model is often relatively straightforward,
tuning; transfer learning but making it fully operational at scale is a different matter
entirely. We’ve seen engineers build a basic chatbot in
• data engineer: a week, but releasing a stable, accurate, and compliant
–d ata wrangling and data version that scales can take four months. That’s why, our
warehousing experience shows, the actual model costs may be less
– data pipeline construction than 10 to 15 percent of the total costs of the solution.
– multimodal processing
– v ector database Building for scale doesn’t mean building a new technology
management architecture. But it does mean focusing on a few core
decisions that simplify and speed up processes without
breaking the bank. Three such decisions stand out:
•F
ocus on reusing your technology. Reusing code
can increase the development speed of gen AI use
cases by 30 to 50 percent. One good approach is
simply creating a source for approved tools, code,
and components. A financial-services company, for
example, created a library of production-grade tools,
which had been approved by both the security and legal
teams, and made them available in a library for teams
to use. More important is taking the time to identify and
build those capabilities that are common across the
most priority use cases. The same financial-services
company, for example, identified three components that
could be reused for more than 100 identified use cases.
By building those first, they were able to generate a
significant portion of the code base for all the identified
use cases—essentially giving every application a big
head start.
5
•F
ocus the architecture on enabling efficient connections between gen AI models
and internal systems. For gen AI models to work effectively in the shaper archetype,
they need access to a business’s data and applications. Advances in integration and
orchestration frameworks have significantly reduced the effort required to make
those connections. But laying out what those integrations are and how to enable
them is critical to ensure these models work efficiently and to avoid the complexity
that creates technical debt (the “tax” a company pays in terms of time and resources
needed to redress existing technology issues). Chief information officers and chief
technology officers can define reference architectures and integration standards for
their organizations. Key elements should include a model hub, which contains trained
and approved models that can be provisioned on demand; standard APIs that act as
bridges connecting gen AI models to applications or data; and context management
and caching, which speed up processing by providing models with relevant information
from enterprise data sources.
•B
uild up your testing and quality assurance capabilities. Our own experience building
Lilli taught us to prioritize testing over development. Our team invested in not only
developing testing protocols for each stage of development but also aligning the entire
team so that, for example, it was clear who specifically needed to sign off on each stage
of the process. This slowed down initial development but sped up the overall delivery
pace and quality by cutting back on errors and the time needed to fix mistakes.
•B
e targeted in ramping up your data quality and data augmentation efforts. While
data quality has always been an important issue, the scale and scope of data that gen
AI models can use—especially unstructured data—has made this issue much more
consequential. For this reason, it’s critical to get the data foundations right, from
clarifying decision rights to defining clear data processes to establishing taxonomies
so models can access the data they need. The companies that do this well tie their
data quality and augmentation efforts to the specific AI/gen AI application and use
case—you don’t need this data foundation to extend to every corner of the enterprise.
This could mean, for example, developing a new data repository for all equipment
specifications and reported issues to better support maintenance copilot applications.
•U
nderstand what value is locked into your unstructured data. Most organizations have
traditionally focused their data efforts on structured data (values that can be organized
in tables, such as prices and features). But the real value from LLMs comes from their
ability to work with unstructured data (for example, PowerPoint slides, videos, and
text). Companies can map out which unstructured data sources are most valuable and
establish metadata tagging standards so models can process the data and teams can
6
find what they need (tagging is particularly important to help companies remove data
from models as well, if necessary). Be creative in thinking about data opportunities.
Some companies, for example, are interviewing senior employees as they retire
and feeding that captured institutional knowledge into an LLM to help improve their
copilot performance.
•O
ptimize to lower costs at scale. There is often as much as a tenfold difference
between what companies pay for data and what they could be paying if they optimized
their data infrastructure and underlying costs. This issue often stems from companies
scaling their proofs of concept without optimizing their data approach. Two costs
generally stand out. One is storage costs arising from companies uploading terabytes
of data into the cloud and wanting that data available 24/7. In practice, companies
rarely need more than 10 percent of their data to have that level of availability, and
accessing the rest over a 24- or 48-hour period is a much cheaper option. The other
costs relate to computation with models that require on-call access to thousands of
processors to run. This is especially the case when companies are building their own
models (the maker archetype) but also when they are using pretrained models and
running them with their own data and use cases (the shaper archetype). Companies
could take a close look at how they can optimize computation costs on cloud platforms—
for instance, putting some models in a queue to run when processors aren’t being used
(such as when Americans go to bed and consumption of computing services like Netflix
decreases) is a much cheaper option.
One insurance company, for example, created a gen AI tool to help manage claims. As
part of the tool, it listed all the guardrails that had been put in place, and for each answer
provided a link to the sentence or page of the relevant policy documents. The company
also used an LLM to generate many variations of the same question to ensure answer
consistency. These steps, among others, were critical to helping end users build trust in
the tool.
Part of the training for maintenance teams using a gen AI tool should be to help them
understand the limitations of models and how best to get the right answers. That includes
teaching workers strategies to get to the best answer as fast as possible by starting with
broad questions then narrowing them down. This provides the model with more context,
and it also helps remove any bias of the people who might think they know the answer
already. Having model interfaces that look and feel the same as existing tools also helps
users feel less pressured to learn something new each time a new application is introduced.
Getting to scale means that businesses will need to stop building one-off solutions that
are hard to use for other similar use cases. One global energy and materials company, for
7
example, has established ease of reuse as a key requirement for all gen AI models, and
has found in early iterations that 50 to 60 percent of its components can be reused. This
means setting standards for developing gen AI assets (for example, prompts and context)
that can be easily reused for other cases.
While many of the risk issues relating to gen AI are evolutions of discussions that were
already brewing—for instance, data privacy, security, bias risk, job displacement, and
intellectual property protection—gen AI has greatly expanded that risk landscape.
Just 21 percent of companies reporting AI adoption say they have established policies
governing employees’ use of gen AI technologies.
In some ways, this article is premature—so much is changing that we’ll likely have a profoundly
different understanding of gen AI and its capabilities in a year’s time. But the core truths
of finding value and driving change will still apply. How well companies have learned those
lessons may largely determine how successful they’ll be in capturing that value.
Eric Lamarre is a senior partner in McKinsey’s Boston office, Alex Singla is a senior partner in the
Chicago office, Alexander Sukharevsky is a senior partner in the London office, and Rodney Zemmel is
a senior partner in the New York office.
The authors wish to thank Michael Chui, Juan Couto, Ben Ellencweig, Josh Gartner, Bryce Hall, Holger
Harreis, Phil Hudelson, Suzana Iacob, Sid Kamath, Neerav Kingsland, Kitti Lakner, Robert Levin, Matej
Macak, Lapo Mori, Alex Peluffo, Aldo Rosales, Erik Roth, Abdul Wahab Shaikh, and Stephen Xu for their
contributions to this article.