Hi, this is Wayne again with a topic “Generative AI Impact on Commerce: Kate Kellogg”.
You you did that perfectly Dave and we are starting right on time with our three faculty speakers, the first of whom is Kate, kellock Kate. Welcome I’m delighted to be here today to talk to you about the impact of generative AI on knowledge workers, and I’m going to be talking about um, an experiment that I did with some colleagues from Harvard Business: School, Wharton, uh, Warwick, Business, School um and um uh. Some colleagues from Boston, Consulting Group and the big picture is that generative AI has the potential to um, provide tremendous gains in knowledge, uh performance for knowledge workers, but it raises three key challenges that we found in our experiment: the jagged Frontier of AI capabilities, the tricky Problem of creativity and geni as a skill leveler, so organizational leaders need to address these challenges is in order to facilitate effective implementation of generative AI in their organizations. Let me tell you about the experiment.
We did the experiment with 758 BCG consultants and we randomized them into two groups: um one of the groups. We gave an idea generation and business writing task, which was inside the frontier of generative AI capabilities and the second group. We gave a task, a problem, solving task that was specifically designed to be outside the frontier of generative AI capabilities.
Within each group there were three conditions: some people had no access to AI, some people had access to gp4 and some people. We gave a brief, prompt engineering overview before the experiment we tested this on different outcomes and the first finding is that within the frontier generative, AI can provide tremendous gains in productivity efficiency and the quality of work. So here’s the inside the frontier task that we gave the Consultants we said, you’re working for a footwear company, generate ideas for a new shoe pick the best idea and explain why describe a potential prototype shoe in Vivid detail, come up with a list of steps. So we gave them a series of subtasks and what we found is inside the frontier, the people who were given access to generative AI performed much better.
They accomplished 12 .5 % more work, 26 % faster with 40 % higher quality. This held across every subtask, every um regression, whether they were graded by human graders or by GPT, so um. So, although we found these great gains in performance, one of the challenges that we found is what we call this Jagged Frontier of capabilities, so for the outside. The front frer task, which we carefully designed any of you who have ever done a Consulting interview, will recognize this kind of problem um. We said the CEO wants to understand performance by the company’s three brands: the men, women and kids um, and so the CEO uh has to pick one of these Brands. We’Re going to give you interviews from company insiders and an Excel spreadsheet, with financial data broken down by the brands, and we want you to tell us which brand should the CEO pick to focus on give a rationale for the choice and also suggest Innovative and tactical Actions the CEO can take here.
The people with AI did worse. So if the Consultants without AI got the brand correct, 85 % of the time versus 71 % of the time with AI, and so what happened here is that the people who were given AI took GPT 4’s mislead in output at face value and performed worse um. Perhaps uh as interestingly um, our brief training backfired, the people that we gave the brief upfront training got the answer: correct only 60 % of the time. So we are unpacking these results right now um, but we’re wondering whether this was because people were overconfident when they had this initial upfront training.
We also found that generative AI can be highly convincing even when incorrect. So the graph on the left here is showing that for the people who got the brand recommendation correct, they increased the quality of their recommendation when they had access to AI, but even for the people on the right who got the incorrect recommendation, they also were rated As having high quality recommendations, so even when judged by human graders generative AI can be uh deceptively convincing. If you don’t know where the frontier lies. In summary, G AI can boost productivity and quality inside the frontier but be counterproductive outside the frontier um.
So what this means for organizational leaders is that, if they’re going to introduce AI in their organizations, they need to develop solutions to address this problem and in our experiment we didn’t test Solutions, but we have experience from studying predictive AI over the last number of years To suggest some solutions that could be tested in future, resarch search so for this issue of differential effects inside and outside the frontier, what leaders can test is agreeing on the highest value use cases in for Gen in their organizations and guiding knowledge workers to use generative Ai for those use cases and not others, they can also create a center of excellence to improve the accuracy of AI with their own software layer on top of the public llm. Another issue this raises is that people could be raising issues for Downstream stakeholders, so in the case of Consultants that could be clients in the medical setting. This could be doctors using generative, Ai and affecting patients, um, and so leaders need to conduct preemptive risk analysis. For these different stakeholder groups and establish AI governance standards to guide AI use, the second um issue we surface in our experiment is what we call the tricky problem of creativity.
Here was the inside the frontier task again and, as you can see, we asked people to be creative, as they came up with these ideas for the new shoe, and indeed we found that people with GPT had higher answer quality. So they had higher creativity when they used G uh GPT and when they accepted more of the GPT output in their answer. So for individuals, the use of GPT allowed them to be more creative, but we also found that as a collective, it reduced creativity, so generative AI alone and humans, with generative AI, had more similar ideas than humans alone had, and so what this suggests is that um, While gener AI can increase idea quality for individuals, it can lead to Collective idea, convergence which can be problematic for organizations. So what can organizations do about this? They can test whether the use of multiple llms helps to increase both quality and variability of ideas, and they can identify human AI practices that increase both quality and variability.
So one thing we’re looking at now is for everybody in the experiment who had access to uh generative AI. We have detailed logs of how they interacted with generative Ai and so right now we’re looking to see if there’s certain practices is that were associated with both increased quality and greater variability. Finally, we found uh the challenge of generative AI as a skilled, leveler, so Consultants who were below average performer on the initial assessment test increased their performance by 43 % of with generative AI versus those above average, who increase their performance by only 177 %.
So this suggests that lower skilled workers benefited from generative AI use more than higher skilled workers, and other researchers are beginning to find this as well for organizational leaders. This suggests that there’s going to be a need for reskilling and role reconfiguration inside the organizations and so for this issue of allowing lower skilled workers to operate at a higher level um. What leaders can do is assign lower skilled workers to task they can perform with AI and train them to use AI effectively? So, for example, if if you can imagine in a medical setting, perhaps now with the use of AI medical assistance, can do things that doctors used to need to do, and so you would need to train them well to like use generative AI with those tasks.
But what it also means um is that now they’re going to be doing tasks that higher skilled workers were doing before. So you need some kind of role reconfiguration and in a past study that I did with predictive AI, I found that something called experimentalist governance can work well, which is essentially running many local experiments, where teams experiment with using Ai and reconfiguring their roles and then also Having a central review team composed of workers from each position within the team who review the results, remove local roadblocks and select the best solutions for scaling across the organization. So I’ve talked about a lot of things today, but the most important things to take away from the presentation is that even for highly skilled knowledge, workers, generative AI, can yield tremendous gains in performance, but it can also raise particular challenges and leaders will need to um Put in place solutions to these challenges in order for knowledge workers to effectively use AI. Thank [ Applause, ] you. We don’t have a lot of time. Q .