If you’re a product person, Generative AI should be firmly on your radar, not least because some think it will replace you soon. But hyperbole aside, what will change in how we create products? And are there any downsides?
As this is a very new technology, no one knows the answers for sure. It’s easy to get carried away by the hype, but there’s a bigger picture to consider. In this article, I’ll explain why I feel Gen-UI may be a double-edged sword, bringing enormous potential for the creation of high-impact products, but also a risk of degrading product culture and producing more high-quality products faster. I’ll go quickly over some ways you may use Gen-AI and then discuss these potential risks.
Ways to use Generative AI
1. Generating Entire Products
This may be the most utopian (or dystopian, depending on whom you ask) scenario: you input an elaborate prompt, the AI does its thing, and Poof! out comes a fully designed and coded feature or product.
In one sense, this is already a reality. Dozens of Generative AI app builders will create a simple application for you on demand, no coding or design knowledge is needed. I’m not sure how good or robust these apps are, but Gen-AI app builders seem like an interesting way to create prototypes to test your ideas like no-code and interactive prototyping tools.
I would hazard a guess that for most real-world dev projects, full automation is years away, and it may turn out to be a very tough problem for AI, like self-driving cars. I’ll list the reasons later in the article.
2. Co-Piloting Development
While the bots can’t replace us completely (yet), they can accelerate our work:
Coding/code-reviewing/debugging: this too is already a reality. According to Inbal Shani, GitHub’s CPO, GitHub’s Co-Pilot is being actively used by 1.5ml developers across 37,000 organisations, with high levels of satisfaction and retention. These are remarkable numbers for such a young technology.
Creating dev artefacts: specs, designs, test plans, product content, interview scripts, experiment plans…
Processing and analyzing data: cleaning, summarizing, reporting, finding insights…
Ideating: generating ideas, proposing goals, suggesting approaches
Acting as a knowledge base: product data, frameworks, processes, templates, book summaries…
Testing code: both test development and running the tests
… and probably others we haven’t thought of yet
This is what gets practitioners most excited. The bots seem infinitely knowledgeable and capable; truly the closest thing we have to an intelligent co-pilot.
3. Powering Your Product
Many tech companies are considering how to add Gen-AI capabilities to their products, offering those same powerful benefits to their users. Most will use off-the-shelf models like GPT, perhaps with some fine-tuning. A few will opt to develop their models, which currently require big amounts of time, data, computing power, and memory (but we should expect the costs to reduce over time).
The Risks of Generative AI in Product Development
Missing Context
I wrote a whole post on this one. The bottom line is that although the bot can produce a compelling user story, OKR, or business model, these are not necessarily the right ones for you. The model is missing a lot of context about your specific product, users, market, and company. Even if you try to condense all the information into an elaborate set of prompts (which I find unlikely), the bot will ultimately output a generic artifact based on what the “industry” (or more accurately, what the model was trained on) tends to do.
It’s important to note that humans suffer from context problems too. Our minds struggle to process a lot of information and may fall for cognitive biases such as Recency Bias and Confirmation Bias that fixate on a small subset of the data. Even smart and capable people need help thinking broadly and deeply.
This may be a big opportunity for AI. Imagine you had an AI system that is constantly fed all your business and product data: every user action, customer feedback, sales call, interview, competitive offering, past experiment…. Equipped with the right model, this system could help us detect patterns and uncover insights, opportunities, and threats. The AI system may suggest goals and ideas, and will be able to answer that all-important question “What evidence do we have in support of this idea?”. Such a system could be a truly helpful co-pilot when developing business plans, product specs, and other artefacts.
I’m sure someone somewhere is working on such a product right now, although this too may turn out to be another tough problem, partly because of the next issue.
Hallucinations
Itamar Gilad’s book ‘Evidence Guided’ explores the concept of using evidence to make better decisions in both personal and professional life. Gilad emphasizes the importance of data-driven decision-making and provides practical strategies for incorporating evidence into everyday choices. From understanding cognitive biases to utilizing data effectively, the book offers valuable insights for anyone seeking to make more informed decisions.”
This summary is 80% wrong. The model has no idea what the book is about, but instead of saying “I don’t know” it produced this plausible-sounding text, probably based on the title alone.
This is a byproduct of the fact that large language models (LLMs) are inherently designed to produce the most correct-sounding answer, rather than the most accurate one. The two often overlap, but not always.
That makes Gen-AI unreliable. Imagine working with a colleague who is very capable, knowledgeable, and eager to help. This person can do a lot, but he has one major personality disorder — he thinks he knows everything, and even when he doesn’t, he tries to bullshit his way through the task at hand and produce something that looks right even if it’s way off. Would you trust this person with critical work?
I’m sure AI companies are working to reduce the rate of hallucinations, but likely this is yet another tough problem. More broadly, Gen-AI is not a one-size-fits-all solution for all classes of problems, other AI approaches have their place, as do humans.
New Costs and Risks
If you’ve tried using any Gen-AI product, you know that there’s an awful lot of trial and error involved. OK, this prompt didn’t work, what if we tweak it this way? Let’s run it again. With each run, there’s a wait time, which can accumulate across the development cycle. You can partly compensate by buying more memory and compute power, or opt for a higher tier of service with your AI provider, but either way, it can get expensive, and you’re never sure how long it’ll take (if ever) to achieve a satisfactory result.
If you embed Generative AI into your software, things get even less deterministic. We were trained to develop software that follows some human-understandable logic (at least until it devolves into spaghetti code), but what happens when at the heart of the system sits a mysterious statistical model that no one understands? How does this affect design, coding, experimentation, testing, debugging, and maintenance?
Another cost is that of using a 3rd-party Generative AI API. Right now Big-Tech as well as some startups and open source projects are jockeying for position, hence the market is very competitive, but sooner or later these organizations will need to recuperate the immense costs that go into collecting their data sets, developing their models, and of operating the service, so likely charges will creep up.
In other words, while Gen-AI introduces powerful new capabilities, it also brings new uncertainties and complexities. As Marty Cagan and Marily Nika point out, Gen-AI introduces all four classes of product risk: feasibility, usability, value, and business viability.
Accelerating Feature Factories and Junk Features
This one worries me the most. It’s no secret that many company leaders measure the success of their product orgs by their output. For these managers Gen-AI is a godsend, helping to further optimize the feature-factory to produce more working code in less time and cost. It’s not hard to imagine some of these cost-cutting measures:
Less thinking — It’s already hard to get people to surface assumptions, run experiments, conduct product discovery, analyze results, and take evidence-guided action. Gen-AI with its confident outputs may be the perfect antidote to thinking, focusing everyone on execution instead.
Copy-cat mentality — just do what everyone else is doing; the bot says so.
Waterfall — a good product spec is a result of cross-functional collab and constant iteration, as is a good design mockup, and working code. But if we’re each producing ready-made, complete artifacts using Gen-AI tools, collaboration and iteration may take a backseat to good old waterfall development.
Headcount cuts — If a design bot can produce convincing UI designs with a click of a button, do we still need a UX designer? If it can produce user stories at a fraction of the cost of a full-time PM, can we scale one PM across 10 teams?
The bottom line may be a major regression in culture and quality —an acceleration of the feature factory model, producing more junk features and junk products at a faster pace. If you want to see what that feels like just see what’s happening to books, articles, images, and social posts.
Gen-AI-Utopia or Techno-Hell?
While very powerful, Gen-AI is not without its risks. We may use Gen-AI to empower individuals, teams, and managers while keeping tabs on the limitations of the technology and leaving flight controls in humans’ hands. We can use Gen-AI to build true intelligence in the org by combining what machines do best (crunch up lots of data, detect patterns) and what humans do best (empathize with other humans, prioritize the most important things, make decisions with partial info …).
On the other hand, we may use Gen-AI like some companies use Agile: “optimize” engineering, design, and product work for maximum throughput creating a “production machine” that is increasingly devoid of deep thinking and judgement. Like every other technology or process, Gen-AI is going to be overused, misused, and abused. Whether or not we do these things heavily depends on our very own human intelligence.