{"id":2769,"date":"2026-02-16T12:24:09","date_gmt":"2026-02-16T11:24:09","guid":{"rendered":"https:\/\/www.innotivum.com\/?page_id=2769"},"modified":"2026-03-05T12:03:16","modified_gmt":"2026-03-05T11:03:16","slug":"spm-blog-43-human-judgment-required-status-of-ai","status":"publish","type":"page","link":"https:\/\/www.innotivum.com\/de\/publications\/spm-blog\/spm-blog-43-human-judgment-required-status-of-ai\/","title":{"rendered":"SPM Blog 43: Human Judgment Required \u2013 The Status of AI and Its Implications for Product Managers (and everybody else)"},"content":{"rendered":"

[vc_row][vc_column el_class=“inno–head-icon“][vc_icon icon_fontawesome=“fa fa-info-circle“ el_class=“inno–icon“][vc_custom_heading text=“SPM Blog 43: Human Judgment Required \u2013 The Status of AI and Its Implications for Product Managers (and everybody else)“ font_container=“tag:h1|text_align:left“ use_theme_fonts=“yes“][\/vc_column][\/vc_row][vc_row][vc_column][vc_single_image image=“2771″ img_size=“600×400″ alignment=“center“]\r\n

\r\n \r\n
AI tools based on Large Language Models (LLMs)<\/strong> have become ubiquitous over the last three years. While initial use tended to be naive and exploratory, use in professional environments is becoming standard and can help product managers a lot. In this post, we look at recent research results on AI, ideas on AI as game changer for industries, severe limitations of LLM-based AI products, and what this means for practice and education of product managers. Summary: Use them responsibly, but don\u2019t trust them. Human judgment is needed more than ever.<\/strong><\/p>\n

Since OpenAI launched ChatGPT in November 2022, large language models (LLMs) have become almost synonymous with the term AI in the public view. Since then a number of additional LLM-based tools like Gemini, Copilot, Claude, Perplexity etc. have been launched and released in new versions and releases quite frequently, some of them free, some at a price. We all have experimented with them, and most people find them useful. The outputs often seem to be surprisingly good, but we are biased. When we ask a question via prompt, it is usually about a subject that we do not know much about. So if the resulting output looks plausible we consider it as useful and think it is correct. Subject matter experts are frequently more critical. The quality of outputs is dependent on the quality of the data that was used for the learning process of the LLM.<\/p>\n

We are seeing significant productivity increases with LLMs in a number of areas like software development, text creation and improvement and many more. Vendors continue to be super enthusiastic and invest enormous amounts of money, in particular into infrastructure, without having a proven business model. Market evaluations are also enormous. However, B2B customers are experiencing high failure rates of AI projects and unconvincing ROIs.<\/p>\n

Gartner positioned Generative AI (LLMs are part of this category) as going towards the \u201etrough of disillusionment\u201c in their latest Gartner AI Hype Cycle<\/a><\/strong> (Gartner, June 2025). In other words, in spite of all the hype we have learned that LLMs do not solve all the problems of the world. The learnings are reflected in quite a number of recent publications, both from researchers and practitioners.<\/p>\n

ISPMA Fellow Sangeet Paul Choudari<\/strong> published a book in August 2025 entitled \u201cReshuffle<\/a>\u201c. He claims that AI is not just a productivity tool. Its real power comes from structural changes in companies or even in whole industries that are based on AI’s coordination capabilities. There are no implementations yet for this vision, but when you look at a tool that was launched last November under the name Clawdbot and which is now called OpenClaw<\/a><\/strong> (acquired by OpenAI in February 2026), you can already envision what this looks like. OpenClaw is designed as a personal assistant that runs on your PC or laptop. You give all access rights to OpenClaw, specify which LLMs you want it to use, and then OpenClaw acts on your behalf. It manages all your messages through all channels, it answers messages, it pays invoices if you give access to your account or credit card. It installs software, all of that autonomously. It’s quite amazing to experience this (for security reasons better in a sandbox), and you can easily extrapolate the experience of this tool into a corporate environment like Choudari describes it. Whether you really want to give so much control over your life to such a software program is a question that you need to answer for yourself.<\/p>\n

One thing we have learned is that LLMs hallucinate, i.e. produce outputs that are plausible but factually incorrect, inconsistent, or well fabricated. In my software life, that is close to the definition of a defect, a bug that needs to get fixed. \u201cHallucination\u201d sounds much better, more human, more intelligent. Why was a new term needed? That question is answered by Sourav Banerjee,\u00a0Ayushi Agarwal\u00a0&\u00a0Saloni Singla<\/a> (2025) and Micha\u0142 P. Karpowicz<\/a> (2025). They provide mathematical proof that hallucinations are inherent to the LLM technology<\/strong>. It is not a bug, it is a feature, unwanted, but not fixable, even with the best training data.<\/p>\n

You may say, well, this is just theory. As long as hallucination rates<\/strong> are lower than human error rates, there is no problem. So we need to look at numbers. On the internet you can find benchmark results for hallucination rates. One is published by Vectara on GitHub<\/a>. Their benchmark is fairly simple. It’s just about summarizing a short text document. Even for this simple case, they come up with hallucination rates between 4 and 8% for the top 25 LLMs. These numbers have not improved with new versions of the LLMs. For more complex benchmarks, you can find much higher hallucination rates, i.e. in the range of 40 to 70 percent, clearly higher than human error rates.<\/p>\n

Recommendation: given the enormous productivity boost that you can get from these tools, use them, but use them responsibly<\/strong>. That means make sure that the tool and its underlying data are a good fit to the problem or question that you are facing. And don’t trust the outputs<\/strong>. This technology is inherently error-prone. Human judgment is required<\/strong>, in particular, when you apply these tools in critical application areas like healthcare, finance, or legal. Nevertheless, LLM-based AI tools can help product managers with most of their tasks.<\/p>\n

If you are a product manager of an LLM-based product you need to be aware of the limitations of the technology and the risks. And you need to take the customers\u2018 perspective into account. B2B customers are increasingly aware of the risks that come with AI technology. The large global insurance group Allianz has just published its \u201cRisk Barometer 2026<\/a><\/strong>\u201c which is based on a global survey. In B2B customers‘ view, AI has gone up from # 10 to # 2 in the list of the most critical risk areas (2026 compared to 2025) (#1 being cybersecurity risks). For AI, customers see operational risks like business interruption, and in particular errors cascading through automated workflows. They see legal and compliance risks, in particular liability for harmful outcomes. And they see reputational risks, in particular through unethical AI use. If you want your product to be successful, you need to address these customer concerns proactively.<\/p>\n

The risk aspects become even more important when the product is intended to be an autonomous AI Agent<\/strong>. For more critical application areas like healthcare, financial, legal, military, you better reduce the level of autonomy and keep the human in the loop for decision-making.<\/p>\n

As a product manager, you also need to influence and monitor what software developers are doing. LLMs can give software development a significant productivity increase. There is one approach called Vibe Coding. That means that software code is generated based on natural language descriptions. ISPMA fellow Fr\u00e9d\u00e9ric Pattyn<\/strong> has just published the book \u201cThe Vibe Coding Trap<\/a><\/strong>\u201c. In this book he states that product managers can succeed when they treat Vibe Coding as a learning amplifier, but not as a delivery guarantee. With the increased speed of development, decision making becomes the bottleneck. Teams can gain remarkable improvements in time to value, but they pay with significant issues regarding correctness, completeness, maintainability and understandability. The price for the amazing speed can easily be significant technical debt that fires back later. The book gives practical advice how to deal with these challenges.<\/p>\n

In November 2025, Andrew Ng<\/a><\/strong>, professor of computer science at Stanford and former head of AI with Google, gave career advice in AI to his computer science students. He stated that the ratio between product managers and software engineers is changing. Historically, it was 1 to 8. It is currently going to 1 to 4 with the perspective of 1 to 1. So product management becomes the bottleneck. If we manage to teach software engineers product management, maybe only one person will be needed in the future. But Andrew Ng reported himself that at Google he failed when he tried to turn software engineers into product managers.<\/p>\n

Product management education<\/strong> needs to focus on the use of LLM-based AI, in particular the applicability of an AI tool and its underlying data to a question or problem, and on judgment, i.e. the ability of the product manager to decide if the output of the tool is reasonable, realistic, and useful. The second focus area is the management of LLM-based products, in particular the business perspective, LLM technology, data science, and risk assessment and management.<\/p>\n

Globally, companies have significantly reduced hiring juniors over the last 2 years. That has happened under the assumption that junior tasks will be done by AI as a team member in the future, with required judgment coming from senior experts. This is obviously not a sustainable approach. Where should future senior experts come from if companies do not hire juniors today? My recommendation is: start hiring juniors again<\/strong> and develop them into knowledgeable and responsible users of AI tools. Make them work with senior experts on the judgment part so that they can learn. And establish governance rules for the company that help reduce AI-induced risks.<\/p>\n

To summarize: LLM-based AI products can generate significant productivity increases, but they come with severe limitations and risks. The fundamental question is: Is the LLM technology with its described limitations capable of carrying the implementation of Sangeet Paul Choudary\u2019s vision? Silicon Valley is betting on it \u2013 big time. I doubt it. That is why I am saying: The human being is needed in this more than ever<\/strong>.<\/p>\n

Addendum March 5, 2026:<\/p>\n

Some readers have told me that my article is too pessimistic. I disagree. I see my article as optimistic and hopeful. My claim is that the LLM technology has unfixable limitations that result in big liability risks for vendors which will keep them from going all the way to fully autonomous Agentic AI systems. The conflict between the U.S. Department of Defense, Anthropic and OpenAI in the last couple of days seem to support my position. Anthropic and OpenAI insisted on restricting the use of their products with regard to domestic surveillance or autonomous weapons.<\/p>\n

We are bombarded with so much information in the area of AI, conflicting, sometimes dystopian. Ravi Venkatesan<\/strong>, former head of Microsoft India, said in his keynote at the SPM Summit India at IIM Bangalore on Feb. 21, 2026, that autonomous Agentic AI would mean the end of traditional employment. His message to students was: prepare yourself to work as entrepreneur and use AI for implementing your ideas fast and at low cost.<\/p>\n

In his post \u201cAI Winter Is Coming\u201c dated Feb. 23, 2026, Gabriel Steinhardt<\/a><\/strong> (BlackBlot, Israel) predicts that the AI industry will begin to collapse within 1.5 to 2.5 years. Product managers will be forced more and more into technical and operational tasks without strategic direction. The only hope is the burst of the AI bubble.<\/p>\n

On Feb 22, 2026, the financial analysts of Citrini Research<\/a><\/strong> published an article in which they outline a scenario for 2028 based on ubiquitous use of autonomous agentic AI. Their outlook is dystopian, not only in terms of resulting unemployment, but also regarding the impact on global economies and stock markets.<\/p>\n

My thinking does not tend to be dystopian, but we as societies, states, the people, need to define guardrails regarding the use of AI. The European Union has already established the EU AI Act<\/a><\/strong> which defines legal restrictions based on risk categories. In the US, a number of experts, researchers as well as industry veterans, have published warnings over the last couple of years. Even some CEOs of AI companies asked for legal restrictions. The US government has not acted so far, probably because they do not want to restrict themselves in the military area and in their race against China. Other countries still have to define their positions.<\/p>\n

This is not just relevant for product managers. It is relevant for everybody.<\/p>\n

References:<\/p>\n

Gartner AI Hype Cycle<\/strong> 2025: https:\/\/www.gartner.com\/en\/newsroom\/press-releases\/2025-08-05-gartner-hype-cycle-identifies-top-ai-innovations-in-2025<\/a><\/p>\n

Sangeet Paul Choudari<\/strong>: Reshuffle – Who wins when AI restacks the knowledge economy, 2025: https:\/\/platforms.substack.com\/p\/reshuffle-my-next-book-is-now-available<\/a><\/p>\n

Sourav Banerjee,\u00a0Ayushi Agarwal\u00a0&\u00a0Saloni Singla<\/strong> (United We Care, Los Angeles): LLMs Will Always Hallucinate, and We Need to Live with This, in: Arai, K. (eds): Intelligent Systems and Applications. IntelliSys 2025. Lecture Notes in Networks and Systems, vol 1554. Springer August 2025; preprint https:\/\/arxiv.org\/pdf\/2409.05746<\/a><\/p>\n

Micha\u0142 P. Karpowicz<\/strong> (Samsung AI Center, Poland): On the Fundamental Impossibility of Hallucination Control in Large Language Models, May 2025; https:\/\/arxiv.org\/pdf\/2506.06382<\/a><\/p>\n

Vectara Hallucination Leaderboard<\/strong>: https:\/\/github.com\/vectara\/hallucination-leaderboard\/<\/a><\/p>\n

Allianz Risk Barometer<\/strong> 2026: https:\/\/commercial.allianz.com\/content\/dam\/onemarketing\/commercial\/commercial\/reports\/allianz-risk-barometer-2026.pdf<\/a><\/p>\n

Fr\u00e9d\u00e9ric Pattyn<\/strong>: The Vibe Coding Trap: How AI Accelerates Delivery – and Quietly Breaks Responsibility, 2026: https:\/\/www.amazon.com\/dp\/B0GGC668P5\/<\/a><\/p>\n

Andrew Ng: <\/strong>Career Advice in AI, Nov. 2025: https:\/\/www.youtube.com\/watch?v=AuZoDsNmG_s<\/a> (starting at 5:30)<\/p>\n

Gabriel Steinhardt:<\/strong> AI Winter Is Coming: https:\/\/www.linkedin.com\/pulse\/ai-winter-coming-gabriel-steinhardt-pntfc\/<\/a><\/p>\n

Citrini Research<\/strong>: The 2028 Global Intelligence Crisis: https:\/\/www.citriniresearch.com\/p\/2028gic<\/a><\/p>\n

EU AI Act<\/strong>: https:\/\/www.europarl.europa.eu\/topics\/en\/article\/20230601STO93804\/eu-ai-act-first-regulation-on-artificial-intelligence<\/a><\/p>\n

Special thanks for helpful feedback and discussions with Prof. Rahul De‘<\/a>, and ISPMA Fellows Fr\u00e9d\u00e9ric Pattyn<\/a>, Vibhuti R. Singh<\/a>, Gerald Heller<\/a>, and Andrey Saltan<\/a>.<\/p>\n

The content of this article was the subject of my presentation at ISPMA<\/a>\u2019s SPM Summit India 2026 at IIM Bangalore on Feb. 21, 2026: https:\/\/spmsummit.org\/india<\/a><\/p>\n

For my training courses, consulting and my books on software product management, see www.innotivum.com<\/a>.<\/div>\r\n \r\n <\/div>[\/vc_column][\/vc_row]<\/p>\n<\/div>","protected":false},"excerpt":{"rendered":"

[vc_row][vc_column el_class=“inno–head-icon“][vc_icon icon_fontawesome=“fa fa-info-circle“ el_class=“inno–icon“][vc_custom_heading text=“SPM Blog 43: Human Judgment Required \u2013 The Status of AI and Its Implications for Product Managers (and everybody else)“ font_container=“tag:h1|text_align:left“ use_theme_fonts=“yes“][\/vc_column][\/vc_row][vc_row][vc_column][vc_single_image image=“2771″ img_size=“600×400″ alignment=“center“][\/vc_column][\/vc_row]<\/p>\n","protected":false},"author":5,"featured_media":2771,"parent":181,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-2769","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/www.innotivum.com\/de\/wp-json\/wp\/v2\/pages\/2769","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.innotivum.com\/de\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/www.innotivum.com\/de\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/www.innotivum.com\/de\/wp-json\/wp\/v2\/users\/5"}],"replies":[{"embeddable":true,"href":"https:\/\/www.innotivum.com\/de\/wp-json\/wp\/v2\/comments?post=2769"}],"version-history":[{"count":4,"href":"https:\/\/www.innotivum.com\/de\/wp-json\/wp\/v2\/pages\/2769\/revisions"}],"predecessor-version":[{"id":2798,"href":"https:\/\/www.innotivum.com\/de\/wp-json\/wp\/v2\/pages\/2769\/revisions\/2798"}],"up":[{"embeddable":true,"href":"https:\/\/www.innotivum.com\/de\/wp-json\/wp\/v2\/pages\/181"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.innotivum.com\/de\/wp-json\/wp\/v2\/media\/2771"}],"wp:attachment":[{"href":"https:\/\/www.innotivum.com\/de\/wp-json\/wp\/v2\/media?parent=2769"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}