- AI Leadership Weekly
- Posts
- AI Leadership Weekly
AI Leadership Weekly
Issue #28
Welcome to the latest AI Leadership Weekly, a curated digest of AI news and developments for business leaders.
Top Stories

Source: Florencia Rojas/Pexels
Evidence of o3 deception
We reported last week on reports that OpenAI is making safety less and less of a priority for new models, and their latest reasoning model, o3, is drawing more of these concerns.
Further more, external auditors have warned that "traditional tests are no longer sufficient" to reliably detect deception, manipulation, or sabotage. In fact, these auditors have described o3 as potentially the "most dangerous" model.
During pre-release evaluations, METR detected what is termed "reward hacking" in 1-2% of its tests. In one instance, o3 even overwrite the Python programming language's timing functions to report lower time scores.
On the one hand, these "hacks" can be seen as impressive feats from the AI systems. But, on the other, it is concerning as governments and corporations look to offload more and more responsibility and decision-making to advanced AI systems where trust (and the understanding of a system's decisions and motivations) is essential.
o3 and o4-mini released
OpenAI has released its latest flagship language models in o3 and o4-mini. Both are reasoning models, and (it's claimed) excell at coding, visual tasks, and much more. And, for the first time, they tie together all of the tools available to ChatGPT, including web search and file analysis.
The main difference between them is their cost and speed. o3 is the more expensive model, and will "think" for longer, where as o4-mini is pitched as their "cost-effective" model for those still wanting a reasoning model.
AI support agent hallucinates policy
In an inevitable turn of events, an AI support agent has hullucinated a company policy which then caused outrage (and cancellations) amongst its userbase. Even more ironically, this happened to the agentic coding company, Cursor, and their AI support system Sam.
The actual issue revolved around a user trying to access their service on multiple devices, to which the AI said they couldn't. The response was then posted to Reddit, sparking a wave of outrate and cancellations.
The error was noticed and quickly corrected (as well as an explanation for the user's original difficulties), but the damage was done. Affected users were offered refunds, but it will be interesting to see how various jurisdictions legislate AI support systems and hold their companies responsible for their responses.
In Brief
Market Trends
New optimisations significantly reduce LLM memory requirements
It's a well-known issue that larger-parameter AI models require huge amounts of RAM, and significantly hinder DIY, at-home approaches to AI. This has also driven demand for consumer graphics cards (such as the RTX 5090) to have much more VRAM, and have kept relatively ancient cards such as the RTX 3090 at high prices.
But, a new technique from Google, which they term Quantization-Aware Training (QAT), is changing that. Using this new technique, their 27B QAT model went from requiring 54GB of memory down to just 14.1GB. This means that even the years-old RTX 3090 can run this model, as well as the upper-tier RTX 5070ti.
AI search killing site traffic
We've previously reported on the impacts of AI content and search on SEO, and this has once again been confirmed by new research.
The company Ahrefs tracks website SEO rankings and click-through rates, and generally helps companies improve their rankings within google. They are reporting upwards of a 34% reduction in click-throughs for top websites, where users are instead interacting with AI-generated summaries on Google's page. The impact of this will be monumental, as many (if not most) of these sites rely on site visits and ad views to pay their bills.
DeepMind moves to “experiential” learning
DeepMind's researchers have proposed a new way of training AI models in which they consume a continual "stream" of real-world data.
They argue that current methods restrict the potential of AIs and what they can "discover", as they are very static, unlike humans who experience continuous inputs from our senses. The paper suggests that AI agents could use real-world signals such as health metrics, exam scores, and more, instead of solely relying on human evaluations for training and reinforcement.
Tools and Resources
Happenstance
Let AI analyse your contacts and social profiles to discover new connections and network opportunities.
OpenAI Codex CLI
An open-source command line tool to give you access to OpenAI's latest reasoning models.
Recommended Reading
Sam Altman interview at TED2025
Sam Altman attended TED2025 recently and discusses with Chris Anderson all the expected hits: ChatGPT, agents, and superintelligence.
Hit reply to let us know which of these stories you found the most important or surprising! And, if you’ve stumbled across an interesting link/tweet/news story of your own, send it our way at [email protected] It might just end up in the next issue!
Thanks for reading. Stay tuned for the next AI Leadership Weekly!

Brought to you by Data Wave your AI and Data Team as a Subscription.
Work with seasoned technology leaders who have taken Startups to IPO and led large transformation programmes.