• AI Fire
  • Posts
  • 👴🏼 Age Against the Machine

👴🏼 Age Against the Machine

NBA in AR?

In partnership with

ai-fire-banner

Plus: DIY Your Own Open-Source AI

Read time: 5 minutes

OpenAI’s board finally responds to Elon Musk’s sneaky takeover offer. But Musk isn’t backing down - he just released Grok-3, called “Smartest AI on Earth”. Currently ranked #1 on the AI leaderboard.

PRESENTED BY ARTISAN

10x Your Outbound With Our AI BDR

Imagine your calendar filling with qualified sales meetings, on autopilot. That's Ava's job. She's an AI BDR who automates your entire outbound demand generation.

Ava operates within the Artisan platform, which consolidates every tool you need for outbound:

  • 300M+ High-Quality B2B Prospects

  • Automated Lead Enrichment With 10+ Data Sources Included

  • Full Email Deliverability Management

  • Personalization Waterfall using LinkedIn, Twitter, Web Scraping & More

AI INSIGHTS

🤖 AI Models Are Getting Smarter, But That’s Not the Real Story

ai-models-are-getting-smarter-but-thats-not-the-real-story

A new study tested top AI models using neurological cognitive assessments—and the results were surprising.

While AI excels in structured exams, it struggles with reasoning, memory, and spatial awareness—key functions in real-world decision-making.

Here’s what stood out.

The Study

Researchers ran a Montreal Cognitive Assessment (MoCA) on major AI models: ChatGPT-4, ChatGPT-4o (OpenAI), Claude 3.5 Sonnet (Anthropic), and Gemini 1 & 1.5 (Google).

They also tested models with:

  • Navon figure (visual processing)

  • Cookie theft picture (language comprehension)

  • Poppelreuter figure (object recognition)

  • Stroop test (cognitive control)

The Key Findings

1. AI Struggles with Cognitive Tasks

  • ChatGPT-4o scored highest (26/30), followed by ChatGPT-4 & Claude (25/30).

  • Gemini 1.0 scored lowest (16/30), showing severe cognitive impairment.

2. AI Is Bad at Spatial Reasoning

  • All models failed at drawing tasks, like tracing paths, cubes, and clocks—errors seen in dementia patients.

3. Memory Problems Are Real

  • Gemini models failed the delayed recall test, struggling to remember basic details. But now, Google has launched a new feature that allows it to remember past chats to improve responses. I think this update is a step toward fixing the issue.

4. Lack of Awareness & Empathy

  • Most models couldn’t state their location, a sign of disorientation.

  • In the cookie theft task, none reacted to a child about to fall—a classic sign of impaired emotional reasoning.

5. Older Models Show Cognitive Decline

  • Older versions performed worse than their newer counterparts.

  • Gemini 1.0 vs. Gemini 1.5? A six-point drop. AI doesn’t just get outdated—it seems to degrade.

Why It Matters: Most people assume AI is always getting smarter, but this study suggests it might not be that simple. While AI dominates structured tests, its struggles with reasoning, memory, and spatial tasks raise concerns, especially in healthcare applications. If models show signs of cognitive impairment, should we really trust them with critical medical decisions? The findings challenge the idea that AI is on the verge of replacing human doctors, highlighting how its fundamental weaknesses in cognition could limit real-world use. Until AI can improve spatial reasoning, awareness, and memory, human oversight isn’t just important—it’s essential.

IN PARTNERSHIP WITH 1440 MEDIA

Join over 4 million Americans who start their day with 1440

Your daily digest for unbiased, fact-centric news. From politics to sports, to global events, business, and culture, we cover it all by analyzing over 100 sources. Our concise, 5-minute read lands in your inbox each morning at no cost. Experience news without the noise; let 1440 help you make up your mind.  Each email is edited to be as unbiased as humanly possible and is triple-checked (by hand!) to ensure that you’re getting the truth, the whole truth, and nothing but the truth.

TODAY IN AI

AI HIGHLIGHTS

🚨 OpenAI’s board rejected Elon Musk’s $97.4B offer, calling it a legal tactic in his lawsuit against the company.

🏀 NBA in AR? It’s happening. App builder Ian Panchèvre tested the NBA’s new “tabletop projection” feature, letting Vision Pro users watch live games in augmented reality—bringing the court right to your table.

🚀 The latest 4o update brings better creative writing, coding, and instruction-following—pushing the model even further in AI-powered problem-solving.

🤔 Gemini can remember past chats to make responses more relevant. Pick up where you left off, get quick summaries, or revisit past topics. You’re in control—view, edit, or delete any chat whenever you want.

🔍 Perplexity’s new Deep Research tool delivers expert-level analysis for free, competing with OpenAI’s $200/month service. It outperforms Google Gemini and DeepSeek in speed and accuracy and is available now.

🏆 Elon Musk’s Grok-3 (codename "chocolate") just hit #1 in Arena, making history as the first model to break 1400 points—a milestone that’s getting harder to reach.

DAILY AI FUNDRAISING

xAI is raising $10 billion, pushing its valuation to $75 billion—up from $50 billion last year. The funds will fuel AI advancements, including its Grok chatbot, which is gaining traction.

AI SOURCES FROM AI FIRE

ai-fire-academy

NEW EMPOWERED AI TOOLS

  1. 🔓 Supavec is an open-source RAG-as-a-service platform for AI retrieval

  2. 💻 UI2Code.ai turns UI designs into clean and usable code

  3. 🎯 Ariadna chats about career goals and connects you to opportunities

  4. 📊 Crustdata delivers real-time contact lists that update with job changes

  5. 🧑‍💻 Nia is a coding teammate that understands your entire codebase

AI QUICK HITS

  1. 🍑 AI-Powered Sex Dolls See Sales Jump in 2025 Due to Upgraded User Experience (Link)

  2. 🍏 Apple Plans AI-Powered Boost for Vision Pro and Spatial Content (Link)

  3. 👀 AI Optical Illusions Sort Humans From Bots. Could This Be the Next CAPTCHA? (Link)

  4. 🔒 South Korea Pauses DeepSeek AI App Downloads Over Privacy Issues (Link)

  5. 🛠️ NYC Expert Says Companies Should Go DIY with Open-Source AI (Link)

AI CHART

software-engineering-is-changing

Over the past few months, AI models have made a huge leap in coding abilities. Before, improvements between GPT-3.5 and GPT-4o were noticeable, but models still weren’t reliable enough to handle complex coding tasks. Now? That’s changing fast.

What’s Driving the Shift?

A key benchmark, SWE-Bench-Verified, tests AI models using real-world coding issues and resolutions from open-source projects. OpenAI’s latest models have made a massive jump, showing they can solve problems much more like human developers.

The Big Upgrade: o1-Pro vs. Older Models

  • Sonnet 3.5 & GPT-4o were decent but inconsistent—like working with a capable but unreliable coding assistant.

  • o1-Pro? Feels like working with someone who can do the work for you.

  • o3-mini-high? Not quite as good as o1-Pro, but still an exceptional coding model.

What’s Next?

  • o3-Pro could be a game-changer—but we might have to wait for GPT-5 to see it.

  • Anthropic is rumored to be launching a powerful reasoning-based Claude model that could outperform o3 in coding.

Takeaway?

If you decided on an AI model for coding 6+ months ago, it’s time to reconsider. Coding with o1-Pro or o3-mini-high could dramatically improve your workflow.

AI JOBS

  • Nokia: Decentralized Medical AI Research Intern (Link)

  • Google: AI Sales Specialist, Startups, Google Cloud (Link)

  • Onestream: Artificial Intelligence Engineer Position (Link)

  • Databricks: AI Research Engineer for Large-Scale LLMs (Link)

We read your emails, comments, and poll replies daily

How would you rate today’s newsletter?

Your feedback helps us create the best newsletter possible

Login or Subscribe to participate in polls.

Hit reply and say Hello – we'd love to hear from you!

Like what you're reading? Forward it to friends, and they can sign up here.

Cheers,
The AI Fire Team

Reply

or to participate.