• AI Fire
  • Posts
  • 🤔 GPT-4.5 Is More Human Than Human?

🤔 GPT-4.5 Is More Human Than Human?

AI's 'gold-standard' therapy skills

In partnership with

ai-fire-banner

Plus: AI's 'gold-standard' therapy skills

Read time: 5 minutes

AI keeps surprising us - from time to time, it even seems more human than a real person. AI Agent can now create clones of itself, and maybe one day, it can create "more human". Just for fun😂😂.

IN PARTNERSHIP WITH SIMPLY BUSINESS

Is your small business risk-ready?

A client has an allergic reaction to some face cream you applied. A customer trips over a ladder you left on the lawn. You neglect to carry the “2” on a client's tax return and they have to pay a penalty. Just about every small business faces financial risks, but more than half of them (56%) aren’t protected with insurance.

Without insurance, you could find yourself having to cover costly claims. Even if it’s not your fault, you may still have to pay the legal expenses to defend yourself.

Getting covered doesn’t have to be confusing, time-consuming, or expensive. Simply Business can help find affordable coverages many small businesses need in just minutes. All from leading insurers and all online, 24/7. They’ve done that for more than 1 million businesses worldwide.

Simply Business offers same-day coverage for most policies and their licensed agents can answer your questions on the phone.

Readers, when it’s time to insure your business, be sure with Simply Business.

AI INSIGHTS

🧠 AI Just Passed the Turing Test – And Fooled Most People

ai-just-pass-the-turing-test

This is the first controlled, empirical study showing that modern LLMs (specifically GPT-4.5) can pass the original three-party Turing Test, where human judges often mistook the AI for a real human. The experiment used randomized, controlled settings and compared four models: GPT-4.5, LLaMa-3.1, GPT-4o, and ELIZA.

Key takeaways:

  • GPT-4.5 with PERSONA prompt:
    ➡️ 73% of the time, it was judged more human than the real person 😳
    ➡️ Passed the Turing Test in both participant groups

  • LLaMa-3.1 with PERSONA:
    ➡️ 56% win rate overall
    ➡️ Passed only in the Prolific (non-student) group

  • Without PERSONA: Both GPT-4.5 and LLaMa dropped to ~36–38% win rates

  • Baselines:

    • GPT-4o: 21% win rate.

    • ELIZA: 23% win rate.

      → Both were significantly worse than chance.

  • 💡 What People Did to Spot the AI

    • Most common tactic: Small talk (61%)

    • Most effective: Jailbreak prompts or “weird” questions to trick AI

    • Least helpful: Asking about the weather

      → Judges relied on emotional tone, flow, and vibe—not logic.

Why it matters: This study doesn’t mean GPT-4.5 is truly intelligent, but it shows just how convincingly it can imitate human conversation - often better than real people. With the right prompt, the model can pass as human in most short chats. That has big implications. It means AI could quietly replace humans in tasks like customer service or social media interaction. And while that might sound efficient, it also opens the door to deception, fake personas, and misinformation. So while we’re not talking about real “thinking” machines yet, we are talking about tools that can blend in—and manipulate - just like one.

PRESENTED BY ARTISAN

Hire Ava, the Industry-Leading AI BDR

Your BDR team is wasting time on things AI can automate. Our AI BDR Ava automates your entire outbound demand generation so you can get leads delivered to your inbox on autopilot.

She operates within the Artisan platform, which consolidates every tool you need for outbound:

  • 300M+ High-Quality B2B Prospects, including E-Commerce and Local Business Leads

  • Automated Lead Enrichment With 10+ Data Sources

  • Full Email Deliverability Management

  • Multi-Channel Outreach Across Email & LinkedIn

  • Human-Level Personalization

TODAY IN AI

AI HIGHLIGHTS

🎓 Anthropic launches Claude for Education, built for students and faculty. New “Learning Mode” teaches critical thinking with follow-up questions, concept highlights, and templates for essays, outlines, and study guides.

🕶️ Meta’s next-gen smart glasses, codenamed Hypernova, are coming late 2025—with a built-in display in the lens! You’ll see app icons, photos, and more, all controlled by gestures or touch. Oh, and of course, there’s an AI assistant inside.

🧠 Dartmouth just completed the first clinical trial of Therabot, a generative AI therapy chatbot - and the results are surprisingly strong. Users experienced a 51% average reduction in depression symptoms and a 31% drop in generalized anxiety. People said they could trust and talk to Therabot like a real therapist.

💸 Running top-tier AI isn't cheap - not even close. The Arc Prize Foundation just updated its cost estimate for o3 high's ARC-AGI task: from $3,000 to a jaw-dropping $30,000 per task - a 10x spike. Why? o3 high uses 172x more compute than o3 low for the same problem.

🤖 LeCun says LLMs like ChatGPT and Claude will be obsolete in 3–5 years. They lack reasoning, planning, and real-world understanding. His advice? “Don’t work on LLMs... build next-gen AI that removes their limits.”

🎯 Genspark just launched Super Agent - an AI that creates recipe-style videos, plans full trips using maps and research tools, and even books restaurants for you. It outperforms Deep Research and Manus on the GAIA benchmark. It’s shut that down and gone all-in on AI agents.

📝 NotebookLM is rolling out a new way to gather info for your projects. Simply define your topic, and it curates relevant sources from across the web, making research a breeze.

💰 AI Daily Fundraising: Replit is in talks to raise $200M at a $3B valuation—nearly 3x growth. With 30M+ users, it’s leading the “vibecoding” wave where AI builds apps from plain text prompts.

AI SOURCES FROM AI FIRE

🎉 Hey friends, we’ve been quiet for a bit, but we never forgot you—our amazing readers!

To all our loyal trivia lovers: thank you for sticking with us. It’s finally time to celebrate the champs of our December 2024 & January 2025 trivia polls! You’ve guessed, voted, and outsmarted the rest (seriously, some of those questions were tough😵).

It’s time to celebrate the trivia champs of December 2024 and January 2025! (Trivia winner of February and March list coming next week - keep an eye out 👀).

Big congrats to all the winners - you rock 💥💥

  • chagge@*****rengineering.com

  • *****@prismagraphic.com

  • *****kaushik17@yahoo.com

  • steven.*****y@mysticlake.com

  • *****x23@gmail.com

  • 1*****@gmail.com

  • j*****@aurorak12.org

  • tapani*****sto@gmail.com

ai-fire-academy

NEW EMPOWERED AI TOOLS

  1. 🎬 Premiere Pro now extends videos using AI scene generation magic (Link)

  2. 🗣️ Synclabs Lipsync-2 copies your voice perfectly - no training needed (Link)

  3. 🎮 Tinder’s Game Game helps you practice flirting with AI personas (Link)

  4. 🌍 MiniMax Speech-02 speaks 30+ languages with stunning voice realism (Link)

  5. 🕶️ visionOS 2.4 adds Apple Intelligence and spatial iPhone experiences (Link)

AI QUICK HITS

  1. 🎓 George Mason Launches Virginia’s First Public AI Master’s Program (Link)

  2. 🧠 15 Moments That Shaped Microsoft’s Big AI Vision (Link)

  3. 💡 Lightmatter Unveils New Photonics Breakthrough for Smarter AI Chips (Link)

  4. 🗣️ Meta’s MoCha Brings Talking Characters to Life with AI (Link)

  5. 🧪 OpenAI Launches PaperBench to Test AI Research Skills (Link)

AI CHART

ai-task-growth-self-exploring-agents-over-iterations

AI agents building other AI agents - on their own.

That’s the core of what Emergence is working on: an autonomous multi-agent system where smart AI workers form teams, solve tasks, and even improve themselves without needing a human to step in every time.

🔧 How It Works

At the center of it all is something called The Orchestrator - think of it as the manager that breaks tasks into steps, assigns agents, and if no agent fits the job? It just writes new ones.

And yes, it can:

  • Plan tasks

  • Generate code

  • Simulate outcomes

  • Learn from feedback

Human oversight is still in place for safety—but most of the heavy lifting is automated.

📊 Real Use Case: Semiconductor Troubleshooting

A client needed help finding chips with low production yield. The Orchestrator:

  • Broke down the task

  • Found no current agent could solve it

  • Wrote brand-new agents to handle the job

  • Then anticipated future related tasks and created agents for those too

📈 The More It Works, The Smarter It Gets

Emergence tracks how:

  • Task complexity increases

  • Number of agents grows

  • Success rate improves over time

That’s recursive self-improvement in action—like evolution, but digital.

💡 New Agent Types Added
Now includes:

  • API Agents

  • Web Agents

  • Data and Connector Agents

  • Text Intelligence Agents

All deployable across cloud setups, with an SDK + registry to plug in your own agents too.

One real-world example already in production? Multi-agent software testing automation—fully AI-driven.

📉 Of Course, There Are Risks

  • Simulations can go wrong if training is biased

  • Agents might optimize for the wrong goal

  • Too many agents = infrastructure overload

That’s why Emergence builds in guardrails, human-in-the-loop checkpoints, and strict verification systems.

📊 Results Are Already Impressive

  • Code agents went from 7% → 60% accuracy in SWE-bench (2024–2025)

  • Emergence’s web agents hit 80% success on WebVoyager (GPT-4 only managed 30%)

Open-source tools like AutoGen, LangGraph, CrewAI, and Emergence’s own orchestrator are helping teams build multi-agent systems faster than ever.

🧬 Inspired by Nature, Backed by Science
This isn’t just engineering—it echoes biology and foundational computer science:

  • Recursive learning (Turing)

  • Self-assembling systems (von Neumann)

  • Natural processes like simulation, feedback, and adaptation

🔮 The Big Shift
We’re moving from:

“What can humans build with AI?”
to
“What can AI build—for us?”

Humans set the direction. AI systems build the solution.

And soon, those systems might just be building themselves.

AI JOBS

  • Lensa: AI/ML Engineer (Link)

  • BMO US: Artificial Intelligence Product Owner (Link)

  • NewtonX: AI Engineer (Remote) (Link)

  • Accuro Group: Generative AI Developer Charlotte, NC (Onsite) (Link)

We read your emails, comments, and poll replies daily

How would you rate today’s newsletter?

Your feedback helps us create the best newsletter possible

Login or Subscribe to participate in polls.

Hit reply and say Hello – we'd love to hear from you!

Like what you're reading? Forward it to friends, and they can sign up here.

Cheers,
The AI Fire Team

Reply

or to participate.