Andrej Karpathy: The Best AI Educator Working Today


a cut in half picture of a building with blue and red arrows

Andrej Karpathy left OpenAI in February 2023, and within months it became clear that his most important work wasn’t the research papers or the Tesla Autopilot system or even his role helping build GPT-4. It was the YouTube videos. The ones where he sits down with a code editor and a whiteboard and explains, from scratch, how neural networks actually work — not the hand-wavy marketing version, but the real math, the real code, the real intuition. In a field full of people who either oversimplify or overcomplicate, Karpathy found a third path: genuine depth, communicated clearly. And right now, when everyone from developers to executives is trying to actually understand AI rather than just talk about it, that makes him one of the most valuable voices in the entire space — alongside a broader group of AI thinkers worth following.

Who Karpathy Actually Is (And Why That Background Matters)

Andrej Karpathy did his PhD at Stanford under Fei-Fei Li, working on convolutional neural networks and image captioning at a time when deep learning was still fighting for legitimacy. He joined OpenAI as a founding research scientist in 2015, then left for Tesla in 2017 to run Autopilot — a job that required not just theoretical knowledge but the brutal real-world discipline of making neural networks work on hardware at 70 mph. He returned to OpenAI in 2022, helped with the final push toward GPT-4, then departed again in early 2023 to focus on education and independent projects.

That arc matters. Karpathy has trained models from scratch, deployed them in production at scale, and worked at the frontier of research. When he explains backpropagation or attention mechanisms, he’s not summarizing someone else’s work — he’s explaining things he’s done hundreds of times. That’s a rare combination in a field where researchers, engineers, and educators are usually three different populations.

He’s also been remarkably consistent about what he thinks AI is and isn’t. While others were swinging between “AGI is five years away” and “this is all a bubble,” Karpathy has stayed focused on the engineering reality: what these systems can actually do, where they fail, and how to think about them honestly.

The YouTube Catalog: What’s Actually in It

Karpathy’s YouTube channel (youtube.com/@AndrejKarpathy) is small by influencer standards — a few dozen videos — but the depth-per-video ratio is extraordinary. A few highlights:

  • The Neural Networks: Zero to Hero series — Starting with “The spelled-out intro to neural networks and backpropagation: building micrograd,” this series builds a working neural network library from scratch in Python. Not a wrapper around PyTorch. Actual autograd, from nothing. It’s probably the best single resource for understanding what training a neural network really means at a mechanical level.
  • Let’s build GPT: from scratch, in code, spelled out — A two-hour video where Karpathy builds a small GPT from scratch, implementing the transformer architecture piece by piece. He doesn’t skip the hard parts. By the end you understand why attention works, not just that it does.
  • Let’s build the GPT Tokenizer — A deep dive into tokenization that most people skip. Understanding why “9.11 is greater than 9.9” breaks some models starts here.
  • Intro to Large Language Models — A one-hour talk structured as a genuine explainer for a technical-but-not-ML audience. Covers what LLMs are, how they’re trained, what RLHF does, and what “jailbreaking” actually means at a systems level.

These aren’t polished studio productions. They’re screen recordings with a microphone. But the quality of thinking is what makes them work. Karpathy pauses when something is confusing, backs up, tries a different angle. It models good technical thinking in a way that slick explainer videos never do — something that also distinguishes the best AI podcasts where researchers actually open up.

His Mental Models: How Karpathy Thinks About AI

Beyond the tutorials, Karpathy has developed a set of framings that have genuinely influenced how the field talks about itself. A few that are worth understanding:

Software 1.0 vs. Software 2.0

In a 2017 Medium post (and revisited frequently since), Karpathy described the shift from classical programming as “Software 1.0” — where humans write explicit instructions — to “Software 2.0,” where neural networks are trained on data and the “code” lives in the weights. He extended this later with the concept of “Software 3.0,” where the program is written in natural language prompts. This framing is genuinely useful because it clarifies what’s different about AI-native development: you’re not debugging logic, you’re debugging data and training procedures and prompts. Different skill set, different failure modes.

LLMs as a “Reasoning Engine” Rather Than a Knowledge Store

Karpathy has been consistent about not treating LLMs as databases. The knowledge in the weights is compressed and lossy. Where LLMs shine is reasoning over information you give them — retrieval augmented generation, tool use, agentic loops where the model is doing something with context rather than reciting facts. This distinction matters practically: if you’re building something that needs factual accuracy, you need retrieval. If you need flexible reasoning over structured inputs, pure LLM is often fine.

The “Vibe Coding” Concept

Early 2025, Karpathy coined the term “vibe coding” in a post on X — the practice of describing what you want to a coding AI (Cursor, Claude, Copilot) and iterating on the output without necessarily reading every line of generated code. He was describing a real shift in how developers, especially solo builders, were actually working. The term immediately sparked debate about whether this was good or bad engineering practice, which was somewhat beside the point — it was an accurate description of something that was already happening, and naming it let the field have a more honest conversation about the tradeoffs.

Karpathy on the State of AI: Where He Actually Stands

Karpathy doesn’t post manifestos. His views come out in talks, in X posts, in the asides inside his tutorials. But a coherent picture emerges:

Topic Karpathy’s Position Notable Context
AGI Timeline Consistently cautious — focuses on capability gaps rather than hype Has noted that current LLMs have significant “hallucination” and reliability problems that matter a lot for real deployment. For a structured way to evaluate any model’s proximity to AGI, see this scored checklist.
AI Safety Takes it seriously but isn’t a doomer Has said alignment is an important research problem without claiming catastrophe is imminent. The question of

Ty Sutherland

Ty Sutherland is the Chief Editor of AI Rising Trends. Living in what he believes to be the most transformative era in history, Ty is deeply captivated by the boundless potential of emerging technologies like the metaverse and artificial intelligence. He envisions a future where these innovations seamlessly enhance every facet of human existence. With a fervent desire to champion the adoption of AI for humanity's collective betterment, Ty emphasizes the urgency of integrating AI into our professional and personal spheres, cautioning against the risk of obsolescence for those who lag behind. "Airising Trends" stands as a testament to his mission, dedicated to spotlighting the latest in AI advancements and offering guidance on harnessing these tools to elevate one's life.

Recent Posts