| Vol. MCMLXXXIV Issue 42

BANISHING GRADIENTS

America's Loss Function

Local Man Pretty Sure He Could Train An LLM Himself Given Enough GPUs

'It's basically just matrix multiplication,' explains man who has never multiplied a matrix

Rita Chen (Culture & Society Reporter) · · 3 min read
Computer hardware and cables
Photo: Unsplash

DENVER, CO — Area man Todd Reynolds, 34, expressed confidence in a recent conversation that he “could definitely train a large language model” comparable to those developed by leading AI labs, if only he had access to “enough GPUs and maybe a weekend.”

“People make it sound so complicated, but it’s basically just matrix multiplication,” explained Reynolds, a project manager at a marketing firm who has never formally studied linear algebra or machine learning. “I’ve been listening to a lot of podcasts about AI, and I feel like I pretty much get the gist. Attention is all you need, right? I’m very attentive.”

Reynolds, who describes his technical background as “I built a website once using WordPress,” outlined his proposed approach to training a frontier AI model.

“You just need a lot of text, right? I have access to the entire internet. And you feed it through some neural networks — I’m not totally clear on what those are, but I know they’re called ‘neural’ because they’re like brains — and then you do the gradient descent thing. Descending gradients can’t be that hard. Gravity does it automatically.”

When asked about specific technical challenges like tokenization, architectural decisions, or alignment procedures, Reynolds waved dismissively.

“Those are implementation details,” he said. “I’m a big picture guy. Once I have the big picture figured out, the details just fall into place. That’s how I approached my last project at work. Admittedly, that project failed, but this is different because I’m really passionate about AI.”

Reynolds estimated that with “probably $50,000 in cloud computing credits and maybe a month or two,” he could produce results comparable to models that required billions of dollars and hundreds of researchers to develop.

“The thing about these big AI labs is they’re bloated with bureaucracy,” he explained. “I’d be leaner. More agile. Just me, some GPUs, and the entire corpus of human knowledge. How hard could it be?”

Friends and family have reportedly grown accustomed to Reynolds’s confident pronouncements about technical subjects.

“Last month he was going to build his own cryptocurrency,” said Reynolds’s roommate, Derek Simmons. “Before that, it was a revolutionary new social media app. He spent about three days on each project before discovering they were ‘harder than expected.’ But I’m sure the AI thing will go differently.”

Reynolds acknowledged that he hadn’t actually written any code yet but insisted this was a “strategic choice.”

“I’m still in the research phase,” he said. “Mostly watching YouTube videos and reading Twitter threads. Some of these threads are really long. Like, some guy wrote a whole explanation of how transformers work. I didn’t read it all, but I saved it, which is basically the same thing.”

When shown the technical specifications for training a model like GPT-4 — including the need for thousands of specialized GPUs running for months, petabytes of carefully curated data, and teams of PhD researchers fine-tuning every aspect of the process — Reynolds remained undeterred.

“That’s how they did it,” he said. “But I think there are probably shortcuts they’re not seeing. Like, what if you trained it on just the really good parts of the internet? Has anyone tried that? I bet I’m the first person to think of that.”

At press time, Reynolds had registered the domain name “toddgpt.ai” and was drafting a pitch deck for potential investors, describing his competitive advantage as “hustle and the ability to think different.”