Tuesday, January 28, 2025

'MAHA': The Unexpected Coalition of Nutritionists, Mushroom Shamans and Moms • Honestly with Bari Weiss

'MAHA': The Unexpected Coalition of Nutritionists, Mushroom Shamans and Moms • Honestly with Bari Weiss

Current AI scaling laws are showing diminishing returns, forcing AI labs to change course | (Nov. 2024) TechCrunch

Current AI scaling laws are showing diminishing returns, forcing AI labs to change course | TechCrunch

AI labs traveling the road to super-intelligent systems are realizing they might have to take a detour.

“AI scaling laws,the methods and expectations that labs have used to increase the capabilities of their models for the last five years, are now showing signs of diminishing returns, according to several AI investors, founders, and CEOs who spoke with TechCrunch. Their sentiments echo recent reports that indicate models inside leading AI labs are improving more slowly than they used to.

Everyone now seems to be admitting you can’t just use more compute and more data while pretraining large language models and expect them to turn into some sort of all-knowing digital god. Maybe that sounds obvious, but these scaling laws were a key factor in developing ChatGPT, making it better, and likely influencing many CEOs to make bold predictions about AGI arriving in just a few years.

OpenAI and Safe Super Intelligence co-founder Ilya Sutskever told Reuters last week that “everyone is looking for the next thing” to scale their AI models. Earlier this month, a16z co-founder Marc Andreessen said in a podcast that AI models currently seem to be converging at the same ceiling on capabilities.

But now, almost immediately after these concerning trends started to emerge, AI CEOs, researchers, and investors are already declaring we’re in a new era of scaling laws. “Test-time compute,which gives AI models more time and compute to “think” before answering a question, is an especially promising contender to be the next big thing.

“We are seeing the emergence of a new scaling law,” said Microsoft CEO Satya Nadella onstage at Microsoft Ignite on Tuesday, referring to the test-time compute research underpinning OpenAI’s o1 model.


He’s not the only one now pointing to o1 as the future.


“We’re now in the second era of scaling laws, which is test-time scaling,” said Andreessen Horowitz partner Anjney Midha, who also sits on the board of Mistral and was an angel investor in Anthropic, in a recent interview with TechCrunch.

If the unexpected success — and now, the sudden slowing — of the previous AI scaling laws tell us anything, it’s that it is very hard to predict how and when AI models will improve.

Regardless, there seems to be a paradigm shift underway: The ways AI labs try to advance their models for the next five years likely won’t resemble the last five.

What are AI scaling laws?

The rapid AI model improvements that OpenAI, Google, Meta, and Anthropic have achieved since 2020 can largely be attributed to one key insight: use more compute and more data during an AI model’s pretraining phase.

When researchers give machine learning systems abundant resources during this phasein which AI identifies and stores patterns in large datasets — models have tended to perform better at predicting the next word or phrase.

This first generation of AI scaling laws pushed the envelope of what computers could do, as engineers increased the number of GPUs used and the quantity of data they were fed. Even if this particular method has run its course, it has already redrawn the map. Every Big Tech company has basically gone all in on AI, while Nvidia, which supplies the GPUs all these companies train their models on, is now the most valuable publicly traded company in the world.

But these investments were also made with the expectation that scaling would continue as expected.

It’s important to note that scaling laws are not laws of nature, physics, math, or government. They’re not guaranteed by anything, or anyone, to continue at the same pace. Even Moore’s Law, another famous scaling law, eventually petered out — though it certainly had a longer run.

“If you just put in more compute, you put in more data, you make the model bigger — there are diminishing returns,” said Anyscale co-founder and former CEO Robert Nishihara in an interview with TechCrunch. “In order to keep the scaling laws going, in order to keep the rate of progress increasing, we also need new ideas.”

Nishihara is quite familiar with AI scaling laws. Anyscale reached a billion-dollar valuation by developing software that helps OpenAI and other AI model developers scale their AI training workloads to tens of thousands of GPUs. Anyscale has been one of the biggest beneficiaries of pretraining scaling laws around compute, but even its co-founder recognizes that the season is changing.

“When you’ve read a million reviews on Yelp, maybe the next reviews on Yelp don’t give you that much,” said Nishihara, referring to the limitations of scaling data. “But that’s pretraining. The methodology around post-training, I would say, is quite immature and has a lot of room left to improve.”

To be clear, AI model developers will likely continue chasing after larger compute cluster and bigger datasets for pretraining, and there’s probably more improvement to eke out of those methods. Elon Musk recently finished building a supercomputer with 100,000 GPUs, dubbed Colossus, to train xAI’s next models. There will be more, and larger, clusters to come.

But trends suggest exponential growth is not possible by simply using more GPUs with existing strategies, so new methods are suddenly getting more attention.

Test-time compute: The AI industry’s next big bet

When OpenAI released a preview of its o1 model, the startup announced it was part of a new series of models separate from GPT.

OpenAI improved its GPT models largely through traditional scaling laws: more data, more power during pretraining. But now that method reportedly isn’t gaining them much. The o1 framework of models relies on a new concept, test-time compute, so called because the computing resources are used after a prompt, not before. The technique hasn’t been explored much yet in the context of neural networks, but is already showing promise.

Some are already pointing to test-time compute as the next method to scale AI systems.

“A number of experiments are showing that even though pretraining scaling laws may be slowing, the test-time scaling laws — where you give the model more compute at inference — can give increasing gains in performance,” said a16z’s Midha.

“OpenAI’s new ‘o’ series pushes [chain-of-thought] further, and requires far more computing resources, and therefore energy, to do so,” said famed AI researcher Yoshua Bengio in an op-ed on Tuesday. “We thus see a new form of computational scaling appear. Not just more training data and larger models but more time spent ‘thinking’ about answers.”

Over a period of 10 to 30 seconds, OpenAI’s o1 model re-prompts itself several times, breaking down a large problem into a series of smaller ones. Despite ChatGPT saying it is “thinking,” it isn’t doing what humans do — although our internal problem-solving methods, which benefit from clear restatement of a problem and stepwise solutions, were key inspirations for the method.

A decade or so back, Noam Brown, who now leads OpenAI’s work on o1, was trying to build AI systems that could beat humans at poker. During a recent talk, Brown says he noticed at the time how human poker players took time to consider different scenarios before playing a hand. In 2017, he introduced a method to let a model “think” for 30 seconds before playing. In that time, the AI was playing different subgames, figuring out how different scenarios would play out to determine the best move.

Ultimately, the AI performed seven times better than his past attempts.

Granted, Brown’s research in 2017 did not use neural networks, which weren’t as popular at the time. However, MIT researchers released a paper last week showing that test-time compute significantly improves an AI model’s performance on reasoning tasks.

It’s not immediately clear how test-time compute would scale. It could mean that AI systems need a really long time to think about hard questions; maybe hours or even days. Another approach could be letting an AI model “think” through a questions on lots of chips simultaneously.

If test-time compute does take off as the next place to scale AI systems, Midha says the demand for AI chips that specialize in high-speed inference could go up dramatically. This could be good news for startups such as Groq or Cerebras, which specialize in fast AI inference chips. If finding the answer is just as compute-heavy as training the model, the “pick and shovel” providers in AI win again.


The AI world is not yet panicking

Most of the AI world doesn’t seem to be losing their cool about these old scaling laws slowing down. Even if test-time compute does not prove to be the next wave of scaling, some feel we’re only scratching the surface of applications for current AI models.

New popular products could buy AI model developers some time to figure out new ways to improve the underlying models.

“I’m completely convinced we’re going to see at least 10 to 20x gains in model performance just through pure application-level work, just allowing the models to shine through intelligent prompting, UX decisions, and passing context at the right time into the models,” said Midha.

For example, ChatGPT’s Advanced Voice Mode is one the more impressive applications from current AI models. However, that was largely an innovation in user experience, not necessarily the underlying tech. You can see how further UX innovations, such as giving that feature access to the web or applications on your phone, would make the product that much better.

Kian Katanforoosh, the CEO of AI startup Workera and a Stanford adjunct lecturer on deep learning, tells TechCrunch that companies building AI applications, like his, don’t necessarily need exponentially smarter models to build better products. He also says the products around current models have a lot of room to get better.

“Let’s say you build AI applications and your AI hallucinates on a specific task,” said Katanforoosh. “There are two ways that you can avoid that. Either the LLM has to get better and it will stop hallucinating, or the tooling around it has to get better and you’ll have opportunities to fix the issue.”

Whatever the case is for the frontier of AI research, users probably won’t feel the effects of these shifts for some time. That said, AI labs will do whatever is necessary to continue shipping bigger, smarter, and faster models at the same rapid pace. That means several leading tech companies could now pivot how they’re pushing the boundaries of AI.


A Chinese lab has released a 'reasoning' AI model to rival OpenAI's o1 | (Nov. 2024) TechCrunch

A Chinese lab has released a 'reasoning' AI model to rival OpenAI's o1 | TechCrunch

A Chinese lab has unveiled what appears to be one of the first “reasoning” AI models to rival OpenAI’s o1.

On Wednesday, DeepSeek, an AI research company funded by quantitative traders, released a preview of DeepSeek-R1, which the firm claims is a reasoning model competitive with o1.

Unlike most models, reasoning models effectively fact-check themselves by spending more time considering a question or query. This helps them avoid some of the pitfalls that normally trip up models.

Similar to o1, DeepSeek-R1 reasons through tasks, planning ahead, and performing a series of actions that help the model arrive at an answer. This can take a while. Like o1, depending on the complexity of the question, DeepSeek-R1 might “think” for tens of seconds before answering.


DeepSeek claims that DeepSeek-R1 (or DeepSeek-R1-Lite-Preview, to be precise) performs on par with OpenAI’s o1-preview model on two popular AI benchmarks, AIME and MATH. AIME uses other AI models to evaluate a model’s performance, while MATH is a collection of word problems. But the model isn’t perfect. Some commentators on X noted that DeepSeek-R1 struggles with tic-tac-toe and other logic problems (as does o1).

DeepSeek can also be easily jailbroken — that is, prompted in such a way that it ignores safeguards. One X user got the model to give a detailed meth recipe.

And DeepSeek-R1 appears to block queries deemed too politically sensitive. In our testing, the model refused to answer questions about Chinese leader Xi Jinping, Tiananmen Square, and the geopolitical implications of China invading Taiwan.

The behavior is likely the result of pressure from the Chinese government on AI projects in the region. Models in China must undergo benchmarking by China’s internet regulator to ensure their responses “embody core socialist values.” Reportedly, the government has gone so far as to propose a blacklist of sources that can’t be used to train models — the result being that many Chinese AI systems decline to respond to topics that might raise the ire of regulators.

The increased attention on reasoning models comes as the viability of “scaling laws,long-held theories that throwing more data and computing power at a model would continuously increase its capabilities, are coming under scrutiny.flurry of press reports suggest that models from major AI labs including OpenAI, Google, and Anthropic aren’t improving as dramatically as they once did.

That’s led to a scramble for new AI approaches, architectures, and development techniques. One is test-time compute, which underpins models like o1 and DeepSeek-R1. Also known as inference compute, test-time compute essentially gives models extra processing time to complete tasks.

“We are seeing the emergence of a new scaling law,” Microsoft CEO Satya Nadella said this week during a keynote at Microsoft’s Ignite conference, referencing test-time compute.

DeepSeek, which says that it plans to open source DeepSeek-R1 and release an API, is a curious operation. It’s backed by High-Flyer Capital Management, a Chinese quantitative hedge fund that uses AI to inform its trading decisions.

One of DeepSeek’s first models, a general-purpose text- and image-analyzing model called DeepSeek-V2, forced competitors like ByteDance, Baidu, and Alibaba to cut the usage prices for some of their models — and make others completely free.

High-Flyer builds its own server clusters for model training, the most recent of which reportedly has 10,000 Nvidia A100 GPUs and cost 1 billion yen (~$138 million). Founded by Liang Wenfeng, a computer science graduate, High-Flyer aims to achieve “superintelligent” AI through its DeepSeek org.


Sunday, January 26, 2025

Part #4: The innovation feast and you - btbirkett@gmail.com - Gmail

Part #4: The innovation feast and you - btbirkett@gmail.com - Gmail

Part #4: The innovation feast and you

In this issue:

  • A middle-class jobs boom

  • The most beautiful thing about physical innovation

  • What I hope my kids do with their lives

  • Introducing the space gun

Dear Rational Optimist,

 

Thanks for all the emails. I’ve never received 450+ responses to a single question before. We asked if you wanted us to write about practical ways to use AI in your life, and you gave a resounding, “Yes.” We’re writing The Rational Optimist’s Guide to AI as we speak.

 

Thanks also for your responses to a preferred ROS first meetup location. It looks like Boston is in the lead, with Washington, DC a close second. More soon.

 

We’ve kicked off the new year by analyzing the sudden burst of rapid physical innovation happening all around us. Today, we’ll conclude our series by getting practical.

 

What should we, the humans living through this rapid transformation, do? How should we plan for our families, our careers, our retirements? What should you encourage your kids to pursue?

 

If you missed prior issues, see below:

 

 

In 2011, early Facebook employee Jeff Hammerbacher bemoaned: “The best minds of my generation are thinking about how to make people click ads… that sucks.”

 

But 2024 marked the great pivot. There’s been a huge shift in what ambitious young entrepreneurs are building.

 

They're creating nuclear reactors that fit in shipping containers. Robots that dance through warehouses. Cancer-hunting pills. Planes that will streak across the Atlantic at twice the speed of sound. Self-landing rockets—taller than the Statue of Liberty. Microgravity drug factories that orbit Earth.

 

These aren’t little conveniences like being able to watch The Sopranos reruns on your iPhone. They’re life-changing transformations.

 

This is just a preview of the next 20 years. Change is happening like an avalanche, fast.

 

The last innovation feast in the mid-20th century gave us nuclear power, supersonic jets, and the Space Race. It also delivered the greatest middle-class boom ever. Could this burst in innovation shrink the wealth gap again? It already is.

 

FactSet data shows the net worth for the bottom half of Americans has grown 65% since 2020—more than double any other cohort. Never happened before.

 

New Bureau of Labor Statistics data shows low-wage workers have experienced the largest pay increase of any income group over the past five years (inflation adjusted).

 

It’s interesting that the two fastest-growing occupations in America today…

 

#1: Wind turbine service technician

 

#2: Solar photovoltaic installer

 

… don’t require a college degree.

 

The innovation avalanche is already creating lots of good jobs for people who work with their hands. Manufacturing employment is at its highest since 2008. Direct employment in data centers is growing 8X faster than overall employment.

 

The physical world is hiring.

 

The most beautiful thing about physical innovation is we all get to live in the better world we build. The interstate highways built in the 1950s made everyone's life better. The jet engine sped up travel for everyone. As we continue to crack the code of disease, we can all live longer.

 

Here’s how Rational Optimists should prepare for what comes next.

 

#1: Investors. In 1980, businesses that made and sold physical stuff ruled. But today, the digital world dominates, as this table shows:

The pendulum is swinging back. Tomorrow's titans will build robots, rockets, reactors, and realities we can touch.

 

Kodak and Xerox won’t be resurrected. New innovators that make physical “stuff” will rise up to control the top 10 list. This shift has already begun, with Tesla and chipmaker Taiwan Semiconductor in spots #8 and #9.

 

Here are real-world innovators that could crack the “top 10” list by 2030:

 

  1. Anduril (drones)
     

  2. ASML (chipmaking machines)
     

  3. Boston Dynamics, KUKA (robotics)
     

  4. BYD (world's top EV maker)
     

  5. Contemporary Amperex Technology Co. (world's largest battery maker)
     

  6. Eli Lily, Novo Nordisk (semaglutide)
     

  7. Moderna (mRNA)
     

  8. OpenAI, Anthropic (AI)
     

  9. SpaceX (spaceships)
     

  10. Waymo (robotaxis)

 

Even more exciting are the hundreds of under-the-radar startups working to achieve the impossible. The more I learn about and talk to these innovators, the more bullish I get about the future.

 

A sneak peek at one of The Rational Optimist Society’s big projects for 2025: building a database of all the innovative startups across America and the world, pushing boundaries. As a Rational Optimist Society member, you’ll know about the coolest companies doing the coolest stuff.

 

#2: Entrepreneurs. While working at SpaceX, Chris Hansen realized the only way to power humanity on Mars was with portable nuclear reactors. Now his startup, Radiant, is building microreactors to provide power on Mars, as well as Earth.

 

Charlie Munger said ideas worth billions hide in cheap books. Every Rational Optimist should study the Five Frontiers of innovation. There are countless billion-dollar business ideas in there. Someone will act on them and build a self-made fortune. Why not you? Why not your kids?

 

Joining the right startup early can change your life. Since going public in 1986, Microsoft created at least eight billionaires and 12,000 millionaires. Most of these folks were simply early employees who benefitted from Microsoft’s stock option program.

 

#3: Parents and grandparents. In the 1980s, parents rushed their kids to Japanese language lessons, certain it was the future of business. Wrong.

 

Until recently, the #1 piece of advice to a young person starting out might have been, “Learn to code.” But AI chatbots now pump out clean code at the click of a button. Coding classes might be tomorrow's dusty Japanese dictionaries.

 

Help your kids understand that the opportunities of tomorrow won't look like those of the last few decades. Don't assume good jobs require coding skills or burying yourself in college debt. I’ll be telling my kids to seek opportunities in fast-growing industries like energy, manufacturing, robotics, construction, and AI.

 

Example: Aerospace and nuclear engineering have been dead-end career choices for 40 years. All of a sudden, they’re now some of the most sought-after skill sets in the world. Look at this Wall Street Journal headline:

Source: The Wall Street Journal

 

Another headline said: “America is trying to electrify. There aren’t enough electricians.”

 

The kids (and grandkids) of ROS members are in good hands. You know about the big changes underway, and you know the only good option is to lean in. More on this in The Rational Optimist’s Guide to AI.

 

#4. Lastly, spread rational optimism. As a member of The Rational Optimist Society, you’re “in the know” about the cutting-edge innovations reshaping our world.

 

But your neighbors probably aren’t. Change that!

 

Tell someone about robots performing microscopic surgery… nuclear reactors that fit in shipping containers… scientists making breakthrough medicines in orbital factories.

 

Share a story that makes their eyes light up with possibility. Do it tonight. And tomorrow night, too.

 

This matters. Everyone who understands this transformation is another person ready to seize its opportunities. Another voice ready to support bold innovation. Another builder of tomorrow.

 

Fifty years from now, children will see charts showing what happened in 2024. They might ask what role you played in America's great awakening. Will you be able to tell them you helped push it forward?

 

The future isn't something that happens to us. It's something we build. And America is building again. The innovation avalanche is just beginning. Pull up a chair and invite your friends.

 

Here’s the link to join The Rational Optimist Society.

 

Introducing the space gun

 

Last week, SpaceX caught Starship—the biggest rocket ever—for a second time. I’ll never get tired of this:

Source: Thomas Godden on X

 

Next up: Later this year, SpaceX and NASA will send an unmanned Starship to test landing on the moon.

 

Then, in 2027, we’ll send men back to the moon for the first time since 1972. How’s that for a revival of the innovation spirit? As Mike Solana at Pirate Wires says, “Moon should be a state.”

 

If we're serious about building moon bases and cities on Mars, we're going to need a lot of “stuff” in space. That’s a challenge because getting stuff to space is extremely expensive.

 

SpaceX has slashed launch costs by 98% since 2000. Now, a startup in Oakland, CA is trying to push it further. Longshot Space is building a gun so massive, your body could slide down its barrel like a playground tunnel.

 

Rockets are expensive because they need to protect fragile humans and delicate equipment. But concrete, steel, and water don't need a gentle ride to space. They can handle a rougher journey at higher speeds.

 

Longshot’s hypersonic “space gun” is like a sophisticated potato cannon that can fire concrete, water, steel, and other “stuff” into orbit. Here’s “V1,” a 60-foot barrel that can launch small projectiles at 3,000 mph:

Source: Longshot Space

 

Longshot’s goal is to build a 15-km barrel capable of launching shipping containers into orbit for just $10 per kg. That’s 99% less than what it costs to hitch a ride on a SpaceX rocket today.

 

Sounds farfetched… until you know the US military had a working prototype 60 years ago.

 

In Project HARP (high altitude research project), America built a 120-foot-long cannon pointing straight up into the sky in Barbados. HARP's gun shot a projectile 112 miles into the air traveling at up to 7,000 feet per second—fast enough to cross Manhattan in half a second. This set a world record that still stands today. Time to go back to the future.

 

Longshot just won a $2 million Air Force contract to build a 500-meter-long gun in the Nevada desert. This beast will launch 100-kg payloads at hypersonic speeds. The acceleration would turn a human body into paste. But it’s no problem for sturdy cargo.

 

The space race of the 1960s was a contest between global superpowers. Today, it's between hoodie-wearing entrepreneurs working in California warehouses.

 

From space guns launching cargo to reusable rockets, we're watching the birth of a new space economy. The next decade will likely see more breakthroughs than the past 50 years combined.

 

Longshot’s success could help open up the cosmos for construction.