Change for the Machines (with apologies to Pat Cadigan)
AI was about models where money equals compute equals big data equals valuations.
So all the money going in was to finance compute, thinking where the value lies,
and every valuation of every company was about how much compute they had so it was all fake.
Companies valuations were just how many H100s they had (compute capacity),
as if it correlates to better models (even though they’re usually just wrappers.
DeepSeek, and other Chinese models broke that which pissed everyone off, private and public investors
included, because it casted doubt on the valuation methodologies, namely that energy, compute, data centres and number of chips were essentially fixed costs and the valuation of companies (and their output) could be measured on that alone.
Basically casted doubt on the last two years of public and private, markets not just of AI companies,
but the entire stack, energy, chips, data centres etc. Everyone felt like a dummy. Even though it’s
been happening for a while. We knew this a couple of years earlier when Meta released Llama, and it was clear that much smaller models could be trained at much lower costs and yet achieve many of the same goals. In that case, it was better software engineering in the open source community. Perhaps being open sourced (despite origins in a hyperscale company) it attracted less attention, although perhaps the google memo "we have no moat"
should have been a clue.
https://www.theverge.com/2023/7/10/23790132/google-memo-moat-ai-leak-demis-hassabis
One of the ironies that the DeepSeek debacle also exacerbates is that one constraint on them that made them seek greater efficiency was export restrictions on higher end GPU - as with the open source research community, less is more. That constraint was already what drove people in the open source (often academic or hobbyist) community
to develop affordable ATI technologies. In fact, outside of the LLM/GenAI world, many machine learning tools have been proving themselves perfectly useful and usable running on laptops on large datasets ("big data").
Denials at the time came thick and fast, perhaps because the huge investment in the new emperors
was not ready to be disrobed. Perhaps, also, as if OpenAI et al were deliberately trying to create artificially
high barriers to entry to their tech market. For investors actually interested in innovation, this is
ironic given the entire direction of travel of much computing related tech has been to lower barriers
so that innovation drives things with as low friction as possible (internet, cloud, processors, compilers, operating systems, SDKs/Appstores etc etc).
So up to and including future chip design, and certainly things like edge compute,
federated machine learning, and of course, all things decentralised...
ad dare one also say warfare - cyber, and hybrid war has even increased the asymmetry
in cost of effective weapons.... the military example is a very important one often overlooked and i think a large part of the world is scrambling to figure that out, same thing with cyber attacks too...
However its more than artificially high barriers to entry its also creating artificial or at least inflated markets because money goes out as investment and back in as infrastructure (think Microsoft investment or NVIDIA investment) when they aren't needed with a faulty way of valuing all the assets.
What's best for innovation, and what happens usually in innovation?
We would think they'd learn the lesson, barriers always lower, things get commoditised, and things get cheaper and easier. This is not always just second system or indeed, third version syndrome - some better understanding of the domain can lead to major efficiencies, and sometimes they arrive combined with other useful innovations - one example arose from work in explainable AI (XAI) where tooling to uncover what structures within a neural network ("deep learning") were responsible for detecting/recognising which input features (and hence classifying an input in some manner) - these tools for explainability also allow one to shrink the neutral network significantly by discarding nodes/edges that serve no useful classifier function - this has been used in face recognition in camera phones to make smaller, faster, and actually potentially more accurate AIs. The cost in training increases somewhat, but the payoff is that the cost in inference (done billions of times rather than just "once") is massively reduced. In some AI models that approach can actually be used during training to reduce training cost too. So an innovation in one space driven by a required feature (explainability) leads to efficiency gains too.
Another angle on this has been the use of physics models (in weather prediction and heavy engineering) combined with neural nets - there's a mutual benefit in reducing the computation costs of computing the physics model, and in optimising the neural network itself -recent advances (e.g. the Aardvaak weather predictor - see https://arxiv.org/abs/2404.00411) actually move the partial differential equations into approximations in the neural net (neural operations for the PDE) gaining huge efficiency, but retaining the fundamental explainability of the original physics. Applying the same technique to continual updates to the models from real world inputs is another huge win.
Profligacy gets in the way of such giant steps.