New algorithm promises to slash AI power consumption by 95 percent

Cal Jeffrey

Posts: 4,595   +1,682
Staff member
A hot potato: As more companies jump on the AI bandwagon, the energy consumption of AI models is becoming an urgent concern. While the most prominent players – Nvidia, Microsoft, and OpenAI – have downplayed the situation, one company claims it has come up with the solution.

Researchers at BitEnergy AI have developed a technique that could dramatically reduce AI power consumption without sacrificing too much accuracy and speed. The study claims that the method could cut energy usage by up to 95 percent. The team calls the breakthrough Linear-Complexity Multiplication or L-Mul for short. The computational process uses integer additions, which require much less energy and fewer steps than floating-point multiplications for AI-related tasks.

Floating-point numbers are used extensively in AI computations when handling very large or very small numbers. These numbers are like scientific notation in binary form and allow AI systems to execute complex calculations precisely. However, this precision comes at a cost.

The growing energy demands of the AI boom have reached a concerning level, with some models requiring vast amounts of electricity. For example, ChatGPT uses electricity equivalent to 18,000 US homes (564 MWh daily). Analysts at the Cambridge Centre for Alternative Finance estimate that the AI industry could consume between 85 and 134 TWh annually by 2027.

The L-Mul algorithm addresses this excessive waste of energy by approximating complex floating-point multiplications with simpler integer additions. In testing, AI models maintained accuracy while reducing energy consumption by 95 percent for tensor multiplications and 80 percent for dot products.

The L-Mul technique also delivers proportionally enhanced performance. The algorithm exceeds current 8-bit computational standards, achieving higher precision with fewer bit-level calculations. Tests covering various AI tasks, including natural language processing and machine vision, demonstrated only a 0.07-percent performance decrease – a small tradeoff when factored into the energy savings.

Transformer-based models, like GPT, can benefit the most from L-Mul, as the algorithm integrates seamlessly into the attention mechanism, a crucial yet energy-intensive component of these systems. Tests on popular AI models, such as Llama and Mistral, have even shown improved accuracy with some tasks. However, there is good news and bad news.

The bad news is that L-Mul currently requires specialized hardware. Contemporary AI processing is not optimized to take advantage of the technique. The good news is plans for developing specialized hardware and programming APIs are in the works, paving the way for more energy-efficient AI within a reasonable timeframe.

The only other obstacle would be companies, notably Nvidia, hampering adoption efforts, which is a genuine possibility. The GPU manufacturer has made a reputation for itself as the go-to hardware developer for AI applications. It is doubtful it will throw its hands up to more energy-efficient hardware when it holds the lion's share of the market.

Those who live for complex mathematical solutions, a preprint version of the study is posted on Rutgers University's "arXiv" library.

Permalink to story:

 
It wouldn’t surprise me if NVIDIA has already managed to integrate this into their software stack, and are planning on implementing L-Mul in whatever comes after Blackwell (or create an entirely new accelerator line).
 
So fp4 is not coarse enough? What next fp1! heck let's remove all significant digits. Why not use fixed point math. You can get away with integers and still get fp like accuracy.
 
Algorithm? What is that?

That is the word that we now use "AI" for.

So, shouldn't the title read that they're using "AI" to find ways to slash power for "AI"?
 
This is opportunity for Intel and amd.
They have fpga which can be temporary solutions until lmul asic is available
 
Algorithm? What is that?

That is the word that we now use "AI" for.

So, shouldn't the title read that they're using "AI" to find ways to slash power for "AI"?
Machine learning is all about complex math. L-Mul is not AI, it's just another algorithm used to reach a "solution" for a given problem.

You may have heard of the word "tensor". In easy to understand terms a tensor is a matrix that contains multiple matrices (think of it as a 3 dimensional matrix, from 0 to n). Once they reach a certain size they become extremely compute intensive to brute force things which is why special algorithms that simplify things are used, some of which have some smaller of bigger margins of error.

AI as we know it today is just ML with the added goal of mimicking humans in the tasks it's given.

Here's a simple article that gives you a few more examples of what kind of math is used in ML and subsequently AI:
https://www.geeksforgeeks.org/machine-learning-mathematics/
 
I saw this paper yesterday and found it interesting too, so I wrote (with "little" help from AI) a benchmark program to test it.
Here are the results.
FbVC0GF.jpeg
 
Last edited:
I am always quite perplexed whenever there found a much simpler and better solution to the common problem which have been puzzling for many. We have tons of mathematicians and programmers looking at these, then bam, someone come with solution that... 20x better energywise. Sounds to good to be true, I do hope it's true though.
 
I am always quite perplexed whenever there found a much simpler and better solution to the common problem which have been puzzling for many. We have tons of mathematicians and programmers looking at these, then bam, someone come with solution that... 20x better energywise. Sounds to good to be true, I do hope it's true though.
It's all about keeping the margin of error as small as possible.
 
Stop calling it AI. It is neither artificial nor intelligent. Large data models are very useful but they are not thinking.
 
Stop calling it AI. It is neither artificial nor intelligent. Large data models are very useful but they are not thinking.
If they can overcome 99% humans without even doing any real thinking, well, that's an achievement. Neural networks can think because their architecture is very similar to the human brain (they simulate the human brain), the only difference is that they don't have self-consciousness. Animals can also think, they don't do complex thinking, but they do simple thinking.
 
AI models taking up as much power as 18,000 homes? No wonder my lights flicker every time I run ChatGPT.
 
Back