Excel may make chatbots much more useful

For all their abilities, LLMs like GPT-4 struggle with math and logic. Everyone’s favorite spreadsheet may help change that.
Sign up for the Freethink Weekly newsletter!
A collection of our favorite stories straight to your inbox

Despite the many things that advanced chatbots like OpenAI’s GPT-4, Google’s Bard, and Anthropic’s Claude can do, these AIs do have a substantial Achilles heel: they’re pretty bad at math.

This makes sense when you consider that they are large language models (LLMs). Trained on the vast corpus of the internet, they essentially operate entirely off of text they have “read.” If you ask one to add two different numbers, it does not truly add them, like a calculator: it predicts the answer based on text it has been trained on, and replies with that.

“Claude” actually explained it best, when Semafor’s Gina Chua prompted it to do some math.

“I am not actually able to do mathematical calculations,” the chatbot responded to Chua. “While I can have conversations about math and numbers, I do not have a built in calculator … I simply treated the question as another language input, and responded with the sum I was trained to give for that specific set of numbers.”

But one new application may provide LLMs with true mathematical powers: its promised integration into Excel, as part of Microsoft’s plans to launch AI Copilot that works with its 365 apps, including Excel, PowerPoint, and Word. 

Having access to Excel’s tools could allow for handling numbers and logic.

“I have been working on this problem, and I’d say math/logic is one of the biggest weaknesses/limitations of LLMs,” Nazneed Rajani, robustness research lead at AI company Hugging Face, tells Freethink via email.

LLMs are currently not reliable when counting, Rajani says. Even a simple prompt like “write a sentence about x that is y words long” is almost always incorrect; the LLM just doesn’t respond with the correct number of words.

ChatGPT’s answers don’t quite add up. Nazneen Rajani

Telling the LLM to “think” about it “step-by-step” can help it to avoid or correct mistakes, but “I’d not trust the calculations without validating them myself,” Rajani says.

But having access to Excel could help LLMs better understand data beyond words and images.

“Excel perhaps adds a lot more structure to the data, and having a model fine-tuned on this structural data would definitely boost the performance of an LLM on Excel-specific tasks,” Rajani says.

As Chua points out, that would at least mean more than an LLM that could perform basic arithmetic correctly. But Excel is essentially a database program that can handle not only numbers but also text, dates, and much else.

If LLMs can successfully incorporate Excel or means to access math and logic capabilities, they could be prompted to do things like create an accurate budget and modify it for various scenarios, all with conversational prompts, or search for patterns in data merely by asking natural human questions.

However, Microsoft has yet to make Copilot AI available to the public, so just how close we are to an LLM that can crunch the numbers is still a bit of an unknown variable.

We’d love to hear from you! If you have a comment about this article or if you have a tip for a future Freethink story, please email us at [email protected].

Sign up for the Freethink Weekly newsletter!
A collection of our favorite stories straight to your inbox
Related
Why the USSR and China fell behind the US in the Chip Cold War
The US is currently winning the “chip war” with China, but to stay ahead, experts say it needs to start manufacturing microchips stateside.
Silicon chips are no longer sustainable. Here’s what’s next.
To take our tech to the next level, we need a more energy-efficient semiconductor. Gallium nitride could be it.
Is this the biggest industrial espionage campaign in history?
The cat-and-mouse game between China and the world’s semiconductor companies is already having enormous consequences.
Replit CEO Amjad Masad on bringing the next 1 billion software creators online
Freethink spoke with Masad about the future of software development, the outsized power of Silicon Valley, and the absurdity of the AI extinction theory.
Microsoft’s “parallel bets” strategy won the PC Wars. Will it work for AI?
Microsoft made parallel bets to make sure they held their OS lead. They’ll do the same for AI — will it work?
Up Next
ChatGPT on a smartphone screen
Subscribe to Freethink for more great stories