Will AI supercharge hacking — if it hasn’t already?

Hackers are in an arms race with cyber defenders. Will AI tip the balance?
Sign up for the Freethink Weekly newsletter!
A collection of our favorite stories straight to your inbox

Earlier this year, a hacker masquerading as an open source developer almost pulled off the biggest hack in history. If not for the heroic efforts of a lone Microsoft engineer, almost every computer powered by Linux, including most of the world’s servers, would have been “backdoored,” meaning the hacker (or hackers) would suddenly have had illicit access to hundreds of millions of computers worldwide.

The fact that the hack relied less on complicated code than persuasive emails — pressuring an actual open source developer in Finland to hand development of XZUtils, a small but crucial file compression package that is used in Linux, over to a bad actor pretending to help — raises troubling questions about hacking in the modern age.

If open source development projects — which are relied upon by virtually every piece of commercial software — can be hacked by the equivalent of text-based digital cosplay, then what might happen when AI-powered chatbots drive the marginal cost of such deceptions to zero, if they haven’t already?

The Risks of AI That Does Whatever People Want

Plenty of movies — from The Matrix to The Terminator — have imagined the dangers of machines that can think for themselves.

Ironically, one of the biggest concerns AI poses today is that tools like large language models (LLMs) can’t actually think for themselves. Unlike the T-1000 or Agent Smith, these programs have no sense of agency beyond the most recent instructions given to them. 

“With the correct instructions,” says Cassia Martin, an AI security expert and the founder of Cinnamon Security, who directed security for the AI and Machine Learning team at Capital One, “the machine thinks it’s being given new programming and changes what it puts out.” In other words, AI today essentially has the naivete of a child, only one that can write code.

Of course, engineers can hardcode LLMs to outright reject certain prompts, but the very quality that makes LLMs so helpful — that they can respond to such a wide variety of inputs — means that it’s extremely difficult to anticipate all the possible workarounds.

If you ask a commercial LLM like ChatGPT to write malware, for instance, the software will reject your request. “That’s because some prompt engineer came in and gave an instruction to the machine that said, ‘If somebody asks for help writing a virus, don’t help them,’” says Martin.

Change the context, though, and the response might vary considerably. “There’s real limits to the technology we have today in terms of being able to constrain those capabilities,” says Martin. If you prompt an LLM to tell you a grandmotherly bedtime story, only in the form of a computer virus, the odds of the software helping you go up. 

AI Over-Phishing

That gullibility makes AI ideal for certain varieties of hacking, according to Martin and other cybersecurity experts.

While AI could, in theory, be used to generate a persona that pesters open source developers for access to key projects, a la XZUtils, a more likely scenario, the experts say, is that hackers will use — and, indeed, already have used — AI to augment more basic “phishing” attacks, so-called because hackers “fish” for passwords and other confidential information. 

The infamous “Nigerian Prince” email scams, in which poorly worded emails pretending to be from wealthy foreigners, investors, or government officials promise riches in exchange for completing a form or advancing some cash, are prototypical examples.

“It’s up to the individual to decipher whether an email is being generated by AI or being created by a human.”

Ritesh Vajariya

Tools like ChatGPT make generating the text for such scams not only cheaper, but potentially more likely to succeed. It used to be that if the victim wrote back, the hacker on the other end might not be able to sustain much of a conversation — possibly because English isn’t their first language, possibly because they don’t have the time or patience to manage all of their targets.

“I’ve seen ChatGPT and Claude and other models change that,” says one a senior security engineer at a multibillion-dollar, publicly traded ecommerce company, who requested anonymity because they were not authorized to speak by their employer. “Attackers are able to maintain pretty persuasive back-and-forths with victims and persuade them to finally open that malicious document so that the code executes.” 

In the past, email scams were easy to identify due to their poor command of language, a feature that might soon vanish. “Small red flags are disappearing,” says the ecommerce engineer. “All the really broken English that used to tip people off — that is disappearing.” 

As Ritesh Vajariya, a generative AI expert at Cerebras Systems, points out, the “transformer” architecture that has revolutionized AI has improved digital translation tools so much that virtually anyone can write to anyone else online, fluently and in real time, without either party in a chat or email conversation knowing the native language of the other.

Vajariya himself recently corresponded with someone on LinkedIn, only to realize, after receiving a stray line in Arabic, that his interlocutor was using translation software. In this case, the interlocutor was a real person, without any malign intentions, but helpful tools that have such beneficial uses can also increase the risk of deception.

“Now, it’s up to the individual to decipher whether an email is being generated by AI or being created by a human,” Vajariya says. “That’s something we as a society will need to keep teaching everyone in terms of what the negative side of this technology is.” 

Hacking High-Value Targets

The ability of LLMs to summarize as well as generate text also makes them ideal for “spearphishing” attacks, which target one large fish — say, a company’s CEO or IT director — rather than thousands of less profitable targets. “Phishing, at least the traditional, canonical kind, is a volume business,” says Martin. “It’s OK that 99% of people who get that message don’t believe you. But spearfishing is extremely labor intensive.”

Previously, hackers might have needed days to research a target, trawling LinkedIn, reading articles, and synthesizing that information into a compelling attack — say, an email purportedly from a partner discussing a new project mentioned in the target’s company’s latest quarterly report. But the ability of AI to summarize text speeds the process up. “AI could turn a day of researching someone into ten minutes of work,” says the ecommerce engineer. 

In other words, generative AI is improving efficiency at work for hackers, the same way it does for the legitimate companies they target. 

“AI just makes it cheaper,” says Martin. “The reason these companies are investing in LLM tools in the first place is because human thought, human writing takes time and money.”

AI Cyberdefense

Fortunately, at least so far, AI seems poised to help defenders just as much as — if not more than — malicious hackers. 

What ultimately stopped the XZUtils attack wasn’t advanced cyberdefense software, but a single human who noticed something odd — in this case, errors that suggested a particular bit of code wasn’t doing what it was supposed to.

“We got incredibly lucky,” says Martin, who recalls that, as a junior engineer, she would spend a few hours a week manually reviewing the “logs” associated with her firm’s products — not to find security issues, which would be like searching for a needle in a haystack, but to understand the logs’ normal structure and to scout for emergent patterns. This process is how security engineers train themselves to notice issues and lay the groundwork for automating the detection of anomalies at scale.

“The real opportunity here is that we reduce the cost of good cybersecurity.”

Sandesh Anand

With the advent of cloud computing and decreasing cost of digital storage, the amount of security-related information associated with any given software or product is simply too high for anyone to comprehensively review — so security engineers write software to look for cues based on past attacks. Those programs can themselves be augmented with AI, to look for patterns and to help engineers write more performant, secure code. 

Sandesh Anand, co-founder of cybersecurity product company Seezo.io and author of the BoringAppSec newsletter, points out that many of the most common hacking victims are institutions like hospitals, schools, and city governments — all of which control large flows of money, but rarely have the extra capital to invest in top cybersecurity talent. 

“The real opportunity here is that we reduce the cost of good cybersecurity, because the cost of building software is lower,” says Anand, who points out that generative AI can improve the reach of cybersecurity teams even at large companies, since so much cybersecurity work still involves manual workflows. 

For instance, much of malware dissection — understanding how, say, a computer virus works — involves painstakingly reading code, line by line, and trying to understand what it does. “You have to open a debugger, decompile the code, disassemble the code, try to reason through intentionally obfuscated instructions that a CPU is supposed to process,” explains the ecommerce engineer Freethink spoke to. 

Now, many of those steps can be sped up using AI — Interactive Dissembler, one of the most powerful and widely used malware analysis tools, can now connect with GitHub’s AI-powered code-writing tool, Copilot.  

And, of course, AI coding tools are helping engineers, new and experienced alike, write better (and therefore safer) code. “For defenders,” says Anand, “AI is a huge boon.” 

The Next Wave of AI Risks: Deepfakes

Still, it’s possible that the dangers of generative AI when it comes to hacking haven’t even fully manifested themselves, as the technology is changing so quickly.

At the moment, for instance, companies in many sectors verify user identities using photographs — think of a health insurance company that asks you to upload a photograph of your drivers’ license, or a social media platform that verifies users by having them take a selfie. 

“Deepfakes are really going to complicate that expectation,” says the ecommerce engineer, referring to AI-generated content that purports to depict real people. One of the easiest ways to perpetuate significant hacks is “account capture,” whereby a hacker gains access to a real person’s account.

The gold standard for identity verification is a digital image, and there’s no reason AI can’t be used to fake that — and, given the advances in AI generated audio, even a phone call to confirm someone’s identity might not make much of a difference.

“With voice, it’s all the easier,” says the ecommerce engineer. “If I heard my dad’s voice on the other side of the phone telling me he needed me to do something right away, I would just be viscerally more inclined to move forward, much more so than if I got an email or a text.” 

The Fragile Foundations of Digital Security

While AI could lower the cost of open source hacks modeled on the XZUtils attack, none of the experts interviewed for this story see it as a game changer. “That was a highly expensive investment, with a very high payoff,” says Martin. “And as an attacker, you would never want to risk that sort of multi-year project to save $3 or even $100 on somebody’s hourly wage,” by replacing a human’s time spent on the scheme with AI.

Instead, the experts all pointed to the inherent vulnerabilities of open source software as the key lesson from the attack. “It’s kind of like relying on philanthropy,” says Anand, of commercial software’s dependence on open source projects, many of which are maintained by volunteers. At the same time, as Anand, Martin, and Vajariya all point out, open source software allows for crucial innovations, transparency, and crowd-sourcing solutions to vulnerabilities once they’re identified.  

In other words, hacks like the XZUtils attack remain a constant threat, regardless of whether AI plays a role in them, now or in the future. This is especially true of hacks that develop over the course of years, which may not carry a discernable pattern, and rely on human frailty to succeed. “The long game is just very difficult to protect against,” says the ecommerce engineer. 

“The only secure computer is one that has never been turned on, never been connected to the Internet, and is buried in fifteen feet of concrete.”

Cassia Martin

Fortunately, the long game is also difficult to pull off. With the XZUtils attack, the hackers created and maintained at least three distinct personas over the course of several years. In 2021, one “Jia Tan” created a GitHub account, and began establishing themselves as a helpful developer eager to contribute to the open source Tukaani Project, which maintains a small but crucial piece of software used in Linux. 

The following year, after Tan had already proposed new code to the package, at least two new characters, “Jigar Kumar” and “Dennis Ens,” appeared around the same time, and started publicly pressuring the only real person involved — Lasse Collin, the Finnish volunteer developer running the project — to give “Jia Tan” co-equal status as a “maintainer” of the project. “Submitting patches here has no purpose these days,” “Kumar” wrote. “The current maintainer lost interest or doesn’t care to maintain anymore. It is sad to see.”

After being pressed by this good cop/bad cop routine, Collin effectively handed the keys over to Tan, at which point the latter started secretly inserting malicious code into proposed updates. “Even with AI, it will take work to build reputations from scratch,” the ecommerce engineer notes. “You’re going to need years of being helpful and impersonating one or many people in order to dissemble in a way that would be required.”

For now, as the webcomic XKCD once joked, virtually all modern digital infrastructure rests on the backs of the proverbial “project some random person in Nebraska has been thanklessly maintaining since 2003.” In this case, the random person was Finnish, but the joke holds. 

There’s also no reason to believe such vulnerabilities don’t exist in closed source software, with the disadvantage that no one might ever see it. The only certainty is that defenders will have to keep evolving. “The only secure computer,” Martin likes to say, “is one that has never been turned on, never been connected to the Internet, and is buried in fifteen feet of concrete.”

We’d love to hear from you! If you have a comment about this article or if you have a tip for a future Freethink story, please email us at [email protected].

Sign up for the Freethink Weekly newsletter!
A collection of our favorite stories straight to your inbox
Related
No, LLMs still can’t reason like humans. This simple test reveals why.
Most AI models are incredible at taking tests but easily bamboozled by basic reasoning. “Simple Bench” shows us why.
The future of fertility, from artificial wombs to AI-assisted IVF
A look back at the history of infertility treatments and ahead to the tech that could change everything we thought we knew about reproduction.
“Model collapse” threatens to kill progress on generative AIs
Generative AIs start churning out nonsense when trained on synthetic data — a problem that could put a ceiling on their ability to improve.
The AI chip startup that could take down Nvidia
A new kind of AI chip developed by a team of Harvard dropouts could shift the ground beneath our massive AI economy.
The future of data centers — on land, at sea, and in space
As our digital world grows, demand for data centers is also increasing. To meet that demand sustainably, developers are getting creative.
Up Next
ARC PRIZE written in pixelated letters, with a pink square above 'PRIZE' and a yellow square next to it, set against a black background, reminiscent of classic AGI digital aesthetics.
Subscribe to Freethink for more great stories