“This DNA is not real”: Why scientists are deepfaking the human genome

Researchers taught an AI to make artificial genomes, possibly opening new doors for genetic research.

Sign up for the Freethink Weekly newsletter!
A collection of our favorite stories straight to your inbox

Researchers have taught an AI to make artificial genomes — possibly overcoming the problem of how to protect people’s genetic information while also amassing enough DNA for research.

Generative adversarial networks (GANs) pit two neural networks against each other to produce new, synthetic data that is so good it can pass for real data. Examples have been popping up all over the web — generating pictures and videos (a la “this city does not exist“). AIs can even generate convincing news articles, food blogs, or human faces (take a look here for a complete list of all the oddities created by GANs).

Now, researchers from Estonia are going more in-depth with deepfakes of human DNA. They created an algorithm that repeatedly generates the genetic code of people that don’t exist.

Deepfaking Human DNA

It may seem simple — randomly mix A, T, C, and G, the letters that make up the genetic code — and voila, a human genetic sequence. But not any random pattern of the letters will work. The AI needs to understand humans at the molecular level. This AI has figured it out.

Like the horse deepfakes, the artificial genomes are a convincing copy of a viable person — a human, the researchers believe, who really could exist but doesn’t.

Most importantly, they could play an important role in genetic research.

“A known limitation in the field (of genetic studies) is the reduced access to many genetic databases due to concerns about violations of individual privacy,” the team writes in their study, published in PLOS Genetics.

The team reports that these “artificial genomes” mimic real genomes so much that they are indistinguishable. But since they aren’t real, researchers can mine the data without worrying about privacy concerns. They can experiment with genomes without actual people giving up their private information.

Protecting the privacy of the people behind genetic information is challenging and often limits how researchers can use that DNA and their willingness to share datasets. But with artificial genomes, researchers don’t have to worry about many of these ethical privacy concerns.

Faking Something You Don’t Fully Understand

The process of using GANs to generate synthetic genomes isn’t akin to making a deepfake of a person’s face. A face is something we are all familiar with and have countless examples with which to train the AI.

But there is so much about DNA and the genome that remains a mystery.

“My initial take is that it is interesting, but I’m not sure I see real practical implications for research right now,” Deanna Church, vice president of the Mammalian Business Area and Software Strategy at the biotech company Inscripta, told Futurism.

“Just because you can’t computationally distinguish these generated genomes from real genomes doesn’t mean they’ve really preserved functional motifs and domains that are important — there is much of this we still don’t understand.”

Even if the artificial genomes resolve the privacy hurdle in genetic research, they raise some possible new concerns.

“In the near term, it’s going to get easier for bad actors to create fake personas that can stand up to even the most rigorous inspection. Not that we envision a scenario where a scam artist needs to provide a fake transcript of their genome, but the unknown unknowns are where security holes tend to grow the fastest,” writes Tristan Greene in The Next Web.

We’d love to hear from you! If you have a comment about this article or if you have a tip for a future Freethink story, please email us at [email protected].

Sign up for the Freethink Weekly newsletter!
A collection of our favorite stories straight to your inbox
Related
Arc Institute’s new AI can read and write the code of life
Training on the DNA of nearly 130,000 species taught Evo 2 how to generate DNA sequences the same way other AIs do text or images.
All PCs will be AI PCs “pretty soon,” says Intel exec 
Manufacturers are now equipping their PCs with the hardware needed to run the latest AI applications locally. Here’s what that means for you.
How AI is reshaping the legal profession
AI-powered tools may already be giving some lawyers the upper hand in court.
Sal Khan wants to give every student on Earth a personal AI tutor
Khan Academy’s new AI tutor, Khanmigo, has the potential to revolutionize education for students and teachers alike.
Inside the “Virtual Lab” where AIs and humans collaborate
A “Virtual Lab” populated by AI scientists could overcome one of the biggest challenges in interdisciplinary research.
Up Next
vince lombardi super bowl ad
Subscribe to Freethink for more great stories