How ambitious is Obama’s gene project?

President Obama wants to map the genes of a population the size of Austin, Texas — an amount that far exceeds the number of Americans whose entire genomes have been sequenced so far.

Scientists have already analyzed genes for tens of thousands of people in the United States. But that’s peanuts compared with a new target the White House set earlier this year, when Obama announced an initiative to collect genetic information from 1 million Americans. Most of the analysis thus far has been limited to sections of DNA, an easier task than scrutinizing a person’s entire genome.

Researchers agree they need a massive collection of genomic data to conduct the kind of research leading to new types of personalized, targeted cures, often for individuals with rare or particularly aggressive diseases — which is why the administration laid out such an ambitious goal.

“Analyzing data from one of the largest research populations ever assembled will teach us more about the connections between us than ever before,” Obama said in a January speech announcing the initiative. “And this new information will help doctors discover the causes, and one day the cures, of some of the most deadly diseases that we face.”

But officials admit that, at least for now, the one million target is more idealistic than realistic. The first aim is to just get the project — dubbed the “precision medicine initiative” — off the ground.

“I think the notion of 1 million was sort of a placeholder that’s aspirational,” said Gary Gibbons, director of the National Heart, Lung and Blood Institute at the National Institutes of Health.

“A way to say we want to do something that’s large scale and national — something that reflects the diversity of our country and something where we can leverage the capabilities of this technology,” he said.

More than a decade after NIH mapped the first human genome, no one knows precisely how many Americans have had all their genes sequenced. Much of the genetic mapping has taken place in the U.S., but Beijing is also a major center for genomic sequencing.

Illumina, a leader in the genome sequencing business, said last year that about 65,000 entire genomes have been sequenced globally, based on its customers. Mike Snyder, director of Stanford University’s Center for Genomics and Personalized Medicine, estimates that about 100,000 whole genomes and exomes have been sequenced worldwide.

Officials hope to tap into some of that existing work as they start gathering volunteers for the precision medicine initiative. Gibbons said that as of this month, NIH has heard from more than 150 groups that have each said they may be able to offer at least 10,000 volunteers from already-formed cohorts.

One of the largest pools includes 380,000 veterans assembled by the Department of Veterans Affairs, which announced an effort in 2011 to gather genetic information from 1 million veterans for medical research.

The agency has conducted whole genome sequencing on just 2,000 of the volunteers so far. But researchers have done more limited kinds of analysis on a larger scale. About half the volunteers have been genotyped, a process that catalogues genetic variations at certain points along the genome instead of mapping out the entire thing.

And researchers have carried out exome sequencing — looking at only the protein-coding genes — on samples from 25,000 of the veterans. The VA says it expects to have half a million volunteers by October 2016, potentially giving NIH a huge pool of participants.

Private health providers are also working to amass large collections of genomic data. The largest is held by Kaiser Permanente of Northern California, which has collected samples from more than 210,000 people since 2005. Researchers have analyzed about half of the the Kaiser volunteers so far, but mostly through genotyping and not whole sequencing.

NIH hasn’t announced any formal agreement with the VA or Kaiser, and tapping into the cohorts isn’t as simple as just sharing the information. Researchers likely would have to re-obtain consent from each individual to include his genetic material in a massive new initiative. And there will be lots more data to collect if researchers decide to do whole sequencing on each volunteer instead of a more limited analysis.

Researchers also will face the challenge of rendering the data usable through electronic health records — an ongoing challenge of precision medicine. To reach any useful findings about which genetic mutations contribute to an individual’s condition, scientists must comb through massive sets of data.

“The hard part is assembling the cohort, choosing the right cohort and having good records linked to that DNA,” Snyder said. “That’s going to be the hard part by a lot.”

Related Content