Posted on March 16, 2021

DNA Databases Are Too White

Tina Hesman Saey, Science News, March 4, 2021

It’s been two decades since the Human Genome Project first unveiled a rough draft of our genetic instruction book. The promise of that medical moon shot was that doctors would soon be able to look at an individual’s DNA and prescribe the right medicines for that person’s illness or even prevent certain diseases.

That promise, known as precision medicine, has yet to be fulfilled in any widespread way. True, researchers are getting clues about some genetic variants linked to certain conditions and some that affect how drugs work in the body. But many of those advances have benefited just one group: people whose ancestral roots stem from Europe. In other words, white people.

Instead of a truly human genome that represents everyone, “what we have is essentially a European genome,” says Constance Hilliard, an evolutionary historian at the University of North Texas in Denton. “That data doesn’t work for anybody apart from people of European ancestry.”

She’s talking about more than the Human Genome Project’s reference genome. That database is just one of many that researchers are using to develop precision medicine strategies. Often those genetic databases draw on data mainly from white participants. But race isn’t the issue. The problem is that collectively, those data add up to a catalog of genetic variants that don’t represent the full range of human genetic diversity.

When people of African, Asian, Native American or Pacific Island ancestry get a DNA test to determine if they inherited a variant that may cause cancer or if a particular drug will work for them, they’re often left with more questions than answers. The results often reveal “variants of uncertain significance,” leaving doctors with too little useful information. This happens less often for people of European descent. That disparity could change if genetics included a more diverse group of participants, researchers agree {snip}.

One solution is to make customized reference genomes for populations whose members die from cancer or heart disease at higher rates than other groups, for example, or who face other worse health outcomes, Hilliard suggests.

And the more specific the better. For instance, African Americans who descended from enslaved people have geographic and ecological origins as well as evolutionary and social histories distinct from those of recent African immigrants to the United States. Those histories have left stamps in the DNA that can make a difference in people’s health today. The same goes for Indigenous people from various parts of the world and Latino people from Mexico versus the Caribbean or Central or South America.

Researchers have made efforts to boost diversity among participants in genetic studies, but there is still a long way to go. How to involve more people of diverse backgrounds — which goes beyond race and ethnicity to include geographic, social and economic diversity — in genetic research is fraught with thorny ethical questions.


Some of our readers asked how genetic research got to this state in the first place. Why is genetic research so white and what do we do about it?

Let’s start with the project that makes precision medicine even a possibility: the Human Genome Project, which produced the human reference genome, a sort of master blueprint of the genetic makeup of humans. The reference genome was built initially from the DNA of people who answered an ad in the Buffalo News in 1997.

Although many people think the reference genome is mostly white, it’s not, says Valerie Schneider, a staff scientist at the U.S. National Library of Medicine and a member of the Genome Reference Consortium, the group charged with maintaining the reference genome. The database is a mishmash of more than 60 people’s DNA.

An African American man, dubbed RP11, contributed 70 percent of the DNA in the reference genome. About half of his DNA was inherited from European ancestors, and half from ancestors from sub-Saharan Africa. Another 10 people, including at least one East Asian person and seven of European descent, together contributed about 23 percent of the DNA. And more than 50 people’s DNA is represented in the remaining 7 percent of the reference, Schneider says. Information about the racial and ethnic backgrounds of most of the contributors is unknown, she says.

All humans have basically the same DNA. Any two people are 99.9 percent genetically identical. That’s why having a reference genome makes sense. But the 0.1 percent difference between individuals — all the spelling variations, typos, insertions and deletions sprinkled throughout the text of the human instruction book — contributes to differences in health and disease.

Much of what is known about how that 0.1 percent genetic difference affects health comes from a type of research called genome-wide association studies, or GWAS. In such studies, scientists compare DNA from people with a particular disease with DNA from those who don’t have the disease. The aim is to uncover common genetic variants that might explain why one person is susceptible to that illness while another isn’t.

In 2018, people of European ancestry made up more than 78 percent of GWAS participants, researchers reported in Cell in 2019. That’s an improvement from 2009, when 96 percent of participants had European ancestors, researchers reported in Nature.


Most of the research funded by the major supporter of U.S. biomedical research, the National Institutes of Health, is done by scientists who identify as white, says Sam Oh, an epidemiologist at the University of California, San Francisco. Black and Hispanic researchers collectively receive about 6 percent of research project grants, according to NIH data.


Hilliard’s hypothesis is that precision medicine, which tailors treatments based on a person’s genetic data, lifestyle, environment and physiology, is more likely to succeed when researchers consider the histories of groups that have worse health outcomes. For instance, Black Americans descended from enslaved people have higher rates of kidney disease and high blood pressure, and higher death rates from certain cancers than other U.S. racial and ethnic groups.=


Some doctors and researchers advocate for racialized medicine in which race is used as proxy for a patient’s genetic makeup, and treatments are tailored accordingly. But racialized medicine can backfire. Take the blood thinner clopidogrel, sold under the brand name Plavix. It is prescribed to people at risk of heart attack or stroke. An enzyme called CYP2C19 converts the drug to its active form in the liver.


The inactive versions are more common among Asians and Pacific Islanders than among people of African or European ancestry. But just saying that the drug won’t work for someone who ticked the Pacific Islander box on a medical history form is too simplistic. About 60 to 70 percent of people from the Melanesian island nation of Vanuatu carry the inactive forms. But only about 4 percent of fellow Pacific Islanders from Fiji and the Polynesian islands of Samoa, Tonga and the Cook Islands, and 8 percent of New Zealand’s Maori people have the inactive forms.


Assuming that someone has a poorly performing enzyme based on their ethnicity is unhelpful, according to Nuala Helsby of the University of Auckland in New Zealand. These examples “reiterate the importance of assessing the individual patient rather than relying on inappropriate ethnicity-based assumptions for drug dosing decisions,” she wrote in the British Journal of Clinical Pharmacology in 2016.

A far better approach than either assuming that ethnicity indicates genetic makeup or that everyone is like Europeans is to analyze a person’s DNA and have a precise reference genome to compare it against, Hilliard says. Deciding which genomes to create should be based on known health disparities.

“We have to stop talking about race, and we have to stop talking about color blindness.”{snip}


Recruiting people from all over the world to participate in genetic research might seem like the way to increase diversity, but that’s a fallacy, Hilliard says. If you really want genetic diversity, look to Africa, she says.

Humans originated in Africa, and the continent is home to the most genetically diverse people in the world. Ancestors of Europeans, Asians, Native Americans and Pacific Islanders carry only part of that diversity, so sequencing genomes from geographically dispersed people won’t capture the full range of variants. But sequencing genomes of 3 million people in Africa could accomplish that task, medical geneticist Ambroise Wonkam of the University of Cape Town in South Africa proposed February 10 in Nature (SN Online: 2/22/21).


Some countries have begun building specialized reference genomes. China compiled a reference of the world’s largest ethnic group, Han Chinese. A recent analysis indicates that Han Chinese people can be divided into six subgroups hailing from different parts of the country. China’s genome project is also compiling data on nine ethnic minorities within its borders. Denmark, Japan and South Korea also are creating country-specific reference genomes and cataloging genetic variants that might contribute to health problems that their populations face. Whether this approach will improve medical care remains to be seen.


Many respondents to our survey expressed concern that even well-intentioned scientists might do research that ultimately increases bias and discrimination toward certain groups. As one reader put it, “The idea of diversity is being stretched into an arena where racial differences will be emphasized and commonalities minimized. This is truly the entry to a racist philosophy.”

Another reader commented, “The fear is that any differences that are found would be exploited by those who want to denigrate others.” Another added, “The idea that there are large genetic differences between populations is a can of worms, isn’t it?”

Indeed, the Chinese government has come under fire for using DNA to identify members of the Uighur Muslim ethnic group, singling them out for surveillance and sending some to “reeducation camps.”