Posted on September 13, 2021

3 Million Sample Size GWAS on Educational Attainment

Steve Sailer, Unz Review, September 11, 2021


A big study led by James J. Lee in 2018 broke the million sample size barrier by stitching together dozens of databases from various studies that include DNA data with whatever demographic data was collected from each volunteer.

Say you are signing up for a study of kidney health that includes getting your DNA scanned. They aren’t very likely to give you an IQ test as part of this kidney study (why would they?), but they might very well include a question asking you to self-report your highest level of educational attainment. So, educational attainment has been the focus of big GWAS studies.

At this months International Society for Intelligence Research get-together, Lee presented highlights from his upcoming updated educational attainment study. By making a deal with 23andMe, they’ve boosted the sample size to three million, with moderate boosts in correlation score.

Russell Warne of Utah Valley U., author of In the Know, reported on ISIR talks:

Warne writes:

Lee: This brings the total of identified SNPs for educational attainment to nearly 4,000. Those SNPs explain 7% of educational attainment variation. Using all SNPs explains 12-16% of educational attainment. #ISIR2021

The top decile for polygenic score for educational attainment is about nine times more likely to graduate from college than the bottom decile. Back in 2018, with the million-plus sample, the top 20% was five times more likely to graduate from college than the bottom 20%.

Of course, few graduate from college solely owing to their own personal genes. College is often expensive in in direct costs and always expensive in opportunity costs (wages forgone). Your parents having enough money for you to go to college can of course make a big difference. And your parents might have enough money for you in part because of their genes, so educational attainment is more complex than, say, IQ.

Lee: Attenuation is not as strong for height, BMI, or IQ. Much more of the GWAS signal we find in IQ is causal. #ISIR2021

The following sounds interesting from a dating-and-mating perspective, but hard to wrap one’s head around:

Lee: Polygenic scores for educational attainment can be used to predict disease, and predictive power increases when combining that with a polygenic score for the disease. But that improved prediction is due often to the environmental confounding. #ISIR2021

The polygenic scores for educational attainment complement the polygenic scores for predicting the disease for a number of diseases. The biggest boost from adding educational attainment PGS to the prediction model is from Type 2 diabetes, which, indeed, fits stereotypes about the type of person likely to suffer from diabetes.

But, keep in mind, these predictions models for disease are, at least so far, not terribly predictive.