All of Us now has health data on 740,000 in the US (but little on Vitamin D)
Short version:
All of Us is fundamentally an EHR-plus-genomics biobank, not a standardized nutritional survey. That distinction drives everything about what it can and can't do for vitamin D work. It's the opposite of NHANES — NHANES gives you a standardized LC-MS/MS 25(OH)D on a random sample but almost no genetics; All of Us gives you deep genetics on a huge diverse cohort but a messy, opportunistic vitamin D phenotype. Here's how that plays out across your four questions.
Vitamin D levels - not uniformly tested
25(OH)D levels — weak, and this is the catch. There is no program-wide vitamin D assay. All of Us did not draw blood and run a standardized 25(OH)D the way NHANES or UK Biobank did. Any vitamin D value comes from electronic health record laboratory tests — opportunistic clinical measurements that land in the OMOP measurement table keyed by LOINC. That means three problems stacked on top of each other: the assays are heterogeneous (DiaSorin, Roche, LC-MS/MS all mixed, unstandardized — exactly the VDSP problem); the measurements are non-random (someone got a 25(OH)D drawn because a clinician had a reason, so you have massive selection/indication bias); and labs only exist for the EHR-consented subset. As of CDRv8 (released February 2025), that's EHR data from over 393,000 participants out of the broader cohort — and only a fraction of those will have a 25(OH)D on file. No program-level 24,25(OH)₂D, no VDBP, no free vitamin D. This is why, notably, there's no headline All of Us 25(OH)D GWAS the way there is for UK Biobank — the phenotype is the bottleneck.
Vitamin D genes - good information?
Vitamin D genes — this is the strong part. Genomics is where All of Us actually beats most alternatives. Clinical-grade whole-genome sequencing for 245,388 participants in the 2024 release (more now), plus genotyping arrays for over 447,000. So everything you care about is directly genotyped or sequenced, not imputed: GC (rs4588, rs7041), CYP2R1, DHCR7/NADSYN1, CYP24A1, CYP27B1, and the VDR SNPs (FokI, BsmI, TaqI, ApaI). And the real differentiator — 77% of individuals with genomic data are from groups historically under-represented in biomedical research, including 46% who self-identify with a racial or ethnic minority group. That diversity is exactly what's missing from the GC/CYP2R1 literature, which is heavily European. The limitation is that to run a vitamin D genetic association you'd have to build the 25(OH)D phenotype from those messy EHR labs above — so the genetics are excellent but chained to a weak outcome variable.
Little information on dosing - generally just D2 prescriptions
Dosing — prescription yes, OTC mostly no. Prescription vitamin D (ergocalciferol 50,000 IU, calcitriol, paricalcitol, doxercalciferol) shows up reasonably in the OMOP drug_exposure table, since EHR data types include medications. But the way most people actually take D — OTC D3 softgels — is captured poorly to not at all, because it never generates a prescription record. Self-report via the surveys is thin on dose and useless on adherence. So you can study prescription vitamin D exposure as a time-varying covariate, but you cannot reconstruct real supplemental D3 dosing regimens the way you'd want for a non-daily-equals-daily argument. That kind of question is better served by the pharmacy-dispensing datasets — the Leumit/TriNetX approach that used pharmacy-dispensed supplementation as a time-varying covariate is the template All of Us can't quite match on the OTC side.
# # # Virtually nothing about co-factors
Cofactors — the weakest area. Magnesium, K2, boron, zinc, vitamin A intake essentially don't exist as clean data. Serum magnesium is a common EHR lab so you'll get spotty opportunistic values; zinc labs are rarer; vitamin K and boron are effectively never measured clinically. Cofactor intake would have to come from supplement self-report, which has the same OTC blind spot as dosing. So cofactor analysis is close to a non-starter here beyond crude serum-magnesium associations.
Poor latitude information for most, and none for people living on reservations
[Claude AI - July 2026](https://claude.ai/share/12620f3d-4763-43d1-ac9f-1214f95fa5a1
All of Us Research Program hits major milestone for precision medicine
With data from nearly 750,000 participants, the NIH initiative is now the largest integrated genomic and electronic health record database in the world, research officials said.
Related in VitaminDwiki
- Low Mortality if more than 20 ng of Vitamin D (UK Biobank)
- Air pollution if less of a problem in hypertensives having higher Vitamin D levels (UK Biobank)
Vitamin D Tests
- Vitamin D test result can vary by 40 ng (10 reasons)
- How to compare Vitamin D measurements from multiple studies
- Huge variation in Vitamin D test results between 4 testers for 8 people
- Vitamin D testing accuracies, including dried blood spot
- Vitamin D deficiency of a group - 15 pcnt to 48 pcnt (depends on tester used)