MVP Data Available for Research
Through centralized data collection, cleaning and curation, MVP has a wealth of health records, self-reported surveys, and genetic data available for research, with generation of other omics data underway. MVP researchers contribute to the curation of phenotypes and all MVP phenotype definitions are stored in the Centralized Interactive Phenomics Resource (CIPHER), a publicly accessible phenotype knowledgebase. Note: CIPHER does not contain patient level data. CIPHER stores algorithms (which are instructions or “recipes”) for using MVP data to define health conditions.
Applying for Access to MVP Data
VA-affiliated researchers can apply for access through the funding opportunities in the Office of Research and Development. Please refer to all documentation and guidance located here. Note: Access is only available to VA system users. The guidance document for requesting a feasibility check from MVP can be found there as well. VA researchers can also apply for select types of non-VA federal funding.
When joining MVP, Veterans contribute the following information for research:
VA Health Records
The VA electronic health record (EHR) contains records for millions of Veterans including the roughly 9 million Veterans currently using the VA and millions more who used care in the past. It contains patient data from inpatient and outpatient visits including diagnoses, procedures, laboratory tests, prescriptions, clinical notes, reports and imaging. VA was one of the first hospital systems to adopt an EHR system in the 1980s and the current system has been in use for over 20 years.
Access to MVP data is available to VA researchers on approved federally funded projects. While the program is working to increase MVP data access by increasing computational capacity and assessing the regulatory landscape, there is no current mechanism for studies led by non-VA researchers.
VA is establishing a Data Commons where MVP data will be available to the broader research community in the coming years.
Self-reported surveys
- The MVP Baseline and Lifestyle Surveys collect information on Veterans’ health and wellbeing, including military experiences and exposures, family medical history, dietary habits, and much more. These surveys are requested from every participant in MVP and have been in use since the program launched in 2011.
- In 2016, a Gulf War Era Survey was launched to collect information from a subset of participants serving during that era.
- In response to the COVID-19 pandemic, the MVP COVID-19 Survey was developed and collected from participants between May 2020 and September 2021 to understand how the pandemic particularly affected Veterans.
Genetic and omic data
Veterans provide a blood sample, which is processed for DNA and plasma aliquots for genotyping and other analyses including Whole Genome Sequencing (WGS), methylation, proteomics, and metabolomics. The remaining sample is stored for future use in a VA Central Biorepository.
Other data sources
MVP requests additional data from sources both internal and external to VA, based on the needs of research projects. This data is integrated into the MVP repository for active MVP enrollees. Other data sources include:
- National Death Index (NDI): NDI contains date and cause of death obtained from state vital statistics offices. The data also includes ICD descriptions for underlying cause of death and the description of additional conditions. It serves to supplement information on death records in the VA and are provisioned by request to approved MVP projects.
- Centers for Medicare and Medicaid Services (CMS): CMS data is provisioned by request to approved MVP projects and contains data on active MVP enrollees for healthcare information captured by Medicare or Medicaid such as demographics, beneficiary summaries, inpatient and outpatient visits, vital status, facility and long-term care information, and prescription drugs.
Data snapshot: Data details
Surveys completed *
600,000+ Baseline Survey
485,000+ Lifestyle Survey
45,000+ Gulf War Survey
255,000+ COVID-19 Survey
*Reflects 1,025,000+ Veteran enrollees
Omics data
- Genotype array data is available to approved researchers, and other omics data capabilities are routinely becoming available.
- ~ 650,000 genotyped individuals using custom Affymetrix genotype array
- Imputed to hybrid 1000Genomes/African Genome Resource reference panel
- Imputed to TOPMed reference panel
- Minority-specific genotype array with over 750,000 genetic variants, including over 300,000 that are more common in minority populations and relevant to their health and well-being (processing underway)
- ~140,000 whole genome sequences (processing underway)
- 40,000 methylation arrays available
- Metabolomics and proteomics pilots underway