SelfDecode Discovers the Most Optimal Genotype Imputation Models for Accurate and Reliable Results

MIAMI, Nov. 1, 2022 /PRNewswire/ — SelfDecode has recently released the results of a study they performed using the most popular tools for genetic phasing and imputation to zero in on the factors that maximize imputation accuracy.

Genotype imputation is an important process within genetic analysis and risk estimation. DNA chips typically read only about 500,000 – 700,000 variants of the more than 3 billion base pairs in the human genome. This leaves many gaps in the information that most commercial DNA chips provide.

Using genotype imputation, those gaps can be filled in, which is important for downstream applications, such as estimating one’s genetic risk of heart disease.

“How well we can impute a person’s genome depends on a number of factors, including the reference populations used, phasing (which is how the data is prepared), and which software or tools end up being used,” said Dr. Puya Yazdi, Chief Science Officer at SelfDecode.  “That is why our research and development team tested, compared, and benchmarked cutting-edge phasing and imputation software against different chips, data preparation methods, and different reference datasets. We tested 144 combinations in total!”

The goal of this study was to identify methods to maximize imputation accuracy. The imputation software used in the study included Beagle5.4, Impute5, ShapeIT4, Minimac4, and Eagle2.4.1.

The SelfDecode team has also compared imputation accuracy metrics, with the goal of understanding which are the most reliable and how they can be used in different scenarios. 

As a result of the study, SelfDecode has created a processing and comparison pipeline that can help researchers design better chips by choosing SNPs that maximize phasing and imputation accuracy.

Additionally, researchers can use this information to choose the best combination of phasing and imputation tools for their chip, datasets, and computational needs in order to produce optimal results.

“Most importantly, we’ve found that all of the current state-of-art tools have limits and drawbacks. For example, they are not accurate enough to impute rare and ultra-rare variants. Our team is working on overcoming some of these limits,” said Dr. Yazdi. “We are currently working on our in-house imputation tool by employing the latest scientific advances, including AI and machine learning. Even greater advancements in our imputation pipeline are just around the corner!”

Visit to learn more about the research and development team at SelfDecode and the types of health-related insights you can learn from your genes.


SelfDecode is a fast-growing precision health company providing consumers and companies with advanced genetic analysis and personalized insights. Using AI, SelfDecode combines DNA, lab, and environmental data to provide science-based health advice tailored to each individual. Learn more at

Contact: Victoria Shelton 

SOURCE SelfDecode