Genomic-QC: large-scale genomic data mining to assess the quality of HiPSC lines

Cell & Gene Therapy Insights 2019; 5(2), 203–219

10.18609/cgti.2019.020

Published: 28 March 2019
Research Article
Ravi Prabhkar More, Mahendra Rao, Odity Mukherjee

Human-induced pluripotent stem cells (HiPSC) are increasingly being used as input material to make differentiated cells for drug discovery, toxicity assays and for cell-based therapy. In the myriad uses of these pluripotent cells, quality and consistency of the cells and what they make is paramount. Currently, several concerns remain when considering the use of HiPSC-derived differentiated cells. These include the presence of integrated vectors used to generate HiPSC; the presence of mutations (germline or somatic) that may exist in the donor sample and/or be introduced in the derivation process; absence of detailed clinical information about donors; and the lack of tracking of the stability and integrity of HiPSC cells as they are propagated in culture. Although specific tests exist to address each of these concerns it can rapidly become time-consuming and prohibitively expensive. To address these issues, we propose a strategy based on mining the wealth of information present in whole genome sequencing (WGS) data and describe a comprehensive Genomic-QC report (dx.doi.org/10.17504/protocols.io.vuae6se) generated by mining mapped and unmapped reads to provide: (1) basic QC profile of a cell line; (2) an assessment of cell line integrity and purity; (3) analysis of foreign DNA for cell line sterility and genomic fidelity; (4) cell line mutation burden profile; and (5) inferred donor specific information. In this manuscript, we propose a set of sequencing data based tests that are relatively cheap but very informative and as such should be included as for information only (FIO) tests along with the regular battery of mandatory test for cell line QC analysis.