Display options
Share it on

J Clin Bioinforma. 2014 Dec 05;4(1):15. doi: 10.1186/s13336-014-0015-z. eCollection 2014.

Copy number variation analysis based on AluScan sequences.

Journal of clinical bioinformatics

Jian-Feng Yang, Xiao-Fan Ding, Lei Chen, Wai-Kin Mat, Michelle Zhi Xu, Jin-Fei Chen, Jian-Min Wang, Lin Xu, Wai-Sang Poon, Ava Kwong, Gilberto Ka-Kit Leung, Tze-Ching Tan, Chi-Hung Yu, Yue-Bin Ke, Xin-Yun Xu, Xiao-Yan Ke, Ronald Cw Ma, Juliana Cn Chan, Wei-Qing Wan, Li-Wei Zhang, Yogesh Kumar, Shui-Ying Tsang, Shao Li, Hong-Yang Wang, Hong Xue

Affiliations

  1. Division of Life Science and Applied Genomics Centre, Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong, China.
  2. National Center for Liver Cancer Research and Eastern Hepatobiliary Surgery Hospital, 225 Changhai Road, Shanghai, 200438 China.
  3. Department of Oncology, Nanjing First Hospital, No. 68 Changle Road, Nanjing, 210006 China.
  4. Department of Hematology, Changhai Hospital, Second Military Medical University, 174 Changhai Road, Shanghai, 200433 China.
  5. Department of Thoracic Surgery, Jiangsu Key Laboratory of Molecular and Translational Cancer Research, Nanjing Medical University Affiliated Cancer Hospital, Cancer Institute of Jiangsu Province, Baiziting 42, Nanjing, 210009 China.
  6. Division of Neurosurgery, Department of Surgery, Prince of Wales Hospital, Chinese University of Hong Kong, 30-32 Ngan Shing Street, Sha Tin, Hong Kong, China.
  7. Division of Neurosurgery, Department of Surgery, Li Ka Shing Faculty of Medicine, University of Hong Kong, Queen Mary Hospital, 102 Pokfulam Road, Hong Kong, China.
  8. Department of Neurosurgery, Queen Elizabeth Hospital, 30 Gascoigne Road, Kowloon, Hong Kong, China.
  9. Shenzhen Center for Disease Control and Prevention, No 8 Longyuan Road, Nanshan district, Shenzhen City, 518055 China.
  10. Nanjing Brain Hospital and Nanjing Institute of Neuropsychiatry, Nanjing Medical University, Nanjing, 210029 China.
  11. Department of Medicine and Therapeutics, 9th floor, Clinical Sciences Building, The Prince of Wales Hospital, Shatin, Hong Kong.
  12. Department of Neurosurgery, Beijing Tiantan Hospital, 6 Tiantan Xili, Dongcheng District, Capital Medical University, Beijing, 100050 China.
  13. MOE Key Laboratory of Bioinformatics and Bioinformatics Division, TNLIST, Department of Automation, Tsinghua University, Beijing, 100084 China.
  14. International Cooperation Laboratory on Signal Transduction, Eastern Hepatobiliary Surgery Hospital, 225 Changhai Road, Shanghai, 200438 China.

PMID: 25558350 PMCID: PMC4273479 DOI: 10.1186/s13336-014-0015-z

Abstract

BACKGROUND: AluScan combines inter-Alu PCR using multiple Alu-based primers with opposite orientations and next-generation sequencing to capture a huge number of Alu-proximal genomic sequences for investigation. Its requirement of only sub-microgram quantities of DNA facilitates the examination of large numbers of samples. However, the special features of AluScan data rendered difficult the calling of copy number variation (CNV) directly using the calling algorithms designed for whole genome sequencing (WGS) or exome sequencing.

RESULTS: In this study, an AluScanCNV package has been assembled for efficient CNV calling from AluScan sequencing data employing a Geary-Hinkley transformation (GHT) of read-depth ratios between either paired test-control samples, or between test samples and a reference template constructed from reference samples, to call the localized CNVs, followed by use of a GISTIC-like algorithm to identify recurrent CNVs and circular binary segmentation (CBS) to reveal large extended CNVs. To evaluate the utility of CNVs called from AluScan data, the AluScans from 23 non-cancer and 38 cancer genomes were analyzed in this study. The glioma samples analyzed yielded the familiar extended copy-number losses on chromosomes 1p and 9. Also, the recurrent somatic CNVs identified from liver cancer samples were similar to those reported for liver cancer WGS with respect to a striking enrichment of copy-number gains in chromosomes 1q and 8q. When localized or recurrent CNV-features capable of distinguishing between liver and non-liver cancer samples were selected by correlation-based machine learning, a highly accurate separation of the liver and non-liver cancer classes was attained.

CONCLUSIONS: The results obtained from non-cancer and cancerous tissues indicated that the AluScanCNV package can be employed to call localized, recurrent and extended CNVs from AluScan sequences. Moreover, both the localized and recurrent CNVs identified by this method could be subjected to machine-learning selection to yield distinguishing CNV-features that were capable of separating between liver cancers and other types of cancers. Since the method is applicable to any human DNA sample with or without the availability of a paired control, it can also be employed to analyze the constitutional CNVs of individuals.

Keywords: AluScan sequencing; CNV calling; Cancer classification; Machine learning

References

  1. Bioinformatics. 2012 Nov 1;28(21):2711-8 - PubMed
  2. Bioinformatics. 2012 Feb 1;28(3):423-5 - PubMed
  3. Future Oncol. 2012 Apr;8(4):441-50 - PubMed
  4. Bioinformatics. 2011 Oct 1;27(19):2648-54 - PubMed
  5. Nat Genet. 2005 Jun;37 Suppl:S11-7 - PubMed
  6. Genome Res. 2013 Sep;23 (9):1422-33 - PubMed
  7. Nat Rev Genet. 2009 Aug;10(8):551-64 - PubMed
  8. Genomics Insights. 2014 Jun 26;7:1-11 - PubMed
  9. Bioinformatics. 2007 Mar 15;23(6):657-63 - PubMed
  10. Nature. 2009 Jun 18;459(7249):987-91 - PubMed
  11. OMICS. 2011 May;15(5):273-80 - PubMed
  12. Genome Res. 2009 Sep;19(9):1586-92 - PubMed
  13. Genome Res. 2006 Mar;16(3):394-404 - PubMed
  14. PLoS One. 2013;8(3):e59128 - PubMed
  15. Nature. 2010 Feb 18;463(7283):899-905 - PubMed
  16. Genome Biol. 2011;12(4):R41 - PubMed
  17. Bioinformatics. 2010 Mar 15;26(6):841-2 - PubMed
  18. Genome Med. 2009 Jun 16;1(6):62 - PubMed
  19. Mod Pathol. 2013 Jul;26(7):922-9 - PubMed
  20. Biometrics. 2011 Dec;67(4):1564-72 - PubMed
  21. Carcinogenesis. 2007 Jul;28(7):1442-5 - PubMed
  22. PLoS One. 2013 Jun 03;8(6):e65188 - PubMed
  23. Biostatistics. 2004 Oct;5(4):557-72 - PubMed
  24. Bioinformatics. 2012 Nov 1;28(21):2747-54 - PubMed
  25. Genomics. 2013 Sep;102(3):174-81 - PubMed
  26. BMC Bioinformatics. 2009 Mar 06;10:80 - PubMed
  27. Nat Methods. 2009 Jan;6(1):99-103 - PubMed
  28. BMC Genomics. 2011 Nov 17;12:564 - PubMed
  29. Brief Bioinform. 2014 Mar;15(2):256-78 - PubMed
  30. Nat Genet. 2013 Oct;45(10):1127-33 - PubMed
  31. Cancer Res. 2009 Mar 15;69(6):2176-9 - PubMed
  32. Neuro Oncol. 2014 May;16(5):662-70 - PubMed
  33. PLoS One. 2011 Feb 04;6(2):e14579 - PubMed
  34. Genome Res. 2011 Jun;21(6):974-84 - PubMed
  35. Bioinformatics. 2006 Jun 15;22(12):1540-2 - PubMed

Publication Types