Display options
Share it on

PLoS One. 2014 Jul 11;9(7):e101850. doi: 10.1371/journal.pone.0101850. eCollection 2014.

Quickly finding orthologs as reciprocal best hits with BLAT, LAST, and UBLAST: how much do we miss?.

PloS one

Natalie Ward, Gabriel Moreno-Hagelsieb

Affiliations

  1. Department of Biology, Wilfrid Laurier University, Waterloo, Ontario, Canada.

PMID: 25013894 PMCID: PMC4094424 DOI: 10.1371/journal.pone.0101850

Abstract

Reciprocal Best Hits (RBH) are a common proxy for orthology in comparative genomics. Essentially, a RBH is found when the proteins encoded by two genes, each in a different genome, find each other as the best scoring match in the other genome. NCBI's BLAST is the software most usually used for the sequence comparisons necessary to finding RBHs. Since sequence comparison can be time consuming, we decided to compare the number and quality of RBHs detected using algorithms that run in a fraction of the time as BLAST. We tested BLAT, LAST and UBLAST. All three programs ran in a hundredth to a 25th of the time required to run BLAST. A reduction in the number of homologs and RBHs found by the faster algorithms compared to BLAST becomes apparent as the genomes compared become more dissimilar, with BLAT, a program optimized for quickly finding very similar sequences, missing both the most homologs and the most RBHs. Though LAST produced the closest number of homologs and RBH to those produced with BLAST, UBLAST was very close, with either program producing between 0.6 and 0.8 of the RBHs as BLAST between dissimilar genomes, while in more similar genomes the differences were barely apparent. UBLAST ran faster than LAST, making it the best option among the programs tested.

References

  1. FEBS J. 2005 Oct;272(20):5101-9 - PubMed
  2. Nucleic Acids Res. 2004 Oct 11;32(18):5392-7 - PubMed
  3. Nucleic Acids Res. 2013 Jan;41(Database issue):D358-65 - PubMed
  4. Nucleic Acids Res. 2014 Jan;42(Database issue):D581-91 - PubMed
  5. Bioinformatics. 2010 Oct 1;26(19):2460-1 - PubMed
  6. Bioinformatics. 2007 Apr 1;23(7):815-24 - PubMed
  7. Bioinformatics. 2006 Jul 15;22(14):e9-15 - PubMed
  8. Nature. 1997 Nov 20;390(6657):249-56 - PubMed
  9. Science. 1996 Oct 25;274(5287):546, 563-7 - PubMed
  10. PLoS One. 2013;8(2):e54859 - PubMed
  11. Nucleic Acids Res. 2013 Jan;41(Database issue):D366-76 - PubMed
  12. Bioinformatics. 2012 Mar 15;28(6):900-4 - PubMed
  13. BMC Bioinformatics. 2003 Sep 11;4:41 - PubMed
  14. J Mol Biol. 1998 Nov 6;283(4):707-25 - PubMed
  15. PLoS Comput Biol. 2012;8(2):e1002386 - PubMed
  16. Science. 1997 Sep 5;277(5331):1453-62 - PubMed
  17. Nucleic Acids Res. 2001 Jun 15;29(12):2607-18 - PubMed
  18. Nucleic Acids Res. 2000 Jan 1;28(1):126-8 - PubMed
  19. PLoS Comput Biol. 2012;8(11):e1002784 - PubMed
  20. Genome Res. 2009 Feb;19(2):327-35 - PubMed
  21. Comput Biol Chem. 2003 Feb;27(1):49-58 - PubMed
  22. Nat Rev Genet. 2013 May;14(5):360-6 - PubMed
  23. Nucleic Acids Res. 2013 Jul;41(Web Server issue):W242-8 - PubMed
  24. Nucleic Acids Res. 2013 Jan;41(Database issue):D353-7 - PubMed
  25. Genome Res. 2011 Mar;21(3):487-93 - PubMed
  26. Nucleic Acids Res. 2001 Jul 15;29(14):2994-3005 - PubMed
  27. Bioinformatics. 2006 Aug 15;22(16):2044-6 - PubMed
  28. Nucleic Acids Res. 2006;34(20):5966-73 - PubMed
  29. Science. 1997 Oct 24;278(5338):631-7 - PubMed
  30. Trends Genet. 2000 May;16(5):227-31 - PubMed
  31. Bioinformatics. 2002;18 Suppl 1:S329-36 - PubMed
  32. BMC Genomics. 2008 Feb 08;9:75 - PubMed
  33. Nat Genet. 2005 Jun;37(6):573-7 - PubMed
  34. Nucleic Acids Res. 2003 Jan 1;31(1):365-70 - PubMed
  35. Nucleic Acids Res. 2014 Jan;42(Database issue):D231-9 - PubMed
  36. Nucleic Acids Res. 2008 Jan;36(Database issue):D735-40 - PubMed
  37. Bioinformatics. 2013 Apr 1;29(7):947-9 - PubMed
  38. Methods Enzymol. 1990;183:63-98 - PubMed
  39. Genome Res. 2002 Apr;12(4):656-64 - PubMed
  40. Nucleic Acids Res. 2014 Jan;42(Database issue):D553-9 - PubMed
  41. Nucleic Acids Res. 1997 Sep 1;25(17):3389-402 - PubMed
  42. J Mol Biol. 1981 Mar 25;147(1):195-7 - PubMed
  43. Bioinformatics. 2008 Feb 1;24(3):319-24 - PubMed
  44. J Mol Microbiol Biotechnol. 2002 Jul;4(4):453-61 - PubMed
  45. PLoS Comput Biol. 2012;8(5):e1002514 - PubMed
  46. BMC Biol. 2006 Dec 07;4:41 - PubMed
  47. BMC Bioinformatics. 2009 Dec 15;10:421 - PubMed
  48. J Mol Biol. 1990 Oct 5;215(3):403-10 - PubMed

MeSH terms

Publication Types