Photo of someone working poolside at a resort


The Jones group focuses on understanding the genomic landscape of cancer using high-throughput sequencing data and machine learning methods. The group is based at Canada's Michael Smith Genome Sciences Centre in Vancouver, Canada and associated with the University of British Columbia. These efforts aim to help the Personalized Oncogenomics (POG) project at BC Cancer and precision cancer medicine in general.

Jake Lever, completed his PhD in the group and focussed on building and using biomedical natural language processing (BioNLP) tools to extract relevant cancer knowledge. He is now a lecturer at the University of Glasgow where he continues to focus on research in this area. These methods are aimed towards the vast PubMed and Pubmed Central Open Access corpora. Below are various projects that we hope will be valuable to the research community.



CIViCmine aids curation of the CIViC database for known cancer biomarkers for diagnosis, prognosis, predisposition and drug resistance. This knowledge is invaluable for personalized cancer projects to help select treatments for individual patients. To assist in curation and to provide a high quality knowledge base in this area, cancer biomarkers have been mined from abstracts and full text papers. The resulting data can be viewed with the associated web viewer. The paper is available at Genome Medicine.


CancerMine uses text mining to extract known drivers, oncogenes and tumor suppressors discussed in the literature. Understanding the role of different genes in different cancer types is essential for precision cancer efforts. The project data can be viewed with the associated web viewer and downloaded at Zenodo. This work has been published in Nature Methods and a preprint paper is available at bioRxiv.


Kindred is our relation extraction tool that uses a supervised learning approach. The code and associated paper are freely available. It is the successor to our BioNLP'16 Shared Task winning VERSE tool.


PubRunner is our framework to keep text mining results up-to-date. Built during a placement at the NCBI with Ben Busby, this framework manages the download of large corpora, execution of text mining tools and upload of results. A short paper of an early prototype can be found here and a full paper is forthcoming.