Photo of someone working poolside at a resort

Research

The Jones group focuses on understanding the genomic landscape of cancer using high-throughput sequencing data and machine learning methods. The group is based at Canada's Michael Smith Genome Sciences Centre in Vancouver, Canada and associated with the University of British Columbia. These efforts aim to help the Personalized Oncogenomics (POG) project at BC Cancer and precision cancer medicine in general.

Jake Lever, a PhD student in the group, has focussed on building and using biomedical natural language processing (BioNLP) tools to extract relevant cancer knowledge. These methods are aimed towards the vast PubMed and Pubmed Central Open Access corpora. Below are various projects that we hope will be valuable to the research community.

Projects

CIViCmine

CIViCmine aids curation of the CIViC database for known cancer biomarkers for diagnosis, prognosis, predisposition and drug resistance. This knowledge is invaluable for personalized cancer projects to help select treatments for individual patients. To assist in curation and to provide a high quality knowledge base in this area, cancer biomarkers have been mined from abstracts and full text papers. The resulting data can be viewed with the associated web viewer. A paper is forthcoming.

CancerMine

CancerMine uses text mining to extract known drivers, oncogenes and tumor suppressors discussed in the literature. Understanding the role of different genes in different cancer types is essential for precision cancer efforts. The project data can be viewed with the associated web viewer and downloaded at Zenodo. A preprint paper is available at bioRxiv.

Kindred

Kindred is our relation extraction tool that uses a supervised learning approach. The code and associated paper are freely available. It is the successor to our BioNLP'16 Shared Task winning VERSE tool.

PubRunner

PubRunner is our framework to keep text mining results up-to-date. Built during a placement at the NCBI with Ben Busby, this framework manages the download of large corpora, execution of text mining tools and upload of results. A short paper of an early prototype can be found here and a full paper is forthcoming.