Photo of someone working poolside at a resort

Research

The Jones group focuses on understanding the genomic landscape of cancer using high-throughput sequencing data and machine learning methods. The group is based at Canada's Michael Smith Genome Sciences Centre in Vancouver, Canada and associated with the University of British Columbia. These efforts aim to help the Personalized Oncogenomics (POG) project at BC Cancer and precision cancer medicine in general.

Jake Lever, a former PhD student in the group, has focussed on building and using biomedical natural language processing (BioNLP) tools to extract relevant cancer knowledge. These methods are aimed towards the vast PubMed and Pubmed Central Open Access corpora. Below are various projects that we hope will be valuable to the research community. Jake is now a postdoctoral researcher in the Helix group at Stanford University.

Projects

CIViCmine

CIVICmine image The CIViC database catalogues known cancer biomarkers for diagnosis, prognosis, predisposition and drug resistance. This knowledge is invaluable for personalized cancer projects to help select treatments for individual patients. To assist in curating this resource and to provide a high quality knowledge base in this area, cancer biomarkers have been mined from abstracts and full text papers. The resulting data can be viewed with the associated web viewer. A paper is forthcoming.

CancerMine

CancerMine image Understanding the role of different genes in different cancer types is essential for precision cancer efforts. This project uses text mining to extract known drivers, oncogenes and tumor suppressors discussed in the literature. The resulting data can be viewed with the associated web viewer and downloaded here. A paper is forthcoming.

Kindred

Kindred image Kindred is our relation extraction tool that uses a supervised learning approach. The code and associated paper are freely available. It is the successor to our BioNLP'16 Shared Task winning VERSE tool.

PubRunner

PubRunner image PubRunner is our framework to keep text mining results up-to-date. Built during a placement at the NCBI with Ben Busby, this framework manages the download of large corpora, execution of text mining tools and upload of results. A short paper of an early prototype can be found here and a full paper is forthcoming.