In the post genomic era, drug discovery efforts are faced with an exponential growth large,
systematically derived datasets. These are fuelled by faster and smarter technologies, and span the breadth of the drug discovery process. Advances in technologies in Next Generation Sequencing, Genome Wide Association Studies, proteomics, RNAi studies, and chemical screening and structural biology, means that the cancer translational research field can tap into an unprecedented wealth of data.
However, in order to realise the therapeutic opportunities these data can deliver, we are faced with the challenge of effectively integrating and representing these data, and providing a global holistic view of the knowledge they yield.
To address this challenge, we have developed an integrated cancer focussed knowledge-base, canSAR, able to integrate heterogeneous biological, chemical, pharmacological and other data of clinical relevance.
The database is modular and extensible to allow for future data growth. The public version of canSAR (v1.0) contains ~8 Million experimentally derived measurements, ~700,000 unique biologically active chemical structures and data for >1,000 cancer cell lines. These data are collated from a number of public sources, and collectively annotated to ensure seamless integration. canSAR will also contain annotated molecular target data representing the human genome and a number of model organisms. Context driven data are generated 'on the fly' from collaborator databases such as ChEMBL, ROCK and Array Express.
canSAR is accessed through a user-friendly web-based interface to support flexible querying.
canSAR aims to accelerate knowledge provision to cancer researchers, and support users from different disciplines to utilise the full breadth of this chemogenomic data to answer specific questions. For example:
- View Target Synopses to summarise publicly known data about a target and its variants including structural annotation, cancer-relevant mutations, chemical screening data, pathway and interaction networks, and identify cell lines that express the target for experimental analysis.
- View the bioactivity profile for a compound or chemical hit series, and identify likely off-targets activity and selectivity issues.
- View chemical annotation of a protein interaction network, and use this to select the best chemical intervention points for this network.
canSAR is supported by Cancer Research UK grant number C309/A8274.