This repo contains:
- client-side code for spliceailookup.broadinstitute.org - contained within the index.html file and hosted via GitHub Pages.
- server-side code for SpliceAI and Pangolin REST APIs - contained within the google_cloud_run_services/ subdirectory and hosted on Google Cloud Run.
The SpliceAI and Pangolin APIs are available at the following urls:
https://spliceai-37-xwkwwwxdwq-uc.a.run.app - SpliceAI for variants on GRCh37
https://spliceai-38-xwkwwwxdwq-uc.a.run.app - SpliceAI for variants on GRCh38
https://pangolin-37-xwkwwwxdwq-uc.a.run.app - Pangolin for variants on GRCh37
https://pangolin-38-xwkwwwxdwq-uc.a.run.app - Pangolin for variants on GRCh38
WARNING: the APIs are intended for interactive use only, and do not support more than several requests per user per minute.
To process many variants in batch, please install and run the underlying models directly on your local infrastructure.
Their source code is available @ https://github.com/bw2/SpliceAI and https://github.com/bw2/Pangolin.
To query the API, select the appropriate base url above, and then use the following endpoints and arguments:
/spliceai/?hg=38&distance=50&variant=chr8-140300616-T-G
Get SpliceAI scores for the given variant.
- variant (required) a variant in the format "chrom-pos-ref-alt"
- hg (required) can be 37 or 38
- distance (optional) distance parameter of SpliceAI model (default: 50)
- mask (optional) can be 0 which means raw scores or 1 which means masked scores (default: 0). Splicing changes corresponding to strengthening annotated splice sites and weakening unannotated splice sites are typically much less pathogenic than weakening annotated splice sites and strengthening unannotated splice sites. When this parameter is = 1 (masked), the delta scores of such splicing changes are set to 0. SpliceAI developers recommend using raw (0) for alternative splicing analysis and masked (1) for variant interpretation.
/pangolin/?hg=38&distance=50&variant=chr8-140300616-T-G
Get Pangolin scores for the given variant.
- variant (required) a variant in the format "chrom-pos-ref-alt"
- hg (required) can be 37 or 38
- distance (optional) distance parameter of SpliceAI model (default: 50)
- mask (optional) can be 0 which means raw scores or 1 which means masked scores (default: 0). Splicing changes corresponding to strengthening annotated splice sites and weakening unannotated splice sites are typically much less pathogenic than weakening annotated splice sites and strengthening unannotated splice sites. When this parameter is = 1 (masked), the delta scores of such splicing changes are set to 0. SpliceAI developers recommend using raw (0) for alternative splicing analysis and masked (1) for variant interpretation.
The steps below describe how to install the API server on your local infrastructure. The details will vary depending on your OS, etc. If you run into issues, please submit them to the issue tracker.
- Install pytorch as described in the Pangolin installation docs
- Install the modified versions of SpliceAI and Pangolin from https://github.com/bw2/SpliceAI and https://github.com/bw2/Pangolin
- Install and start a redis server. It's used to cache previously computed API server responses so that they don't have to be computed again.
- Download reference fasta files: hg19.fa and hg38.fa
- Generate annotation files using the steps in the annotations README.
- Start the API server on localhost port 8080. To modify server options, edit the
start_local_server.sh
script:
$ git clone git@github.com:broadinstitute/SpliceAI-lookup.git # clone this repo
$ cd SpliceAI-lookup
$ python3 -m pip install -r requirements.txt # install python dependencies
$ ./start_local_server.sh
The server uses ~1.5 Gb RAM per server thread.
The spliceailookup.broadinstitute.org front-end is contained within index.html. It uses ES6 javascript with Semantic UI and jQuery. Also, it uses a custom version of igv.js that includes new track types for visualizing the SpliceAI & Pangolin scores. The original server-side code is in server.py and uses the Flask library. It is designed to run on a plain Linux or MacOS machine. The new server-side code is in the google_cloud_run_services/ subdirectory and includes Dockerfiles and scripts for deploying SpliceAI and Pangolin API services to Google Cloud Run.