Spaces:
Running
A newer version of the Gradio SDK is available:
5.28.0
title: Gradio-IGV
emoji: 🧬
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.23.3
app_file: app.py
pinned: false
license: bsd-3-clause
tags:
- gradio-custom-component
gradio-igv
Embed IGV.js in your app to visualize genomics data.
Installation
pip install gradio_igv
Quickstart
Here is a quickstart to make an app which views a CRAM file in IGV and displays a table of the first 20 visible reads.
import gradio as gr
from gradio_igv import IGV, IGVContext, AlignmentTrackLoad, FeatureContext, parse_locus
import pandas as pd
public_cram = "https://s3.amazonaws.com/1000genomes/data/HG00103/alignment/HG00103.alt_bwamem_GRCh38DH.20150718.GBR.low_coverage.cram"
default_igv_context = IGVContext(
genome="hg38",
).update_locus("BRCA1").add_track(
AlignmentTrackLoad(
name="HG00103",
url=public_cram,
indexURL=f"{public_cram}.crai",
order=1,
height=200,
colorBy="strand",
oauthToken=None, # Public file so no auth needed; otherwise inferred by URL type using environment
)
)
def summarize_visible_alignments(igv_context):
loci = parse_locus(igv_context.locus)
feature_ctx = FeatureContext(
files=[public_cram],
names=["HG00103"],
loci=loci,
)
reads = list(feature_ctx.features["HG00103"])
df = pd.DataFrame({
"Read Name": [read.query_name for read in reads],
"Pos": [read.reference_start for read in reads],
"MAPQ": [read.mapq for read in reads],
}).sort_values(by='Pos')
return df.head(20)
with gr.Blocks() as demo:
with gr.Row():
with gr.Column(scale=3):
igv_component = IGV(value=default_igv_context, label="IGV Browser")
with gr.Column(scale=1):
alignment_summary = gr.DataFrame(value=pd.DataFrame(), label="Alignment Summary", max_height=800)
igv_component.locuschange(summarize_visible_alignments, [igv_component], [alignment_summary])
if __name__ == "__main__":
demo.launch()
Usage
The library contains a few fundamental classes that work together to manipulate the IGV.js instance in your app.
IGV
This is the main component that you will use to embed IGV.js in your app. It takes an IGVContext
object as input and returns an IGVContext
object as output. The IGVContext
object is used to configure the IGV instance.
IGVContext
This object is used to configure the IGV instance. It should be initialized using a choice of reference genome (generally either hg19
or hg38
, but you can also use the ReferenceGenome
object to link to a custom fasta). You can then update the locus, add tracks, and configure the IGV instance using the following methods:
update_locus
: Update the current locus of the IGV instance using a string.add_track
: Add a track to the IGV instance using aTrackLoad
object.remove_track
: Remove a track from the IGV instance using a string.update_genome
: Update the reference genome of the IGV instance using a string or aReferenceGenome
object.
TrackLoad
The TrackLoad
class has multiple classes which inherit from it that are meant to be used to facilitate loading and configuring different tracks. The defaults used are meant to reflect those of IGV.js. The following classes are available:
AnnotationTrackLoad
: Load an annotation track using e.g. GFF, GTF, or BED file.AlignmentTrackLoad
: Load an alignment track using a CRAM or BAM file.VariantTrackLoad
: Load a variant track using a VCF or BCF file.
At the moment, these are the only types of tracks supported to upload. Feel free to open an issue if you have a use case for more!
To see the full customizable parameters, check the input fields and compare with the IGV.js documentation. Note that OAuth token generation should be handled automatically based on the URL provided if it lives in GCloud, Azure, or AWS and assuming your environment is configured appropriately to access the file.
FeatureContext
This is a convenience object you can create using the same filenames and loci as the IGVContext
to fetch features from the files. This is useful for fetching features from the files that are currently visible in the IGV instance, like in the Quickstart example above. It works well with the parse_locus
method as in the Quickstart example. The features
attribute is a dictionary of lists of features, where the keys are the names provided for each file and the values are iterables over the visible features. This uses pysam
under the hood for the genomics files IO.
Recipes
Be sure to check out the recipes directory for more full examples of how to use the library, along with some tools that might be useful, including:
- SV-Breakpoint-Visualizer: A tool to visualize structural variant breakpoints in IGV with statistics on reads covering the edges.