Visualisation and Analysis

ORILINX produces bedGraph output that can be visualised in genome browsers for intuitive exploration of predicted replication origins. This section covers various tools and methods for visualising results.

Overview

bedGraph files from ORILINX are compatible with most genome browsers including:

UCSC Genome Browser - Web-based, no installation needed
IGV (Integrative Genomics Viewer) - Desktop application with advanced features
JBrowse - Lightweight web browser for genomic data
Gviz (R) or pyBigWig (Python) - Programmatic visualisation

The basic workflow is:

Run ORILINX to generate bedGraph files
Load them into a genome browser
Compare with other genomic features (genes, regulatory elements, etc.)
Interpret results in biological context

UCSC Genome Browser

The UCSC Genome Browser is a free, web-based tool that requires no installation.

Basic Usage

Generate your ORILINX results:

orilinx --fasta_path hg38.fa --output_dir results --sequence_names chr8

Go to https://genome.ucsc.edu/
Select your genome (e.g., “Human” and “Dec. 2013 (GRCh38/hg38)”)
Navigate to your region of interest (e.g., type “chr8:128862888-128870405” in the search box)
Click “Add Custom Tracks” and upload your bedGraph file:
- Copy the contents of results/chr8.bedGraph or upload the file directly
- Set the display height and colour
- Click “Submit”
Your ORILINX scores will appear as a histogram track

Customizing Track Appearance

You can customize how your track appears by adding a header to your bedGraph file:

track name="ORILINX Origins" description="Predicted replication origins" colour=50,50,200 viewLimits=0:1

Then prepend this to your bedGraph file:

echo 'track name="ORILINX Origins" description="Predicted replication origins" colour=50,50,200 viewLimits=0:1' > formatted.bedGraph
cat results/chr8.bedGraph >> formatted.bedGraph

Then upload formatted.bedGraph to UCSC.

Converting to BigWig for faster loading

For large files, convert bedGraph to BigWig format for faster loading:

# Download BigWig tools if needed
wget http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/bedGraphToBigWig
chmod +x bedGraphToBigWig

# Obtain chrom sizes
curl https://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/hg38.chrom.sizes > hg38.chrom.sizes

# Convert
./bedGraphToBigWig results/chr8.bedGraph hg38.chrom.sizes results/chr8.bw

Then upload the .bw file to UCSC instead of the bedGraph.

IGV (Integrative Genomics Viewer)

IGV is a desktop application that offers more control and advanced features than web browsers.

Installation

Download from http://software.broadinstitute.org/software/igv/
Install for your operating system (Mac, Windows, Linux)
Launch IGV

Loading ORILINX Data

Open IGV and select your genome (File → Genomes → Load Genome)
Load your bedGraph file:
- File → Load from File
- Select results/chr8.bedGraph
Navigate to your region of interest using the search box
IGV will display your ORILINX scores as a bar graph

Tips for IGV

Zoom in/out: Use the zoom controls or scroll wheel
Compare tracks: Load multiple bedGraph files simultaneously to compare regions
Overlay with annotations: Load gene annotations, ChIP-seq, or other genomic data for context
Export images: Right-click on tracks to save publication-quality figures
Coverage view: Change display mode to “expanded” to see individual windows

Coloured tracks

Create a coloured bedGraph based on score thresholds:

# High confidence origins (score > 0.7) in red
# Medium confidence (0.3-0.7) in yellow
# Low confidence (< 0.3) in blue
awk 'BEGIN {FS=OFS="\t"}
     {if ($4 > 0.7) colour="255,0,0";
      else if ($4 > 0.3) colour="255,255,0";
      else colour="0,0,255";
      print $1, $2, $3, $4, colour}' results/chr8.bedGraph > coloured.bedGraph

Then load coloured.bedGraph in IGV.

JBrowse

JBrowse is a lightweight, embeddable genome browser suitable for web-based visualisation.

Using JBrowse Online

Visit https://jbrowse.org/jbrowse/
Select your reference genome
Add tracks:
- Click “Add Track”
- Paste the URL to your bedGraph file or upload it directly
Navigate to your region to visualize

Self-hosted JBrowse

For more control, you can host JBrowse on your own server:

# Install JBrowse (see documentation)
wget https://github.com/GMOD/jbrowse/releases/download/1.11.6/JBrowse-1.11.6.zip
unzip JBrowse-1.11.6.zip

# Configure data directory
cd JBrowse-1.11.6
./bin/prepare-refseqs.pl --fasta hg38.fa

# Add ORILINX track
./bin/flatfile-to-json.pl --gff results/chr8.bedGraph --type bedGraph --trackType wig --out data

In Python

Use Python for programmatic visualisation and analysis of ORILINX results.

Basic plot with Matplotlib

import pandas as pd
import matplotlib.pyplot as plt

# Read CSV output
df = pd.read_csv('results/chr8.csv')

# Plot probability scores
plt.figure(figsize=(14, 4))
plt.plot(df['start'], df['probability'], linewidth=0.5)
plt.fill_between(df['start'], df['probability'], alpha=0.3)
plt.xlabel('Genomic Position (bp)')
plt.ylabel('Origin Probability')
plt.title('ORILINX Predictions - Chr8')
plt.tight_layout()
plt.savefig('origins_plot.png', dpi=300)
plt.show()

Finding high-confidence origins

import pandas as pd

df = pd.read_csv('results/chr8.csv')

# Filter for high-confidence origins (>0.7 probability)
origins = df[df['probability'] > 0.7]

print(f"Found {len(origins)} high-confidence origins")
print(origins[['start', 'end', 'probability']])

# Export for further analysis
origins.to_csv('high_confidence_origins.csv', index=False)

Interactive visualisation with Plotly

import pandas as pd
import plotly.graph_objects as go

df = pd.read_csv('results/chr8.csv')

fig = go.Figure()

fig.add_trace(go.Scatter(
    x=df['start'],
    y=df['probability'],
    mode='lines',
    name='ORILINX Probability',
    fill='tozeroy'
))

# Add threshold lines
fig.add_hline(y=0.7, line_dash="dash", line_color="red",
              annotation_text="High confidence", annotation_position="right")
fig.add_hline(y=0.3, line_dash="dash", line_color="orange",
              annotation_text="Low confidence", annotation_position="right")

fig.update_layout(
    title='ORILINX Predictions with Confidence Thresholds',
    xaxis_title='Genomic Position (bp)',
    yaxis_title='Origin Probability',
    hovermode='x unified'
)

fig.show()
fig.write_html('origins_interactive.html')

Comparison of multiple regions

import pandas as pd
import matplotlib.pyplot as plt

# Load multiple regions
regions = {}
for region in ['chr1', 'chr8', 'chrX']:
    regions[region] = pd.read_csv(f'results/{region}.csv')

# Plot comparison
fig, axes = plt.subplots(len(regions), 1, figsize=(14, 3*len(regions)))

for idx, (region, df) in enumerate(regions.items()):
    axes[idx].plot(df['start'], df['probability'], linewidth=0.5)
    axes[idx].fill_between(df['start'], df['probability'], alpha=0.3)
    axes[idx].set_title(f'{region} ORILINX Predictions')
    axes[idx].set_ylabel('Probability')
    if idx == len(regions) - 1:
        axes[idx].set_xlabel('Genomic Position (bp)')

plt.tight_layout()
plt.savefig('multiregion_comparison.png', dpi=300)
plt.show()

In R

Use R for statistical analysis and publication-quality figures.

Basic plot with ggplot2

library(ggplot2)
library(dplyr)

# Read CSV output
df <- read.csv('results/chr8.csv')

# Create plot
ggplot(df, aes(x=start, y=probability)) +
  geom_line(size=0.2) +
  geom_area(alpha=0.3) +
  theme_minimal() +
  labs(
    title = 'ORILINX Predictions - Chr8',
    x = 'Genomic Position (bp)',
    y = 'Origin Probability'
  ) +
  theme(text=element_text(size=12))

ggsave('origins_plot.png', width=14, height=4, dpi=300)

Finding and annotating peaks

library(dplyr)
library(ggplot2)

df <- read.csv('results/chr8.csv')

# Find peak origins (local maxima)
df <- df %>%
  mutate(
    is_peak = probability > 0.7,
    peak_id = cumsum(c(TRUE, diff(is_peak) != 0)) * is_peak
  )

peaks <- df %>%
  filter(is_peak) %>%
  group_by(peak_id) %>%
  summarise(
    peak_start = min(start),
    peak_end = max(end),
    peak_probability = max(probability),
    .groups = 'drop'
  )

print(peaks)
write.csv(peaks, 'predicted_origins.csv', row.names=FALSE)

Genome browser-style visualisation with Gviz

library(Gviz)
library(GenomicRanges)

# Read ORILINX data
df <- read.csv('results/chr8.csv')

# Convert to GRanges
gr <- GRanges(
  seqnames = df$chrom,
  ranges = IRanges(start = df$start, end = df$end),
  score = df$probability
)

# Create DataTrack
dtrack <- DataTrack(
  range = gr,
  name = "ORILINX",
  type = "histogram",
  col.histogram = "steelblue",
  fill.histogram = "steelblue"
)

# Plot
plotTracks(dtrack, from=128862888, to=128870405, chromosome="chr8")

Combining with other genomic features

For biological interpretation, visualise ORILINX results alongside:

Gene annotations
ChIP-seq peaks
Copy number variation
Evolutionary conservation
Chromatin accessibility

Example: Adding genes to your plot

import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches

# Load ORILINX results and gene annotations
origins = pd.read_csv('results/chr8.csv')
genes = pd.read_csv('gene_annotations.csv')  # Your gene file

fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(14, 6), sharex=True)

# Plot ORILINX scores
ax1.plot(origins['start'], origins['probability'], linewidth=0.5)
ax1.fill_between(origins['start'], origins['probability'], alpha=0.3)
ax1.set_ylabel('ORILINX Probability')
ax1.set_title('Chr8 - Origins and Gene Structure')

# Plot genes
for idx, gene in genes.iterrows():
    ax2.barh(0, gene['end']-gene['start'],
            left=gene['start'], height=0.5, label=gene['name'])
ax2.set_ylabel('Genes')
ax2.set_xlabel('Genomic Position (bp)')

plt.tight_layout()
plt.savefig('origins_with_genes.png', dpi=300)
plt.show()