Intersection#

This section covers common patterns for finding overlapping genomic features using GIQL’s spatial operators.

Finding Overlapping Features#

Basic Overlap Query#

Find all features in table A that overlap with any feature in table B:

SELECT DISTINCT a.*
FROM features_a a, features_b b
WHERE a.interval INTERSECTS b.interval

Use case: Identify variants that fall within gene regions.

Get All Overlap Pairs#

Return every pair of overlapping features (may produce duplicates if one feature overlaps multiple others):

SELECT a.*, b.*
FROM features_a a, features_b b
WHERE a.interval INTERSECTS b.interval

Use case: Generate a full overlap matrix for downstream analysis.

Query Against a Specific Region#

Find features overlapping a literal genomic range:

SELECT * FROM variants
WHERE interval INTERSECTS 'chr1:1000000-2000000'

Use case: Extract all data for a specific chromosomal region.

Filtering by Overlap#

Excluding Overlaps#

Find features in A that do NOT overlap with any feature in B:

SELECT a.*
FROM features_a a
LEFT JOIN features_b b ON a.interval INTERSECTS b.interval
WHERE b.chrom IS NULL

Use case: Find regulatory regions that don’t overlap with known genes, or identify variants outside of exonic regions.

Features with Any Overlap (Unique)#

Return each feature from A only once, regardless of how many B features it overlaps:

SELECT DISTINCT a.*
FROM features_a a
INNER JOIN features_b b ON a.interval INTERSECTS b.interval

Use case: Get a deduplicated list of features that have at least one overlap.

Counting Overlaps#

Count Overlapping Features#

Count how many B features each A feature overlaps:

SELECT a.*, COUNT(b.name) AS overlap_count
FROM features_a a
LEFT JOIN features_b b ON a.interval INTERSECTS b.interval
GROUP BY a.chrom, a.start, a.end, a.name, a.score, a.strand

Use case: Calculate how many enhancers each gene overlaps with, or count variants per feature.

Filter by Overlap Count#

Find features that overlap at least N other features:

SELECT a.*
FROM features_a a
INNER JOIN features_b b ON a.interval INTERSECTS b.interval
GROUP BY a.chrom, a.start, a.end, a.name, a.score, a.strand
HAVING COUNT(*) >= 3

Use case: Identify hotspot regions with high feature density.

Strand-Specific Operations#

Same-Strand Overlaps#

Find overlapping features on the same strand:

SELECT a.*, b.name AS b_name
FROM features_a a, features_b b
WHERE a.interval INTERSECTS b.interval
  AND a.strand = b.strand

Use case: Find sense-strand overlaps for transcript analysis.

Opposite-Strand Overlaps#

Find overlapping features on opposite strands:

SELECT a.*, b.name AS b_name
FROM features_a a, features_b b
WHERE a.interval INTERSECTS b.interval
  AND a.strand != b.strand
  AND a.strand IN ('+', '-')
  AND b.strand IN ('+', '-')

Use case: Identify antisense overlaps or convergent transcription.

Overlap Fraction Requirements#

Minimum Overlap Fraction of A#

Find overlaps where at least 50% of feature A is covered:

SELECT a.*
FROM features_a a, features_b b
WHERE a.interval INTERSECTS b.interval
  AND (
      LEAST(a.end, b.end) - GREATEST(a.start, b.start)
  ) >= 0.5 * (a.end - a.start)

Use case: Ensure substantial overlap rather than just touching edges.

Minimum Overlap Fraction of B#

Find overlaps where at least 50% of feature B is covered:

SELECT a.*
FROM features_a a, features_b b
WHERE a.interval INTERSECTS b.interval
  AND (
      LEAST(a.end, b.end) - GREATEST(a.start, b.start)
  ) >= 0.5 * (b.end - b.start)

Use case: Find features that substantially cover smaller annotations.

Reciprocal Overlap#

Require both features to have at least 50% mutual overlap:

WITH overlap_calcs AS (
    SELECT
        a.*,
        b.name AS b_name,
        (LEAST(a.end, b.end) - GREATEST(a.start, b.start)) AS overlap_bp,
        (a.end - a.start) AS a_length,
        (b.end - b.start) AS b_length
    FROM features_a a, features_b b
    WHERE a.interval INTERSECTS b.interval
)
SELECT *
FROM overlap_calcs
WHERE overlap_bp >= 0.5 * a_length
  AND overlap_bp >= 0.5 * b_length

Use case: Find high-confidence overlaps where features mutually cover each other.

Join Patterns#

Left Outer Join#

Report all features from A, with B information where available:

SELECT a.*, b.name AS overlapping_feature
FROM features_a a
LEFT JOIN features_b b ON a.interval INTERSECTS b.interval

Use case: Annotate features with overlap information while keeping all records.

Calculate Overlap Amount#

Return the overlap size in base pairs:

SELECT
    a.*,
    b.name AS b_name,
    (LEAST(a.end, b.end) - GREATEST(a.start, b.start)) AS overlap_bp
FROM features_a a, features_b b
WHERE a.interval INTERSECTS b.interval

Use case: Quantify the extent of each overlap.

Overlap with NULL Handling#

Report overlap amount for all A features, with 0 for non-overlapping:

SELECT
    a.*,
    b.name AS b_name,
    CASE
        WHEN b.chrom IS NULL THEN 0
        ELSE LEAST(a.end, b.end) - GREATEST(a.start, b.start)
    END AS overlap_bp
FROM features_a a
LEFT JOIN features_b b ON a.interval INTERSECTS b.interval

Use case: Create a complete overlap report including non-overlapping features.

Multi-Table Operations#

Union Multiple Sources#

Intersect A with features from multiple B tables:

WITH all_b_features AS (
    SELECT * FROM features_b1
    UNION ALL
    SELECT * FROM features_b2
    UNION ALL
    SELECT * FROM features_b3
)
SELECT DISTINCT a.*
FROM features_a a
INNER JOIN all_b_features b ON a.interval INTERSECTS b.interval

Use case: Find features overlapping any region from multiple annotation sources.

Track Overlap Source#

Know which source table each overlap came from:

WITH all_b_features AS (
    SELECT *, 'source1' AS source FROM features_b1
    UNION ALL
    SELECT *, 'source2' AS source FROM features_b2
    UNION ALL
    SELECT *, 'source3' AS source FROM features_b3
)
SELECT a.*, b.name AS overlap_name, b.source
FROM features_a a
INNER JOIN all_b_features b ON a.interval INTERSECTS b.interval

Use case: Track which annotation database each overlap originated from.

Complex Filtering#

Overlap with Quality Filters#

Combine spatial and attribute filters:

SELECT v.*, g.name AS gene_name
FROM variants v
INNER JOIN genes g ON v.interval INTERSECTS g.interval
WHERE v.quality >= 30
  AND g.biotype = 'protein_coding'
ORDER BY v.chrom, v.start

Use case: Find high-quality variants in protein-coding genes.

Specific Target Genes#

Find overlaps with a specific set of genes:

SELECT v.*, g.name AS gene_name
FROM variants v
INNER JOIN genes g ON v.interval INTERSECTS g.interval
WHERE g.name IN ('BRCA1', 'BRCA2', 'TP53', 'EGFR')
ORDER BY g.name, v.start

Use case: Extract variants in clinically relevant genes.