Spatial Relationship Operators
Spatial relationship operators test positional relationships between genomic ranges. These are the core operators for determining whether genomic intervals overlap, contain, or are contained within other intervals.
INTERSECTS
Returns true when two genomic ranges overlap by at least one base pair.
Description
The INTERSECTS operator is the most commonly used spatial operator. It tests
whether two genomic intervals share any overlapping bases. Two intervals intersect
if they are on the same chromosome and their coordinate ranges overlap.
Mathematically, intervals [start_a, end_a) and [start_b, end_b) intersect when:
They are on the same chromosome, AND
start_a < end_bANDstart_b < end_a
Syntax
-- Compare against a literal range
interval INTERSECTS 'chr1:1000-2000'
-- Compare against another genomic column (joins)
a.interval INTERSECTS b.interval
-- With set quantifiers
interval INTERSECTS ANY('chr1:1000-2000', 'chr2:5000-6000')
Parameters
- interval
A genomic column registered with the engine via
register_table_schema().- literal_range
A string literal specifying a genomic range in the format
'chromosome:start-end'.- other_interval
Another genomic column from the same or different table (for joins).
Return Value
Boolean: true if the ranges overlap, false otherwise.
Examples
Basic Usage:
Find all variants that overlap a specific genomic region:
cursor = engine.execute("""
SELECT * FROM variants
WHERE interval INTERSECTS 'chr1:1000-2000'
""")
Column-to-Column Joins:
Find variants that overlap with any gene:
cursor = engine.execute("""
SELECT v.*, g.name AS gene_name
FROM variants v
INNER JOIN genes g ON v.interval INTERSECTS g.interval
""")
With WHERE Clause:
Find overlapping features with additional filtering:
cursor = engine.execute("""
SELECT v.*, g.name
FROM variants v
INNER JOIN genes g ON v.interval INTERSECTS g.interval
WHERE v.quality >= 30
AND g.biotype = 'protein_coding'
""")
Left Outer Join:
Find all variants, with gene information where available:
cursor = engine.execute("""
SELECT v.*, g.name AS gene_name
FROM variants v
LEFT JOIN genes g ON v.interval INTERSECTS g.interval
""")
Backend Compatibility
Backend |
Support |
Notes |
|---|---|---|
DuckDB |
Full |
|
SQLite |
Full |
|
PostgreSQL |
Planned |
Targeted for future release |
Performance Notes
Create indexes on
(chromosome, start_pos, end_pos)for better join performanceWhen joining large tables, consider filtering by chromosome first
The generated SQL uses efficient range comparison predicates
CONTAINS
Returns true when one genomic range fully contains another.
Description
The CONTAINS operator tests whether one genomic interval completely encompasses
another. The containing interval must cover the entire span of the contained interval.
Mathematically, interval [start_a, end_a) contains [start_b, end_b) when:
They are on the same chromosome, AND
start_a <= start_bANDend_a >= end_b
Syntax
-- Check if interval contains a point
interval CONTAINS 'chr1:1500'
-- Check if interval contains a range
interval CONTAINS 'chr1:1200-1800'
-- Column-to-column comparison
gene.interval CONTAINS variant.interval
Parameters
- interval
A genomic column registered with the engine.
- literal_range
A string literal specifying a genomic point or range.
- other_interval
Another genomic column for comparisons.
Return Value
Boolean: true if the first range fully contains the second, false otherwise.
Examples
Point Containment:
Find genes that contain a specific position:
cursor = engine.execute("""
SELECT * FROM genes
WHERE interval CONTAINS 'chr1:1500'
""")
Range Containment:
Find large features that fully contain smaller features:
cursor = engine.execute("""
SELECT g.name AS gene_name, e.name AS exon_name
FROM genes g
INNER JOIN exons e ON g.interval CONTAINS e.interval
""")
Filtering Fully Contained Variants:
Find variants that are completely within gene boundaries:
cursor = engine.execute("""
SELECT v.*
FROM variants v
INNER JOIN genes g ON g.interval CONTAINS v.interval
""")
Backend Compatibility
Backend |
Support |
Notes |
|---|---|---|
DuckDB |
Full |
|
SQLite |
Full |
|
PostgreSQL |
Planned |
Related Operators
WITHIN - Inverse of CONTAINS
INTERSECTS - Partial overlap (less strict)
WITHIN
Returns true when one genomic range is fully contained within another.
Description
The WITHIN operator is the inverse of CONTAINS. It tests whether a genomic
interval falls completely inside another interval.
Mathematically, interval [start_a, end_a) is within [start_b, end_b) when:
They are on the same chromosome, AND
start_a >= start_bANDend_a <= end_b
Syntax
-- Check if interval is within a range
interval WITHIN 'chr1:1000-5000'
-- Column-to-column comparison
variant.interval WITHIN gene.interval
Parameters
- interval
A genomic column registered with the engine.
- literal_range
A string literal specifying the containing range.
- other_interval
Another genomic column for comparisons.
Return Value
Boolean: true if the first range is fully within the second, false otherwise.
Examples
Filter to Region:
Find all features within a specific genomic window:
cursor = engine.execute("""
SELECT * FROM features
WHERE interval WITHIN 'chr1:1000000-2000000'
""")
Find Nested Features:
Find exons that are completely within their parent gene:
cursor = engine.execute("""
SELECT e.*, g.name AS gene_name
FROM exons e
INNER JOIN genes g ON e.interval WITHIN g.interval
""")
Backend Compatibility
Backend |
Support |
Notes |
|---|---|---|
DuckDB |
Full |
|
SQLite |
Full |
|
PostgreSQL |
Planned |
Related Operators
CONTAINS - Inverse of WITHIN
INTERSECTS - Partial overlap (less strict)