API Reference
This section documents the GIQL Python API.
GIQL - Genomic Interval Query Language.
A SQL dialect for genomic range queries with multi-database support.
- This package provides:
GIQL dialect extending SQL with spatial operators
Query engine supporting multiple backends (DuckDB, SQLite)
Range parser for genomic coordinate strings
Schema management for genomic data
- class giql.GIQLEngine(target_dialect='duckdb', connection=None, db_path=':memory:', verbose=False, **dialect_options)[source]
Bases:
objectMulti-backend GIQL query engine.
Supports multiple SQL databases through transpilation of GIQL syntax to standard SQL. Can work with DuckDB, SQLite, and other backends.
Examples
Query a pandas DataFrame with DuckDB:
import pandas as pd from giql import GIQLEngine df = pd.DataFrame( { "id": [1, 2, 3], "chromosome": ["chr1", "chr1", "chr2"], "start_pos": [1500, 10500, 500], "end_pos": [1600, 10600, 600], } ) with GIQLEngine(target_dialect="duckdb") as engine: engine.conn.register("variants", df) cursor = engine.execute( "SELECT * FROM variants WHERE interval INTERSECTS 'chr1:1000-2000'" ) for row in cursor: print(row)
Load from CSV:
with GIQLEngine(target_dialect="duckdb") as engine: engine.load_csv("variants", "variants.csv") cursor = engine.execute( "SELECT * FROM variants WHERE interval INTERSECTS 'chr1:1000-2000'" ) # Process rows lazily while True: row = cursor.fetchone() if row is None: break print(row)
Using SQLite backend:
with GIQLEngine(target_dialect="sqlite", db_path="data.db") as engine: cursor = engine.execute( "SELECT * FROM variants WHERE interval INTERSECTS 'chr1:1000-2000'" ) # Materialize all results at once results = cursor.fetchall()
- __init__(target_dialect='duckdb', connection=None, db_path=':memory:', verbose=False, **dialect_options)[source]
Initialize engine.
- Parameters:
target_dialect (Literal['duckdb', 'sqlite'] | str) – Target SQL dialect (‘duckdb’, ‘sqlite’, ‘standard’)
connection – Existing database connection (optional)
db_path (str) – Database path or connection string
verbose (bool) – Print transpiled SQL
dialect_options – Additional options for specific dialects
- close()[source]
Close database connection.
Only closes connections created by the engine. If an external connection was provided during initialization, it is not closed.
- execute(giql)[source]
Execute a GIQL query and return a database cursor.
Parses the GIQL syntax, transpiles to target SQL dialect, and executes the query returning a cursor for lazy iteration.
- Parameters:
giql (str) – Query string with GIQL genomic extensions
- Returns:
Database cursor (DB-API 2.0 compatible) that can be iterated
- Raises:
ValueError – If the query cannot be parsed, transpiled, or executed
- Return type:
CursorLike
- register_table_schema(table_name, columns, genomic_column='interval', chrom_col='chromosome', start_col='start_pos', end_col='end_pos', strand_col='strand', coordinate_system='0based', interval_type='half_open')[source]
Register schema for a table.
This method tells the engine how genomic ranges are stored in the table, mapping logical genomic column names to physical column names.
- Parameters:
table_name (str) – Table name
genomic_column (str) – Logical name for genomic position
chrom_col (str) – Physical chromosome column
start_col (str) – Physical start position column
end_col (str) – Physical end position column
strand_col (str | None) – Physical strand column (optional)
coordinate_system (str) – Coordinate system: “0based” or “1based” (default: “0based”)
interval_type (str) – Interval endpoint handling: “half_open” or “closed” (default: “half_open”)
- transpile(giql)[source]
Transpile a GIQL query to the engine’s target SQL dialect.
Parses the GIQL syntax and transpiles it to the target SQL dialect without executing it. Useful for debugging or generating SQL for external use.
- Parameters:
giql (str) – Query string with GIQL genomic extensions
- Returns:
Transpiled SQL query string in the target dialect
- Raises:
ValueError – If the query cannot be parsed or transpiled
- Return type: