Skip to content

API Reference

Complete reference for blq's Python API and MCP tool schemas. For tutorial-style usage, see Python API Guide and MCP Guide.

Source files: src/blq/query.py, src/blq/storage.py, src/blq/services/, src/blq/serve.py


BlqStorage

Module: blq.storage

Low-level storage interface backed by BIRD (DuckDB tables + content-addressed blobs). Query methods return DuckDBPyRelation objects -- call .df() for DataFrame or .fetchall() for tuples.

Construction

from blq.storage import BlqStorage

storage = BlqStorage.open()                  # auto-find .bird from cwd
storage = BlqStorage.open("/path/to/.bird")  # explicit path

Supports context manager:

with BlqStorage.open() as storage:
    errors = storage.errors().df()

Properties

Property Type Description
path Path Path to .bird directory
connection duckdb.DuckDBPyConnection Underlying DuckDB connection

Data Checks

storage.has_data() -> bool       # any runs exist
storage.has_runs() -> bool       # alias for has_data()
storage.has_events() -> bool     # any parsed events exist

Run Queries

storage.runs(limit: int | None = None) -> DuckDBPyRelation
All runs with aggregated event counts, newest first. Columns: run_id, source_name, source_type, command, tag, started_at, completed_at, exit_code, cwd, executable_path, hostname, platform, arch, git_commit, git_branch, git_dirty, ci, event_count, error_count, warning_count.

storage.run(run_id: int) -> DuckDBPyRelation
storage.latest_run_id() -> int | None

Event Queries

storage.events(
    run_id: int | None = None,
    severity: str | list[str] | None = None,
    limit: int | None = None,
) -> DuckDBPyRelation
storage.errors(run_id: int | None = None, limit: int = 20) -> DuckDBPyRelation
storage.warnings(run_id: int | None = None, limit: int = 20) -> DuckDBPyRelation
storage.event(run_serial: int, event_id: int) -> dict[str, Any] | None
storage.error_count(run_id: int | None = None) -> int
storage.warning_count(run_id: int | None = None) -> int

Status

storage.status() -> DuckDBPyRelation       # blq_status() summary
storage.source_status() -> DuckDBPyRelation # per-source latest run

Output

storage.get_output(run_id: str | int, stream: str | None = None) -> bytes | None
storage.get_output_info(run_id: str | int) -> list[dict[str, Any]]
stream accepts 'stdout', 'stderr', 'combined', or None (any).

SQL

storage.sql(query: str, params: list | None = None)
    -> DuckDBPyRelation | DuckDBPyConnection
Without params, returns a relation. With params (using ? placeholders), returns a connection result. Both support .fetchall() and .fetchone().

storage.sql("SELECT * FROM blq_load_events() WHERE fingerprint = ?", [fp])

Write Operations

storage.write_run(
    run_meta: dict[str, Any],
    events: list[dict[str, Any]] | None = None,
    output: bytes | None = None,
) -> str  # returns invocation UUID

run_meta keys: command, source_name, source_type, exit_code, started_at, completed_at, cwd, hostname, platform, arch, git_commit, git_branch, git_dirty, ci, environment.

Maintenance

storage.prune(days: int = 30) -> int                    # remove old data
storage.prune_by_max_runs(max_runs: int) -> int          # keep N per source
storage.prune_by_size(max_size_mb: int) -> int            # cap total output size
storage.cleanup_blobs() -> tuple[int, int]                # (deleted, bytes_freed)
storage.total_output_size() -> int                        # total bytes

LogStore

Module: blq.query

Higher-level query API with fluent LogQuery builder. Returns pandas DataFrames.

Construction

from blq.query import LogStore

store = LogStore.open()                    # auto-find .bird
store = LogStore.open("/path/to/.bird")    # explicit path
store = LogStore("/path/to/.bird")         # direct init
store = LogStore.from_parquet_root("path") # raw parquet directory

Properties

Property Type Description
path Path .bird directory path
logs_path Path Logs subdirectory path
connection duckdb.DuckDBPyConnection DuckDB connection

Query Methods

store.events() -> LogQuery            # all events
store.errors() -> LogQuery            # severity='error'
store.warnings() -> LogQuery          # severity='warning'
store.run(run_id: int) -> LogQuery    # events from one run
store.runs() -> pd.DataFrame          # run summaries
store.latest_run() -> int | None      # most recent run_id
store.event(run_id: int, event_id: int) -> dict[str, Any] | None
store.has_data() -> bool

LogQuery

Module: blq.query

Fluent query builder wrapping a DuckDB relation. All operations are deferred until a terminal method is called.

Construction

from blq.query import LogQuery

LogQuery.from_file(path, format="auto", conn=None) -> LogQuery
LogQuery.from_content(content: str, format="auto", conn=None) -> LogQuery
LogQuery.from_sql(conn, sql, params=None) -> LogQuery
LogQuery.from_table(conn, table_name) -> LogQuery
LogQuery.from_parquet(path, conn=None, hive_partitioning=True) -> LogQuery
LogQuery.from_relation(rel, conn) -> LogQuery

Filtering

.filter(severity="error")              # exact match
.filter(severity=["error", "warning"]) # IN clause
.filter(ref_file="%main%")             # ILIKE pattern
.filter(severity="!info")              # NOT equal
.filter(ref_line=100)                  # numeric equality
.filter(severity=None)                 # IS NULL
.filter("ref_line > 100")              # raw SQL condition
.exclude(severity="info")              # NOT (severity = 'info')
.where("ref_line BETWEEN 10 AND 50")   # raw SQL WHERE

Projection

.select(*columns: str) -> LogQuery
.order_by(*columns: str, desc: bool = False) -> LogQuery
.limit(n: int) -> LogQuery

Terminal Methods

.df() -> pd.DataFrame
.fetchall() -> list[tuple]
.fetchone() -> tuple | None
.count() -> int
.exists() -> bool
.show(n: int = 10) -> None          # print to stdout
.explain() -> str                    # query plan
.describe() -> pd.DataFrame         # statistics

Inspection

.columns -> list[str]
.dtypes -> list[str]

Aggregation

.group_by(*columns) -> LogQueryGrouped
.value_counts(column: str) -> pd.DataFrame

LogQueryGrouped

Module: blq.query

Returned by LogQuery.group_by(). All methods return pd.DataFrame.

grouped = query.group_by("ref_file")
grouped.count() -> pd.DataFrame
grouped.sum(column: str) -> pd.DataFrame
grouped.avg(column: str) -> pd.DataFrame
grouped.min(column: str) -> pd.DataFrame
grouped.max(column: str) -> pd.DataFrame
grouped.agg(**aggregations: str) -> pd.DataFrame

agg example: .agg(total="COUNT(*)", first_line="MIN(ref_line)")


Service Layer

Module: blq.services

Pure business logic shared by CLI and MCP. All query functions take BlqStorage as the first argument and return structured dicts/lists.

Refs

from blq.services import parse_ref, resolve_run_ref, ParsedRef

parse_ref(ref: str) -> ParsedRef
Parses ref strings into structured form. Raises ValueError on invalid input.

Input Result
"5" ParsedRef(run_serial=5)
"build:3" ParsedRef(tag="build", run_serial=3)
"test:5:2" ParsedRef(tag="test", run_serial=5, event_id=2)
"5:2" ParsedRef(run_serial=5, event_id=2)
"~1" ParsedRef(relative=1)
"test:~2" ParsedRef(tag="test", relative=2)
UUID ParsedRef(uuid="...")

ParsedRef properties: is_relative -> bool, run_ref -> str.

resolve_run_ref(storage: BlqStorage, ref: str) -> dict | None
Resolves a ref string to a run data dict. Returns None if not found.

Queries

from blq.services import query_status, query_history, query_events, query_diff

query_status(storage: BlqStorage) -> list[dict[str, Any]]
Returns per-source status: name, status, error_count, warning_count, last_run, run_ref, run_serial.

query_history(
    storage: BlqStorage,
    limit: int = 20,
    source: str | None = None,
    status: str | None = None,   # 'running', 'completed', 'orphaned'
) -> list[dict[str, Any]]
Returns: run_ref, run_serial, source_name, status, error_count, warning_count, started_at, exit_code, command, git_commit, git_branch, git_dirty.

query_events(
    storage: BlqStorage,
    severity: str | None = None,       # 'error', 'warning', or comma-separated
    run_id: int | None = None,
    source: str | None = None,
    file_pattern: str | None = None,
    limit: int = 20,
    default_to_latest: bool = False,
    suppressed_fingerprints: list[str] | None = None,
    all_runs: bool = False,
) -> dict[str, Any]                    # {"events": [...], "total_count": int}

query_diff(storage: BlqStorage, run1: int, run2: int) -> dict[str, Any]
Returns: summary (run1_errors, run2_errors, fixed, new, unchanged), fixed (list), new (list).

Inspect

from blq.services import (
    get_source_context, get_log_context,
    get_git_context, get_fingerprint_history,
)
get_source_context(
    ref_file: str | None, ref_line: int | None,
    source_root: Path, context_lines: int = 3,
) -> str | None

get_log_context(
    storage: BlqStorage | None, run_id: int,
    log_line_start: int | None, log_line_end: int | None,
    context_lines: int = 3,
) -> str | None

get_git_context(
    ref_file: str | None, ref_line: int | None,
    source_root: Path, history_limit: int = 2,
) -> dict[str, Any] | None    # {file, line, blame, recent_commits}

get_fingerprint_history(
    storage: BlqStorage | None, fingerprint: str | None,
) -> dict[str, Any] | None    # {fingerprint, first_seen, last_seen, occurrences, is_regression}

Execution

from blq.services import run_result_to_concise

run_result_to_concise(full_result: dict[str, Any], source_name: str) -> dict[str, Any]
Converts a RunResult.to_json() dict into the concise response format with keys: run_ref, cmd, status, exit_code, duration_sec, summary, output_stats. Conditionally includes errors (max 10), warnings (max 5), infos (max 5).


MCP Tools

MCP server started via blq mcp serve. Tools are callable by any MCP client.

run

Run a registered command and capture output.

Parameter Type Default Description
command str required Registered command name
args dict[str,str] \| list[str] \| None None Named args (dict) or positional args (list)
extra list[str] \| None None Passthrough arguments appended to command
timeout int \| None None Timeout in seconds
lines str \| None None Line selection for inline output (e.g. '+20-')
commands list[str] \| None None Batch mode: run multiple commands in sequence
stop_on_failure bool True Stop batch on first failure

Returns: {run_ref, cmd, status, exit_code, duration_sec, summary, output_stats, errors?, warnings?}

exec

Execute an ad-hoc shell command. Do not use pipes/redirects -- use output() tool instead.

Parameter Type Default Description
command str required Shell command (no pipes)
args list[str] \| None None Additional arguments
timeout int \| None None Timeout in seconds
shell bool False Allow shell syntax
lines str \| None None Inline output line selection

Returns: Same shape as run.

status

No parameters. Returns {sources: [{name, status, error_count, warning_count, last_run, run_ref}]}.

events

Parameter Type Default Description
limit int 20 Max events
run_id int \| None None Filter by run serial
source str \| None None Filter by source name
severity str \| None None 'error', 'warning', or comma-separated
file_pattern str \| None None SQL LIKE pattern for ref_file
all_runs bool False Show all runs (default: most recent only)
run_ids list[int] \| None None Batch mode: multiple run IDs
limit_per_run int 10 Max events per run in batch mode

Returns: {events: [...], total_count: int}

inspect

Parameter Type Default Description
ref str required Event ref (e.g. "build:1:3")
lines int 5 Context lines before/after
include_log_context bool True Include log output context
include_source_context bool True Include source file context
include_git_context bool False Include git blame/history
include_fingerprint_history bool False Include occurrence history
refs list[str] \| None None Batch mode: multiple refs

Returns: {ref, severity, ref_file, ref_line, message, log_context?, source_context?, git_context?, fingerprint_history?}

info

Parameter Type Default Description
ref str \| None None Run ref or UUID. None = most recent
head int \| None None First N lines of output
tail int \| None None Last N lines of output
errors bool False Include error events
warnings bool False Include warning events
severity str \| None None Filter events by severity
limit int 20 Max events
context int \| None None Log context lines around each event

Returns: {run_ref, status, exit_code, command, started_at, events?, output?, summary?}

history

Parameter Type Default Description
limit int 20 Max runs
source str \| None None Filter by source name
status str \| None None 'running', 'completed', 'orphaned'

Returns: {runs: [...]}

query

Parameter Type Default Description
sql str \| None None Raw SQL query
filter str \| None None Filter expression (e.g. "severity=error ref_file~test")
limit int 100 Max rows

Filter syntax: key=value (exact), key=v1,v2 (IN), key~pattern (ILIKE), key!=value (not equal). Space-separated filters are AND'd.

Returns: {columns, rows, row_count}

output

Parameter Type Default Description
ref str required Run ref (e.g. '5', 'test:3', '+1')
stream str \| None None 'stdout', 'stderr', 'combined'
tail int \| None None Last N lines
head int \| None None First N lines
grep str \| None None Regex search pattern
context int 0 Context lines around grep matches
lines str \| None None Line spec (e.g. '100-200')
debug_formats bool False Show format detection info

Returns: {output, byte_length, total_lines, ...}

diff

Parameter Type Default Description
run1 int required Baseline run serial
run2 int required Comparison run serial

Returns: {summary: {run1_errors, run2_errors, fixed, new, unchanged}, fixed: [...], new: [...]}

commands

No parameters. Returns {commands: [{name, cmd, description, ...}]}.

register_command

Parameter Type Default Description
name str required Command name
cmd str \| None None Command string
tpl str \| None None Template with {param} placeholders
defaults dict[str,str] \| None None Default template parameter values
description str "" Description
timeout int \| None None Timeout in seconds
capture bool True Capture and parse output
force bool False Overwrite existing
format str \| None None Log format hint
run_now bool False Run immediately after registering
lines str \| None None Default output line selection
sandbox str \| dict \| None None Sandbox preset or spec dict
lock str \| None None Lock name for concurrency control

Returns: {success, command, run?}

unregister_command

Parameter Type Default
name str required

Returns: {success: bool}

clean

Parameter Type Default Description
mode str "data" 'data', 'prune', 'schema', 'full'
confirm bool False Must be True to proceed
days int \| None None Prune: remove older than N days
max_runs int \| None None Prune: keep N per source
max_size_mb int \| None None Prune: cap total output size

Returns: {success, message, mode}

report

Parameter Type Default Description
ref str \| None None Run ref (default: latest)
baseline str \| None None Baseline run, branch, or commit
warnings bool False Include warnings
summary_only bool False Omit individual error details
error_limit int 20 Max errors in details
file_limit int 10 Max files in breakdown

Returns: {report: "markdown...", run_id, total_errors, total_warnings, has_baseline}

ci_check

Parameter Type Default Description
baseline str \| None None Baseline (auto-detects main/master)
fail_on_any bool False Fail on any errors
run_id int \| None None Run to check (default: auto-detect)

Returns: {status: 'OK'|'FAIL', current_run_id, new_errors?, fixed?}

ci_generate

Parameter Type Default Description
commands list[str] \| None None Commands to generate (default: all)
shell str "bash" 'bash', 'sh', 'zsh'

Returns: {scripts: [{name, content, ...}]}

sandbox_info

Parameter Type Default Description
command str \| None None Command name (omit for all)

Returns: JSON with sandbox spec, grades, and resource metrics.


MCP Resources

URI Type Description
blq://status JSON Current per-source status
blq://runs JSON Run history (up to 100)
blq://events JSON Recent error events
blq://event/{ref} JSON Single event details
blq://errors JSON Recent errors (up to 50)
blq://errors/{run_serial} JSON Errors for a specific run
blq://warnings JSON Recent warnings (up to 50)
blq://warnings/{run_serial} JSON Warnings for a specific run
blq://context/{ref} JSON Log context around an event
blq://commands JSON Registered commands
blq://guide Markdown Agent usage guide