Developer documentation for
psdl-langv0.4
pip install psdl-lang # Core package
pip install psdl-lang[omop] # With OMOP adapter
pip install psdl-lang[fhir] # With FHIR adapter
pip install psdl-lang[full] # All adaptersfrom psdl import (
# Parsing
PSDLParser, PSDLScenario, parse_scenario,
# IR types
Signal, SignalGroup, ClinicalDomain, TrendExpr, LogicExpr,
# Compilation (v0.3)
compile_scenario, ScenarioCompiler, ScenarioIR,
# Dataset Spec (RFC-0004)
load_dataset_spec, DatasetSpec, Binding,
# Evaluation
SinglePatientEvaluator, InMemoryBackend,
# AST types
parse_trend_expression, parse_logic_expression,
extract_operators, extract_terms,
# Adapters
get_omop_adapter, get_fhir_adapter,
# Examples
examples,
)Parse PSDL YAML scenarios into PSDLScenario objects.
from psdl import PSDLParser
parser = PSDLParser()
# Parse from file
scenario = parser.parse_file("scenario.yaml")
# Parse from string (loose mode — default)
yaml_content = """
scenario: MyScenario
version: "1.0"
signals:
HR:
ref: heart_rate
logic:
alert: { when: HR > 120 }
"""
scenario = parser.parse_string(yaml_content)
# Strict mode — also validate against spec/schema.json before parsing
scenario = parser.parse_string(yaml_content, strict=True)
scenario = parser.parse_file("scenario.yaml", strict=True)Methods:
| Method | Returns | Description |
|---|---|---|
parse_string(yaml, source="<string>", strict=False) |
PSDLScenario |
Parse a YAML string. With strict=True, validates against spec/schema.json first; schema violations surface as PSDLParseError. |
parse_file(path, strict=False) |
PSDLScenario |
Parse a YAML file. Same strict= semantics. |
Top-level scenario shape: the parser accepts both forms.
# Flat form (legacy, still supported)
scenario: MyScenario
version: "1.0"
# Schema form (used by all bundled examples)
scenario:
name: MyScenario
version: "1.0"
description: "Optional"
tags: ["aki", "nephrology"]Convenience function:
from psdl import parse_scenario
scenario = parse_scenario("scenario.yaml") # loose
scenario = parse_scenario(yaml_content, strict=True) # strictParsed scenario representation.
scenario = parser.parse_file("scenario.yaml")
print(scenario.name) # Scenario name
print(scenario.version) # Version string
print(scenario.signals) # Dict[str, Signal]
print(scenario.trends) # Dict[str, TrendExpr]
print(scenario.logic) # Dict[str, LogicExpr]
print(scenario.population) # Optional population criteria
print(scenario.audit) # Audit metadata (intent, rationale, provenance)Attributes:
| Attribute | Type | Description |
|---|---|---|
name |
str |
Scenario name |
version |
str |
Scenario version |
signals |
Dict[str, Signal] |
Signal definitions |
trends |
Dict[str, TrendExpr] |
Trend definitions |
logic |
Dict[str, LogicExpr] |
Logic rules |
signal_groups |
Dict[str, SignalGroup] |
Bulk data-request groups and custom panels (RFC-0009). Defaults to empty. |
population |
Optional[PopulationCriteria] |
Population filters |
audit |
Optional[AuditBlock] |
Audit metadata |
Optional, per-scenario named collections of signals — either a domain-level bulk request or an author-defined custom panel. Groups are a data-extraction declaration only: they are consumed by the Dataset Spec layer to fulfill bulk data requests and have zero interaction with trends or logic.
signal_groups:
# Domain-level: pull every laboratory concept for the cohort
all_labs:
domain: laboratory
description: "All lab results for cohort patients"
# Custom panel: author-defined subset of declared signals
renal_panel:
members: [creatinine, hemoglobin, dialysis_active]
description: "Renal function monitoring panel"from psdl import SignalGroup, ClinicalDomain
g = SignalGroup(
name="all_labs",
description="All lab results",
domain=ClinicalDomain.LABORATORY,
)Validation rules (Phase 1):
descriptionis required.domainandmembersare mutually exclusive — exactly one must be set.membersentries must reference signals declared in the same scenario; unknown references raisePSDLParseErrorat parse time.domainmust be a validClinicalDomainenum value.
Compile a scenario to IR with cryptographic hashes for audit trails.
from psdl import compile_scenario
# From file path
ir = compile_scenario("scenario.yaml")
# From PSDLScenario object
ir = compile_scenario(scenario)
# Access hashes
print(ir.spec_hash) # SHA-256 of input YAML
print(ir.ir_hash) # SHA-256 of compiled IR
print(ir.toolchain_hash) # SHA-256 of compiler versionParameters:
| Parameter | Type | Description |
|---|---|---|
source |
str | PSDLScenario |
File path or parsed scenario |
Returns: ScenarioIR
Compiled intermediate representation with DAG ordering and hashes.
ir = compile_scenario("scenario.yaml")
# Metadata
print(ir.scenario_name)
print(ir.scenario_version)
print(ir.psdl_version)
print(ir.compiled_at) # datetime
# Hashes (for audit)
print(ir.spec_hash)
print(ir.ir_hash)
print(ir.toolchain_hash)
# DAG ordering
print(ir.dag.evaluation_order) # Ordered list of (type, name)
# Export for audit
artifact = ir.to_artifact() # Dict for JSON serializationKey Attributes:
| Attribute | Type | Description |
|---|---|---|
spec_hash |
str |
SHA-256 of input specification |
ir_hash |
str |
SHA-256 of compiled IR |
toolchain_hash |
str |
SHA-256 of compiler version |
dag |
DependencyDAG |
Evaluation order |
diagnostics |
CompilationDiagnostics |
Warnings/errors |
Low-level compiler with fine-grained control.
from psdl import ScenarioCompiler, PSDLParser
parser = PSDLParser()
scenario = parser.parse_file("scenario.yaml")
compiler = ScenarioCompiler()
ir = compiler.compile(scenario)
# Access diagnostics
for warning in ir.diagnostics.warnings:
print(f"Warning: {warning.message}")Each diagnostic carries a stable code (see psdl.core.compile.DiagnosticCode) so callers can filter or escalate selectively.
| Code | Severity | Meaning |
|---|---|---|
S100 |
error | Signal not found |
S101 |
error | Duplicate signal name |
S102 |
error | Invalid signal ref |
T100 |
error | Trend expression invalid |
T101 |
error | Trend references unknown signal |
T102 |
error | Trend uses unknown operator |
T103 |
error | Trend has invalid window |
T104 |
error | Trend contains comparison (v0.3+ rejects this; comparisons belong in logic) |
L100 |
error | Logic expression invalid |
L101 |
error | Logic references unknown trend or logic term |
L102 |
error | Logic has circular reference |
L103 |
error | Logic type mismatch |
D100 |
error | DAG circular dependency |
D101 |
error | DAG unreachable node |
W100 |
warning | Signal defined but never used |
W101 |
warning | Trend defined but never used in logic |
W102 |
warning | Deprecated syntax |
W103 |
warning | Performance hint |
W104 |
warning | Transitively unused signal — referenced by a trend, but every trend that references it is itself unused (RFC #7) |
The DependencyAnalysis attached to each ScenarioIR carries unused_signals, unused_trends, and transitively_unused_signals for programmatic access:
ir = compile_scenario("scenario.yaml")
dep = ir.compilation.dependency_analysis
dead_in_eval_graph = dep.transitively_unused_signals # set[str]Evaluate a scenario against single patient data.
from psdl import SinglePatientEvaluator, InMemoryBackend, compile_scenario
from datetime import datetime, timedelta
# Setup backend with data
backend = InMemoryBackend()
now = datetime.now()
backend.add_observation(123, "Cr", 1.0, now - timedelta(hours=6))
backend.add_observation(123, "Cr", 1.5, now)
# From compiled IR (recommended)
ir = compile_scenario("scenario.yaml")
evaluator = SinglePatientEvaluator.from_ir(ir, backend)
# Or from scenario directly
evaluator = SinglePatientEvaluator(scenario, backend)
# Evaluate
result = evaluator.evaluate(patient_id=123, reference_time=now)
print(result.is_triggered) # bool
print(result.triggered_logic) # List[str]
print(result.trend_values) # Dict[str, float]
print(result.logic_values) # Dict[str, bool]Methods:
| Method | Returns | Description |
|---|---|---|
evaluate(patient_id, reference_time) |
EvaluationResult |
Evaluate scenario |
from_ir(ir, backend) |
SinglePatientEvaluator |
Create from compiled IR |
In-memory data backend for testing and development.
from psdl import InMemoryBackend, DataPoint
from datetime import datetime
backend = InMemoryBackend()
# Add single observation
backend.add_observation(
patient_id=123,
signal_name="Cr",
value=1.5,
timestamp=datetime.now()
)
# Add multiple data points
backend.add_patient_data(123, {
"Cr": [
DataPoint(timestamp=datetime(2024, 1, 1, 10, 0), value=1.0),
DataPoint(timestamp=datetime(2024, 1, 1, 16, 0), value=1.5),
],
"HR": [
DataPoint(timestamp=datetime(2024, 1, 1, 10, 0), value=72),
]
})
# Query data
data = backend.get_signal_data(123, "Cr", window_seconds=3600)Methods:
| Method | Description |
|---|---|
add_observation(patient_id, signal_name, value, timestamp) |
Add single data point |
add_patient_data(patient_id, data_dict) |
Add multiple signals |
get_signal_data(patient_id, signal_name, window_seconds) |
Query data |
Result of scenario evaluation.
result = evaluator.evaluate(patient_id=123)
# Basic results
print(result.is_triggered) # Any logic rule triggered?
print(result.triggered_logic) # List of triggered rule names
print(result.highest_severity) # Highest severity level
# Detailed values
print(result.trend_values) # {"cr_delta": 0.5, ...}
print(result.logic_values) # {"aki_risk": True, ...}
# v0.3: Standardized output
standard = result.to_standard_result()Parse a trend expression into AST.
from psdl import parse_trend_expression
ast = parse_trend_expression("delta(Cr, 6h)")
print(ast.temporal.operator) # "delta"
print(ast.temporal.signal) # "Cr"
print(ast.temporal.window) # WindowSpec(value=6, unit='h')Parse a logic expression into AST.
from psdl import parse_logic_expression
ast = parse_logic_expression("cr_rising AND cr_high")
# Returns AndExpr with operands
ast = parse_logic_expression("cr_delta > 0.3")
# Returns ComparisonExprExtract operator calls from an expression string.
from psdl import extract_operators
ops = extract_operators("delta(Cr, 6h) > 0.3")
# [TemporalCall(operator='delta', signal='Cr', window=WindowSpec(6, 'h'))]Extract term references from a logic expression.
from psdl import extract_terms
terms = extract_terms("cr_rising AND bp_low OR shock_index")
# ['cr_rising', 'bp_low', 'shock_index']from psdl import (
# Trend expressions
TrendExpression, # Wrapper for temporal call
TemporalCall, # delta(Cr, 6h)
WindowSpec, # 6h, 30m, etc.
# Logic expressions
LogicNode, # Union type for all logic nodes
TermRef, # Reference to a trend/logic name
ComparisonExpr, # cr_delta > 0.3
AndExpr, # a AND b AND c
OrExpr, # a OR b
NotExpr, # NOT a
)TemporalCall:
@dataclass
class TemporalCall:
operator: str # delta, slope, ema, sma, min, max, count, last
signal: str # Signal name
window: Optional[WindowSpec]
percentile: Optional[int] # For percentile operatorWindowSpec:
@dataclass
class WindowSpec:
value: int # Numeric value
unit: str # 's', 'm', 'h', 'd', 'w'
@property
def seconds(self) -> int:
"""Window duration in seconds"""from psdl import get_omop_adapter
OMOPAdapter = get_omop_adapter()
adapter = OMOPAdapter(
connection_string="postgresql://user:pass@host/db",
cdm_schema="cdm",
vocab_schema="vocab"
)
# Use with evaluator
evaluator = SinglePatientEvaluator(scenario, adapter)from psdl import get_fhir_adapter
FHIRAdapter = get_fhir_adapter()
adapter = FHIRAdapter(
base_url="http://hapi.fhir.org/baseR4",
auth_token="optional_bearer_token"
)
# Use with evaluator
evaluator = SinglePatientEvaluator(scenario, adapter)The Dataset Specification provides a portable binding layer that maps semantic signal references to physical data locations. This enables the same PSDL scenario to run across different datasets.
Scenario (WHAT) → DatasetSpec (WHERE) → Adapter (HOW)
| | |
v v v
"creatinine" table: measurement SQL query
filter: concept_id=X execution
Load and validate a Dataset Specification from YAML.
from psdl import load_dataset_spec
# Load a dataset spec (mandatory validation)
spec = load_dataset_spec("dataset_specs/mimic_iv_omop.yaml")
# Resolve a signal reference to physical binding
binding = spec.resolve("creatinine")
print(binding.table) # "measurement"
print(binding.filter_expr) # "concept_id = 3016723"Parameters:
| Parameter | Type | Description |
|---|---|---|
path |
str | Path |
Path to dataset spec YAML file |
Returns: DatasetSpec
Raises:
| Exception | When |
|---|---|
FileNotFoundError |
File doesn't exist |
DatasetValidationError |
Spec fails schema validation |
Important: Always use
load_dataset_spec()to load specs. Direct construction bypasses validation and will raiseDatasetSpecErrorwhen usingresolve().
Dataset specification with semantic-to-physical bindings.
spec = load_dataset_spec("dataset_specs/omop_cdm_v54.yaml")
# Metadata
print(spec.name) # "OMOP CDM v5.4"
print(spec.version) # "1.0.0"
print(spec.data_model) # "omop"
print(spec.psdl_version) # "0.3"
# Validation status
print(spec.is_validated) # True (if loaded via load_dataset_spec)
# Audit info
print(spec.checksum) # SHA-256 of source file
print(spec.source_path) # Path to YAML file
# List available elements
print(spec.list_elements())
# ['creatinine', 'potassium', 'heart_rate', ...]
# List by kind
print(spec.list_elements_by_kind("lab"))
# ['creatinine', 'potassium', 'bun', ...]Methods:
| Method | Returns | Description |
|---|---|---|
resolve(signal_ref) |
Binding |
Resolve semantic ref to physical binding |
list_elements() |
List[str] |
List all element names |
list_elements_by_kind(kind) |
List[str] |
List elements of specific kind |
get_valueset(name) |
ValuesetSpec | None |
Get valueset by name |
to_dict() |
Dict |
Convert to dictionary |
Properties:
| Property | Type | Description |
|---|---|---|
is_validated |
bool |
Whether loaded via load_dataset_spec() |
checksum |
str | None |
SHA-256 of source YAML |
source_path |
Path | None |
Path to source file |
Resolved physical binding - the contract between DatasetSpec and adapters.
binding = spec.resolve("creatinine")
print(binding.table) # "measurement"
print(binding.value_field) # "value_as_number"
print(binding.time_field) # "measurement_datetime"
print(binding.patient_field) # "person_id"
print(binding.filter_predicates) # FilterPredicateSet (v0.4, preferred)
print(binding.filter_expr) # "concept_id = 3016723" (deprecated, use filter_predicates)
print(binding.unit) # "mg/dL"
print(binding.value_type) # "numeric"Attributes:
| Attribute | Type | Description |
|---|---|---|
table |
str |
Table name (with schema if configured) |
value_field |
str |
Column containing the value |
time_field |
str |
Column containing timestamp |
patient_field |
str |
Column containing patient ID |
filter_predicates |
FilterPredicateSet |
Structured filter predicates (v0.4) |
filter_expr |
str |
SQL-like filter expression (deprecated in v0.4, use filter_predicates) |
unit |
str | None |
Expected unit |
value_type |
str |
Value type (numeric, string, etc.) |
transform |
str | None |
Optional transform expression |
from psdl import load_dataset_spec
from psdl.adapters.omop import OMOPBackend, OMOPConfig
# Load dataset spec
spec = load_dataset_spec("dataset_specs/mimic_iv_omop.yaml")
# Configure OMOP backend with dataset spec
config = OMOPConfig(
connection_string="postgresql://user:pass@host/db",
cdm_schema="cdm",
)
backend = OMOPBackend(config, dataset_spec=spec)
# Signals in scenarios are resolved through the spec
# scenario.yaml: signals.Cr.ref = "creatinine"
# → spec resolves "creatinine" to measurement table with concept_id filterpsdl_version: "0.3"
dataset:
name: "My Hospital OMOP"
version: "1.0.0"
description: "OMOP CDM 5.4 with local mappings"
data_model: omop
conventions:
patient_id_field: person_id
default_time_field: measurement_datetime
schema: cdm
unit_strategy: strict
elements:
creatinine:
table: measurement
value_field: value_as_number
filter:
concept_id: 3016723
unit: mg/dL
kind: lab
heart_rate:
table: measurement
value_field: value_as_number
filter:
concept_id: 3027018
unit: beats/min
kind: vital| Type | Description |
|---|---|
DatasetSpec |
Main specification class |
Binding |
Resolved physical binding |
ElementSpec |
Element definition |
FilterSpec |
Filter criteria |
Conventions |
Global conventions |
ValuesetSpec |
Valueset definition |
DatasetSpecError |
Base exception |
DatasetValidationError |
Validation failure |
BindingResolutionError |
Unknown signal reference |
from psdl import examples
# List available scenarios
print(examples.list_scenarios())
# ['aki_detection', 'sepsis_screening', 'hyperkalemia_detection', ...]
# Load a scenario
scenario = examples.get_scenario("aki_detection")
# Get scenario file path
path = examples.get_scenario_path("aki_detection")| Type | Description |
|---|---|
Signal |
Signal definition (ref, clinical_domain, unit) |
TrendExpr |
Trend expression with metadata |
LogicExpr |
Logic rule with severity |
DataPoint |
Single observation (timestamp, value) |
EvaluationResult |
Evaluation output |
| Type | Description |
|---|---|
ClinicalDomain |
Vendor-neutral clinical domain enum (LABORATORY, VITAL_SIGN, CONDITION, MEDICATION, PROCEDURE, OBSERVATION, DEMOGRAPHIC) |
FilterPredicate |
Single structured filter predicate (field, operator, value) |
FilterPredicateSet |
Ordered set of predicates (AND semantics) |
BatchRuntime |
Abstract base for batch scenario execution |
SQLBatchRuntime |
SQL-specific batch runtime with dialect rendering |
| Type | Description |
|---|---|
WindowSpec |
Time window (value + unit) |
TemporalCall |
Operator call (delta, slope, etc.) |
TrendExpression |
Numeric trend expression |
ComparisonExpr |
Comparison (>, <, ==, etc.) |
AndExpr |
Logical AND |
OrExpr |
Logical OR |
NotExpr |
Logical NOT |
TermRef |
Reference to named term |
LogicNode |
Union of all logic AST types |
| Type | Description |
|---|---|
ScenarioIR |
Compiled intermediate representation |
DependencyDAG |
Evaluation order graph |
CompilationDiagnostics |
Warnings and errors |
| Type | Description |
|---|---|
DatasetSpec |
Main specification class |
Binding |
Resolved physical binding |
ElementSpec |
Element definition with table/field info |
FilterSpec |
Filter criteria (concept_id, source_value, etc.) |
Conventions |
Global conventions (patient_id_field, schema, etc.) |
ValuesetSpec |
Valueset definition (inline or file reference) |
DatasetSpecError |
Base exception for dataset errors |
DatasetValidationError |
Schema validation failure |
BindingResolutionError |
Unknown signal reference |
| Version | Changes |
|---|---|
| 0.4.0 | RFC-0008 Vendor-Neutral Foundation: ClinicalDomain, FilterPredicate/FilterPredicateSet, BatchRuntime/SQLBatchRuntime, concept_id deprecated, DataBackend lifecycle |
| 0.3.1 | Dataset Spec API (RFC-0004), mandatory validation |
| 0.3.0 | v0.3 architecture, compile_scenario(), AST exposure |
| 0.2.0 | Streaming support, FHIR adapter |
| 0.1.0 | Initial release |
Last updated: March 9, 2026