StochStack

prototype 09

Data Agent

A natural-language data catalog for clinical development: what data exists, where it lives, who controls access, and what schema it has.

Ask the data agent

Match confidence: 0%

Matched datasets

Global Claims Longitudinal

Source: externalClaims

Oncology · Cardiovascular · Immunology · CNS

Owner: RWE Strategy

Access owner: Data Governance Office

Storage: lakehouse table

Refresh: monthly

Granularity: patient-level (de-identified)

Schema fields

patient_token (string) - Tokenized patient key
diagnosis_code (string) - ICD-10 diagnosis code
procedure_code (string) - CPT/HCPCS procedure code
service_date (date) - Date of claim service

EHR Oncology Outcomes Repository

Source: externalRWE

Oncology

Owner: Translational Medicine Data

Access owner: RWE Access Committee

Storage: secure workspace

Refresh: quarterly

Granularity: patient-level with longitudinal labs

Schema fields

patient_token (string) - Tokenized patient key
tumor_stage (string) - Clinical stage at diagnosis
biomarker_panel (json) - Molecular marker panel
treatment_line (integer) - Line of therapy index

Site Startup and Activation Ledger

Source: internalClinical Operations

Oncology · Immunology · CNS · Cardiovascular

Owner: Global Clinical Operations

Access owner: Clinical Ops PMO

Storage: warehouse mart

Refresh: daily

Granularity: site-study-week

Schema fields

study_id (string) - Study identifier
site_id (string) - Site identifier
startup_status (string) - Current startup milestone status
cycle_days (integer) - Days elapsed in startup cycle

Patient Screening Funnel

Source: internalPatient

Oncology · Immunology

Owner: Study Operations Analytics

Access owner: Patient Data Privacy Board

Storage: secure mart

Refresh: weekly

Granularity: patient-screening event

Schema fields

screening_id (string) - Screening event id
screen_fail_reason (string) - Primary reason for failure
site_id (string) - Site where screening happened
age_band (string) - Age bucket

RBQM Unified Risk Signals

Source: internalQuality

Oncology · CNS · Cardiovascular

Owner: RBQM Center of Excellence

Access owner: Quality Governance

Storage: feature store

Refresh: daily

Granularity: site-study-day

Schema fields

risk_signal_id (string) - Signal id
kri_name (string) - Key risk indicator name
kri_value (float) - Observed indicator value
severity (string) - Risk severity band

Regulatory Submission Document Graph

Source: internalRegulatory

Oncology · Immunology · Cardiovascular

Owner: Regulatory Operations

Access owner: Regulatory Document Control

Storage: document index + graph

Refresh: daily

Granularity: document-version

Schema fields

document_id (string) - Document unique id
document_type (string) - Protocol/CSR/IB/etc
version (string) - Version number
study_id (string) - Associated study id

Access guidance

Run a query first. The agent will return owner, access level, and recommended request path.

update log

Prototype Change Log

  1. 2026-03-01 · v0.1.0

    Initial Data Agent Catalog Prototype

    • - Added enterprise clinical data catalog across internal/external domains.
    • - Added natural-language query endpoint with dataset matching and confidence score.
    • - Added access metadata response: owner, access level, storage type, and schema preview.