Copilot API¶

The Copilot struct is the canonical entry point for every AI-driven workflow in pccx-lab. It wraps:

an optional reference to the currently-loaded NpuTrace
the pccx KV260 reference HardwareModel
the full TraceAnalyzer registry from pccx-core
the UvmStrategy registry from pccx-ai-copilot::uvm

Consumers construct one instance per trace load and then drive it through the methods below.

Construction¶

use pccx_ai_copilot::Copilot;

let c = Copilot::new(Some(&trace));
// or
let c = Copilot::new(None);   // no trace yet; investigate() returns []

Callers that want to extend the analyzer registry mutate c.analyzers.push(...) before the first investigate() call.

Methods¶

`investigate() -> Vec<AnalysisReport>`¶

Runs every registered analyzer against the trace. Returns reports in registry order. Empty when no trace is loaded.

`investigate_summary() -> String`¶

Collapses all reports into a single LLM-ready system prompt. Each line is "- [<analyzer_id>] <summary>"; total length is bounded by the number of analyzers × 500 chars per summary.

let prompt = c.investigate_summary();
// Trace analysis summary (automated):
// - [roofline] AI 0.69 ops/byte · 321.4 GOPS (0% of peak) · memory-bound
// - [dma_util] DMA SATURATED: read 46% + write 46% pinned — compute only 4%
// - [bottleneck] 19179 windows · DmaRead×12787, DmaWrite×6392
// ...

Feed this straight into an LLM’s system channel.

`rank_by_severity() -> Vec<AnalysisReport>`¶

Server-side equivalent of the dashboard’s severity sort. Returns the reports reordered so error-class findings come first, then warnings, then informational entries. Ties break on registry declaration order so the output is stable across repeated calls.

let top = c.rank_by_severity();
println!("most urgent: {}", top[0].summary);

`explain(analyzer_id: &str) -> Option<String>`¶

Long-form Markdown explainer for a single analyzer. Pulls every matching research::Citation and formats them alongside the analyzer’s description and its latest finding.

let md = c.explain("kv_cache_pressure").unwrap();
// # KV-Cache Pressure (kv_cache_pressure)
// Projects per-decode KV-cache footprint against L2 URAM; cites …
//
// ## Latest finding
// HBM-SPILL: decode 512 tokens → 60000 KB KV …
//
// ## Research lineage
// - **QServe …** (2024) — [2405.04532](…). _Why_: …

Returns None when the id is not in the registry, so the UI can distinguish “unknown id” from “no trace loaded”.

`suggest_fix(intent: &str) -> (String, String)`¶

Natural-language intent → (strategy_slug, SystemVerilog_stub). Uses keyword rules when the slug isn’t in the intent verbatim.

let (slug, sv) = c.suggest_fix("please reduce DMA barrier contention");
// slug == "barrier_reduction"
// sv   == "class barrier_reduction_seq extends uvm_sequence; …"

Keyword routing — the Copilot looks for these substrings (in priority order) and maps them to a canonical strategy:

Intent keywords	Strategy	Research lineage
`barrier`, `sync`	`barrier_reduction`	—
`prefetch`, `dma`	`l2_prefetch`	Packing-Prefetch (arxiv 2508.08457)
`thermal`, `tdp`, `power`, `heat`	`back_pressure_gate`	KV260 7 W PL TDP; Hybrid-Systolic ISLPED 2025
`latency`, `tail`, `p95`, `p99`, `jitter`	`kv_cache_thrash_probe`	HERMES 2025 tail-latency model
`early exit`, `edge-cloud`	`early_exit_decoder`	arxiv 2505.21594
`speculative`, `draft`, `spec decode`	`speculative_draft_probe`	OpenPangu NPU (arxiv 2603.03383 — 1.35×)
`evict`, `sparsif`, `kv drop`	`sparsified_kv_eviction`	LoopServe + EVICPRESS
`w4a8`, `kv4`, `quantize`, `qserve`, `qoq`	`qoq_kv4_quantize`	QServe (arxiv 2405.04532) + QQQ
`matryoshka`, `subnet`, `e2b`, `e4b`	`matryoshka_subnet_switch`	arxiv 2205.13147
`wavelet`, `low-rank`, `long context`	`wavelet_attention_probe`	arxiv 2312.07590
`flash`, `tile`, `softmax`	`flash_attention_tile_probe`	FlashAttention-2/3
`moe`, `expert`, `mixture`	`kv_cache_thrash_probe`	Switch Transformer (arxiv 2101.03961)
`throughput`, `core`	`back_pressure_gate`	—

`generate_report(synth: Option<&SynthReport>) -> String`¶

Delegates to pccx-core::report::render_markdown. Pass a parsed SynthReport (from synth_report::load_from_files or synth_runner::run) to include utilisation + timing tables.

Registry extension¶

// Runtime-register an extra analyzer before investigate().
let mut c = Copilot::new(Some(&trace));
c.analyzers.push(Box::new(MyCustomAnalyzer));
let reports = c.investigate();

Strategy registry is queried via uvm::strategies_for_analyzer(id) — the Copilot uses this to map any analyzer’s findings to a candidate mitigation set without hard-coded keyword matching.

Using it from automation¶

use pccx_ai_copilot::Copilot;
use pccx_core::{PccxFile, NpuTrace};
use std::fs::File;

fn summarise(path: &str) -> anyhow::Result<String> {
    let mut f = File::open(path)?;
    let pccx = PccxFile::read(&mut f)?;
    let trace = NpuTrace::from_payload(&pccx.payload)?;
    let c = Copilot::new(Some(&trace));
    Ok(c.investigate_summary())
}

Pairs naturally with pccx_analyze <trace.pccx> for CI pipelines — the CLI prints the equivalent of investigate_summary() by default.

Error handling¶

Copilot is infallible by design. Every method returns a value even when the trace is missing (empty Vec / "No trace loaded" string / empty SV stub / None from explain). The caller’s empty-state branch is the correct place to decide whether to render a welcome screen, toast, or CI skip — never inside the copilot.

Copilot API¶

Construction¶

Methods¶

investigate() -> Vec<AnalysisReport>¶

investigate_summary() -> String¶

rank_by_severity() -> Vec<AnalysisReport>¶

explain(analyzer_id: &str) -> Option<String>¶

suggest_fix(intent: &str) -> (String, String)¶

generate_report(synth: Option<&SynthReport>) -> String¶