RTL Source Reference (v001)

This section is the authoritative browser for every SystemVerilog module that makes up the archived v001 NPU (64 files across 8 categories, plus the host-side C API). Every file under codes/v001/ is reachable from here via a language-aware literalinclude — click through to read the real source with syntax highlighting; no separate repository visit required.

See also

pccx: Parallel Compute Core eXecutor

High-level block diagram and core roles.

pccx ISA Specification

64-bit VLIW instruction set backing this RTL.

v001 is frozen. Active RTL is in hwkim-dev/pccx-FPGA-NPU-LLM-kv260 and documented under RTL Source Reference (v002).

Top level

NPU_top wrapper, BF16 barrel shifter.

Top level
Packages & Constants

ISA package, device / type / arch packages, interface defs.

Packages and Constants
Controller

AXI-Lite frontend, decoder, dispatcher, global scheduler.

NPU Controller
Matrix Core (GEMM)

32×32 systolic array with DSP48E2 MACs.

Matrix Core (GEMM)
Vector Core (GEMV)

Parallel μV-cores with reduction tree.

Vector Core (GEMV)
CVO Core (SFU)

Softmax / GELU / CORDIC non-linear engine.

CVO Core (SFU)
Memory Control

L2 URAM cache, dispatcher, HP buffers, CVO bridge.

Memory Control
Preprocess

Feature-map cache + BF16→fixed-point pipeline.

Preprocess
Library

BF16 math, general algorithms, FIFO queue primitives.

Library
Host API (C)

pccx_v1 HAL + high-level C interface under sw/driver/.

Host API (C driver)