pccx Documentation

Welcome to the pccx (Parallel Compute Core eXecutor) documentation. pccx is a scalable NPU architecture for accelerating Transformer-based LLMs on edge devices. Select a section from the sidebar to begin.

Ecosystem

RTL Implementation

github.com/hwkim-dev/pccx-FPGA-NPU-LLM-kv260

The active v002 SystemVerilog sources — ISA package, controller, compute cores (GEMM / GEMV / CVO), memory hierarchy. Target device is the Xilinx Kria KV260 (Zynq UltraScale+ ZU5EV).

Every v002 RTL reference page on this site links back to the exact .sv file in that repository.

Open the pccx-FPGA-NPU-LLM-kv260 repository on GitHub
Documentation source

github.com/hwkim-dev/pccx — the Sphinx project powering this site.

Open the pccx documentation repository on GitHub
Author portfolio

hwkim-dev.github.io/hwkim-dev — blog, other projects, about.

Open the hwkim-dev portfolio site

Tooling & Lab

pccx-lab

Performance simulator and AI-integrated profiler, built for the pccx NPU. Pre-RTL bottleneck detection, UVM co-simulation, and LLM-driven testbench generation in one workflow.

Work in Progress

Source: github.com/hwkim-dev/pccx-lab

Open the pccx-lab simulator and profiler
Design rationale

Why pccx-lab is one repo, not five. Module boundary rules (core/, ui/, uvm_bridge/, ai_copilot/).

Read the pccx-lab design rationale

v002 Architecture

Target Hardware

Archive

Toolchain Demos