Contents Menu Expand Light mode Dark mode Auto light/dark, in light mode Auto light/dark, in dark mode Skip to content
English version · 한국어로 보기 →
pccx
pccx
EN · 한국어
RTL Lab Docs Blog

Introduction

  • pccx: Parallel Compute Core eXecutor

Roadmap

  • Roadmap (Two-Track)

v002 Architecture

  • pccx v002 Architecture
    • Overview
    • Hardware Architecture
      • Design Rationale: v001 → v002
      • Top-Level Architecture
      • Physical Floorplan
      • Memory Hierarchy
      • KV Cache Optimization Strategy
      • GEMM Core (Systolic Array)
      • GEMV Core
      • SFU Core (Complex Vector Operations)
      • DSP48E2 W4A8 Bit Packing and Sign Recovery
    • Instruction Set Architecture (ISA)
      • Instruction Encoding
      • Per-Instruction Encoding
      • Per-Instruction Dataflow
    • Software Stack
      • C API Overview
    • Target Models
      • Gemma 3N E4B — Overview
      • Gemma 3N E4B — Operator-Level Pipeline
      • Gemma 3N — Attention and RoPE Constraints
      • Gemma 3N — LAuReL and PLE Calibration Modules
      • Gemma 3N — FFN Gaussian Top-K Sparsity
      • Gemma 3N E4B on pccx v002 — Execution and Scheduling
    • RTL Source Reference (v002)
      • ISA Type Package
      • NPU Top-Level
      • Compute Core Modules
      • NPU Controller Modules
    • Verification

Target Hardware

  • Devices
    • Target Hardware: Xilinx Kria KV260

Archive

  • Archive
    • Archive: v001 Experimental Architecture
      • pccx: Parallel Compute Core eXecutor
      • pccx ISA Specification
      • pccx ISA Spreadsheet View
      • Developer Reference for pccx v001 Host API
      • RTL Source Reference (v001)
        • Top level
        • Packages and Constants
        • NPU Controller
        • Matrix Core (GEMM)
        • Vector Core (GEMV)
        • CVO Core (SFU)
        • Memory Control
        • Preprocess
        • Library
        • Host API (C driver)

Toolchain Demos

  • Toolchain Demos
    • Mermaid — NPU block diagram
    • WaveDrom — AXI4 read transaction
    • SVG — themed 4×4 PE array
    • scienceplots — bandwidth vs batch size
    • Plot gallery
      • Batch size vs achieved HP-AXI bandwidth

Tools

  • pccx-lab — Simulator & AI Profiler
Back to top
View this page
Edit this page

Plot gallery¶

Automatically generated pages for every plots/plot_*.py script. Each script produces an IEEE-style figure via the scienceplots matplotlib style, and is rendered once per language (the scripts are shared; captions stay in English for numerics).

See CLAUDE.md §6 (Plotting Conventions) for authoring rules.

Batch size vs achieved HP-AXI bandwidth

Batch size vs achieved HP-AXI bandwidth

Gallery generated by Sphinx-Gallery

Next
Batch size vs achieved HP-AXI bandwidth
Previous
scienceplots — bandwidth vs batch size
Copyright © 2026, hwkim
Made with Furo
Last updated on 2026-04-19
RTL Lab Docs Blog