SiddheshShinde
I build the invisible machines under your software — caches, pipelines, accelerators and silicon layouts. I care about cycles, joules and millimeters.
The engineer behind the silicon

I'm Siddhesh Shinde, a Computer Engineering grad student at NC State, obsessed with squeezing every last cycle, joule and millimeter out of silicon. My work spans cache & memory systems, out-of-order pipelines, CNN accelerators in SystemVerilog and full RTL-to-GDSII physical design in Synopsys ICC2.
I love the moment when an abstract architecture idea becomes a waveform, then a post-route layout, then real performance. Recent highlights: a 16-tap FIR accelerator hitting ~30.6× over software, a 130 nm PnR flow with 43% PPAT improvement at an 8.24 ns critical path, and a streamed CNN engine with sliding-window + ping-pong buffering.
Academic Trace
M.S. Computer Engineering
Coursework: Embedded Systems Architecture · Architecture of Parallel Computers · Neural Networks & Deep Learning · ASIC Verification · Microprocessor Architecture · ASIC & FPGA Design with Verilog.
B.E. Electronics & Telecommunication
Coursework: VLSI Design · Digital & Analog Circuits · Control Systems · Power Electronics. Strong foundation in CMOS, DSP and embedded design.
Featured Work
Architecture, RTL, verification and physical-design projects — each with a live visualization. Click any card to open.
Computer Architecture Simulator Suite
C/C++ simulators for branch prediction & configurable L1/L2 cache hierarchies.
Branch Predictor: Bimodal · Gshare · Hybrid
Trace-driven comparison of bimodal, gshare and hybrid predictors with deep stats.
Out-of-Order Superscalar Pipeline Simulator
Cycle-accurate dynamic instruction scheduling simulator for RISC-V.
DRAM-Streamed CNN Accelerator (RTL)
4×4 Conv + LeakyReLU + 2×2 AvgPool streamed CNN engine in SystemVerilog.
I2C Multi-Bus Controller — Verification & Coverage
Layered SystemVerilog verification env for a Wishbone-controlled I2CMB.
Intel Processor Testbench — RTL-to-GDSII (SkyWater 130 nm)
Full PnR flow on an open-PDK processor block in Synopsys ICC2.
FIR Filter Accelerator for RISC-V SoC
16-tap parallel-MAC FIR IP with DMA, ~30.6× over software baseline.
Tech Stack
Programming
RTL & Design
EDA Tools
Concepts
Experience
Project Intern · RFID Attendance System (Custom PCB)
- Designed, integrated and debugged an RFID-based attendance system on a custom PCB using a microcontroller, RFID reader, LCD and GSM module for real-time identification and notification.
- Performed board bring-up, prototype validation and PCB debugging of the 5 V power subsystem using oscilloscope and DMM, resolving voltage stability and ripple issues.
- Owned the firmware ↔ hardware integration loop and delivered a working classroom prototype.
Let's build something fast
Have a role, collab, or chip to tape out? Drop a signal.