FPGA 2018: Program

Note: Detailed descriptions of workshop and panel are available here.

Start Time


Sunday February 25

Workshops 9am – 2:30pm


FPGA-based Accelerated Cloud Computing with AWS EC2 F1 and SDAccel (Slides)
Parimal Patel (Xilinx) – San Carlos 3&4

High-Speed FPGA Packet Processing using the new P4 Programming Language (Slides - pptx) (Slides - pdf)
Gordon Brebner (1);

Stephen Ibanez (2)
(1 Xilinx Labs; 2 Stanford University) –
San Carlos 1&2


Doing Research on FPGAs in the Data Center
Organizers: Paul Chow (UToronto); Derek Chiou (UT Austin) – San Carlos 3&4


Lunch – Marriott Ferrantes Bay View (10th Floor)


Training of Quantized Neural Networks (Slides)

Thomas Preusser (Xilinx) - San Carlos 1-4


Optimizing Quantized Neural Networks on FPGAs (Slides)

Robert Green (ASIC Design Services) - San Carlos 1-4


Coffee Break

Special Session: Deep Learning
Session Chair: Andrew Ling, Intel – San Carlos 1-4


CausaLearn: Automated Scalable Framework for Streaming-based Causal Bayesian Learning using FPGAs (Slides)

Bita Darvish Rouhani; Mohammad Ghasemzadeh; Farinaz Koushanfar


C-LSTM: Enabling Efficient LSTM using Structured Compression Techniques on FPGAs (Slides)

Shuo Wang (1); Zhe Li (2); Caiwen Ding (2); Bo Yuan (3); Qinru Qiu (2); Yanzhi Wang (2); Yun (Eric) Liang (1);  Yun (Eric) Liang (1)
(1 Peking Univ; 2 Syracuse Univ; 3 CUNY)


Coffee Break


DeltaRNN: A Power-efficient Recurrent Neural Network Accelerator (Slides)

Chang Gao (1); Daniel Neil (2); Enea Ceolini (1); Shih-Chii Liu (1); and Tobi Delbruck (1)
(1 University of Zurich and ETH Zurich; 2 BenevolentAI)


A Lightweight YOLOv2: A Binarized CNN with A Parallel Support Vector Regression for an FPGA (Slides)

Hiroki Nakahara; Haruyoshi Yonekawa; Tomoya Fujii; and Shimpei Sato
(Tokyo Institute of Technology)




Reception – Marriott Ferrantes Bay View (10th Floor)

Monday February 26


Continental Breakfast – San Carlos Foyer


Opening Remarks – San Carlos 2-4

Jason Anderson (U Toronto); Kia Bazargan (UMN)

Session 1: Architecture
Session Chair: Sinan Kaptanoglu, Microsemi


Architecture and Circuit Design of An All-Spintronic FPGA Device (Slides)

Stephen Williams ; Mingjie Lin
(Univ of Central Florida)


Liquid Silicon: A Data-Centric Reconfigurable Architecture enabled by RRAM Technology (Slides)

Yue Zha and Jing Li
(UW Madison)


Improving FPGA Performance with a S44 LUT Structure (Slides)(short paper)

Wenyi Feng (1); Jonathan Greene (1); Alan Mishchenko (2)
(1 Microsemi; 2 UC Berkeley)


Poster Session 1 and Break – San Carlos 1 & San Carlos Foyer

Session 2: CAD - San Carlos 2-4

Session Chair: Sabya Das, Xilinx


ParaDRo: A Parallel Deterministic Router Based on Spatial Partitioning and Scheduling (Slides)

Chin Hau Hoo (1); and Akash Kumar (2)
(1 National University of Singapore; 2 Technische Universitaet Dresden)


Routing Magic: Performing Computations Using Routing Networks and Voting Logic on Unary Encoded Data (Slides)

Soheil Mohajer; Zhiheng Wang; Kia Bazargan
(Univ of Minnesota)


A Full-System VM-HDL Co-Simulation Framework for Servers with PCIe-Connected FPGAs (Slides)

Shenghsun Cho; Mrunal Patel; Han Chen; Michael Ferdman; Peter Milder
(Stony Brook University)


Lunch – See ticket for location

Session 3: Deep Learning - San Carlos 2-4

Session Chair: Peter Cheung, Imperial College


Towards a Uniform Template-based Architecture for Accelerating 2D and 3D CNNs on FPGA (Slides)

Junzhong Shen; You Huang; Zelong Wang; Yuran Qiao; Mei Wen; Chunyuan Zhang
(National Univ of Defense Tech)


A Customizable Matrix Multiplication Framework for the Intel HARPv2 Xeon+FPGA Platform - A Deep Learning Case Study (Slides)

Duncan Moss (1); Srivatsan Krishnan (2); Eriko Nurvitadhi (2); Piotr Ratuszniak (2); Chris Johnson (2); Jaewoong Sim (2); Asit Mishra (2); Debbie Marr (2); Suchit Subhaschandra (2); Philip Leong (1)
(1 The Univ of Sydney; 2 Intel)


A Framework for Generating High Throughput CNN Implementations on FPGAs (Slides)

(Best Paper Nominee)

Hanqing Zeng; Ren Chen; Chi Zhang; Viktor Prasanna


Poster Session 2 and Break – San Carlos 1 & San Carlos Foyer

Session 4: High Level Synthesis 1 - San Carlos 2-4

Session Chair: Stephen Neuendorffer, Xilinx


Dynamically Scheduled High-level Synthesis (Slides)

(Best Paper Nominee)

Lana Josipovic; Radhika Ghosal; Paolo Ienne


A Scalable Approach to Exact Resource-Constrained Scheduling Based on a Joint SDC and SAT Formulation (Slides)

(Best Paper Nominee)

Steve Dai; Gai Liu; Zhiru Zhang


P4-compatible High-level Synthesis of Low Latency 100 Gb/s Streaming Packet Parsers in FPGAs (Slides)(short paper)

Jeferson Santiago da Silva; François-Raymond Boyer; J.M. Pierre Langlois
(Polytechnique Montreal)


Banquet - San Carlos 2-4


Panel: The Computational Battle for Deep Learning


Debbie Marr (Intel), Jeff Johnson (Facebook), Kees Vissers (Xilinx), (Eric Chung (Microsoft), Song Han (Stanford/MIT)

Tuesday February 27

Session 5: Applications 1 - San Carlos 2-4

Session Chair: John Lockwood, Algo-Logic Systems


Combined Spatial and Temporal Blocking for High-Performance Stencil Computation on FPGAs Using OpenCL (Slides)

Hamid Reza Zohouri; Artur Podobas; Satoshi Matsuoka
(Tokyo Institute of Technology)


A HOG-based real-time and multi-scale Pedestrian Detector Demonstration System on FPGA (Slides)

Jan Dürre; Dario Paradzik; Holger Blume
(Leibniz Universtität Hannover)


Scalable Window Generation for the Intel Broadwell+Arria 10 and High-Bandwidth FPGA Systems

Greg Stitt; Abhay Gupta; Madison Emas; David Wilson; Austin Baylis
(University of Florida)


High-performance QR Decomposition for FPGAs (Slides)(short paper)

Martin Langhammer; Bogdan Pasca


Poster Session 3 and Break – San Carlos 1 & San Carlos Foyer

Session 6: High Level Synthesis 2 - San Carlos 2-4

Session Chair: George Constantinides, Imperial College


ADAM: Automated Design Analysis and Merging for Speeding up FPGA Development (Slides)

Ho-Cheung Ng; Shuanglong Liu; Wayne Luk
(Imperial College London)


Graph-Theoretically Optimal Memory Banking for Stencil-Based Computing Kernels (Slides)

Juan Escobedo; Mingjie Lin
(University of Central Florida)


Architecture Exploration for HLS-Oriented FPGA Debug Overlays (Slides)

Al-Shahna Jamal (1); Jeffrey Goeders (2); Steve Wilton (1)
(1 UBC; 2 BYU)


Lunch – Marriott Ferrantes Bay View (10th Floor)

Session 7: Circuits and Computation Engines - San Carlos 2-4

Session Chair: Nachiket Kapre, University of Waterloo


Memory-Efficient Fast Fourier Transform on Streaming Data by Fusing Permutations (Slides)

François Serre; Markus Püschel
(ETH Zurich)


Degree-aware Hybrid Graph Traversal on FPGA-HMC Platform (Slides)

Jialiang Zhang; Jing Li


Accelerating Graph Analytics By Co-Optimizing Storage and Access on an FPGA-HMC Platform (Slides)

Soroosh Khoram; Jialiang Zhang; Maxwell Strange; Jing Li


Coffee Break - San Carlos Foyer

Session 8: Applications 2 - San Carlos 2-4

Session Chair: Lesley Shannon, Simon Fraser University


Configurable FPGA Packet Parser for Terabit Networks with Guaranteed Wire-Speed Throughput (Slides)

Jakub Cabal (1); Pavel Benáček (1); Lukáš Kekely(1); Michal Kekely (2); Viktor Puš (2); Jan Kořenek (3)
(1 CESNET a.l.e.; 2 Netcope Technologies ; 3 Brno Unive of Tech)


FASTCF: FPGA-based Accelerator for Stochastic-Gradient-Descent-based Collaborative Filtering (Slides)

(Best Paper Award Recipient)

Shijie Zhou (1); Rajgopal Kannan (2); Yu Min (1); Viktor Prasanna (1)
(1 USC; 2 US Army Research Lab)


Rosetta: A Realistic High-Level Synthesis Benchmark Suite for Software Programmable FPGAs (Slides)

Yuan Zhou (1); Udit Gupta (2); Steve Dai (1); Ritchie Zhao (1); Nitish Srivastava (1); Hanchen Jin (1); Joseph Featherston (1); Yi-Hsiang Lai (1); Gai Liu (1); Gustavo Angarita Velasquez (1); Wenping Wang (1); Zhiru Zhang (1)
(1 Cornell; 2 Harvard)


FPGA Fastfood - A High Speed Systolic Implementation of a Large Scale Online Kernel Method (Slides)(short paper)

Sean Fox; David Boland; Philip Leong
(The University of Sydney)


Closing Remarks, Best Paper Award

Poster Session 1 – San Carlos 1

Optimizations of Sequence Alignment on FPGA: A Case Study of Extended Sequence Alignment

Zheming Jin; Kazutomo Yoshii
(Argonne National Lab)

Automatic Optimising CNN with Depthwise Separable Convolution on FPGA

Ruizhe Zhao; Xinyu Niu; Wayne Luk
(Imperial College London)

Continuous Skyline Computation Accelerator with Parallelizing Dominance Relation Calculations

Kenichi Koizumi; Kei Hiraki; Mary Inaba
(The Univ of Tokyo)

Fast-Track: Exploiting Fast FPGA wiring for implementing NoC shortcuts

Nachiket Kapre (1); Tushar Krishna (2)
(1 Univ of Waterloo; 2 Georgia Tech)

An Optimal Microarchitecture for Stencil Computation with Data Reuse and Fine-Grained Parallelism

Yuze Chi;Peipei Zhou; Jason Cong

A FPGA friendly approximate computing framework with hybrid Neural networks

Haiyue Song; Xiang Song; Tianjian Li; Naifeng Jing; Xiaoyao Liang; Li Jiang
(Shanghai Jiao Tong Univ)

In-Package Domain-Specific ASICs for Intel® Stratix® 10 FPGAs: A Case Study of Accelerating Deep Learning Using TensorTile ASIC

Eriko Nurvitadhi; Jeff Cook; Asit Mishra; Debbie Marr; Kevin Nealis; Philip Colangelo; Andrew Ling; Davor Capalija; Utku Aydonat; Sergey Shumarayev; Aravind Dasu

Evaluation of OpenCL Performance-oriented Optimizations for Streaming Kernels on the FPGA

Zheming Jin
(Argonne National Lab)

K-Flow: A Programming and Scheduling Framework to Optimize Dataflow Execution on CPU-FPGA Platforms

Jason Cong (1); Zhenman Fang (1)(2); Yao Hu (3); Di Wu (3)
(1 UCLA; 2 Xilinx ; 3 Falcon Computing Solutions, Inc.)

FPGA-based LSTM Acceleration for Real-Time EEG Signal Processing

Zhe Chen; Andrew Howe; Hugh T. Blair; Jason Cong

Understanding Performance Differences of FPGAs and GPUs

Jason Cong (1); Zhenman Fang (1,2); Michael Lo (1); Hanrui Wang (1,3); Jingxian Xu (1); Shaochong Zhang (1)
(1 UCLA; 2 Xilinx; 3 Fudan Univ)

Poster Session 2 – San Carlos 1

Software/Hardware co-design for multichannel scheduling in IEEE 802.11p MLME

Nan Ding (1); Wei Zhang (2); Yanhua Ma (1); Zhenguo Gao (1)
(1 Dalian Univ of Tech; 2 Huawei)

Solving Satisfiability Problem on Quantum Annealer: A Lesson from FPGA CAD Tools

Juexiao Su; Lei He

Domino: An Asynchronous and Energy-efficient Accelerator for Graph Processing

Chongchong Xu; Chao Wang; Yiwei Zhang; Lei Gong; Xi Li; Xuehai Zhou
(Univ of Science and Tech of China)

Towards Serial-Equivalent Parallel Routing for FPGAs

Minghua Shen (1); Wentai Zhang (2); Nong Xiao (1); Guojie Luo (2)
(1 Sun Yat-sen Univ; 2 Peking Univ)

Performance Comparison of Multiple Approaches of Status Register for Medium Density Memory Suitable for Implementation of a Lossless Compression Dictionary

Matěj Bartík (2); Tomáš Beneš (1); Sven Ubik (2); Pavel Kubalík (1)
(1 CTU FIT DDD; 2 CESNET a.l.e.)

BoxPlacer: Force Directed-Based Timing-Driven Placement for Large-Scale FPGAs

Minghua Shen (1); Jiaxi Zhang (2); Nong Xiao (1); Guojie Luo (2)
(1 Sun Yat-sen Univ; 2 Peking Univ)

DATuner: An Extensible Distributed Autotuning Framework for FPGA Design and Design Automation

Gai Liu (1); Ecenur Ustun (1); Shaojie Xiang (1); Chang Xu (2); Guojie Luo (3); Zhiru Zhang (1)
(1 Cornell; 2 IBM Research-China; 3 Peking Univ)

Mapping Large-Scale DNNs on Asymmetric FPGAs

Wentai Zhang (1); Jiaxi Zhang (1); Minghua Shen (2); Nong Xiao (2); Guojie Luo (1)
(1 Peking Univ; 2 Sun Yat-sen Univ)

Software-Defined FPGA-Based Accelerator for Deep Convolutional Neural Networks

Yankang Du; Qinrang Liu; Shuai Wei; Chen Gao
(National Digital Switching System Engineering & Technology Research Center)

Design of an MTJ-Based Nonvolatile LUT Circuit with a Data-Update Minimized Shift Operation for an Ultra-Low-Power FPGA

Daisuke Suzuki; Takahiro Hanyu
(Tohoku Univ)

High-Throughput Lossless Compression on Tightly Coupled CPU-FPGA Platforms

Weikang Qiao (1); Jieqiong Du (1); Zhenman Fang (1,2); Libo Wang (1); Michael Lo (1); Jason Cong (1); Mau-Chung Frank Chang (1)
(1 UCLA; 2 Xilinx)

Poster Session 3 – San Carlos 1

HexCell: a Hexagonal Cell for Evolvable Systolic Arrays on FPGAs

Fady Hussein; Luka Daoud; Nader Rafla
(Boise State Univ)

Label based Feature Analysis and Target Detection with Imager-driven Processing Mode for Ultrafast-Imager

Xiaoyu Yu (1); Dong Ye (2)

(1 Tencent; 2 Harbin Inst of Tech)

A Low-Power Deconvolutional Accelerator for Convolutional Neural Network Based Segmentation on FPGA

Shuanglong Liu (1); Xinyu Niu (2); Wayne Luk (1)
(1 Imperial College London; 2 Corerain Tech Ltc., China)

FPGAs in the Datacenters: the Case of Parallel Hybrid Super Scalar String Sample Sort (pHS^5)

Mikhail Asiatici (1); Damian Maiorano (2); Paolo Ienne (1)
(1 EPFL; 2 Politecnico di Torino)

SIFT keypoint Descriptor Matching Algorithm: A Fully Pipelined Accelerator on FPGA

Luka Daoud; Muhammad Kamran Latif; Nader Rafla
(Boise State University)

FGC: A Toolflow for Generating and Configuring Custom FPGAs

Oluseyi Ayorinde (1); He Qi (2); Benton Calhoun (2)
(1 US Army Research Lab; 2 University of Virginia)

Exploration of Low Numeric Precision Deep Learning Inference Using Intel® FPGAs

Philip Colangelo (1); Nasibeh Nasiri (1); Eriko Nurvitadhi (1); Asit Mishra (1); Martin Margala (2); Kevin Nealis (1)
(1 Intel; 2 University of Massachusetts Lowell)

LEOSoC: An Open-Source Cross-Platform Embedded Linux Library for Managing Hardware Accelerators in Heterogeneous System-on-Chips

Andrea Guerrieri (1); Sahand Kashani-Akhavan (1); Mikhail Asiatici (1); Pasquale Lombardi (2); Bilel Belhadj (2); Paolo Ienne (1)
(1 EPFL; 2 Syderal SA)

A Self-adaptation Method of Fitting Convolutional Neural Network into FPGA

Ning Mao (1); Zhihong Huang (1); Xing Wei (1); He Zhao (1); Xinkai Di (1); Le Yu (2); Haigang Yang (1)
(1 Chinese Academy of Sciences (UCAS); 2 Beijing Tech and Business Univ)