Publications

(2023). MSD: Mixing Signed Digit Representations for Hardware-efficient DNN Acceleration on FPGA with Heterogeneous Resources. (To appear) 2023 IEEE 31st Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM'23).

PDF Project

(2023). Model-Platform Optimized Deep Neural Network Accelerator Generation through Mixed-integer Geometric Programming. (To appear) 2023 IEEE 31st Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM'23).

PDF Project

(2023). DPACS: Hardware Accelerated Dynamic Neural Network Pruning through Algorithm-Architecture Co-design. The 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems.

PDF Code Project DOI

(2023). Millisecond autofocusing microscopy using neuromorphic event sensing. Optics and Lasers in Engineering.

Project DOI

(2022). NITI: Training Integer Neural Networks Using Integer-only Arithmetic. IEEE Transactions on Parallel and Distributed Systems.

PDF Code Project DOI

(2022). Lens-free motion analysis via neuromorphic laser speckle imaging. Optics Express 30, 2206-2218 (2022).

Project DOI

(2022). REMOT: A Hardware-Software Architecture for Attention-Guided Multi-Object Tracking with Dynamic Vision Sensors on FPGAs. The 2022 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. (Best Paper Nominee).

PDF Code Project DOI

(2021). High-speed laser-scanning biological microscopy using FACED. Nature Protocols.

Project DOI

(2021). Event-based laser speckle correlation for micro motion estimation. Optics Letters.

Project Project DOI

(2021). HAO: Hardware-aware Neural Architecture Optimization for Efficient Inference. 2021 IEEE 29th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).

Project DOI

(2021). Mix and Match: A Novel FPGA-Centric Deep Neural Network Quantization Framework. 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA).

Project DOI

(2021). Low-Latency In Situ Image Analytics With FPGA-Based Quantized Convolutional Neural Network. IEEE Transactions on Neural Networks and Learning Systems.

Project DOI

(2020). Vision Guided Crop Detection in Field Robots using FPGA-Based Reconfigurable Computers. 2020 IEEE International Symposium on Circuits and Systems (ISCAS).

Project DOI

(2020). Deep-learning-assisted biophysical imaging cytometry at massive throughput delineates cell population heterogeneity. Lab on a Chip.

Project DOI

(2020). FTDL: A tailored FPGA-overlay for deep learning with high scalability. Proceedings of the 57th Annual Design Automation Conference 2020.

PDF Code Project Project

(2020). CSB-RNN: A faster-than-realtime RNN acceleration framework with compressed structured blocks. 2020 International Conference on Supercomputing.

PDF Project DOI

(2020). Dynamic Sparse Training: Find Efficient Sparse Network From Scratch With Trainable Masked Layers. Eighth International Conference on Learning Representations.

PDF Code Project DOI

(2020). RedCap: residual encoder-decoder capsule network for holographic image reconstruction. Optics Express.

Project DOI

(2020). FTDL: An FPGA-tailored architecture for deep learning systems. Proceedings of the 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays.

PDF Code Project Project Poster Slides DOI

(2020). A super real-time RNN framework with compressed structured block. 2020 Boston Area Architecture Workshop.

PDF Project Slides

(2020). PARC: ultrafast and accurate clustering of phenotypic data of millions of single cells. Bioinformatics.

Project DOI

(2020). Exploiting Elasticity in Tensor Ranks for Compressing Neural Networks. 2020 25th International Conference on Pattern Recognition (ICPR).

Project DOI

(2019). GraVF-M: Graph Processing System Generation for Multi-FPGA Platforms. ACM Transactions on Reconfigurable Technology and Systems (TRETS) Volume 12 Issue 4.

PDF Code Project DOI

(2019). Fringe Pattern Improvement and Super-Resolution Using Deep Learning in Digital Holography. IEEE Transactions on Industrial Informatics.

Project DOI

(2019). E-LSTM: Efficient inference of sparse LSTM on embedded heterogeneous system. Proceedings of the 56th Annual Design Automation Conference 2019.

PDF Code Project Poster Slides DOI

(2019). PACoGen: A Hardware Posit Arithmetic Core Generator. IEEE Access.

Project DOI

(2019). A real-time coprime line scan super-resolution system for ultra-fast microscopy. IEEE Transactions on Biomedical Circuits and Systems.

PDF Project DOI

(2018). Large-scale multi-class image-based cell classification with deep learning. IEEE Journal of Biomedical and Health Informatics.

Project DOI

(2018). Performance-driven System Generation for Distributed Vertex-Centric Graph Processing on Multi-FPGA Systems. 2018 28th International Conference on Field Programmable Logic and Applications (FPL).

PDF Code Project Poster Slides DOI

(2018). Architecture Generator for Type-3 Unum Posit Adder/Subtractor. 2018 IEEE International Symposium on Circuits and Systems.

PDF Project DOI

(2018). Universal number posit arithmetic generator on FPGA. 2018 Design, Automation Test in Europe Conference Exhibition.

Project DOI

(2018). An Unified Architecture for Single, Double, Double-Extended, and Quadruple Precision Division. Circuits, Systems, and Signal Processing.

PDF Project DOI

(2017). Ultra-low latency continuous block-parallel stream windowing using FPGA on-chip memory. 2017 International Conference on Field Programmable Technology.

PDF Project DOI

(2017). NnCore: A parameterized non-linear function generator for machine learning applications in FPGAs. 2017 International Conference on Field Programmable Technology (ICFPT).

PDF Code Project DOI

(2017). Image super-resolution for ultrafast optical time-stretch imaging. 25th Congress of the International Commission for Optics.

PDF Project Slides

(2017). Towards Flexible Automatic Generation of Graph Processing Gateware. HEART2017: Proceedings of the 8th International Symposium on Highly Efficient Accelerators and Reconfigurable Technologies.

PDF Code Project Slides DOI

(2017). A parameterizable activation function generator for FPGA-based neural network applications. 2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).

PDF Project DOI

(2017). All-passive pixel super-resolution of time-stretch imaging. Scientific Reports.

Project DOI

(2017). High-throughput cellular imaging with high-speed asymmetric-detection time-stretch optical microscopy under FPGA platform. 2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig).

PDF Code Project DOI

(2017). Area-Efficient Architecture for Dual-Mode Double Precision Floating Point Division. IEEE Transactions on Circuits and Systems I: Regular Papers.

PDF Project DOI

(2016). High-throughput time-stretch imaging flow cytometry for multi-class classification of phytoplankton. Optics Express.

Project DOI

(2016). Real-time object detection and classification for high-speed asymmetric-detection time-stretch optical microscopy on FPGA. 2016 International Conference on Field Programmable Technology (FPT).

Project Project DOI

(2016). Towards FPGA-assisted Spark: An SVM training acceleartion case study. 2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig).

PDF Code Project DOI

(2016). Computationally Efficient Hyperspectral Data Learning Based on the Doubly Stochastic Dirichlet Process. IEEE Transactions on Geoscience and Remote Sensing.

Project DOI

(2016). GraVF: A vertex-centric distributed graph processing framework on FPGAs. 2016 26th International Conference on Field Programmable Logic and Applications (FPL).

PDF Code Project Poster Slides DOI

(2016). A Soft Processor Overlay with Tightly-coupled FPGA Accelerator. 2nd International Workshop on Overlay Architectures for FPGAs (OLAF 2016.

PDF Code Project DOI

(2016). Vertex-centric Graph Processing on FPGA. 2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).

PDF Project Poster DOI

(2015). QuickDough: a rapid FPGA loop accelerator design framework using soft CGRA overlay. 2015 International Conference on Field Programmable Technology (FPT).

PDF Code Project DOI

(2015). Automatic Nested Loop Acceleration on FPGAs Using Soft CGRA Overlay. The Second International Workshop on FPGAs for Software Programmers.

PDF Project DOI

(2015). Configurable Architectures for Multi-Mode Floating Point Adders. IEEE Transactions on Circuits and Systems I: Regular Papers.

PDF Project DOI

(2013). Direct Virtual Memory Access from FPGA for High-Productivity Heterogeneous Computing. 2013 International Conference on Field-Programmable Technology (FPT).

PDF Project DOI

(2009). Operation scheduling for FPGA-based reconfigurable computers. 2009 International Conference on Field Programmable Logic and Applications.

Project DOI