Although there are distinct power-performance ad- vantages in customizing an accelerator for a specific combination of FPGA platform and neural network model, developing such highly customized accelerators is a challenging task due to the massive design space spans from the range of network models to be accelerated, the target platform’s compute capability, and its memory capacity and performance characteristics. To address this architectural customization problem, an automatic design space exploration (DSE) framework using a mixed-integer geometric programming (MIGP) approach is presented. Given the set of DNN models to be accelerated and a generic description of the target platform’s compute and memory capabilities as input, the proposed framework automatically customizes an architectural template for the platform-model combination and produces the associated I/O schedule to maximize its end-to- end performance. By formulating DNN inference as a multi-level loop tiling problem, the proposed framework first customizes an accelerator template that consists of a parameterizable array architecture with SIMD execution cores and a customizable memory hierarchy using a MIGP to maximize the expected resource utilization. Subsequently, a second MIGP is used to schedule memory and compute operations as tiles to improve on- chip data reuse and memory bandwidth utilization. Experimental results from a wide range of neural network models and FPGA platform combinations show that the proposed scheme is able to produce accelerators with performance comparable to the state-of-the-art. The proposed DSE framework and the resulting hardware/software generator are available as an open-source package called AGNA with the hope that it may facilitate vendor-agnostic DNN accelerator development from the research community in the future.