Ashish Reddy Bommana, Susheel Ujwal Siddamshetty, Dhilleswararao Pudi, Arvind T. K. R, Srinivas Boppu, M Sabarimalai Manikandan, Linga Reddy Cenkeramaddi, “Design of Synthesis-time Vectorized Arithmetic Hardware for Tapered Floating-point Addition and Subtraction,” has been accepted for publication in ACM Transactions on Design Automation of Electronic Systems (2022).
Abstract: Energy efficiency has become the new performance criterion in this era of pervasive embedded computing; thus, accelerator-rich multi-processor system-on-chips are commonly used in embedded computing hardware. Once computationally intensive machine learning applications gained much traction, they are now deployed in many application domains due to abundant and cheaply available computational capacity. In addition, there is a growing trend toward developing hardware accelerators for machine learning applications for embedded edge devices where performance and energy efficiency are critical. Although these hardware accelerators frequently use floating-point operations for accuracy, reduced-width floating-point formats are also used to reduce hardware complexity; thus, power consumption while maintaining accuracy. Vectorization concepts can also be used to improve performance, energy efficiency, and memory bandwidth. We propose the design of a vectorized floating-point adder/subtractor that supports arbitrary length floating-point formats with varying exponent and mantissa widths in this article. In comparison to existing designs in the literature, the proposed design is 2.57× area- and 1.56× power-efficient, and it supports true vectorization with no restrictions on exponent and mantissa widths.