Academic
Publications
A High-Performance Sum of Absolute Difference Implementation for Motion Estimation

A High-Performance Sum of Absolute Difference Implementation for Motion Estimation,10.1109/TCSVT.2006.877150,IEEE Transactions on Circuits and Systems

A High-Performance Sum of Absolute Difference Implementation for Motion Estimation   (Citations: 21)
BibTex | RIS | RefWorks Download
This paper presents a high-performance sum of absolute difference (SAD) architecture for motion estimation, which is the most time-consuming and compute-intensive part of video coding. The proposed architecture contains novel and efficient optimizations to overcome bottlenecks discovered in existing approaches. In addition, designed sophisticated control logic with multiple early termination mechanisms further enhance execution speed and make the architecture suitable for general-purpose usage. Hence, the proposed architecture is not restricted to a single block-matching algorithm in motion estimation, but a wide range of algorithms is supported. The proposed SAD architecture outperforms contemporary architectures in terms of execution speed and area efficiency. The proposed architecture with three pipeline stages, synthesized to a 0.18-mum CMOS technology, can attain 770-MHz operating frequency at a cost of less than 5600 gates. Correspondingly, performance metrics for the proposed low-latency 2-stage architecture are 730 MHz and 7500 gates
Journal: IEEE Transactions on Circuits and Systems for Video Technology - TCSV , vol. 16, no. 7, pp. 876-883, 2006
Cumulative Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.
    • ...Vanne et al. [4] proposed a new SAD processing unit and first included a comparison of various SAD processing units in terms on area and delay...
    • ...In Table III, we can note that our architectures, synthesized for maximum throughput, surpass those presented in [2], [4] and [6]...
    • ...The work in [4] shows a 16-bits-input SAD architecture (equivalent to our 2-input sample)...
    • ...We achieved a 496% throughput increase (16-input 6-stage pipeline) compared to [4]...
    • ...[2] [4] [6] This work CMOS Technology TSMC 0.18µm 0.18µm TSMC 0.18µm TSMC 0.18µm...

    Fabio L. Walteret al. Synthesis and comparison of low-power high-throughput architectures fo...

    • ...The kernel processing can be implemented with iterative parallel/serial ALU operations [25]...

    Yong-Kyu Jung. Hardware/Software Co-reconfigurable Instruction Decoder for Adaptive M...

    • ...The absolute difference module implements the absolute difference operation [27] expressed as...

    Gustavo A. Ruizet al. An Efficient VLSI Architecture of Fractional Motion Estimation in H.26...

    • ...In the last few years several efficient circuits and specialized processors for Motion Estimation (ME) have been proposed [1-6]...
    • ...every clock cycle), memory organizations exploited in existing parallel SAD-based architectures [1, 2, 4-6] can be used...
    • ...For this reason, just as happens in the referenced counterparts [1, 2, 4-6], implementation details and comparisons provided in the following of the paper are referred to the portion that in Fig.2 is enclosed in the grey box...
    • ...The circuits recently proposed in [1, 2, 4-6] to perform SAD-based dissimilarity measures for VBSME were chosen as the reference counterparts...
    • ...Table I demonstrates that, at a parity of parallelism level, when the hybrid PE is used, the novel MFWSAD-based circuits can achieve performances comparable with the SAD-based counterparts described in [1, 2, 4-6], but required resources are significantly increased...

    Stefania Perriet al. VLSI Circuits for Accurate Motion Estimation

    • ...CSAtreesarealsooptimized. Asin[19], theabsolute difference operation can be expressed as (4)...
    • ...In this way, the inverters in allPEs,whichwereadoptedby[19],arealsodiscarded.Because 32SAD Treesare configured in the design, inverters could be saved...

    Zhenyu Liuet al. HDTV1080p H.264/AVC Encoder Chip Design and Performance Analysis

Sort by: