A Novel Reconfigurable Data-Flow Architecture for Real Time Video Processing

Expand
  • (1. School of Microelectronics, Xidian University, Xi’an 710071, China; 2. School of Electronic Engineering, Xi’an University of Posts and Telecommunications, Xi’an 710121, China)

Online published: 2013-08-12

Abstract

This paper describes a dynamically reconfigurable data-flow hardware architecture optimized for the computation of image and video. It is a scalable hierarchically organized parallel architecture that consists of data-flow clusters and finite-state machine (FSM) controllers. Each cluster contains various kinds of cells that are optimized for video processing. Furthermore, to facilitate the design process, we provide a C-like language for design specification and associated design tools. Some video applications have been implemented in the architecture to demonstrate the applicability and flexibility of the architecture. Experimental results show that the architecture, along with its video applications, can be used in many real-time video processing.

Cite this article

LIU Zhen-tao1* (刘镇弢), LI Tao2 (李 涛), HAN Jun-gang2 (韩俊刚) . A Novel Reconfigurable Data-Flow Architecture for Real Time Video Processing[J]. Journal of Shanghai Jiaotong University(Science), 2013 , 18(3) : 348 -359 . DOI: 10.1007/s12204-013-1405-2

References

[1] Rowen C. Engineering the complex SOC: Fast, flexible design with configurable processors [M]. Beijing: China Machine Press, 2005: 11-20.
[2] Compton K, Hauck S. Reconfigurable computing: A survey of systems and software [J]. ACM Computing Surveys, 2002, 34(2): 171-210.
[3] Oruklu E, Saniie J. Dynamically reconfigurable architecture design for ultrasonic imaging [J]. IEEE Transactions on Instrumentation and Measurement, 2009, 58(8): 2856-2866.
[4] D′?az J, Ros E, Carrillo R, et al. Real-time system for high-image resolution disparity estimation [J]. IEEE Transactions on Image Processing, 2007, 16(1): 280-285.
[5] Batlle J, Marti J, Ridao P, et al. A new FPGA/DSP-based parallel architecture for real-time image processing [J]. Real-Time Imaging, 2002, 8(5): 345-356.
[6] Chen J C, Chien S Y. CRISP: Coarse-grained reconfigurable image stream processor for digital still cameras and camcorders [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2008, 18(9): 1223-1236.
[7] Farrugia N, Mamalet F, Roux S, et al. Fast and robust face detection on a parallel optimized architecture implemented on FPGA [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2009, 19(4): 597-602.
[8] Chattopadhyay A, Chen X, Ishebabi H, et al. High-level modelling and exploration of coarse-grained re-configurable architectures [C]//Proceedings of IEEE 2008 Design, Automation and Test in Europe. Munich, Germany: IEEE, 2008: 1334-1339.
[9] Dennis J B, Misunas D P. A preliminary architecture for a basic data-flow processor [J]. ACM SIGARCH Computer Architecture News, 1974, 3(4): 126-132.
[10] Hicks J, Chiou D, Ahg B S, et al. Performance studies of ID on the Monsoon dataflow system [J]. Journal of Parallel and Distributed Computing, 1993, 18(3): 273-300.
[11] Cho M H, Cheng C C, Kinsy M, et al. Diastolic arrays: Throughput-driven reconfigurable computing [C]//2008 IEEE/ACM International Conference on Computer-Aided Design. San Jose, CA: IEEE, 2008: 457-464.
[12] Dennis J B. Data flow supercomputers [J]. IEEE Computer, 1980, 13(11): 48-56.
[13] Veen A H. Dataflow machine architecture [J]. ACM Computing Surveys, 1986, 18(4): 365-396.
[14] Sanders J, Kandrot E. CUDA by example: An introduction to general-purpose GPU programming [M]. Boston, MA: Addison-Wesley Professional, 2010.
[15] Chiussi F, Bakhru U, Brizio A, et al. A chipset for scalable QoS-preserving protocol-independent packet switch fabrics [C]// Proceedings of 2001 IEEE International Solid-State Circuits Conference. San Jose, CA: IEEE, 2001: 448-500.
[16] Hu C, Tang Y, Chen X, et al. Per-flow queueing by dynamic queue sharing [C]// Proceedings of 26th IEEE International Conference on Computer Communications, in IEEE INFOCOM 2007. Anchorage, AK: IEEE, 2007: 1613-1621.
[17] Sweldens W. The lifting scheme: A new philosophy in biorthogonal wavelet constructions [J]. Wavelet Applications in Signal and Image Processing, 1995, 3: 68-79.
[18] Cohen A, Daubechies I, Feauveau J C. Biorthogonal bases of compactly supported wavelets [J]. Communications on Pure and Applied Mathematics, 1992, 45: 485-560.
[19] Chen T, Wu H R, Yu Z H. Efficient deinterlacing algorithm using edge-based line average interpolation [J]. Optical Engineering, 2000, 39(8): 2101-2105.
[20] Erd¨os P, Koren I, Moran S, et al. Minimumdiameter cyclic arrangements in mapping data-flow graphs onto VLSI arrays [J]. Computing Systems Theory, 1988, 21(1): 85-98.
[21] Novo D, Li M, Fasthuber R, et al. Exploiting finite precision information to guide data-flow mapping [C]//Proceedings of the 47th Design Automation Conference. Anaheim, CA: ACM, 2010: 248-253.
[22] Van Der Laan W J, Jalba A C, Roerdink J B T M. Accelerating wavelet lifting on graphics hardware using CUDA [J]. IEEE Transactions on Parallel and Distributed Systems, 2011, 22(1): 132-146.
Options
Outlines

/