DCT Presentation1

Embed Size (px)

Citation preview

  • 7/22/2019 DCT Presentation1

    1/39

    A

    Presentation on

    Design & Verification of DCT

    Algorithm

    Guided by:- Submitted by:-

    Mr. Preet Jain Atush Jain(Asst. Prof. (0802EC09ME03)

    EC Department)

  • 7/22/2019 DCT Presentation1

    2/39

    Abstract

    DCT is abbreviated as DiscreteCosine Transform.

    It can be regarded as a discrete time version of the FourierCosine series.

    It is technique for converting a signal into elementaryfrequency components.

    It is very Common and well known algorithm, used for signaland image compression.

  • 7/22/2019 DCT Presentation1

    3/39

    Abstract Contd

    The DCT Core uses Direct Implementation of algorithm (i.e.

    as per the standard equation of DCT)

    The DCT Core is implemented using Verilog HDL.

    The output of core is then verified from the output of

    MATLAB.

  • 7/22/2019 DCT Presentation1

    4/39

    Discrete Cosine Transform

    A discrete cosine transform (DCT) expresses a sequence of

    finitely many data points in terms of a sum of cosine functions

    oscillating at different frequencies.

    The Discrete Cosine Transform (DCT) of a one dimensional

    sequence of length N is defined as

    where k = 0,1,2,..., N 1

    The original signal vector x(n) can be reconstructed back from

    the DCT coefficients Y[k] by using the Inverse DCT (IDCT)

    operation and it can be defined as

    1

    0

    }2/)12cos{()()()(N

    n

    NknnxkczY

  • 7/22/2019 DCT Presentation1

    5/39

    Discrete Cosine Transform

    where n = 0,1,2,..., N 1 In both the above equations c[k] is defined as

    1

    0

    }2/)12cos{][][)(N

    k

    NknkYkcnx

    1

    0

    }2/)12cos{][][)(N

    k

    NknkYkcnx

    1

    0

    }2/)12cos{][][)(N

    k

    NknkYkcnx

    1

    0

    }2/)12cos{][][)(N

    k

    NknkYkcnx

    1.........2,1,)/2(

    0,)/1(

    ][ NkN

    kN

    kc

  • 7/22/2019 DCT Presentation1

    6/39

    DCT vs DFT

    If we wish to find the frequency spectrum of a function thatwe have sampled, the continuous Fourier Transform is not souseful. For that, We need a discrete version like DFT.

    When the input data contains only real numbers, the sinecomponent of the DFT is 0, then DFT becomes a DiscreteCosine Transform(DCT).

    The Discrete Fourier Transform (DFT) and Discrete CosineTransform (DCT) perform similar functions i.e. they bothdecompose a finite-length discrete-time vector into a sum ofscaled-and-shifted basis functions.

  • 7/22/2019 DCT Presentation1

    7/39

    DCT vs DFT

    The difference between both the transforms is the type of basisfunction used; the DFT uses a set of complex exponentialfunctions, while the DCT uses only (real-valued) cosine

    functions.

    The DCT & DFT are used because Some tasks are much easierto handle in the frequency domain that in the time domain. Forexample: graphicequalizer. We want to boost the bass:

    1. Transform to frequency domain.2. Increase the magnitude of low frequency components.

    3. Transform back to time domain.

  • 7/22/2019 DCT Presentation1

    8/39

    Application of DCT

    For audio:

    Human ear has different dynamic range for differentfrequencies.

    It transform from time domain to frequency domain, and

    quantize different frequencies differently.

    For images and video:

    Human eye is less sensitive to fine detail.

    It transform from spatial domain to frequency domain, and

    quantize high frequencies more coarsely (or not at all)

    Has the effect of slightly blurring the image - may not be

    perceptable if done right.

  • 7/22/2019 DCT Presentation1

    9/39

    Literature Review

    Distributed Arithmetic [6,9,10]:-

    DCT have been implemented using distributed mechanism.

    Most often encountered form of its computation is sum of

    product. The product of a pair of matrices can be realized using the DA

    when one of the vectors is constant.

    (1)

  • 7/22/2019 DCT Presentation1

    10/39

    Distributed Arithmetic Contd..

    Where, Ak is constant

    Xk is the input data.

    If A1,.AL are all N bits signed 2s complement

    binary number, (1) can also be represented as:

    (2)

    (3)

  • 7/22/2019 DCT Presentation1

    11/39

    Distributed Arithmetic Contd..

    In eq. (2), matrix A is a adder matrix

    (4)

    but it only consists of two elements: 0and 1. It is easy to

    find that Y0, Y1, , YN-1 are the sum of some data fromX1,X2,XL, so the computation of Y only contains two

    operations: addition and shift.

  • 7/22/2019 DCT Presentation1

    12/39

    Distributed Arithmetic Contd..

    DA uses a look-up table and accumulators instead ofmultipliers.

    Each single bit from each single value of the two multipliedvariables contribute only once to the sum. Because {0, 1} are

    the values as discussed earlier, can be restricted to 2^n,therefore they can be pre-calculated and saved in a look-uptable to be retrieved later.

    The construction of look up table used by the Distributed

    Arithmetic method take large memory size

    . The shift operation is implemented by wirings, which costslittle delay and hardware resources.

  • 7/22/2019 DCT Presentation1

    13/39

    Fast DCT Algorithms

    To overcome the extensive computation of the DCT Chen et al

    [5, 15], proposed fast DCT (FDCT).

    Chen used the Fast Fourier Transform (FFT) method topropose more efficient algorithm involving only real operation

    for computing what he called the Fast Discrete Cosine

    Transform algorithm (FDCT).

    Let, The 8-point DCT can be written as a matrix transform.

    Y=AX

  • 7/22/2019 DCT Presentation1

    14/39

    Fast DCT Algorithms Contd

    Where,

    The Multiplier coefficients are given by

  • 7/22/2019 DCT Presentation1

    15/39

    Fast DCT Algorithms Contd

    Where,

  • 7/22/2019 DCT Presentation1

    16/39

    Fast DCT Algorithms Contd

    Due to the Symmetry of the (8 X 8) multiplication matrix, it

    can be replaced by two (4x4) x (4x4) matrices, which can be

    computed in parallel, as can the sums and differences forming

    the vectors below

  • 7/22/2019 DCT Presentation1

    17/39

    Fast DCT Algorithms Contd

    The matrices operation of the design was implemented in

    terms of a plot for the signal-flow.

  • 7/22/2019 DCT Presentation1

    18/39

    Fast DCT Algorithms Contd

    The Chen fast DCT signal-flow requires total of 18

    multiplications.

    Lee Algorithm[8, 15]: -

    Lee algorithm [8] is based on the matrix representation.

    In fact, the first step is nothing than a butterfly decompositionyielding to an even and an odd part.

  • 7/22/2019 DCT Presentation1

    19/39

    Fast DCT Algorithms Contd

    The even part will be just a 1-D DCT of order N/2. While, theodd part will be computed through a matrix multiplication.

    For 1-D DCT of order N=8, the number of operationnecessarily for these algorithm will be 32 multiplications and32 additions.

  • 7/22/2019 DCT Presentation1

    20/39

    Project Design Flow

    HDL Flow Matlab Flow

    Taking input matrix of size 8 x 1

    Apply DCT algorithm designed through HDL on it

    Check simulation results for DCT output

    A

    Taking input matrix of size 8 x 1

    Compute DCT through dct command on MATLAB

    Store result of above calculated DCT

    B

  • 7/22/2019 DCT Presentation1

    21/39

    Project Design Flow Contd

    Comparison b/w HDL & Matlab Results

    Compare results of A and B

    BA

  • 7/22/2019 DCT Presentation1

    22/39

    Design of DCT Controller

    The equation stated below is the standard equation for the 1d -

    DCT

    (1)

    Where,

    Y(u) = Coefficient value in transform domain

    X(i) = Coefficient value in pixel domain.u = Co-ordinates in transform domain

    i = Spatial co-ordinates in pixel domain

    7

    0

    ]16/)12cos[()(}2/)({)(

    i

    uiixucuy

    02/1 u

    02/1 u

    02/1 u

  • 7/22/2019 DCT Presentation1

    23/39

    Design of DCT Controller Contd...

    (2)

    Considering the eq. (1) following eight equations are inferred

    Y(0) = [X(0) + X(1) + X(2) + X(3) + X(4) + X(5) + X(6) + X(7)] P (3.1)

    Y(1) = [X(0) - X(7)]A + [X(1) - X(6)]B + [X(2) - X(5)]C + [X(3) - X(4)] D

    (3.2)

    Y(2) = [X(0) - X(3)X(4) + X(7)]M + [X(1) - X(2) - X(5) + X(6)] N (3.3)

    Y(3) = [X(0) - X(7)]B + [X(1) - X(6)] (-D) + [X(2) - X(5)] (-A) + [X(3) -

    X(4)] (-C) (3.4)

    0,1

    0,2/1)(

    u

    uuc

    0,1

    0,2/1)(

    u

    uuc

    0,1

    0,2/1)(

    u

    uuc

    0,10,2/1)(

    u

    uuc

  • 7/22/2019 DCT Presentation1

    24/39

    Design of DCT Controller Contd...

    Y(4) = [X(0) - X(1) - X(2) + X(3) + X(4) - X(5) - X(6) + X(7)] P (3.5)

    Y(5) = [X(0) - X(7)]C + [X(1) - X(6)](-A) + [X(2) - X(5)]D + [X(3) - X(4)]

    B (3.6)

    Y(6) = [X(0) - X(3)X(4) + X(7)]N + [X(1) - X(2) - X(5) + X(6)] (-M)

    (3.7)

    Y(7) = [X(0) - X(7)]D + [X(1) - X(6)] (-C) + [X(2) - X(5)] B + [X(3) -

    X(4)] (-A) (3.8)

  • 7/22/2019 DCT Presentation1

    25/39

    Design of DCT Controller Contd...

    Where,

    M = 0.5 * Cos(pi/8) = 0.5 * Cos (2*pi/16)

    N = 0.5 * Cos(3*pi/8) = 0.5 * Cos (6*pi/16)

    P = 0.5 * Cos(pi/4) = 0.5 * Cos (4*pi/16)A = 0.5 * Cos(pi/16)

    B = 0.5 * Cos(3*pi/16)

    C = 0.5 * Cos(5*pi/16)

    D = 0.5 * Cos(7*pi/16)

  • 7/22/2019 DCT Presentation1

    26/39

    Generalized Equation Implementer

    BlockInput

    (16 bit)

    Xin0 Xin7 Xin1 Xin6 Xin2 Xin5 Xin3 Xin4

    Add/Sub Block Add/Sub Block Add/Sub Block Add/Sub Block

    Multiplier Multiplier Multiplier Multiplier

    Adder

    Output

  • 7/22/2019 DCT Presentation1

    27/39

    Add/Sub Block

    Adder

    Substractor

    MuxInput 1

    Input 2Output

    Sel

  • 7/22/2019 DCT Presentation1

    28/39

    Multiplier Block

    Output of Add/Sub Block

    Multiplier

    Cos Coefficient

    Output

  • 7/22/2019 DCT Presentation1

    29/39

    DCT Controller Interface

  • 7/22/2019 DCT Presentation1

    30/39

    RTL View

  • 7/22/2019 DCT Presentation1

    31/39

    Results

    The DCT core is implemented in HDL. It is synthesized and simulated

    using Xilinx ISE 9.2i on Spartan 3 (xc3s4000-5fg900)

    Synthesis Report:-

    S.No. Logic Utilization Used Available Utilization1 Number of Slices 685 27648 2%

    2 Number of Slice Flip Flops 788 55296 1%

    3 Number of 4 input LUTs 1238 55296 2%

    4 Number of bonded IOBs 259 633 40%

  • 7/22/2019 DCT Presentation1

    32/39

    Advanced HDL Synthesis Report

    S.No. Component Used

    1 16x16-bit Multiplier 22

    2 16-bit Adder 13

    3 16-bit Subtractor 15

    4 16-bit Register 50

  • 7/22/2019 DCT Presentation1

    33/39

    Matlab Results

  • 7/22/2019 DCT Presentation1

    34/39

    VerificationOpen DCT Core Code with Xilinx ISE 9.2i

    Go to the process window and double click on synthesis Button

    After Successful Synthesis. Create Test Bench

    Apply Stimulus to Test Bench

    Select Behavioral Simulation from source window

    From Process Window run Xilinx ISE Simulator

    Simulation will start & generate output

  • 7/22/2019 DCT Presentation1

    35/39

    Simulation Results

  • 7/22/2019 DCT Presentation1

    36/39

    Comparison b/w HDL Simulation &

    Matlab Results

  • 7/22/2019 DCT Presentation1

    37/39

    Conclusion

    The 1DDCT algorithm code was written in the Verilog HDL. It is then,

    synthesized and simulated successfully through Xilinx ISE 9.2i.

    Eight 8 x 1 input samples are taken, and DCT is calculated through DCT

    core designed in Verilog HDL and the same inputs are used for calculatingDCT using MATLAB.

    The latency of implemented core is five clock cycles and through put is one

    clock cycle.

    Comparison is done between output of Matlab and HDL Simulation Result

    which shows that, the accuracy of implemented core is 93.75%.

  • 7/22/2019 DCT Presentation1

    38/39

    References

    1. R.C. Gonzalez, R.E. Woods, DigitalImage Processing,Pearson Education 3rdEdition 2008

    2. David Salomon, DataCompression,The Complete Reference, 2nd Edition Springer-Verlag 1998

    3. Y. C. Lim, J. B. Evans, and B. Liu, Decomposition of binary integers into signed power-of-two terms,IEEE Trans. Circuits Syst., vol. 38, no. 6, pp. 667-672,1991

    4. R. J. Clark, Relationbetween the Karhunen-Lobe and cosine transform,IEEE Proc., vol. 128, pt. F, no.6, pp. 359-360, Nov.1981.

    5. W. Chen, C.H.Smith, and S.C.Fralick,A fast computational algorithm for the Discrete Cosine transformIEEE,Trans.Commun.COMM-25, pp.100 1009, Sep.1977.

    6. Peng Chungan, Cao Xixin, Yu Dunshan, Zhang Xing, A250MHz optimized distributed architecture of 2D8x8 DCT,7th International Conference on ASIC, pp. 189192, Oct. 2007

    7. Bian Li Jian, Zeng Xuan, Tong Jia Rong, Liu Yue, AnEfficient VLSI Architecture for 2D-DCT UsingDirect Method

    8. B. G. Lee, Anew algorithm to compute the discrete cosine transform,IEEE Trans. Acoust., speech, SignalProcessing, vol. ASSP-32, pp.12431245, Dec.1984.

  • 7/22/2019 DCT Presentation1

    39/39

    References Contd

    9. Vijay Kumar Sharma, K. K. Mahapatra and Umesh C. Pati, AnEfficient Distributed Arithmetic based VLSIArchitecture for DCT,National Institute of Technology, Rourkela, India-769008

    10. Sungwook Yu DCT implementation with Distributed Arithmetic. IEEE Transactions on ComputersVolume 50, Issue 9 September 2001 Pages: 985991, year of Publication: 2001 ISSN: 0018-9340.

    11. Trevor W. Fox-2002 Rapid Prototyping of Field Programmable Gate Array- Based Discrete CosineTransform ApproximationsEURASIP JASP 2003, 543- 554.

    12. Anthony Edward Nelson Implementationof image processing algorithms on FPGA hardware thesis ofMaster of Science in Electrical Engineering, May 2000 Graduate School of Vanderbilt University.

    13. Latha Pillai.video compression using DCT,XILINX Application Note : Virtex- 11 series.XAPP610,v1.4,April 10,2008.

    14. K. R. Rao and P. Yip, Discrete Cosine Transform: Algorithms, Advantages and Applications, Academic

    Press, Inc., 1990

    15. Hassan EL-Banna, Alaa A. EL-Fattah, Waleed Fakhr, AnEfficient Implementation of the 1D DCT usingFPGA Technology,ICM 2003, Dec. 9-1 1, Cairo, Egypt