www.pudn.com > Blackfin_Mpeg_2_4.zip > isadct.asm


/******************************************************************************* 
Copyright(c) 2000 - 2002 Analog Devices. All Rights Reserved. 
Developed by Joint Development Software Application Team, IPDC, Bangalore, India 
for Blackfin DSPs  ( Micro Signal Architecture 1.0 specification). 
 
By using this module you agree to the terms of the Analog Devices License 
Agreement for DSP Software.  
******************************************************************************** 
Module name     : isadct.asm 
Label name      : __isadct 
Version         : 1.3 
Change History  : 
 
                Version     Date        Author         Comments 
                1.3         11/18/2002  Swarnalatha    Tested with VDSP++ 3.0 
                                                       compiler 6.2.2 on  
                                                       ADSP-21535 Rev.0.2 
                1.2         11/13/2002  Swarnalatha    Tested with VDSP++ 3.0 
                                                       on ADSP-21535 Rev.0.2 
                1.1         03/10/2002  Manoj          Modified to match 
                                                       silicon cycle count 
                1.0         08/30/2001  Manoj          Original  
 
Description     : This program performs inverse SADCT on a 8x8 as prescribed in  
                  the MPEG-4 standard. It takes the input transformed data array 
                  X[] in short format to be inverse transformed 
                     X00,X01 ...X07; 
                     X10,X11 ....X17; 
                        ........ 
                     X70,X71.....X77; 
                  and the corresponding shape information in character  
                  format (Alpha Map). Consider the shape array of a 3x3 (taken  
                  for ease of demonstration). The following sequence of  
                  operations are performed. 
         
                  a) Perform inverse SADCT of the appropriate length on the rows 
                  of the input X[]. To do this the shape array is transformed as 
                  shown (after column alignment and row alignment [refer  
                  sadct.asm]) 
 
                  [255 255  0 ;       col+row          [255 255  255 ;   
                    0   0  255;       =======>          255  0    0;    
                    0  255  0 ]       align              0   0    0   ] 
           
                  Perform ISADCT(3) on row 1 of X and ISADCT(1) on row 2 of X. 
                  Row 3 is skipped as there are no non-zero elements in row 3 to 
                  get XC,  
                    where ISADCT(N) => ISADCT(N,N)=K * cos(j*(i+0.5)*(pi/N)), 
                      where i,j E [0 N) and K=sqrt(1/N) : i=0; 
                                =sqrt(2/N) : else. 
                  N is the number of shape elements in the row on which ISADCT  
                  is being performed. In this program, the ISADCT coefficients  
                  are stored in an array and a direct matrix multiplication  
                  method is used to implement the SADCT. It is to be noted that  
                  for N > 6 special flowgraph implementation of ISADCT will be  
                  optimal (considering the conditional branch and SADCT  
                  complexity). However, this has not been incorporated in this  
                  program. 
                  For ISADCT(8) the chens IDCT will suffice. If required, the  
                  flowgraph of ISADCT(7) is to be integrated. However, since the 
                  cycle count is highly dependent on the shape array, the user  
                  has to prudently decide whether to use the flowgraph approach  
                  in application using the considerations of code-size and speed 
                  improvement. In the implementation provided, two ISADCT  
                  outputs are computed simultaneously, by a slight compromise on 
                  memory storage (coefficients). 
 
                  b)Undo the row alignment as shown 
         
                  [255 255  255 ;        row          [255 255  255 ; 
                   255   0   0  ;      =======>         0  255  0   ;   =====> 
                    0    0   0 ]       unalign          0   0   0   ]                   
 
                                                         XC=[XC00  XC01  XC12 ; 
                                                                0   XC31   0  ; 
                                                                0     0    0  ] 
 
                  c) Perform inverse SADCT along the columns of XC using ISADCT 
                  kernels of appropriate lengths i.e. perform ISADCT(1) on  
                  column 1 of XC, ISADCT(2) on row 2 of XC and ISADCT(1) on  
                  row 3. 
 
                  xc=[x00 x10 x20 ;  
                       0  x11  0  ; 
                       0   0   0  ] 
      
                  d) Undo the column alignment as shown 
         
                  [255 255  255 ;      column          [255  255   0  ;          
                    0  255   0  ;      =======>          0    0   255 ;   =====> 
                    0   0    0 ]       unalign           0   255   0  ]         
 
                                                          xrec=[x00 x10   0 ; 
                                                                0    0  x20 ; 
                                                                0   x11  0  ] 
 
                  Instead of the conventional method of rearranging the shape  
                  array a number of times, in this program a novel technique has 
                  been used. The column count of the shape array is stored in  
                  a temporary storage in the stack and used to ease the task of  
                  finding the ISADCT length both row-wise and column-wise. 
 
Prototype       : void isadct(short in[], unsigned char shape[], short out[],  
                              short coeff_tans[]); 
 
                     in          -> Address of the 8x8 sadct array 
                     shape       -> Address of the 8x8 shape array 
                     out         -> Address of the 8x8 output data array 
                     coeff_tans  -> Address of the coefficients 
 
Registers used  : A0, A1, R0-R7, I0-I3, B0-B3, M0, M2, L0-L3, P0-P5, LC0, LC1. 
 
Performance     : 
                Code Size   : 498 Bytes 
                Cycle count : 2284 Cycles for a lower triangular output matrix 
                                   (including the diagonal elements) 
*******************************************************************************/ 
.section L1_code; 
.global __isadct; 
.align 8; 
.extern _Coeff_offset; 
 
__isadct: 
                            //Initializations 
 
    B0 = R0;                //Address of the sadct array 
    P0 = R1;                //Address of the shape array 
    R1 = [SP+12];           //Address of the coefficient array 
    [--SP] = (R7:4,P5:3); 
    P5 = R2;                //Address of the output array 
    B3 = R2;                //Address of the output array 
    I2 = R1;                //Address of the coefficient array 
    B2 = R1;                //Address of the coefficient array 
    I3.L = _Coeff_offset;   //Address of the offset array 
    I3.H = _Coeff_offset;   //Address of the offset array 
    L2 = 0; 
    L3 = 0; 
    R4 = 1; 
    P3 = 8; 
     
    SP += -16;              //Temporary storage 
    R5 = SP; 
    SP += -16;              //To store the column length information in shape 
    I1 = SP; 
    B1 = SP; 
    L1 = 16; 
     
    R1 = 0; 
/*Clear length array in stack*/ 
    [I1++] = R1;[I1++] = R1; 
    [I1++] = R1; 
    R7 = R7-R7 (S) || [I1++] = R1; 
                            //Row loop counter  =  0  
     
//Determining the number of nonzero elements in the Columns of the shape array 
    P4 = 64; 
    R6 = R6-R6 (S) || R1.L = W[I1++] || R0 = B[P0++] (Z); 
                            //Fetch the length. Read shape  
    R3 = R0 >> 7; 
     
    LSETUP($1LP_ST,$1LP_END) LC0 = P4; 
$1LP_ST: 
        R3.L = R1.L+R3.L (S) || R1.L = W[I1--] || R0 = B[P0++] (Z); 
                            //Fetch the length. Read shape  
$1LP_END: 
        R3 = R0>>7 || W[I1] = R3.L || I1 += 4; 
                            //Update the length  
    P0 += -1; 
    P0 += -64;              //Restore the address of the shape array 
    B1 = R5; 
/****************************Inverse Row SADCT*********************************/ 
     
I_ROW_ST: 
    P1 = SP;                //P1 points to the count 
    R3 = R4<<3 || R1 = W[P1++](Z); 
                            //Count for Row_ISADCT. Read the column count  
    LSETUP($2LP_ST,$2LP_END) LC0 = P3>>1; 
$2LP_ST: 
        CC = R1 <= R7; 
        R2 = CC; 
        R3 = R3-R6 (S) || R1 = W[P1++](Z); 
                            //Count the number of non-zero values along each row 
        CC = R1 <= R7; 
        R6 = CC; 
$2LP_END: 
        R3 = R3-R2 (S) || R1 = W[P1++](Z); 
                            //Count the number of non-zero values along each row 
    CC = R3 == 0;         //Check if the row length is zero. If zero, row  
                            //SADCT is over 
    IF CC JUMP I_ROW_OVER; 
     
/*Process the row from the input array I0 by adjusting I0, B0 and L0*/ 
     
    R1 = R3 << 1 || NOP;    //2*length 
    M0 = R1; 
    I0 = B0;                //Set I0 to B0 
    L0 = R1;                //Set I0 as a circular buffer of desired length 
    P2 = R3;                //Loop counter 
    R2 = R2-R2 (S) || I3 += M0; 
                            //Point to the right offset  
    R1 = R3+R4 (S) || R2.L = W[I3] || I3 -= M0; 
                            //Length+1, Fetch the offset. Restore I3  
    M2 = R2; 
    P4 = R1; 
    I1 = B1;   
    P1 = SP;                //Restore the address of the count 
/*Compute ISADCT*/ 
    I2 += M2; 
    A1 = A0 = 0 || R3.L = W[I0++] || R1 = [I2++]; 
 
    LSETUP($3LP_ST,$3LP_END) LC1 = P4>>1; 
                            //Set Loop for (L+1)>>1  
$3LP_ST: 
        LSETUP($4LP_ST,$4LP_ST) LC0 = P2; 
                            //Set Loop for L  
$4LP_ST: 
            R2.H = (A1 += R3.L*R1.H),R2.L = (A0 += R3.L*R1.L) || R3.L = W[I0++] 
            || R1 = [I2++]; 
                            //Fetch a data and 2 coeff.  
$3LP_END: 
        A1 = A0 = 0 || [I1++] = R2; 
     
/*Store the data in the right position in output array*/ 
     
    I1 = B1; 
    R5 = R4+|+R4,R6 = R4-|-R4 || R1 = W[P1++](Z); 
                            //Set R5  = 2 and clear R6  
    R0 = I1; 
    R3 = I1; 
     
    R3 = R3+R5 (S)|| R2.L = W[I1] ; 
                            //Read the stored data, read the count  
    CC = R1 <= R7; 
 
    LSETUP($5LP_ST,$5LP_END) LC0 = P3; 
$5LP_ST: 
        IF CC R3 = R0; 
        I1 = R3; 
        IF CC R2 = R6; 
        R0 = PACK(R3.H,R3.L) || R1 = W[P1++](Z); 
        CC = R1 <= R7; 
$5LP_END: 
        R3 = R3+R5 (S)|| R2.L = W[I1] || W[P5++] = R2; 
                            //Read the stored data, read the count  
/*Update pointers*/ 
    R0 = B0; 
    R0 += 16;               //As data packing ensures that if a row count is  
                            //zero, ISADCT_ROW is over 
    L0 = 0;                 //Clear the circular buffering of I1 
    B0 = R0; 
    R7 = R7+R4 (S); 
    I2 = B2;                //Restore pointer to the coeff. buffer 
    CC = R7  <=  7 (IU); 
    IF CC JUMP I_ROW_ST (BP); 
I_ROW_OVER:  
/****************************Inverse Column SADCT*****************************/ 
    P1 = SP;                //Column count 
    SP += -16;              //Temporary storage for output 
    R7 = 0;                 //Column loop counter 
    P3 = 16; 
     
I_COL_ST: 
    R1 = R7 << 1 || R3 = W[P1++](Z); 
                            //Column ISADCT count  
    CC = R3  <=  0; 
    IF CC JUMP I_COL_END; 
     
    R0 = B3; 
    R0 = R0+R1 (S); 
    P5 = R0;                //Address of the current column 
    P2 = R3;                //Column ISADCT count 
    I1 = B1; 
    R1 = R3 << 1;           //2*length 
    M0 = R1; 
     
/*Copy column into temporary buffer*/ 
    LSETUP($6LP_ST,$6LP_ST) LC0 = P2; 
    R1 = W[P5++P3](Z) || NOP; 
                            //First data from a column  
$6LP_ST: 
        R1 = W[P5++P3](Z) || W[I1++] = R1.L; 
    P5 = R0;                //Restore the output buffer pointer 
     
    L1 = M0; 
    I1 = B1; 
    R2 = R2-R2 (S) || I3 += M0; 
                            //Point to the right offset  
    R1 = R3+R4 (S) || R2.L = W[I3] || I3 -= M0; 
                            //Length+1, Fetch the offset. Restore I3  
    M2 = R2; 
    P4 = R1; 
    I0 = SP; 
    R6 = SP; 
    R5 = PACK(R6.H,R6.L) || I2 += M2; 
     
/*Store in output array column*/ 
    A1 = A0 = 0 || R0.L = W[I1++] || R1 = [I2++]; 
 
    LSETUP($8LP_ST,$8LP_END) LC1 = P4>>1; 
                            //Set Loop for (L+1)>>1  
$8LP_ST: 
        LSETUP($9LP_ST,$9LP_ST) LC0 = P2; 
                            //Set Loop for L  
$9LP_ST: 
            R2.H = (A1 += R0.L*R1.H),R2.L = (A0 += R0.L*R1.L) || R0.L = W[I1++] 
            || R1 = [I2++]; 
                            //Fetch a data and 2 coeff.  
$8LP_END: 
        A1 = A0 = 0 || [I0++] = R2; 
     
/*Store the data in the right position in output array*/ 
    R2 = R2-R2 (S) || R0 = B[P0] (Z); 
                            //Read the shape array  
    R3 = 2; 
    I0 = SP; 
     
    LSETUP($10LP_ST,$10LP_END) LC1 = P3>>1; 
                            //Set Loop for 8  
$10LP_ST: 
        P0 += 8; 
        R6 = R6+R3 (S) || R1.L = W[I0];     
        CC = R0 == 0; 
        IF CC R6 = R5; 
        I0 = R6; 
        IF !CC R2 = R1; 
        R5 = PACK(R6.H,R6.L) || W[P5++P3] = R2.L; 
$10LP_END: 
        R2 = R2-R2 (S) || R0 = B[P0] (Z); 
     
    I2 = B2; 
    P0 += -64;              //Set P0 to the next column 
    L1 = 0; 
I_COL_END:P0 += 1; 
    R7 += 1; 
    CC = R7  <=  7 (IU); 
    IF CC JUMP I_COL_ST (BP); 
    SP += 48; 
    (R7:4,P5:3) = [SP++]; 
    RTS; 
    NOP;                    //to avoid one stall if LINK or UNLINK happens to be 
                            //the next instruction after RTS in the memory. 
__isadct.end: