Chi-Tai, Research, Software and Music

Computer Science, Software and Afro Cosmic Music

Environs framework project page online



As of today, the project page of Environs got online and is well prepared for the EICS conference in Rome.
Environs was developed so as to enable research for my PhD thesis within. It enables simple and efficient implementations of distributed applications and user interfaces for multi display environments. Have a look at the project page http://hcm-lab.de/environs

environs-scene-mediabrowser

Comments are off for this post

OpenCL: kernel optimization of ARGB to YUV

As a sidenote to the recent post, the therein presented kernel is already superfast, but guess what :-)

There are ways to make it even faster by virtue of memory access optimizations. Let’s consider the memory access of U and V plane. They both access the same bits within the same dimension, thus can be consolidated into the same work item to access the memory only once (global memory access is the slowest memory access type).

Futhermore, “flattening” the work-group from 2D to 1D enables faster sequential memory access instead of the presented 2D access, hence benefit much better from prefetching and probably help avoiding bank conflicts…

So far for optimizations.. If there is demand on an appropriate kernel, then drop me an email..

1 comment

OpenCL: ARGB to YUV conversion

Just submitted a post at stackoverflow which would also have a nice place here, so.. here we go :-)

It was concerning the topic how to convert ARGB to YUV using the GPU. Some time ago I’ve developed and used the following OpenCL kernel to convert ARGB (typical windows bitmap pixel layout) to the y-plane (full sized), u/v-half-plane (quarter sized) memory layout as input for libx264 encoding.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
__kernel void ARGB2YUV (
                        __global unsigned int * sourceImage,
                        __global unsigned int * destImage,
                        unsigned int srcHeight,
                        unsigned int srcWidth,
                        unsigned int yuvStride // must be srcWidth/4 since we pack 4 pixels into 1 Y-unit (with 4 y-pixels)
                    )
{
    int i;
    unsigned int RGBs [ 4 ];
    unsigned int posSrc, RGB, Value4 = 0, Value, yuvStrideHalf, srcHeightHalf, yPlaneOffset, posOffset;
    unsigned char red, green, blue;
 
    unsigned int posX = get_global_id(0);
    unsigned int posY = get_global_id(1);
 
    if ( posX < yuvStride ) { 
        // Y plane - pack 4 y's within each work item
        if ( posY >= srcHeight )
            return;
 
        posSrc = (posY * srcWidth) + (posX * 4);
 
        RGBs [ 0 ] = sourceImage [ posSrc ];
        RGBs [ 1 ] = sourceImage [ posSrc + 1 ];
        RGBs [ 2 ] = sourceImage [ posSrc + 2 ];
        RGBs [ 3 ] = sourceImage [ posSrc + 3 ];
 
        for ( i=0; i<4; i++ ) { 
            RGB = RGBs [ i ]; blue = RGB & 0xff; green = (RGB >> 8) & 0xff; red = (RGB >> 16) & 0xff;
 
            Value = ( ( 66 * red + 129 * green + 25 * blue ) >> 8 ) + 16;
            Value4 |= (Value << (i * 8)); 
        } 
        destImage [ (posY * yuvStride) + posX ] = Value4; 
        return; 
    } 
    posX -= yuvStride; yuvStrideHalf = yuvStride >> 1; 
 
    // U plane - pack 4 u's within each work item 
    if ( posX <= yuvStrideHalf )
        return;
 
    srcHeightHalf = srcHeight >> 1;
    if ( posY < srcHeightHalf ) {
        posSrc = ((posY * 2) * srcWidth) + (posX * 8);
 
        RGBs [ 0 ] = sourceImage [ posSrc ];
        RGBs [ 1 ] = sourceImage [ posSrc + 2 ];
        RGBs [ 2 ] = sourceImage [ posSrc + 4 ];
        RGBs [ 3 ] = sourceImage [ posSrc + 6 ];
 
        for ( i=0; i<4; i++ ) { 
            RGB = RGBs [ i ]; 
            blue = RGB & 0xff; green = (RGB >> 8) & 0xff; red = (RGB >> 16) & 0xff;
 
            Value = ( ( -38 * red + -74 * green + 112 * blue ) >> 8 ) + 128;
            Value4 |= (Value << (i * 8));
        } 
        yPlaneOffset = yuvStride * srcHeight; 
        posOffset = (posY * yuvStrideHalf) + posX; 
        destImage [ yPlaneOffset + posOffset ] = Value4; 
        return; 
    } 
    posY -= srcHeightHalf; 
    if ( posY >= srcHeightHalf )
        return;
 
    // V plane - pack 4 v's within each work item
    posSrc = ((posY * 2) * srcWidth) + (posX * 8);
 
    RGBs [ 0 ] = sourceImage [ posSrc ];
    RGBs [ 1 ] = sourceImage [ posSrc + 2 ];
    RGBs [ 2 ] = sourceImage [ posSrc + 4 ];
    RGBs [ 3 ] = sourceImage [ posSrc + 6 ];
 
    for ( i=0; i<4; i++ ) { 
        RGB = RGBs [ i ]; 
        blue = RGB & 0xff; green = (RGB >> 8) & 0xff; red = (RGB >> 16) & 0xff;
 
        Value = ( ( 112 * red + -94 * green + -18 * blue ) >> 8 ) + 128;
        Value4 |= (Value << (i * 8));
    }
 
    yPlaneOffset = yuvStride * srcHeight;
    posOffset = (posY * yuvStrideHalf) + posX;
 
    destImage [ yPlaneOffset + (yPlaneOffset >> 2) + posOffset ] = Value4;
    return;
}

This code performs only global 32-bit memory access while 8-bit processing happens within each work item.

Oh.. and the proper code to invoke the kernel

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
unsigned int width = 1024;
unsigned int height = 768;
 
unsigned int frameSize = width * height;
const unsigned int argbSize = frameSize * 4; // ARGB pixels
 
const unsigned int yuvSize = frameSize + (frameSize >> 1); // Y,U,V planes
 
const unsigned int yuvStride = width >> 2; // since we pack 4 RGBs into "one" YYYY
 
// Allocates ARGB buffer
ocl_rgb_buffer = clCreateBuffer ( context, CL_MEM_READ_WRITE, argbSize, 0, &amp;error );
// ... error handling ...
 
ocl_yuv_buffer = clCreateBuffer ( context, CL_MEM_READ_WRITE, yuvSize, 0, &amp;error );
// ... error handling ...
 
error = clSetKernelArg ( kernel, 0, sizeof(cl_mem), &amp;ocl_rgb_buffer );
error |= clSetKernelArg ( kernel, 1, sizeof(cl_mem), &amp;ocl_yuv_buffer );
 
error |= clSetKernelArg ( kernel, 2, sizeof(unsigned int), &amp;height);
error |= clSetKernelArg ( kernel, 3, sizeof(unsigned int), &amp;width);
 
error |= clSetKernelArg ( kernel, 4, sizeof(unsigned int), &amp;yuvStride);
// ... error handling ...
 
const size_t local_ws[] = { 16, 32 };
const size_t global_ws[] = { yuvStride + (yuvStride>>1), height };
 
error = clEnqueueNDRangeKernel ( queue, kernel, 2, NULL, global_ws, local_ws, 0, NULL, NULL );
// ... error handling ...

Note: have a look at the work item calculations. Some additional code needs to be added (e.g. using mod so as to add sufficient spare items) to make sure that work item sizes fit to local work sizes.

Comments are off for this post

Canada – ITS 2009 Conference



I’ve been to banff/canada for a week to attend the international conference on interactive tabletops and surfaces 2009. Apart from the high quality of that conference, there were a lot of social happenings with the researchers, e.g. having interesting and productive talks while enjoying canadian beer in some of the beautiful pubs and restaurants in banff. Also the banquet with banquet speaker Chia Shen was great. I haven’t had such a delicious dinner so far and i’m pretty sure that it was quite expensive.. Thank goodness it was paid by the conference hehe

Oh and the view onto toronto at night (from the airplane) is soo beautiful.

On tuesday i presented results of my research from the university of augsburg and of course i was quite nervous hehe. Here are some pictures of my presentation and talk afterwards.
CIMG1245

CIMG1244

This is an elk in the gray of dawn. Unfortunatelly there wasn’t sufficient light for a perfect picture…
CIMG1248

A great diner burger :)
CIMG1258

The conference hotel – the fairmont springs banff!
CIMG1226

Banff at night…
CIMG1240

A view from the bridge in banff..
CIMG1195

Some mountains…
CIMG1198
CIMG1217

Comments are off for this post

PhD Trip to Israel

I was lucky enough to attend the “German-Israeli Minerva School for Ubiquitous Display Environments” from end of august on and it was a great experience to learn from the lectures and different cultures. It was also an experience to see how other researchers move towards possible solutions within the workshop and collaboratively create and work on ideas. Many thanks to all the responsible staff at this place. Oh and really delicious food…
And of course (a little late) here are some impressions i took with my camera hehe..

The minerva crew:
Minvera Team

Some impressions from Israel, Haifa, Jerusalem, …:
Israel1032
Israel0985
Israel0986
Israel1027
Israel1028

Israel1035

Israel1036
Israel1037
Israel1038
Israel1039
Israel1043
Israel1047
Israel1096
Israel1104
Israel1115
Israel1130
Israel1133

Comments are off for this post

Next Page »