12/12/2013

OpenCL fractals

I just finished an assignment yesterday for my C++ programming class (ECE 2036). One of the homework assignments earlier in the semester was a simple complex number class, and now the class is covering threading. We're using the standard POSIX threads library, but the class is kind of lame and they wrote a wrapper for us to use; the class taught how to use the wrapper class correctly. This homework was to generate a mandelbrot fractal, the assignment came with some prewritten code that uses QT to open a window and write some pixels. We were to plug our complex number class in there and write just a few lines.

Well, being dissatisfied with that, and wanting to play with OpenCL to see what the commotion was about, I learned enough to write a simple kernel.

#define MAX_ITERATION_COUNT 2000

inline float2 complexMultiply(float2 a, float2 b) {
    return (float2)(a.x * b.x - a.y * b.y, a.x * b.y + a.y * b.x);
}

inline float complexMagnitude(float2 a) {
    return sqrt(a.x * a.x + a.y * a.y);
}

inline float complexMagnitudeSquared(float2 a) {
    return (a.x * a.x) + (a.y * a.y);
}

kernel void mandlebrot(global int2 *screenSize, global float2 *location, global float *domainSize, global int *iterationCountOut, global float2 *coordsMemOut) {
    size_t a = get_global_id(0);

    int xPixel = a % screenSize[0].x;
    int yPixel = floor(((float)a / (float)screenSize[0].x));

    float xCoord = location[0].x + ((float)xPixel * domainSize[0] / (float)screenSize[0].x);
    float yCoord = location[0].y + ((float)yPixel * domainSize[0] / (float)screenSize[0].x);

    float2 imagCoordinate = (float2)(xCoord, yCoord);

    coordsMemOut[a] = imagCoordinate;

    float2 Z = imagCoordinate;

    int iterationCount = 0;
    for(iterationCount; iterationCount <= MAX_ITERATION_COUNT; iterationCount++) {
        if(complexMagnitudeSquared(Z) > 4) {
            break;
        } else{
            Z = complexMultiply(Z, Z) + imagCoordinate;
        }
    }

    iterationCountOut[a] = iterationCount;

}

The idea is that I pass in a series of vector coordinates which are just individual pixels for now. I think later I can use this to space the pixels out and then do a color smoothing algorithm to connect the points. What's neat is that I can also super-sample and then rescale to do anti-aliasing. What's passed out from the kernel is the number of iterations that it took for the magnitude to become "unbounded" which in this case is a magnitude of 4 or greater.

Anyway, I suppose it goes without saying that this is way faster than the threaded version. I'm parallelizing individual pixels and not just sending one half of the plane to one core and the other half to the other. What happens now is I just get caught up looking at the pretty pictures and stop making the code better. Oops. My coloring function kinda sucks, and the bottleneck is now I'm using OpenGL to literally paint individual pixels. I need to figure out how to dump the output to an image. I should also add a way to pan around the thing and zoom in. Right now I'm just using coordinates off websites that show cool areas. Here's what I've got so far though:

image

image