April 21, 2012

Towards a richer GPGPU Mandelbrot Sets and OpenCL example.

Nearly 3 years ago I started playing with OpenCL on the GPGPU front and posted about my experiences which excited some interest. Since then I've worked on a number of different technologies but I do keep coming back to OpenCL.

I've decided to work on enhancing my Mandelbrot generator and thought I'd share some of my experience doing so. My intention is to create a richer and more capable Mandelbrot Set generator that properly leverages the capabilities of OpenCL and allows someone to more interactively explore the set.

I'm having to dust off my Swing skills to do this as I want to create a richer UI experience. So first off lets talk about current features in my Mandelbrot Generator:

Current Features

  • Single, Double and Fixed Precision (128bit) calculation modes.
  • Simple zooming using the mouse left button to zoom in and the mouse right button to zoom out.
  • Re-sizable window.
  • Simple Benchmarking application using the different calculation modes to demonstrate GPU performance using Single, Double and Integer mathematics.
  • Automatic detection of Work Group size and appropriate buffering.

Future Features

  • Configurable colour palettes.
  • Julia Set generation - for the current location in the Mandelbrot set..
  • UI components to be able to configure the calculation mode and a few other options.
  • Further, richer Fixed Precision modes (256 and 512 bit seem a natural fit for uint8 and uint16)
  • Interactive Benchmarking
  • Ability to save specific 'views' and 'journeys' through the Mandelbrot Set and interact with them.
  • Rendering of ultra high resolution images, either directly rendered in one pass on the GPU or rendered as tiles and stitched together.
  • Render to OpenGL contexts and real-time fly throughs of the set.
Today I'm going to talk about some key components of my Mandelbrot Generator:
  • JavaCL
  • MandelbrotView Interface
  • Fixed Precision algorithm

JavaCL

I've been looking at the various Java to OpenCL bindings and having played with a number of them I keep on coming back to Olivier Chafik's JavaCL for the cleanest and best experience. I have yet to use the Maven bindings generator but even without it, the code is very clean.

My MandelbrotGenerator class which handles all the generation modes looks like:
package co.uk.fvdl.mandelbrot;

import co.uk.fvdl.mathematics.FixedPrecision128;
import co.uk.fvdl.util.TwoTuple;
import com.nativelibs4java.opencl.*;

import java.io.IOException;
import java.io.InputStreamReader;

/**
 * Generic class for generating Mandelbrot set using OpenCL. May be able to use as the basis for generating other
 * fractals such as Julia sets by making it possible to choose different OpenCL programs as parameters.
 */
public class MandelbrotGenerator {

    private static final int ESCAPE_THRESHOLD = 4;
    private final CLKernel mandelbrotFloatKernel;
    private final CLKernel mandelbrotDoubleKernel;
    private final CLKernel mandelbrotFixedPrecision128Kernel;
    private final CLQueue queue;
    private final CLContext context;
    private int workgroupWidth;
    private int workgroupHeight;
    private boolean profiling;


    public MandelbrotGenerator(CLContext context, boolean profiling) throws IOException {
        this(context, Utility.detectLargest2DWorkGroupSize(context), profiling);
    }

    public MandelbrotGenerator(CLContext context, TwoTuple<integer, integer> workgroupSize, boolean profiling) throws IOException {
        this.profiling = profiling;
        System.out.println("Workgroup size: " + workgroupSize);
        this.context = context;
        queue = context.createDefaultQueue();
        queue.setProperty(CLDevice.QueueProperties.ProfilingEnable, profiling);

        //Will always support integers...
        String fixedPrecision128Source = MandelbrotViewer.readFully(
                new InputStreamReader(this.getClass().getResourceAsStream("opencl/MandelbrotFixedPrecision128.cl")),
                new StringBuilder()
        ).toString();
        String fixedPoint128LibrarySource = MandelbrotViewer.readFully(
                new InputStreamReader(this.getClass().getResourceAsStream("opencl/FixedPrecision128.cl")),
                new StringBuilder()
        ).toString();
        CLProgram fixedPrecision128Program = context.createProgram(fixedPoint128LibrarySource, fixedPrecision128Source);
        mandelbrotFixedPrecision128Kernel = fixedPrecision128Program.createKernel("MandelbrotFixedPrecision128");

        //Will always support floats...
        String floatSource = MandelbrotViewer.readFully(
                new InputStreamReader(this.getClass().getResourceAsStream("opencl/MandelbrotFloat.cl")),
                new StringBuilder()
        ).toString();
        CLProgram floatProgram = context.createProgram(floatSource);
        mandelbrotFloatKernel = floatProgram.createKernel("MandelbrotFloat");

        //May not always support doubles...
        if(context.isDoubleSupported()){
            //Read the source file.
            String doubleSource = MandelbrotViewer.readFully(
                    new InputStreamReader(this.getClass().getResourceAsStream("opencl/MandelbrotDouble.cl")),
                    new StringBuilder()
            ).toString();
            CLProgram doubleProgram = context.createProgram(doubleSource);
            mandelbrotDoubleKernel = doubleProgram.createKernel("MandelbrotDouble");
        } else {
            mandelbrotDoubleKernel = null;
        }

        this.workgroupWidth = workgroupSize.getFirstElement();
        this.workgroupHeight = workgroupSize.getSecondElement();
    }

    public GenerationResult generateMandelbrot(MandelbrotView mandelbrotView) throws IOException {

        Padded2DBuffer padded2DBuffer = new Padded2DBuffer(mandelbrotView.getRealAxisPixelCount(), mandelbrotView.getImaginaryAxisPixelCount(), workgroupWidth, workgroupHeight);
        CLBuffer<integer> outputBuffer =  context.createBuffer(CLMem.Usage.Output, padded2DBuffer.getPointer());

        CLEvent event;
        CLKernel mandelbrotKernel = null;
        switch (mandelbrotView.getMandelbrotGenerationStrategy()){
            case DOUBLE:
                if(context.isDoubleSupported()){
                    mandelbrotKernel = mandelbrotDoubleKernel;
                    mandelbrotKernel.setArgs(
                            new double[]{mandelbrotView.getRealAxisPixelSize().doubleValue(), mandelbrotView.getImaginaryAxisPixelSize().doubleValue()},
                            new double[]{mandelbrotView.getRealAxisMinimum().doubleValue(), mandelbrotView.getImaginaryAxisMinimum().doubleValue()},
                            mandelbrotView.getMaxIter(), ESCAPE_THRESHOLD, padded2DBuffer.getActualWidth(), outputBuffer);
                    break;
                }
            case SINGLE:
                mandelbrotKernel = mandelbrotFloatKernel;
                mandelbrotKernel.setArgs(
                        new float[]{mandelbrotView.getRealAxisPixelSize().floatValue(), mandelbrotView.getImaginaryAxisPixelSize().floatValue()},
                        new float[]{mandelbrotView.getRealAxisMinimum().floatValue(), mandelbrotView.getImaginaryAxisMinimum().floatValue()},
                        mandelbrotView.getMaxIter(), ESCAPE_THRESHOLD, padded2DBuffer.getActualWidth(), outputBuffer);
                break;
            case FP128:
                mandelbrotKernel = mandelbrotFixedPrecision128Kernel;
                mandelbrotKernel.setArgs(
                        FixedPrecision128.convertToFixedPrecision128(mandelbrotView.getRealAxisPixelSize()), FixedPrecision128.convertToFixedPrecision128(mandelbrotView.getImaginaryAxisPixelSize()),
                        FixedPrecision128.convertToFixedPrecision128(mandelbrotView.getRealAxisMinimum()), FixedPrecision128.convertToFixedPrecision128(mandelbrotView.getImaginaryAxisMinimum()),
                        mandelbrotView.getMaxIter(), ESCAPE_THRESHOLD, padded2DBuffer.getActualWidth(), outputBuffer);
                break;
        }
        event = mandelbrotKernel.enqueueNDRange(queue, new int[]{padded2DBuffer.getActualWidth(), padded2DBuffer.getActualHeight()}, new int[]{workgroupWidth, workgroupHeight});

        outputBuffer.read(queue, padded2DBuffer.getPointer(), true, event);

        if(profiling){
            return new GenerationResult(padded2DBuffer, event.getProfilingCommandEnd() - event.getProfilingCommandStart());
        } else {
            return new GenerationResult(padded2DBuffer, Long.MIN_VALUE);
        }
    }

    public boolean isDoubleAvailable() {
        return context.isDoubleSupported();
    }
}
I've pushed all the JavaCL code into this class and am able to reuse it in both my interactive and benchmarking applications.

MandelbrotView

To provide a coherent view of the set to my Java code I've created a standard interface that defines the place in the Mandelbrot Set that we are looking at and how we are looking at it:
package co.uk.fvdl.mandelbrot;

import java.math.BigDecimal;

/**
 * Classes implementing this interface represent views onto the Mandelbrot set coordinate space. It is fundamentally
 * represented in terms of the centre of the view on the set and in terms of the zoom level represented by the span of a
 * pixel in that coordinate space. This representation of the view makes it easier to handle re-sizing of the containing
 * window, treating it as a viewport on to the Mandelbrot set.
 */
public interface MandelbrotView {

    /**
     * Get the size of pixels on the Real Axis.
     * @return the size of pixels on the Real Axis.
     */
    BigDecimal getRealAxisPixelSize();

    /**
     * Get the size of pixels on the Imaginary Axis.
     * @return the size of pixels on the Imaginary Axis.
     */
    BigDecimal getImaginaryAxisPixelSize();

    /**
     * Get the number of pixels currently used in the view on the real axis.
     * @return the number of pixels currently used in the view on the real axis.
     */
    int getRealAxisPixelCount();

    /**
     * Get the number of pixels currently used in the view on the imaginary axis.
     * @return the number of pixels currently used in the view on the imaginary axis.
     */
    int getImaginaryAxisPixelCount();

    /**
     * Set the number of pixels to be used in the view on the real axis.
     * @param realAxisPixelCount the number of pixels currently used in the view on the real axis.
     */
    void setRealAxisPixelCount(int realAxisPixelCount);

    /**
     * Set the number of pixels to be used in the view on the imaginary axis.
     * @param imaginaryAxisPixelCount the number of pixels currently used in the view on the imaginary axis.
     */
    void setImaginaryAxisPixelCount(int imaginaryAxisPixelCount);

    /**
     * Calculates the minimum value that the current view has on the real axis.
     * @return the minimum value that the current view has on the real axis.
     */
    BigDecimal getRealAxisMinimum();

    /**
     * Calculates the minimum value that the current view has on the imaginary axis.
     * @return the minimum value that the current view has on the imaginary axis.
     */
    BigDecimal getImaginaryAxisMinimum();

    /**
     * Create a new MandelbrotView for a sub-rectangle of the current view. It will preserve the same pixel dimensions
     * as the original and so will choose the scaling axis to ensure that the whole of the selected rectangle is
     * visible within the new MandelbrotViewImpl instance.
     * @param realAxisPosition The pixel position on the real axis within the current view of the top left corner of the
     *                         rectangle.
     * @param imaginaryAxisPosition  The pixel position on the imaginary axis within the current view of the top left
     *                               corner of the rectangle.
     * @param realAxisPixels The pixel width on the real acis of the rectangle
     * @param imaginaryAxisPixels The pixel width on the imaginary axis of the rectangle
     * @return a new MandelbrotViewImpl representing the sub-rectangle of the current view.
     */
    MandelbrotView selectSubRectangle(
            int realAxisPosition, int imaginaryAxisPosition,
            int realAxisPixels, int imaginaryAxisPixels);

    /**
     * Create a new MandelbrotView for a super-rectangle of the current view where the current view will be contained
     * within the selected rectangle. It will preserve the same pixel dimensions as the original.
     * @param realAxisPosition The pixel position on the real axis within the new view of the top left corner of the
     *                         rectangle containing the current view.
     * @param imaginaryAxisPosition  The pixel position on the imaginary axis within the new view of the top left
     *                               corner of the rectangle containing the current view.
     * @param realAxisPixels The pixel width on the real axis of the rectangle
     * @param imaginaryAxisPixels The pixel width on the imaginary axis of the rectangle
     * @return a new MandelbrotViewImpl representing the super-rectangle of the current view.
     */
    MandelbrotView selectSuperRectangle(
            int realAxisPosition, int imaginaryAxisPosition,
            int realAxisPixels, int imaginaryAxisPixels);

    /**
     * Get the preferred generation strategy.
     * @return the preferred generation strategy.
     */
    MandelbrotGenerationStrategy getMandelbrotGenerationStrategy();

    /**
     * The maximum number if iterations to use before considering that we have escaped the set.
     * @return the maximum number of iterations to use.
     */
    int getMaxIter();

    /**
     * Set the maximum number if iterations to use before considering that we have escaped the set.
     * @param maxIter the maximum number of iterations to use.
     */
    void setMaxIter(int maxIter);
}

While the view interface talks about the 'minimums' and the number of pixels which maps well to what we need to render, the underlying implementation defines the view in terms of the centre of the view and the pixel size. This means that when we resize the window containing the view the view remains centred on what we intend:
package co.uk.fvdl.mandelbrot;

import co.uk.fvdl.mathematics.BigDecimalArithmetic;

import java.math.BigDecimal;

import static co.uk.fvdl.util.Preconditions.checkNotNull;

/**
 * This class represents a view onto the Mandelbrot set coordinate space. It is fundamentally represented in terms of
 * the centre of the view on the set and in terms of the zoom level represented by the span of a pixel in that coordinate
 * space. This representation of the view makes it easier to handle re-sizing of the containing window, treating it as a
 * viewport on to the Mandelbrot set.
 */
public class MandelbrotViewImpl implements MandelbrotView {
    private final BigDecimal realAxisCentre;
    private final BigDecimal imaginaryAxisCentre;
    private final BigDecimal realAxisPixelSize;
    private final BigDecimal imaginaryAxisPixelSize;
    private final BigDecimalArithmetic numberArithmetic;
    private final MandelbrotGenerationStrategy mandelbrotGenerationStrategy;

    private int realAxisPixelCount;
    private int imaginaryAxisPixelCount;
    private int maxIter;

    /**
     * Construct an instance of the Mandelbrot view for the following parameters.
     * @param realAxisCentre the centre of the view on the Real Axis.
     * @param imaginaryAxisCentre the centre of the view on the Imaginary Axis.
     * @param realAxisPixelSize the size of pixels on the Real Axis.
     * @param imaginaryAxisPixelSize the size of pixels on the Imaginary Axis.
     * @param realAxisPixelCount the number of pixels to be used in the view on the real axis.
     * @param imaginaryAxisPixelCount the number of pixels currently used in the view on the imaginary axis.
     * @param mandelbrotGenerationStrategy the prefered strategy to use when generating the Mandelbrot set.
     * @param maxIter The maximum iteration to use for generating the set.
     */
    @SuppressWarnings("unchecked")
    public MandelbrotViewImpl(BigDecimal realAxisCentre, BigDecimal imaginaryAxisCentre,
                              BigDecimal realAxisPixelSize, BigDecimal imaginaryAxisPixelSize,
                              int realAxisPixelCount, int imaginaryAxisPixelCount, MandelbrotGenerationStrategy mandelbrotGenerationStrategy, int maxIter) {
        this.mandelbrotGenerationStrategy = mandelbrotGenerationStrategy;
        this.maxIter = maxIter;
        this.numberArithmetic = new BigDecimalArithmetic();
        this.realAxisCentre = checkNotNull(realAxisCentre, "Real Axis centre is null.");
        this.imaginaryAxisCentre = checkNotNull(imaginaryAxisCentre, "Imaginary Axis centre is null.");
        this.realAxisPixelSize = checkNotNull(realAxisPixelSize, "Real Axis pixel size is null.");
        this.imaginaryAxisPixelSize = checkNotNull(imaginaryAxisPixelSize, "Imaginary Axis pixel size is null.");
        this.realAxisPixelCount = realAxisPixelCount;
        this.imaginaryAxisPixelCount = imaginaryAxisPixelCount;
    }

    /**
     * Get the centre of the view on the Real Axis.
     * @return the centre of the view on the Real Axis.
     */
    public BigDecimal getRealAxisCentre() {
        return realAxisCentre;
    }

    /**
     * Get the centre of the view on the Imaginary Axis.
     * @return the centre of the view on the Imaginary Axis.
     */
    public BigDecimal getImaginaryAxisCentre() {
        return imaginaryAxisCentre;
    }

    @Override
    public BigDecimal getRealAxisPixelSize() {
        return realAxisPixelSize;
    }

    @Override
    public BigDecimal getImaginaryAxisPixelSize() {
        return imaginaryAxisPixelSize;
    }

    @Override
    public int getRealAxisPixelCount() {
        return realAxisPixelCount;
    }

    @Override
    public int getImaginaryAxisPixelCount() {
        return imaginaryAxisPixelCount;
    }

    @Override
    public void setRealAxisPixelCount(int realAxisPixelCount) {
        this.realAxisPixelCount = realAxisPixelCount;
    }

    @Override
    public void setImaginaryAxisPixelCount(int imaginaryAxisPixelCount) {
        this.imaginaryAxisPixelCount = imaginaryAxisPixelCount;
    }

    @Override
    public BigDecimal getRealAxisMinimum() {
        return numberArithmetic.subtract(
                realAxisCentre, 
                numberArithmetic.divide(
                        numberArithmetic.multiply(
                                realAxisPixelSize, 
                                numberArithmetic.fromInteger(realAxisPixelCount)
                        ),
                        numberArithmetic.fromInteger(2)));
    }

    @Override
    public BigDecimal getImaginaryAxisMinimum() {
        return numberArithmetic.subtract(
                imaginaryAxisCentre, 
                numberArithmetic.divide( 
                        numberArithmetic.multiply(
                                imaginaryAxisPixelSize, 
                                numberArithmetic.fromInteger(imaginaryAxisPixelCount)
                        ),numberArithmetic.fromInteger(2)));
    }

    @Override
    public MandelbrotViewImpl selectSubRectangle(
            int realAxisPosition, int imaginaryAxisPosition,
            int realAxisPixels, int imaginaryAxisPixels) {
        BigDecimal scale;
        if((realAxisPixelCount / imaginaryAxisPixelCount) > (realAxisPixels/imaginaryAxisPixels)){ //use the imaginary axis to provide the scaling
            scale = numberArithmetic.divide(
                    numberArithmetic.fromInteger(imaginaryAxisPixels),
                    numberArithmetic.fromInteger(imaginaryAxisPixelCount)
            );
        } else { //use the real axis to provide scaling.
            scale = numberArithmetic.divide(
                    numberArithmetic.fromInteger(realAxisPixels),
                    numberArithmetic.fromInteger(realAxisPixelCount)
            );
        }

        float newRealPixelCentre = (float)realAxisPosition + (float)realAxisPixels / 2.0F;
        float newImaginaryPixelCentre = (float)imaginaryAxisPosition + (float)imaginaryAxisPixels / 2.0F;
        float currentRealPixelCentre = (float)realAxisPixelCount / 2.0F;
        Float currentImaginaryPixelCentre = (float)imaginaryAxisPixelCount / 2.0F;

        BigDecimal deltaRealPixels = numberArithmetic.fromFloat(newRealPixelCentre - currentRealPixelCentre);
        BigDecimal deltaImaginaryPixels = numberArithmetic.fromFloat(newImaginaryPixelCentre - currentImaginaryPixelCentre);
        
        BigDecimal newRealAxisCentre = numberArithmetic.add(
                realAxisCentre,
                numberArithmetic.multiply(
                        deltaRealPixels,
                        realAxisPixelSize));
        
        BigDecimal newImaginaryAxisCentre = numberArithmetic.add(
                imaginaryAxisCentre,
                numberArithmetic.multiply(
                        deltaImaginaryPixels,
                        imaginaryAxisPixelSize));

        BigDecimal newRealAxisPixelSize = numberArithmetic.multiply(realAxisPixelSize, scale);
        BigDecimal newImaginaryAxisPixelSize = numberArithmetic.multiply(imaginaryAxisPixelSize, scale);

        return new MandelbrotViewImpl(
                newRealAxisCentre, newImaginaryAxisCentre,
                newRealAxisPixelSize, newImaginaryAxisPixelSize,
                realAxisPixelCount, imaginaryAxisPixelCount, mandelbrotGenerationStrategy, maxIter);
    }

    @Override
    public MandelbrotViewImpl selectSuperRectangle(
            int realAxisPosition, int imaginaryAxisPosition,
            int realAxisPixels, int imaginaryAxisPixels) {
        BigDecimal scale;
        if((realAxisPixelCount / imaginaryAxisPixelCount) > (realAxisPixels/imaginaryAxisPixels)){ //use the imaginary axis to provide the scaling
            scale = numberArithmetic.divide(
                    numberArithmetic.fromInteger(imaginaryAxisPixelCount),
                    numberArithmetic.fromInteger(imaginaryAxisPixels)
            );
        } else { //use the real axis to provide scaling.
            scale = numberArithmetic.divide(
                    numberArithmetic.fromInteger(realAxisPixelCount),
                    numberArithmetic.fromInteger(realAxisPixels)
            );
        }

        int newRealPixelCentre = realAxisPixelCount / 2;
        int newImaginaryPixelCentre = imaginaryAxisPixelCount / 2;
        int currentRealPixelCentre = realAxisPosition + realAxisPixels / 2;
        int currentImaginaryPixelCentre = imaginaryAxisPosition + imaginaryAxisPixels / 2;

        BigDecimal deltaRealPixels = numberArithmetic.fromInteger(newRealPixelCentre - currentRealPixelCentre);
        BigDecimal deltaImaginaryPixels = numberArithmetic.fromInteger(newImaginaryPixelCentre - currentImaginaryPixelCentre);

        BigDecimal newRealAxisPixelSize = numberArithmetic.multiply(realAxisPixelSize, scale);
        BigDecimal newImaginaryAxisPixelSize = numberArithmetic.multiply(imaginaryAxisPixelSize, scale);

        BigDecimal newRealAxisCentre = numberArithmetic.add(
                realAxisCentre,
                numberArithmetic.multiply(
                        deltaRealPixels,
                        newRealAxisPixelSize));

        BigDecimal newImaginaryAxisCentre = numberArithmetic.add(
                imaginaryAxisCentre,
                numberArithmetic.multiply(
                        deltaImaginaryPixels,
                        newImaginaryAxisPixelSize));


        return new MandelbrotViewImpl(
                newRealAxisCentre, newImaginaryAxisCentre,
                newRealAxisPixelSize, newImaginaryAxisPixelSize,
                realAxisPixelCount, imaginaryAxisPixelCount, mandelbrotGenerationStrategy, maxIter);

    }

    @Override
    public String toString() {
        final StringBuffer sb = new StringBuffer();
        sb.append("MandelbrotViewImpl");
        sb.append("{maxIter=").append(maxIter);
        sb.append(", realAxisCentre=").append(realAxisCentre);
        sb.append(", imaginaryAxisCentre=").append(imaginaryAxisCentre);
        sb.append(", realAxisPixelCount=").append(realAxisPixelCount);
        sb.append(", imaginaryAxisPixelCount=").append(imaginaryAxisPixelCount);
        sb.append(", realAxisPixelSize=").append(realAxisPixelSize);
        sb.append(", imaginaryAxisPixelSize=").append(imaginaryAxisPixelSize);
        sb.append(", mandelbrotGenerationStrategy=").append(mandelbrotGenerationStrategy);
        sb.append('}');
        return sb.toString();
    }

    @Override
    public MandelbrotGenerationStrategy getMandelbrotGenerationStrategy() {
        return mandelbrotGenerationStrategy;
    }

    @Override
    public int getMaxIter() {
        return maxIter;
    }

    @Override
    public void setMaxIter(int maxIter) {
        this.maxIter = maxIter;
    }
}
The final component of the MandelbrotView is the MandelbrotViewHolder which acts as a delegate for the multiple MandelbrotView instances that we progress through as we zoom. The intent is that eventually the MandelbrotViewHolder will morph into something that allows us to trace the history of a session exploring the MandelbrotSet.
package co.uk.fvdl.mandelbrot;

import java.math.BigDecimal;

import static co.uk.fvdl.util.Preconditions.checkNotNull;

/**
 * This class provides a level of indirection that allows us to work with immutable implementations of the
 * MandelbrotView interface without having to deal with a large amount of reference passing in UI components as the
 * view is modified.
 */
public class MandelbrotViewHolder implements MandelbrotView {

    private MandelbrotView wrappedInstance;

    /**
     * Construct and instance using the wrapped instance.
     * @param wrappedInstance the instance of MandelbrotView to wrap.
     */
    public MandelbrotViewHolder(MandelbrotView wrappedInstance) {
        this.wrappedInstance = checkNotNull(wrappedInstance, "The MandelbrotView instance to be wrapped may not be null");
    }

    @Override
    public BigDecimal getRealAxisPixelSize() {
        return wrappedInstance.getRealAxisPixelSize();
    }

    @Override
    public BigDecimal getImaginaryAxisPixelSize() {
        return wrappedInstance.getImaginaryAxisPixelSize();
    }

    @Override
    public int getRealAxisPixelCount() {
        return wrappedInstance.getRealAxisPixelCount();
    }

    @Override
    public int getImaginaryAxisPixelCount() {
        return wrappedInstance.getImaginaryAxisPixelCount();
    }

    @Override
    public void setRealAxisPixelCount(int realAxisPixelCount) {
        wrappedInstance.setRealAxisPixelCount(realAxisPixelCount);
    }

    @Override
    public void setImaginaryAxisPixelCount(int imaginaryAxisPixelCount) {
        wrappedInstance.setImaginaryAxisPixelCount(imaginaryAxisPixelCount);
    }

    @Override
    public BigDecimal getRealAxisMinimum() {
        return wrappedInstance.getRealAxisMinimum();
    }

    @Override
    public BigDecimal getImaginaryAxisMinimum() {
        return wrappedInstance.getImaginaryAxisMinimum();
    }

    /**
     * Calls the underlying wrapped instance to create a new view on the Mandelbrot set for the sub-rectangle, replacing
     * it with the new one.
     * @param realAxisPosition The pixel position on the real axis within the current view of the top left corner of the
     *                         rectangle.
     * @param imaginaryAxisPosition  The pixel position on the imaginary axis within the current view of the top left
     *                               corner of the rectangle.
     * @param realAxisPixels The pixel width on the real acis of the rectangle
     * @param imaginaryAxisPixels The pixel width on the imaginary axis of the rectangle
     * @return this instance of the MandelbrotViewHolder which now wraps the new instance of the view.
     */
    @Override
    public MandelbrotView selectSubRectangle(int realAxisPosition, int imaginaryAxisPosition,
                                                int realAxisPixels, int imaginaryAxisPixels) {
        wrappedInstance = wrappedInstance.selectSubRectangle(
                realAxisPosition, imaginaryAxisPosition,
                realAxisPixels, imaginaryAxisPixels);
        return this;
    }

    /**
     * Calls the underlying wrapped instance to create a new view on the Mandelbrot set for the super-rectangle, replacing
     * it with the new one.
     * @param realAxisPosition The pixel position on the real axis within the new view of the top left corner of the
     *                         rectangle containing the current view.
     * @param imaginaryAxisPosition  The pixel position on the imaginary axis within the new view of the top left
     *                               corner of the rectangle containing the current view.
     * @param realAxisPixels The pixel width on the real axis of the rectangle
     * @param imaginaryAxisPixels The pixel width on the imaginary axis of the rectangle
     * @return a new MandelbrotViewImpl representing the super-rectangle of the current view.
     */
    @Override
    public MandelbrotView selectSuperRectangle(int realAxisPosition, int imaginaryAxisPosition, int realAxisPixels, int imaginaryAxisPixels) {
        wrappedInstance = wrappedInstance.selectSuperRectangle(
                realAxisPosition, imaginaryAxisPosition,
                realAxisPixels, imaginaryAxisPixels);
        return this;
    }

    @Override
    public MandelbrotGenerationStrategy getMandelbrotGenerationStrategy() {
        return wrappedInstance.getMandelbrotGenerationStrategy();
    }

    @Override
    public int getMaxIter() {
        return wrappedInstance.getMaxIter();
    }

    @Override
    public void setMaxIter(int maxIter) {
        wrappedInstance.setMaxIter(maxIter);
    }
}

Fixed Precision Mathematics

Having delved deep into the Mandelbrot set using both single and double precision implementations I discovered that with the raw performance of OpenCL on a decent graphic card it is entirely too easy to run into the limits of precision of the numeric representation. Having read around the problem I came upon the idea of Fixed Precision Mathematics to go deeper and came across an excellent piece of work by a gentleman called Eric Bainville at this site.

Using his work as a jumping off point I started seeing if I could integrate Fixed Precision calculations into my Mandelbrot Generator. The only problem I had was that I didn't fully understand the code that I was using and wanted to be able to extend it to greater precisions so I decided to write my own implementation from first principles. I did reuse a useful hack that he used replacing all the insignificant digit carry calculations with a '3'.

In the end I created a stripped down library to service my own Fixed Precision Mandelbrot generation.

It was interesting to compare his and my Fixed Precision libraries to see the similarities and differences - I deliberately didn't take some of the optimisation paths (e.g. dedicated squaring logic) - but I did end up with a mixture of similar and different methods. The largest difference was in the Positive / Positive multiplication where I decided to try to take advantage of the vector methods:
#define MAX_UINT4 (uint4)(0xFFFFFFFF, 0xFFFFFFFF, 0xFFFFFFFF, 0xFFFFFFFF)
#define MOST_SIGNIFICANT_BIT_MASK 0x80000000U
uint4 addFP128(uint4 firstAddend, uint4 secondAddend)
{
    uint4 naiveAddition = firstAddend + secondAddend;
    uint4 propagatableVector = convert_uint4(naiveAddition == MAX_UINT4);
    uint4 overflowCarries = (convert_uint4(naiveAddition < firstAddend) & (uint4)(0,1,1,1)).yzwx;
    uint4 propagatedCarries = (uint4)(
                    ( (overflowCarries.z & propagatableVector.z) | overflowCarries.y ) & propagatableVector.y,
                    overflowCarries.z & propagatableVector.z,
                    0,
                    0);

    return naiveAddition + overflowCarries + propagatedCarries;
}

uint4 doubleFP128(uint4 operand)
{
    uint4 propagated = (operand & (uint4)(0,MOST_SIGNIFICANT_BIT_MASK, MOST_SIGNIFICANT_BIT_MASK, MOST_SIGNIFICANT_BIT_MASK)).yzwx  >>(uint4)(31U);
    return (operand<<(uint4)(1U)) + propagated;
}

uint4 incrementFP128(uint4 operand)
{
    uint4 propagatableVector = convert_uint4(operand == MAX_UINT4); // Determine if any of the operand elements are equivalent to 0xFFFFFFFF
    uint4 carriedIncrement = (uint4)(
                    propagatableVector.y & propagatableVector.z & propagatableVector.w & 1, //Carry up to x
                    propagatableVector.z & propagatableVector.w & 1, //Carry up to y
                    propagatableVector.w&1, //Carry up to z
                    1); //increment to be added to w
    return operand + carriedIncrement;
}

uint4 negateFP128(uint4 operand)
{
    uint4 complement = operand ^ MAX_UINT4;
    return incrementFP128(complement);
}

uint4 absoluteFP128(uint4 value) {
    return ((value.x & MOST_SIGNIFICANT_BIT_MASK)?negateFP128(value):value);
}

uint4 multiplyFP128PositivePositive(uint4 firstMultiplicand, uint4 secondMultiplicand)
{
    uint4 wXYZW = firstMultiplicand * (uint4)(secondMultiplicand.w);
    uint4 wXYZWHigh = mul_hi(firstMultiplicand, (uint4)(secondMultiplicand.w));
    uint4 zXYZW = firstMultiplicand * (uint4)(secondMultiplicand.z);
    uint4 zXYZWHigh = mul_hi(firstMultiplicand, (uint4)(secondMultiplicand.z));
    uint4 yXYZW = firstMultiplicand * (uint4)(secondMultiplicand.y);
    uint4 yXYZWHigh = mul_hi(firstMultiplicand, (uint4)(secondMultiplicand.y));
    uint4 xXYZW = firstMultiplicand * (uint4)(secondMultiplicand.x);
    uint4 xXYZWHigh = mul_hi(firstMultiplicand, (uint4)(secondMultiplicand.x));

    uint4 resultPart1 = addFP128((uint4)(yXYZWHigh.x, zXYZWHigh.x, wXYZWHigh.x, 3), (uint4)(xXYZWHigh.y, yXYZWHigh.y, zXYZWHigh.y, wXYZWHigh.y));
    uint4 resultPart2 = addFP128((uint4)(0, xXYZWHigh.z, yXYZWHigh.z, zXYZWHigh.z),   (uint4)(0, 0, xXYZWHigh.w, yXYZWHigh.w));
    uint4 resultPart3 = addFP128((uint4)(0, 0, 0, wXYZW.x), (uint4)(0, 0, zXYZW.x, zXYZW.y));
    uint4 resultPart4 = addFP128((uint4)(0, yXYZW.x, yXYZW.y, yXYZW.z), (uint4)(xXYZW.x, xXYZW.y, xXYZW.z, xXYZW.w));
    return addFP128(addFP128(resultPart1, resultPart2), addFP128(resultPart3, resultPart4));
}

//TODO consider if a dedicated squaring function would be a good idea.

uint4 multiplyFP128AnyAny(uint4 firstMultiplicand, uint4 secondMultiplicand)
{
      // Get the absolute product;
      uint4 p = multiplyFP128PositivePositive(absoluteFP128(firstMultiplicand), absoluteFP128(secondMultiplicand));
      //Return it with the appropriate sign
      return ((firstMultiplicand.x & MOST_SIGNIFICANT_BIT_MASK)^(secondMultiplicand.x & MOST_SIGNIFICANT_BIT_MASK))?negateFP128(p):p;
}
I implemented my Fixed Precision 128 Mandelbrot Kernel using it looking like:
__kernel void MandelbrotFixedPrecision128(
    const uint4 realDelta,
    const uint4 imaginaryDelta,
    const uint4 realMinimum,
    const uint4 imaginaryMinimum,
 const unsigned int maxIter,
 const unsigned int escapeNumber,
 const unsigned int hRes,
 __global int* outputi
)
{
 int realId = get_global_id(0);
 int imaginaryId = get_global_id(1);

    uint4 realPosition = addFP128(realMinimum, multiplyFP128AnyAny(realDelta, realId));
    uint4 squaredRealValue = multiplyFP128AnyAny(realPosition, realPosition);
    uint4 realValue = realPosition;

    uint4 imaginaryPosition = addFP128(imaginaryMinimum, multiplyFP128AnyAny(imaginaryDelta, imaginaryId));
    uint4 squaredImaginaryValue = multiplyFP128AnyAny(imaginaryPosition, imaginaryPosition);
    uint4 imaginaryValue = imaginaryPosition;

 int iter = 0;
 while ( (iter < maxIter) && (addFP128(squaredRealValue,squaredImaginaryValue).x < escapeNumber) )
 {
     imaginaryValue = addFP128(doubleFP128(multiplyFP128AnyAny(realValue, imaginaryValue)), imaginaryPosition);
     realValue = addFP128(addFP128(squaredRealValue, negateFP128(squaredImaginaryValue)), realPosition);

        squaredRealValue = multiplyFP128AnyAny(realValue, realValue);
        squaredImaginaryValue = multiplyFP128AnyAny(imaginaryValue, imaginaryValue);

  iter++;
 }
 if(iter >= maxIter)
  iter = maxIter + 1;

 outputi[imaginaryId * hRes + realId] = iter;
}
This new implementation gives me the ability to delve much deeper into the Mandelbrot Set but at the cost of significantly reduced performance.
On my AMD 6970 Radeon graphics card:
Workgroup size: TwoTuple{firstElement=16, secondElement=16}
10 runs of MandelbrotViewImpl{maxIter=512, realAxisCentre=-0.75, imaginaryAxisCentre=0, realAxisPixelCount=1280, imaginaryAxisPixelCount=1280, realAxisPixelSize=0.00234375, imaginaryAxisPixelSize=0.00234375, mandelbrotGenerationStrategy=FP128} took on average 157.7323556 milliseconds
10 runs of MandelbrotViewImpl{maxIter=512, realAxisCentre=-0.75, imaginaryAxisCentre=0, realAxisPixelCount=1280, imaginaryAxisPixelCount=1280, realAxisPixelSize=0.00234375, imaginaryAxisPixelSize=0.00234375, mandelbrotGenerationStrategy=DOUBLE} took on average 7.4757887 milliseconds
10 runs of MandelbrotViewImpl{maxIter=512, realAxisCentre=-0.75, imaginaryAxisCentre=0, realAxisPixelCount=1280, imaginaryAxisPixelCount=1280, realAxisPixelSize=0.00234375, imaginaryAxisPixelSize=0.00234375, mandelbrotGenerationStrategy=SINGLE} took on average 6.429966899999999 milliseconds

1 comment:

Anonymous said...

Hi Bob,

Great job! thanks for sharing. Can I have full source code. I would to run on my computer to see performance. Did you compare the performance with java thread pool.
Thanks!
Sara