matt egan » experiments in 3D rendering

12/25/2015

experiments in 3D rendering

I've always kind of wondered how 3D renderers do their magic. A few years ago I found a book in a relative's library from the 1990's on some basic 3D computer graphics techniques, and since then I've been itching to write a simple software rendering engine. The book is 3D Computer Graphics by Alan Watt (2nd Edition), there's a 3rd edition out now that you might be able to get your hands on. The book was first printed in 1989, and thus doesn't include anything about hardware rendering or giant 3D graphics libraries, which I found really nice. I've done some stuff with OpenGL before, but since most of the tutorials and guides I've read are simply describing how OpenGL works, they don't usually spend the time to explain most of the the mathematics that are involved.

When I got the book for the first time, I made this little experiment, simply trying to figure out where the eight points in a cube end up after a projection from 3D space into screen space, but something wasn't quite right (if you can notice in the video below), and I kind of gave up on it then and hadn't tried again until now.

This entry is not really intended to be a tutorial of sorts, since there are plenty of them, but more of just a log of some of the fun I had in learning this stuff. If you're interested in a tutorial, there's a great series here by David Rousset that has a lot of the fundamentals.

matlab experiementation

Because my previous experiment had some issues I could never track down and I was never sure where in the pipeline the issues were, I wanted to make sure I had some of the main concepts down before writing a lot of my own matrix math code. I figured Matlab has a nice line function built in, and it has all of the matrix operations built into the langugage, so I figured it'd be a good starting point (even though a lot of my peers tend to bemoan Matlab's existance). I suppose I could have done it in any other langugage, but I didn't want to muck around with matrix libraries and such and was itching to get my feet wet.

I first wrote some functions to give back some standard transformation matricies for quick use. These are ugly, but I was really just looking to get something working to prove the math behind it all.

function [matrix] = perspectiveMatrix(r, l, t, b, f, n)
    matrix = [
        2/(r-l) 0 0 -(r+l)/(r-l) ; 
        0 2/(t-b) 0 -(t+b)/(t-b) ;
        0 0 1/(f-n) -n/(f-n) ;
        0 0 0 1
        ];
end

function [matrix] = xrotationMatrix(deg)
    rad = deg * pi / 180;
    matrix = [
        1 0 0 0 ;
        0 cos(rad) sin(rad) 0 ;
        0 -sin(rad) cos(rad) 0 ;
        0 0 0 1
        ];
end

function [matrix] = yrotationMatrix(deg)
    rad = deg * pi / 180;
    matrix = [
        cos(rad) 0 -sin(rad) 0 ;
        0 1 0 0 ;
        sin(rad) 0 cos(rad) 0 ;
        0 0 0 1
        ];
end

function [matrix] = zrotationMatrix(deg)
    rad = deg * pi / 180;
    matrix = [
        cos(rad) sin(rad) 0 0 ;
        -sin(rad) cos(rad) 0 0 ;
        0 0 1 0 ;
        0 0 0 1
        ];
end

function [matrix] = translationMatrix(x, y, z)
    matrix = [
        1 0 0 0 ;
        0 1 0 0 ;
        0 0 1 0 ;
        x y z 1
        ];
end

The meat of the code is below.

function a = draw(h, event)
    ClearLinesFromAxes()
    lines = [
        [[-50 -50 -100], [-50 50 -50]],       
        [[-50 50 -50], [50 50 -50]],
        [[50 50 -50], [50 -50 -50]],
        [[50 -50 -50], [-50 -50 -50]],
        %
        % more line pairs here, but removed for conciseness
        %

    ];

    lines = reshape(lines, [length(lines), 3, 2]);
    length(lines);
    t = perspectiveMatrix(-100, 100, -100, 100, -100, 100);
    data = guidata(gcf);
    rx = yrotationMatrix(data(1).Value);
    ry = xrotationMatrix(data(2).Value);
    rz = zrotationMatrix(data(3).Value);
    move = translationMatrix(data(4).Value, data(5).Value, data(6).Value);
    for m = [1: length(lines)]
        pointa = [lines(m, :, 1), 1]';
        pointb = [lines(m, :, 2), 1]';
        if (length(lines) - m) <= 44
            transforma = (pointa' * (rz * rx * ry * t))';
            transformb = (pointb' * (rz * rx * ry * t))';
        else
            transforma = ((pointa' * move) * (rz * rx * ry) * t)';
            transformb = ((pointb' * move) * (rz * rx * ry) * t)';
        end
        xya = transforma(1:2);
        xyb = transformb(1:2);
        x = [xya(1) xyb(1)];
        y = [xya(2) xyb(2)];
        l = line(x, y);
        i = length(lines) - m;
        if i == 0
            set(l, 'Color', 'red');
        elseif i == 1
            set(l, 'Color', 'magenta');
        elseif i == 2
            set(l, 'Color', 'green');
        elseif i <= 44
            set(l, 'Color', [.9, .9, .9]);
        end
    end

    xlim([-2, 2])
    ylim([-2, 2])
end

There's some GUI setup boilerplate not shown. There are six sliders shown on top of the plot. Three shift all of the objects around in space, and the other three adjust the transformation matrix to "move" the camera around. The camera tranformation is summed up into the translationMatrix function. The rest of the code is just going through all of the sets of points, transforming them into 2D screen space, and then using the Matlab line(x, y) function to draw then onto a plot. The draw function is called every time a slider value changes. The only weird thing is coloring the first three lines in the lines list different colors. This let me have three different colored axes to make sure everything made sense.

Here are two demos of this script. The first is a simple pyramid shape, and the second is after I added a few more structures and the axes to the output.

writing a matrix in swift (eg. getting sidetracked)

I've been wanting to get deeper into Swift too. So I figured this would be a good candidate project. I looked into a few Swift matrix libaries and really didn't see anyting I liked, so I rolled my own, and what started an "okay, I just want to be able to create and multiply some matricies" project turned into "this thing is like 500 lines now" undertaking.

I don't want to get too detailed on it here, so you can check Github for the full code listing (it's just one file) and the project itself for some more information. Any source code posted here will probably have little repetitive tidbits removed.

All elements in the matrix are doubles, I decided not to go with a generic implementation to simplify matters a bit, and are stored column-major ordered inside a Swift array. For the most part, I copied a lot of the nice functions that Matlab has built in, and tried to maximize the amount of code reuse in the library itself. For example, instead of having two funcitons ones and zeros (which return a prefilled MxN matrix with ones and zeros respectively) that both handle creating a new matrix and prefilling it, I wrote one function prefilled which returns a matrix with MxN elements that are all equal to one number, which is called by ones and zeros. Elementary I know, but I was proud of how clean some of the code came out to be.

//  returns a rows*cols matrix with all elements equal to 'prefill'
class func prefilled(rows: Int, _ cols: Int, pad: Double) -> Matrix {
    let elements = [Double](count: rows * cols, repeatedValue: pad)
    return Matrix.init(nr: rows, nc: cols, elements: elements)
}

//  returns a square prefilled matrix
class func prefilled(size: Int, pad: Double) -> Matrix {
    return Matrix.prefilled(size, size, pad: pad)
}

//  returns a rows*cols matrix filled with zeros
class func zeros(rows: Int, _ cols: Int) -> Matrix {
    return Matrix.prefilled(rows, cols, pad: 0.0)
}

//  returns a square zeros matrix
class func zeros(size: Int) -> Matrix {
    return Matrix.zeros(size, size)
}

//  returns a rows*cols matrix filled with ones
class func ones(rows: Int, _ cols: Int) -> Matrix {
    return Matrix.prefilled(rows, cols, pad: 1.0)
}

//  returns a square ones matrix
class func ones(size: Int) -> Matrix {
    return Matrix.ones(size, size)
}

Here are three more methods that I wrote in a similar way, one top-level method diagonal allows the creation of a diagonally filled matrix with the non-diagonal elements set to any arbitray number. it is then overridden to allow a default value of 0. The identity matrix function then uses the square-matrix diagonal method.

//  returns a fill.count*fill.count matrix with the diagonal elements equal to the elements in
//  fill, from top-left to bottom-right, and all other elements equal to 'pad'
class func diagonal(fill: [Double], pad: Double) -> Matrix {
    let result = Matrix.prefilled(fill.count, fill.count, pad: pad)
    for (row, element) in fill.enumerate() {
        result.elements[(row * fill.count) + row] = element
    }
    return result
}

//  same as diagonal(fill:, pad:), but uses a default pad of zero
class func diagonal(fill: [Double]) -> Matrix {
    return Matrix.diagonal(fill, pad: 0.0)
}

//  returns an size*size sized identity matrix
class func I(size: Int) -> Matrix {
    let ones = [Double](count: size, repeatedValue: 1.0)
    return Matrix.diagonal(ones)
}

Again, in the same vein, instead of writing four functions to perform the four elementary operations on an element-wise basis, I wrote one function that can "combine" two equally sized matricies element-by-element, that takes any function that accepts two doubles and returns a double. Then, the four elementary operations use this base function instead. Similarly, I wrote a map function that does the same, except it maps one matrix to a new matrix through a uinary operation.

//  apply a function to every element
func map(function: (Double) -> Double) -> Matrix {
    let result = Matrix.zeros(nr, nc)
    for (index, element) in elements.enumerate() {
        result.elements[index] = function(element)
    }
    return result
}

//  combine two equally sized matrices into one using a function
func combine(matrix: Matrix, operation: (Double, Double) -> Double) -> Matrix {
    assert(Matrix.dimensionsEqual(self, matrix), "Dimension Mismatch: Dimensions Must Equal")
    let result = Matrix.zeros(nr, nc)
    for (index, element) in elements.enumerate() {
        result.elements[index] = operation(element, matrix.elements[index])
    }
    return result
}

//  element-wise addition, subtraction, multiplication and division
func add(matrix: Matrix) -> Matrix {
    return self.combine(matrix, operation: +)
}

//  add, subtract, multiply and divide by a scalar
func add(scalar: Double) -> Matrix {
    return self.map{$0 + scalar}
}

Using Swift's operator overloading features was really nice as well. I defined the * infix operator on two matricies as matrix multiplication (much like Matlab) and then defined ∆* and ∆/ as element-by-element multiplication and division. However, I have not yet implemented matrix "division", because I have not implemented matrix inversion.

//  add two equivalently sized matrices, element by element
func + (left: Matrix, right: Matrix) -> Matrix {
    return left.add(right)
}

//  element-wise multiplication
infix operator ∆* { associativity left precedence 150 }
func ∆* (left: Matrix, right: Matrix) -> Matrix {
    return left.multiplyElements(right)
}

//  scalar addition
func + (m: Matrix, s: Double) -> Matrix {
    return m.add(s)
}

I also overloaded (and this is probably the coolest bit) the subscript accessor methods. I wanted to be able to index the matricies by any permutation of numbers, arrays, ranges and StrideThroughs. This allows for getting (and setting portions of) new matricies by writing mat(1..5, [1, 2, 9]) or mat(5, stride(from: 1, to: 9, by: 3)) and such. However, I didn't want to rewrite all of these things a billion time for every combination of the two datatypes, so instead every overloaded instance of the subscript accessors simply cast whatever arguments they have into arrays, and then call one main function.

// this array subscript accessor is the basis for the rest of the other accessors, it allows for
// getting and setting groupings of rows and columns in any order, when setting, the fill matrix
// must either be equal to the size of the submatrix that is being set, or a singular element
// that is to be filled in all spaces defined in 'rows' and 'cols'
subscript(rows: Array<Int>, cols: Array<Int>) -> Matrix {
    get {
        let rStart = rows.minElement()!
        let rEnd = rows.maxElement()!
        let cStart = cols.minElement()!
        let cEnd = rows.maxElement()!
        assert(rangeInBounds(rStart...rEnd, cStart...cEnd), "Getting [Rows, Cols] Out Of Bounds")
        let result = Matrix.zeros(rows.count, cols.count)
        for (placeRow, getRow) in rows.enumerate() {
            for (placeCol, getCol) in cols.enumerate() {
                result.elements[(placeRow * cols.count) + placeCol] = elements[(getRow * nc) + getCol]
            }
        }
        return result
    }
    set (fill) {
        let rStart = rows.minElement()!
        let rEnd = rows.maxElement()!
        let cStart = cols.minElement()!
        let cEnd = rows.maxElement()!
        assert(rangeInBounds(rStart...rEnd, cStart...cEnd), "Setting [Rows, Cols] Out Of Bounds")
        let singular = fill.elements.count == 1
        if !singular {
            assert(fill.rangeInBounds(0...(rows.count - 1), 0...(cols.count - 1)), "Supplied Matrix Is Insufficiently Sized")
        }
        for (getRow, placeRow) in rows.enumerate() {
            for (getCol, placeCol) in cols.enumerate() {
                elements[(placeRow * nc) + placeCol] = singular ? fill.elements[0] : fill.elements[(getRow * fill.nc) + getCol]
            }
        }
    }
}

//  rows and cols can be indexed by ranges or by integer strides
subscript(rows: Range<Int>, cols: Range<Int>) -> Matrix {
    get {
        return self[[Int](rows), [Int](cols)]
    }
    set(fill) {
        self[[Int](rows), [Int](cols)] = fill
    }
}

I think the coolest "functional" code I wrote though was in the matrix multiplication method. I wrote a sum method and a transpose method first, and then wrote this. Note, allRows and allCols are methods that return a range that specifies the minimum and maximum row or column indicies for the matrix. The "summing" operation is one line which uses the subscript accessors to get the correct LHS column matrix, and then multiplies its elements by the transposed (turns a row matrix into a column matrix) version of the correct RHS row, and then takes the sum of the result. I thought it was quite beautiful, at least compared to an even deeper nested loop.

// multiply two matrices
func multiply(matrix: Matrix) -> Matrix {
    assert(nc == matrix.nr, "Multiplication Dimension Mismatch")
    let result = Matrix.zeros(nr, matrix.nc)
    for row in result.allRows {
        for col in result.allCols {
            let value = self[row, allCols].multiplyElements(matrix[matrix.allRows, col].transpose()).sum
            result[row, col] = value
        }
    }
    return result
}

actually rendering some stuff now

I really hate mucking with CoreGraphics and NSView's drawRect(), so I wrote a quick little "render buffer" which has the ability to set individual pixel values to colors, and buffer updates (so all pixels get pushed at once to the screen). I wanted this to be cross platform (well, to the iPhone at least) so I wrote a little interface that could potentially be implemented in UIKit as well. This is hopefully the only UIKit/Cocoa specific code in the renderer, so we'll see how it goes and if I end up porting to iOS in the future.

///////////////////////////////////////////////////////////////////////////////////////////////////
//  MARK: RenderSurfaceProtocol
///////////////////////////////////////////////////////////////////////////////////////////////////
//  In order for the renderer to be cross platform, this protocol is used to represent something
//  that can render pixels to a display, it should allow for one state to be saved and
//  continuously drawn, while new pixel data is given to it, when draw() is called, the new pixel
//  data should be shown on the display, but the buffer should not be cleared until clear is called

protocol RenderSurfaceProtocol {
    //  returns the size of the buffer in pixels
    var size: CGSize { get }
    
    init(bufferSize: CGSize)

    //  sets a pixel in the buffer to an RGB color
    func setPixel(x: Int, _ y: Int, _ r: Int, _ g: Int, _ b: Int)

    //  clears all pixels to black
    func clear()

    //  clears all pixels to a given color
    func clearToColor(r: Int, _ g: Int, _ b: Int)

    //  puts the current buffer on the screen, does not clear the buffer
    func draw()
}

I don't think I'll bore you with the actual implementation. I did have some weird issues though when doing this. Honestly I can't remember what it was that caused only some of the screen to update, but I'm sure it was something dumb. Here's a video so you can laugh at the hilarity. I was drawing random circles in the buffer continuously, and for some reason it only updates in a specific region. However, when the window is being resized, it seemed to draw in the right place, until the mouse was released and it went back to acting strange. I fixed it though, so it's all good.

From here, I kind of just went about copying what I had done in Matlab. I did implement a very awful version of Bresenham's line algorithm because I only had a buffer that would draw individual pixels. Note the hardcoding of the viewport bounds.

func drawLine(x1: Int, _ y1: Int, _ x2: Int, _ y2: Int, _ r: Int, _ g: Int, _ b: Int) {
    if(x2 < x1) {
        drawLine(x2, y2, x1, y1, r, g, b)
    } else {
        let dx = x2 - x1
        let dy = y2 - y1
        if dx > 0 {
            for x in x1...x2 {
                let slope: Double = Double(dy) / Double(dx)
                let y: Double = (slope * Double(x - x1)) + Double(y1)
                if(x > 640 || x < 0 || y > 480 || y < 0) {
                    
                } else {
                    renderSurface.setPixel(x, Int(y), r, g, b)
                }
            }
        } else {
            if y2 > y1 {
                for y in y1...y2 {
                    renderSurface.setPixel(x1, y, r, g, b)
                }
            } else {
                for y in y2...y1 {
                    renderSurface.setPixel(x1, y, r, g, b)
                }
            }
        }

    }
}

To make things a little more extensible, I created a Renderable interface, which simply described something that had a collection of triangular faces. In order for it to be dynamic, I added a construct() method which would have the item fill its list of faces.

//  Describes an object that is renderable, renderable objects should have a list of verticies and
//  faces, a boolean which represents if the verticies and faces lists have been filled 
//  ('constructed') and a construct function which calculates the verticies and faces, filling 
//  the respective lists

protocol Renderable {
    
    //  the verticies that comprise the object, should be row vectors (in an array), with
    //  three items, x, y, and z respectively
    var vertices: [[Double]] { get }
    
    //  the faces of the object, should be a list of 3-item integer lists, refering to indicies
    //  in the verticies list
    var faces: [[Int]] { get }
    
    //  the origin of the object as a row matrix [[x, y, z]]
    var origin: Matrix { get }
    
    //  describes if the object has had it's verticis and faces lists filled, if an object is not
    //  constructed at init, this should be false until construct is called, and falsified if
    //  a property is modified that would modify the verticies or faces
    var constructed: Bool { get }
    
    //  fills the verticies and faces matricies
    func construct()
    
}

Here's an example of a pyramid object that implements the interface. In this case it's hand coded and really simple.

class Pyramid: BaseRenderableObject {
    
    override func construct() {
        construct(50.0)
    }
    
    func construct(size: Double) {
        let d: Double = size / 2.0
        //fill the verticies array
        vertices = [
            [-d,  0, -d],
            [-d,  0,  d],
            [ d,  0,  d],
            [ d,  0, -d],
            [ 0,  d,  0],
        ]
        
        faces = [
            //  bottom
            [0, 2, 1],
            [0, 2, 3],
            
            //  sides
            [0, 4, 1],
            [1, 4, 2],
            [2, 4, 3],
            [3, 4, 0]
        ]
        
        constructed = true
    }
    
}

Here's an example of a plane (like the flat surface, not an airplane) object that has an adjustable subdivision property, meaning it can either consist of just two triangles, or subdivide it's size into smaller "grid" subsections for more detail. Note I was kind of naughty here and didn't implement the construct() method and instead opted to call a different construct method manually. I should fix this later, but it works for now.

class Plane: BaseRenderableObject {
   
   override func construct() {
       
   }
   
   func construct(size: Double, subdivisions: Int, primaryAxis: Int, secondaryAxis: Int) {
       let d: Double = size / 2.0
       //fill the verticies array
       for primaryIndex in 0...subdivisions {
           let i: Double = -d + ((size/Double(subdivisions)) * Double(primaryIndex))
           for secondaryIndex in 0...subdivisions {
               let j: Double = -d + ((size/Double(subdivisions)) * Double(secondaryIndex))
               var vertex: [Double] = [0.0, 0.0, 0.0]
               vertex[primaryAxis] = i
               vertex[secondaryAxis] = j
               vertices.append(vertex)
               if primaryIndex != subdivisions && secondaryIndex != subdivisions {
                   let tl = (primaryIndex * (subdivisions + 1)) + secondaryIndex
                   let bl = ((primaryIndex + 1) * (subdivisions + 1)) + secondaryIndex
                   print([tl + 1, bl, tl])
                   print([tl + 1, bl, bl + 1])
                   faces.append([tl + 1, bl, tl])
                   faces.append([tl + 1, bl, bl + 1])
               }
           }
       }   
       constructed = true
   }    
}

The main rendering code is largely the same as the Matlab code, just even uglier. It caries out the same functions. I may make it public later after cleaning it up a bit.

Here's some videos from the process. Hilariously, the first result I got in Swift was the same thing I got when I tried doing this three years ago, where my cube would mysteriously "squish" down into a square. I tried forever thinking about how you could look at a cube and have it appear as just a square, and I think it's imposisble. I'm not 100% in the video the dots are even visible with the compression.

Just for grins I decided to connect these incorrectly located dots with lines, for humor's sake.

I reviewed the code some more, and fixed the bug.

This is when I added the planes. I also added some sliders to mess with the camera location.

Finally I added perspective and some more sliders. This took me a bit of time to wrap my head around how it works, but I finally got it (except for the bugs).

So this is it for now. I'd like to actually fill the triangles now to make the wireframe views into something more substantial, but I think to get there I'm going to have to do some major refactoring and optimization. It'd be fun to implement a z-buffer and then some backface culling. I'd also like to implement some basic lighting and a fly around camera. If nothing else, this has been a really fun introduction into 3D computer graphics (and some more advanced Swift) and now I think I'd feel comfortable applying some of this math in OpenGL shaders.

If you'd like the code send me an email, it's not really polished and really in no state to be put on Github.