Saturday, January 23, 2010

Prime numbers

Erik Meijer has done this excellent series in functional programming which is available at Channel9.msdn.com
In one of the lectures he gives this example of generating an infinite list of prime numbers using lazy evaluation. His example was in Haskell. Here is the equivalent in F#:


let rec sieve nums = seq {
let head = Seq.head nums
yield head
let tail = Seq.skip 1 nums
let notMultipleOf n x = x % n <> 0
yield! sieve (Seq.filter (notMultipleOf head) tail)
}

let rec numbersAbove n = seq { yield n
yield! numbersAbove (n+1) }

let primes = sieve (numbersAbove 2)

Tuesday, January 12, 2010

Creating an interactive game using F#


One of the interesting things about functional programming is the notion of immutable types. Functional programmers avoid shared mutable state like the plague. In most functional languages, all values are immutable by default. A value cannot change once it is created. If you want to change a value in struct with many fields, you need to create a new struct, copy all the values and provide a new value for the field that needs to change. My initially thoughts were that this seemed incredibly inefficient. What it buys us though is that we can now modify values without worrying about another thread trying to access the same value. This all sounds great in theory but how do you write a game without having mutable state? The following example is kindof silly, but it serves to illustrate the point. How do you do the equivalent of:

void update() =
int rocketHeight = 0;
while(true)
{
drawRocketAt(rocketHeight);
rocketHeight++;
}

One of the ways of maintaining state is to pass it around in a recursive function. The following snippet has no mutable state, yet the rocket is drawn at a new position each frame.

let rec update(rocketHeight) =
drawRocketAt(rocketHeight)
update(rocketHeight + 1)
update(0)


This is all well and good, but surely you cannot pass all the state of every object in the same loop.
Agents to the rescue!
An agent or Actor is a lightweight object that runs in its own thread. Communication between actors happens through message passing. In F#, the MailboxProcessor implements this functionality and is essentially an actor/agent.

let mailbox = MailboxProcessor.Start(fun inbox ->
let rec update(state) =
async {
let! msg = inbox.Receive()
match msg with
| ActionA ->
let newState = OnActionA()
return! update(newState)
| ActionB ->
let newState = OnActionB()
return! update(newState)
| Stop -> return ()
}
update(initialState) )
mailbox.Post(Action)


Agents allows us to separate concerns and avoid having to pass unrelated game state in our main recursive loop. Of course agents are also multi-threaded so chances are that some work done by agents will actually automatically happen in parallel. The mechanism for passing messages are also thread safe. Only one request can be handled at a time.

In the past week I have written the classic game Pong using these ideas. It is hopelessly over engineered for a game as simple, but it helps to see how this might fit into a game with a bigger scope. In this game there are 3 agents. The main loop, Score and PlayerController. Since there are two player controllers, this is actually 4 different threads. The main loop is implemented as an agent that post an update message to itself for each frame. This allows the Application thread to send a Stop message to end update loop.
The Score agent listens for collision messages. Whenever the ball hits any of the oponents' zones, the score is updated.

type ScoreMessage =
| Collision of Surface
| Stop

type Score(physics:Physics) =

let mailbox = MailboxProcessor.Start(fun inbox ->
let rec update(green, red) =
async {
let! msg = inbox.Receive()
match msg with
| Collision(surface) ->
let score =
match surface with
| GreenZone -> (green, red+1)
| RedZone -> (green+1, red)
| Wall -> (green, red)
Debug.Trace "Score: Green-%d Red-%d" (fst score) (snd score)
return! update(score)
| Stop -> return ()
}
update(0,0) )

let OnCollision(player) =
mailbox.Post(player)

do physics.Event.AddHandler(fun player -> mailbox.Post(Collision(player)))

For the moment it only prints a value to the debug output. (It turns out SlimDX Direct2D is not completely implemented yet. You can not draw to the same surface as a Direct3D renderTarget.)
The PlayerController listens for events from the keyboard and updates the paddle positions accordingly.
The above example used the normal way of handling events. F# also allows you to manipulate the event stream using the Observable class. One can create an event stream that listents for events, reformat them and publish this modified event stream. One can even filter out events that you are not interested in. In the Input class I map from SlimDX KeyInputEventArgs to my own KeyEvent. I only have a subset of all the available keys because I would like them to fit into a 64bit integer bitfield. I therefor need to first map it to my representation and then filter out those that I don't have a mapping for.

module Keyboard =
...
do Device.RegisterDevice(UsagePage.Generic, UsageId.Keyboard, DeviceFlags.None)
// Listen only for key events for which we have a mapping defined
let Input = Device.KeyboardInput
|> Observable.map ( fun e ->
match keyToFlag.TryFind(e.Key) with
| Some(key) ->
{Key = key; IsPressed = e.State = KeyState.Pressed}
| None -> KeyEvent.None )
|> Observable.filter
(fun keyEvent -> keyEvent.Key <> KeyFlags.None)

Clients of the Keyboard module can then subscribe to this event stream. The PlayerController listens for these events and sends the appropriate message to its message queue.

type PlayerController =
...
let keyboardInput = Keyboard.Input.Subscribe(fun (keyEvent:KeyEvent) ->
if keyEvent.Pressed(left) then mailbox.Post(Left)
elif keyEvent.Pressed(right) then mailbox.Post(Right) )

Monday, January 4, 2010

Tessellation




DirectX 11 provides three new stages to the graphics pipeline that allows us to increase the complexity of objects by generating additional polygons. For this example we will look at tesselating the triangles of an icosahedron to generate a sphere. Our vertex shader for this example simply passes the vertex position to the hull shader.

HSInOut VertShader(VSInput input )
{
HSInOut Out;
Out.Position = input.Position;
return Out;
}

Like the geometry shader, the hull shader takes a draw primitive (in this case a triangle) as input is called once for each control point(vertex). The hull shader has two components. A patch constant function and the shader function. The patch constant function is responsible for calculating the tesselation factors for the edges and inside of the triangle before passing it along to the tesselation unit. The main shader can optionally perform additional computations. In this example, it only passes each vertex to the tesselator.

HSConst PatchConstantsHS( InputPatch< HSInOut, 3 > p )
{
HSConst output = (HSConst)0;
output.Edges[0] = output.Edges[1] = output.Edges[2] = 4;
output.Inside = 4;
return output;
}

[domain("tri")]
[partitioning("integer")]
[outputtopology("triangle_cw")]
[patchconstantfunc("PatchConstantsHS")]
[outputcontrolpoints(3)]
HSInOut PassThroughHS(
InputPatch< HSInOut, 3 > inputPatch,
uint uCPID : SV_OutputControlPointID )
{
HSInOut output = (HSInOut)0;
output.Position = inputPatch[uCPID].Position;
return output;
}

The real action happens in the Domain shader. Every additional vertex that is output from the tesselator is passed to the domain shader. It provides the original primitive that was tesselated, as well as the coordinates of the newly generated vertex in relation to the original. For a triangle patch, this is the barycentric coordinates of the vertex. While this sounds really special, it is really just the weights with respect to the original triangle vertices. It can easily be transformed by multiplying each weight with the original vertex coordinate as seen in the snippet below. The newly generated vertices are still in the same plane of the original triangle. In order to provide a better looking sphere the domain shader will also push out these new vertices so that they lie on the surface of our sphere before doing the transformation to screen space.

[domain("tri")]
PSInput DS(
HSConst input, const OutputPatch< HSInOut, 3 > TrianglePatch,
float3 BarycentricCoordinates : SV_DomainLocation )
{
PSInput output = (PSInput)0;

// Interpolate local space position with barycentric coordinates
float3 position = BarycentricCoordinates.x * TrianglePatch[0].Position
+ BarycentricCoordinates.y * TrianglePatch[1].Position
+ BarycentricCoordinates.z * TrianglePatch[2].Position;
position = normalize(position);
output.Position = mul( float4(position, 1.0), WorldViewProjection );
output.Normal = mul( position, (float3x3)World );
return output;
}

When creating the hull and domain shader it is important to specify shader model 5.0.

type Geometry(device, shape) =
...
let hullShader =
let byteCode = ShaderBytecode.CompileFromFile(
"Simple.hlsl",
"PassThroughHS",
"hs_5_0",
ShaderFlags.None,
EffectFlags.None)
new HullShader(device, byteCode)

let domainShader =
let byteCode = ShaderBytecode.CompileFromFile(
"Simple.hlsl",
"DS",
"ds_5_0",
ShaderFlags.None,
EffectFlags.None)
new DomainShader(device, byteCode)

Something extremely important. The primitive topology should be set to a patch list with 3 control points. Failing to do this will result in several hours of pulling out your hair to figure out why things don't work.

type Geometry(device, shape) =
...
member this.Prepare(transforms) =
do deviceContext.InputAssembler.PrimitiveTopology <-
PrimitiveTopology.PatchListWith3ControlPoint
do deviceContext.HullShader.Set(hullShader)
do deviceContext.DomainShader.Set(domainShader)
do deviceContext.DomainShader.SetConstantBuffer(
vsConstBuffer.Update(transforms),0)
...

Note also that the DomainShader now requires access to the transforms shader constants.

Saturday, January 2, 2010

Geometry shaders for Dummies



DirectX 11 offers 2 new programmable shader stages in the graphics pipeline, the Hull Shader and the Domain Shader. Before I investigate those, I wanted to first look at Geometry shaders, which was actually introduced in DirectX 10 already. Geometry shaders allows you to treat your data at a primitive level rather than the vertex level. Where in a Vertex shader you have access to only a single vertex at a time, the geometry shaders gives you the three vertices of a triangle as input. The shader can then modify the vertices, add more triangles or even discard it entirely. For my example I will tessellate an Icosahedron. This is a convenient shape to subdivide since it results in a sphere with evenly distributed vertices. Each triangle of the original icosahedron is recursively subdivided into 4 triangles. The newly created vertices are then normalized so that the lie on the surface of a unit sphere.
The vertex shader which would usually transform the vertices form local space to screen space, will now just pass the vertices along to the pipeline without any transformation.

struct GSInput
{
float4 Position : POSITION;
};
GSInput VertShader(VSInput input )
{
GSInput Out;
Out.Position = input.Position;
return Out;
}

Note that the original normal of the vertices are not really of interest since for a unit sphere, the normal is the same as the position. We only pass on the vertex position in local space.
SphereVertex is a helper function used by the Geometry shader that now does the vertex transformation.

PSInput SphereVertex(float4 position)
{
PSInput output;
output.Position = mul( position, WorldViewProjection );
output.Normal = mul(position.xyz, (float3x3)World);
return output;
}

Below is the algorithm for a single subdevision of an icosahedron. Here I ran into a limitation of HLSL. It turns out that recursive functions are not supported... One could write the algorithm iteratively but it is a bit more messy, and this exercise was really just a stab creating a geometry shader that does something useful. DirectX 11 actually has a hardware tesselator for this sort of thing. In the next tutorial I will create Hull and Domain shaders to do tessellation with arbitrary complexity.

[maxvertexcount(12)]
void GeomShader( triangle GSInput input[3], inout TriangleStream OutputStream )
{
float4 va = input[0].Position;
float4 vb = input[1].Position;
float4 vc = input[2].Position;
float4 v1 = float4(normalize(va.xyz + vb.xyz),1);
float4 v2 = float4(normalize(va.xyz + vc.xyz),1);
float4 v3 = float4(normalize(vb.xyz + vc.xyz),1);
OutputStream.Append( SphereVertex(vc) );
OutputStream.Append( SphereVertex(v2) );
OutputStream.Append( SphereVertex(v3) );
OutputStream.Append( SphereVertex(v1) );
OutputStream.Append( SphereVertex(vb) );
OutputStream.RestartStrip();
OutputStream.Append( SphereVertex(va) );
OutputStream.Append( SphereVertex(v1) );
OutputStream.Append( SphereVertex(v2) );
OutputStream.RestartStrip();
}

When setting up the geometry, one creates a geometry shader in very much the same way as a vertex or pixel shader.

type Geometry(device, shape) =
...
let gsByteCode = ShaderBytecode.CompileFromFile(
"Simple.hlsl",
"GeomShader",
"gs_4_0",
ShaderFlags.None,
EffectFlags.None)

let geomShader = new GeometryShader(device, gsByteCode)
....
member this.Prepare(transforms) =
....
do deviceContext.GeometryShader.Set(geomShader)
do deviceContext.GeometryShader.SetConstantBuffer(
vsConstBuffer.Update(transforms),0)



Note that the transforms constants (World/WorldViewProjection) that were previously used by the vertex shader must now be provided to the geometry shader instead.


For interest sake, here is what the subdivision algorithm in F# looks like if it were to be done on the CPU instead.



static member Sphere subdivisions =
let t = (1.0f + sqrt(5.0f))/2.0f;
let s = sqrt(1.0f + t*t);
let icosahedronVertices = [|
Vector3( t, 1.0f, 0.0f)/s;
Vector3( -t, 1.0f, 0.0f)/s;
Vector3( t,-1.0f, 0.0f)/s;
Vector3( -t,-1.0f, 0.0f)/s;
Vector3( 1.0f, 0.0f, t)/s;
Vector3( 1.0f, 0.0f, -t)/s;
Vector3(-1.0f, 0.0f, t)/s;
Vector3(-1.0f, 0.0f, -t)/s;
Vector3( 0.0f, t, 1.0f)/s;
Vector3( 0.0f, -t, 1.0f)/s;
Vector3( 0.0f, t,-1.0f)/s;
Vector3( 0.0f, -t,-1.0f)/s;
|]

let icosahedronIndices = [|
0; 8; 4;
1;10; 7;
2; 9;11;
7; 3; 1;
0; 5;10;
3; 9; 6;
3;11; 9;
8; 6; 4;
2; 4; 9;
3; 7;11;
4; 2; 0;
9; 4; 6;
2;11; 5;
0;10; 8;
5; 0; 2;
10; 5; 7;
1; 6; 8;
1; 8;10;
6; 1; 3;
11; 7; 5;
|]

let rec subdevide(vertices:array, indices, level) =
if level < subdivisions then
let subdivideHelper(accVertices:array,
accIndices,
triangle:list) index =
match triangle with
| ib::ia::[] ->
let ic = index
let va, vb, vc = vertices.[ia], vertices.[ib], vertices.[ic]
let v1, v2, v3 =
Vector3.Normalize(va + vb),
Vector3.Normalize(va+vc),
Vector3.Normalize(vb + vc)
let i1 = accVertices.Length
let i2 = i1+1
let i3 = i2+1
(Array.append accVertices [|v1;v2;v3|],
Array.append accIndices [|i1;i3;i2;
ia;i1;i2;
ic;i2;i3;
ib;i3;i1|],
[])
| _ -> (accVertices,accIndices, index::triangle)

let vertices, indices, _ =
Array.fold subdivideHelper (vertices,[||],[]) indices
subdevide(vertices, indices, level+1)
else
vertices, indices

let vertices, indices = subdevide(icosahedronVertices, icosahedronIndices, 0)
{ Vertices = Array.map (fun vec -> Vertex(vec, vec)) vertices; Indices = indices }