Unnamed Project

This project is currently in the very early stages of development. It will be set in space in the far flung future. The details are a mystery, even to us.

Development Log

Below is a log of our progress so far. Enjoy.

Log 2 - Blazingly Fast^TM Glyph Rendering

24 June, 2023

And we're back. Most of us graduated from university in the past month or so, and I moved, so progress slowed down for a bit, although that's not to say there hasn't been any, it's just that more blog posts weren't our priority given their audience is, generously, about 5 people. Alex has been working on a physics system, and Yigit's been working on an asset management/serialization system for all of the data we're gonna need¹. This week though, I'm going to be talking about my pet project for the past few weeks: Working on a UI and text rendering solution.

As mentioned in Log 1, text rasterization is something that I sorted out pretty early on. However, all that the existing code could do was rasterize predefined text to a texture, and then render that texture to a quad. This would be fine for static text or buttons, but obviously not for any dynamic content, such as rendering debug info, which happens to be something we really need as we begin to develop tools for working on the game. Generally my approach when faced with something I don't know much about² is to begin by trying to get it working in the simplest way I can think of. In this case, that was using my existing text rendering code, which renders set text to a bitmap, to create a new bitmap³ and upload it as a texture to the GPU each frame. This worked fine, except that it was abysmally slow.

Essentially what I had created was a software rasterizer. In other words, each frame, I was going over each pixel on the screen (the texture I was writing to was the size of the screen), and determining its color. While I wasn't doing anything fancy, it's still a really bad idea to do this in software, when the GPU has a whole unit of hardware dedicated to this purpose. For some reference, my extremely basic text rendering was taking ~5ms, which is nearly a third of our frame budget⁴, running at 1280x720p. Not good.

Scrapping my initial approach, I did some research on text rendering, and found a few common methods. The more traditional approach is to use what's called a "glyph atlas," where each glyph gets rasterized to a texture at startup, and then the appropriate region of that texture is drawn to quads as needed. A more modern approach uses a "signed distance function" (SDF), defined for each glyph, to provide properly antialiased glyphs independent of resolution. While the SDF technique is more flexible, it's significantly more involved than using a glyph atlas, so I opted for the latter. My initial implementation was very basic, and wrote the geometry for each quad to a vertex buffer, with the position and texture coordinates defined per vertex. This implementation provided a significant performance increase, but still had some room for improvement.

For one, texture atlases are a little old school, and there are a few problems generally associated with them. One of these is the simple fact that you can't have textures of different sizes or formats contained within them, since they're part of the same actual texture. Another is mipmap⁵ bleeding, where at lower mipmap levels, the different sub-textures of the atlas will begin to affect one another. Instead, I decided to use a texture array to store the textures, which means that each glyph gets its own texture. Then, instead of getting the texture coordinates of a glyph within the atlas, the index of the glyph within the array is used to grab the appropriate texture when rendering. Because they're different textures, we can size the textures to only be as large as the glyph in question needs to be. Additionally, there's no way for mipmap bleeding to occur anymore.

I also was initially drawing the quads the same way I draw all of the 3D geometry: using a vertex buffer and an index buffer. While this makes sense for rendering 3D meshes, we end up with a lot of redundant data when rendering quads. For example, for a given quad, we have 4 vertices, which each contain their position, texture coordinate, and character. This data can be calculated for each vertex in the quad by just knowing the position and character of the quad. In addition to this redundancy, the vertex and index buffers need to be updated each frame. Eating up the bus's bandwidth on redundant data is no good, and so I revised my approach.

Instead of using a vertex and index buffer to hold the quad geometry, I got rid of them completely, and instead just tell the GPU to draw 6 vertices⁶ for each quad, but don't provide any data per vertex. Then, in the vertex shader, I use the builtin gl_VertexIndex to calculate which quad is being drawn, and which vertex within that quad is being drawn. However, the data for each quad still has to exist somewhere on the GPU to be accessible to the vertex shader. To accomplish this, I use a "shader storage buffer object," or SSBO. An SSBO is really just a buffer the GPU can read and write from a shader, and can contain really anything we need it to. It has the advantage of being able to be much larger than uniform buffers and being writeable from the GPU, at the cost of some speed⁷. The SSBO contains the data for each quad that needs to be drawn, and is persistently host mapped, meaning that it can always be written to from the CPU.

That's more or less it for the text rendering. It's a shockingly deep topic when you start really drilling down into it, but I've found this quads-and-texture-arrays approach good enough for what we need it for, at least for the time being. You can see a small UI rendered using this new method below:

I've still gotta make an RSS feed for these...

- Oliver

Notes

1. They should have posts about these in the coming weeks if I bother them enough.

2. Most things, but in this case text rendering.

3. On the CPU, each frame. I know, I know.

4. Or a 1/6 if your name starts with "B" and ends in "ethesda".

5. Mipmaps are downscaled copies of a texture that the GPU keeps on hand (if enabled) to use if the texture is farther away.

6. Six because, now that we've gotten rid of the indices, we need separate vertices for each triangle, and each quad is made up of two triangles.

7. Allegedly.

Log 1 - Renderer Plumbing

28 April, 2023

Hi again, if you're reading this, that means that I've put out a second blog post and haven't immediately given up, so I'm going to call that a win. My friend Chin wants me to create an RSS feed for this blog, so look out for that. I'm also thinking about making an email list so I can send out alerts to people who really need to know exactly when the new Northhead Games^TM Incorporated LLC Â© is up. Anyways, this week I want to talk about the our renderer, as that was the primary focus of this week.

Because we're writing this system from more or less the groundup in C¹, we also need to write a renderer ourselves. I don't want to go down a rathole trying to write a general purpose renderer, so my goal for the renderer is to set up something that works for what we're trying to do right now, and then worry about what we might need later when we need it later. For now, my goal was to have the ability to load arbitrary .obj models with texture and normal maps, be able to transform them around world space, and have some basic Phong shading² with a directional light. As I showed in the previous post, our renderer was already capable of loading a model and transforming it, but not much else. Before adding new features however, I took some time to revise the renderer's interface to the game code.

Previously, each action³ the renderer could perform was called via a function in the game code. While this was fine for testing that the game code could in fact call the renderer, and that the renderer was working, it presents a central problem: the game code must wait on the renderer any time it calls down to it. A secondary problem with this approach is that it does not lend itself to being parallelized later, as the renderer code would become tightly intertwined with the game logic. Instead, we've switched to using a command queue system, to pass messages to and from the renderer.

The command queue is super straightforward. During the game code's execution, the code can make a call to the platform layer (Plover) to push a render command into the queue. The queue itself is just a fixed size circular buffer, with an assert to tell me if messages start getting overwritten. The commands themselves are a tagged union, as that seemed like the most natural way to store the data, as opposed to some sort of template craziness. The outgoing message queue that the renderer uses to send messages back to the game code is implemented in exactly the same way. Currently, these queues are not thread safe, but adding a simple mutex lock⁴ when rendering gets spun out onto a separate thread would be trivial. Now onto the new renderer features that got added this week.

The next step after rejiggering the renderer API was implementing a basic lighting model. We have plans for fancier stuff, but for testing and getting things up and running quickly I've done a really basic Phong type shader. I'm not really going to go over the specifics, because this sort of lighting has been done to death, but if you're not familiar the tutorials at learnopengl.com are excellent. However, I did encounter a few interesting issues worth talking about, and some worth showing because they have an appealing aesthetic, like this cool but totally screwed up shader test:

While most of the implementation of the current lighting model went off without a hitch, I did spend an inordinate amount of time trying to figure out why the hell my normal maps were getting cloberred in the shader, like in this screenshot of my precious artefact with a stone texture on it:

On first blush, it might look okay, but if you look a little more carefully you'll see really strange discontinuities in the lighting. The problem ended up being that the normal map, which looks like this:

was being interpreted in sRGB. sRGB, or standard RGB, is a colorspace that maps the color values in our input textures to a value more appropriate to how our brains perceive color. The curve of this mapping is as follows:

While this mapping is great for our color (or albedo) textures⁵, it skews the normal vectors defined in the normal map, creating inaccurate normals in the final render of the object, as shown in this diagram:

With that issue resolved, we can now render our models with normal and specular maps, which looks pretty good in my opinion:

Anyways, that's all for this week. The plan is to have Yigit write an article about the asset streaming system we're working on, so keep an eye out for that. Also I made a Twitter, so feel free to go follow that.

- Oliver

Notes

1. It's technically written in C++, but we use C++ features very sparingly. We're not religious about it, but in general we use: function/operator overloading, method syntax^a, and references^b. We stay away from heavy usage of templates and also don't really use virtual methods/OOP stuff. As much fun as arcane debates about programming language features are, our main focus is writing code that executes instructions on the CPU/GPU :).

a. When functionality is really tightly coupled with data

b. Mostly because I find ptr->member syntax pretty ugly and needing to change the pointer is pretty rare.

2. Phong shading is a shading method that uses an ambient, diffuse, and specular component to figure out the intensity of the lighting of each pixel. It's a really old technique, but it's still very capable of creating nicely shaded models, particularly because we'll be going for a more stylized look and won't need the physical accuracy that more modern approaches provide.

3. Load a model, transform a model, create a material, etc.

4. Or potentially a more efficient concurrency mechanism like a fine-grained lock or non-blocking if it were warranted. As I'm writing this I should actually be studying for my concurrency exam tomorrow, so I could really overcomplicate things here if I wanted to.

5. Colors tend to look washed out and lack depth with a linear mapping.

Log 0 - Getting Started

20 April, 2023

Hi reader, welcome to the development log. This "unnamed project"¹ is going to be a sandbox factory building game, set in space. Currently, the team consists of 3 programmers. All of us are finishing up our last year of university, with the hope being that we'll be able to put significant hours into the project in the coming months. Our hope is to have something playable by August, but beyond that it doesn't seem appropriate to make predictions about timeline.

There'll be a lot of technical details to talk about as this project progresses, so I'm just going to jump right in. Initially, we were developing a small prototype of the game in Unity. However, we were constantly running into problems with the garbage collector and just generally feeling hampered by an interface we don't have control over. So, inspired partially by Casey Muratori's wonderful Handmade Hero series, we're going to create our own engine. Oh the hubris! As anyone on the net will tell you, making your own engine is a horrible terrible idea that will always end in disaster. Maybe they're right, but the truth it's just way more fun. We're trying to minimize our use of external libraries, and maintain ownership over as much of the code as possible. The code for the library itself will be public², so if you want to take a look at the code head over to the GitHub for it. It's called Plover because we like shore birds and are tired of programming projects called things like "Neutronium."³

Because the game will be 3D, we needed to pick which graphics hardware API to use. I personally had some familiarity with OpenGL, but wanted to use something with a more modern API. Our options were between DX12, Metal, and Vulkan. We ended up going with Vulkan, because we want to target Windows, Mac, and Linux, and don't want to have to port all of our graphics code between systems. Initially we were a bit scared by Vulkan, as it has a reputation for being very complex, but frankly with some basic graphics programming experience it's not too hard to get to grips with, it's just significantly more verbose than OpenGL. To get the renderer up and running, I went through the Vulkan tutorial, which is really high quality. We can now render this beautiful viking room test model thing:

I have no doubt that you're exceedingly impressed by it. Initially, Yigit and I tried implementing our own memory allocator⁴. Unfortunately, that ended up something like this:

This was less than desirable, as you might imagine, and so we've now switched to using the header lib Vulkan Memory Allocator, which works fine for the time being. Memory allocation on the GPU may be something we come back when doing optimization later on.

Once this was finished, I went ahead and implemented font rendering, using the Freetype library as a base. Once we get around to writing an asset loader, we will probably pre-rasterize the fonts when building the asset file(s?), but for now the glyphs get written into a bitmap texture on the CPU, then get uploaded onto the GPU for rendering. This required adding a subpass to our renderpass for rendering UI, which was my first real step outside of the comfort of the tutorials.

The main snag with text rendering was getting kerning working. Kerning is how typographers define how individual pairs of glyphs should be spaced, as shown in this image I stole:

Freetype supports kerning, but unfortunately only via the TrueType kern table. The font I wanted to use does not have this table, instead it has the OpenType GPOS table. To read this table, the renderer now reads in the binary data, and extracts the kerning information for each pair of glyphs. While the GPOS table supports a huge amount of features, I've only implemented horizontal kerning, as I'm only rendering latin text. The implementation borrows pretty heavily from the stb_truetype library, which also implements kerning using the GPOS table. The result of all this work can be seen below, with nicely kerned glyphs:

The last interesting bit of the engine to discuss⁵ is the way we're going to handle memory. Most modern C/C++ game engines use a ton of news and deletes to allocate memory for whatever they need. Instead, taking after Handmade Hero, we're going to have the engine provide the game with a flat memory partition that it can do whatever it wants in, and not have any sort of memory allocation in the game code. This has a few benefits. The first is that we can guarantee that if a given has enough memory to allocate the flat partition, our game will run on it, eliminating a whole class of errors. The second is that we won't be taking the performance hit of calling the operating system every frame to get new memory. Finally, we can also hot swap the game code⁶ trivially. While this is a pretty unorthodox approach to memory nowadays, I think it'll be interesting to see how it pays off.

This log is rather long, as it encompases a couple months of off and on work. Sorry about that. Our hope is that logs will be written weekly, but we'll see if that actually happens :).

- Oliver

Notes

1. We call it ProjectG internally.

2. I haven't decided on the license, but probably something something permissive like MIT.

3. Sorry if that's the name of your project.

4. For fun. Obviously.

5. So far, we hope.

6. Which is just a shared library with an exported function that runs each frame of the game loop.

Unnamed Project

Development Log

Log 2 - Blazingly FastTM Glyph Rendering

Notes

Log 1 - Renderer Plumbing

Notes

Log 0 - Getting Started

Notes

Log 2 - Blazingly Fast^TM Glyph Rendering