<![CDATA[programming - ruby0x1.notes]]>https://notes.underscorediscovery.com/Ghost 0.11Thu, 11 Feb 2021 13:11:29 GMT60<![CDATA[UE4 - list blueprint assets by type]]>I'm busy working on a project in unreal lately, and ran into the desire to enumerate a list of blueprint assets from a folder. This post has a simple solution.

I looked around and found this post which covers finding UClass from a type, native or from assets, but didn't

https://notes.underscorediscovery.com/ue4-list-blueprint-assets-by-type/07696f7c-4ec8-4117-bde8-2db45b0baa69Tue, 27 Mar 2018 13:22:25 GMT

I'm busy working on a project in unreal lately, and ran into the desire to enumerate a list of blueprint assets from a folder. This post has a simple solution.

I looked around and found this post which covers finding UClass from a type, native or from assets, but didn't feel right for my needs. I then found out about UObjectLibrary and after figuring out some of it's API, found this post which also helped. UObjectLibrary been working well and has lots of neat little helpers in the API for what I wanted, and the code is simple.

The code is at the end of the post.

important notes

Make sure the path you use exists in the cooked data in packaged builds. The project has settings for forcing folders to be cooked, add them there if your code is assuming they exist.

All Content/<folder> paths are referenced as /Game/<folder> when using this. Here we have /Game/Trees and /Game/Cards.

UE4 - list blueprint assets by type

The point

Let's say I was doing some procedural generation, and I wanted to spawn some trees (or even whole chunks of a level). These trees can be made into blueprint actors, and spawned dynamically. But how do I get a list of them to spawn from in the first place? That's what we want here, and this is what we'll get.

UE4 - list blueprint assets by type

Type filtering
You'll notice that Type Class filter, this is useful to be specific about the sub class, so that you can cast knowing each result is of the right type.

For example, in my case I have a c++ class called EventCard that all my "card" blueprints inherit from, then I can select it here. Then, I can cast to the specific class and use it as that type, if I wanted to. (This also means you can mix types in the same path/folder, and query only ones that you want).

This example also shows how you can take the type and spawn an actor from it. Since the class is just a type of tree, we are now creating one actual tree from it. We can do that several times for the same class, making many from one.

UE4 - list blueprint assets by type

C++ spawning/usage
This code example assumes all my blueprints are inheriting from the c++ class AEventCard, which they are. (see unreal wiki for GetWorld() alternative)

TArray<UClass*> list;  
helper::GetBlueprintsOf(AEventCard::StaticClass(), TEXT("/Game/Trees"), list);  
UClass* cardClass = list[0]; //assume it found one, use the first one  
AEventCard* card = GetWorld()->SpawnActor<AEventCard>(cardClass);  

Code setup

In my case I had two goals:

  • I want to use this from c++
  • And I want to use to use it from blueprints

The general idea is we want one c++ function, and then one blueprint-facing wrapper function to expose that. The Unreal Wiki has some great examples of exposing stuff to blueprints so I won't get too specific, but I will provide hopefully a clear example of how it'll be setup.

I already have a general purpose blueprint function library in c++, called "TypesAndGlobals", this is where I'll expose it. If you don't have one (and want to use this and don't mind c++ in your project) you would create one from File -> New C++ class and choose Blueprint Function Library as the parent class.

Once the function is exposed then we get this:

UE4 - list blueprint assets by type

The code

<![CDATA[game engines: using ALL the languages]]>In the 1.0/long term version of my engine, I designed the core runtime to expose it's API in such a way that it would be agnostic to a scripting language. This is a short post on how that looks, when you connect the dots.

This post needs a

https://notes.underscorediscovery.com/game-engines-using-all-the-languages/5d6b1463-260c-45f6-9697-24d587b989e6Mon, 20 Nov 2017 07:44:26 GMT

In the 1.0/long term version of my engine, I designed the core runtime to expose it's API in such a way that it would be agnostic to a scripting language. This is a short post on how that looks, when you connect the dots.

This post needs a thanks to a friend Jeff Ward who helped me with the dart embedding, and clarified some of my misconceptions/lack of clarity about embedding mono.

I've chosen wren for several reasons as the default language in my engine. That post includes a lot of rationale, but near the end points out that you'll be able to use any language later. I was validating some assumptions earlier today and figured hey, why not bind <x> - it'll be quick? And a few hours later.... x was not alone.

Turns out I could bind rust, c# (mono), c++, js, lua, dart, python and swift in less than a day. It's not the entire API, but the foundations are there (the rest can largely be generated). This also comes back to the data oriented API too, because things are all static functions and primitives it's a bunch easier.

Here's how the log looks when they're all loaded up:

game engines: using ALL the languages

Here's how the dependencies look after:

game engines: using ALL the languages

sometimes simple

I was really surprised how simple it was in some languages, like c# with mono can handle a c function pointer directly. This is great, because that's how the API is defined. This is true of rust and swift, they both have direct access to the endpoints. lua and js (via duktape.org) for example have generic function callbacks, where you unpack the args and forward them along.

This is the C# connection to the C API.

game engines: using ALL the languages

And then on the c# side:

game engines: using ALL the languages

sometimes frustrating

The dart sdk pulled down around 5.5gb of... stuff. It then built for a long while and the folder was around 6.5gb before I saw any usable files. I don't know why dart implies it's designed to be embedded, a lot of the header files in the embedding api reference internal header files, use unportable functions (_strdup) and more. It's wild.

Python has dependencies on a lot of implicit binaries. With cffi and cpython or pypy it has to reach the python runtime, the cffi runtime and who knows what else (in terms of standard lib?). Even the cffi docs complain the entire thing is a mess... I hate dependencies, and wouldn't want the end user to have to install python or anything. Maybe those can somehow be put alongside the binary all together, but I have no idea atm.

user plugin api

On the user end, the workflow should ideally be the same. They get a consistent interface to the engine API (idiomatic to the language to some extent) and they all feel similar.

Here's how it works in rust, c#, python (but you can imagine the rest, they're all very similar by design).

game engines: using ALL the languages

game engines: using ALL the languages

game engines: using ALL the languages

That's it for today

I'll go into more details in the future, and if you have questions feel free to send them my way.

<![CDATA[OBJ parser: easy parse time triangulation]]>For some reason whenever I wrote a quick obj parser, I never bothered to think about the case where the faces in the file had more than 3 points. It was easier in the moment to just triangulate the meshes before loading them in the 3D app. In practice on

https://notes.underscorediscovery.com/obj-parser-easy-parse-time-triangulation/b99f2d0b-1ea7-42f2-aaea-a3808c47ecf9Thu, 05 Jan 2017 01:50:01 GMT

For some reason whenever I wrote a quick obj parser, I never bothered to think about the case where the faces in the file had more than 3 points. It was easier in the moment to just triangulate the meshes before loading them in the 3D app. In practice on a bigger project, with bigger scenes and lightmaps and all sorts this became a bit of a pain.

Since I was writing a quick obj parser again, I decided to think about it for more than a few minutes and since I didn't see too many easily found results on doing this: I figured I'd share what I implemented for future reference. The only thing you have to tweak is how you parse the f lines.

This solution may be obvious to some and that's fine!

Note that this assumes you want to parse the mesh details as triangle primitives, like GL_TRIANGLES or similar. This does NOT require using triangle fan primitives.

The solution

Parse the face points as if it was a triangle fan!

The details

All that means is they share a common vertex.
We can use the first vertex (0) of the face, and connect each of the points after that back to it.

OBJ parser: easy parse time triangulation

This is the basis for our easy triangulation. The points are almost always coplanar and should generally be convex but do be aware of what happens when they're not. This solution also applies to any other mesh formats that follow similar coplanar rules and stores faces of course.

If we imagine a simple quad, in our OBJ file we might have this line:

f 1/1/1 4/2/1 3/3/1 2/4/1

Using the above logic we can parse the face by hand/on paper/in mind as:

triangle one:  
[1, 1, 1],    [4, 2, 1],    [3, 3, 1]
triangle two:  
[1, 1, 1],    [3, 3, 1],    [2, 4, 1]

So the logic is for each point: connect itself & the next point back to the first point. We can do this in a very simple pseudoish for loop:

var points = split line by ' ', discard the first item ('f')  
var triangle_indices = []  
for(i in 1 ... points.count-1) {  
    tri_corner0 = points[0]
    tri_corner1 = points[i]
    tri_corner2 = points[i + 1]
    triangle_indices.push([corner1, corner2, corner3])

This logic applies the same if there are just 3 points, since it will loop from 1 to 2, adding a single triangle with the correct points!


That's all there is to it, now you can throw most meshes into the loader without any trouble, and the second part of obj loading (converting the indices into vertices) doesn't change.

Yay for better workflow!

<![CDATA[c++11 constexpr fnv1a compile time hash]]>I've been working with some c++ lately and had previously been using static hash values for strings (similar to this from the Bitsquid blog) but the one thing that bothered me was the lack of ability to use them in switch statements, where I often want to use them. With

https://notes.underscorediscovery.com/constexpr-fnv1a/04a7cb06-9437-4629-a2b1-16cfa3d7e2b4Wed, 19 Oct 2016 10:14:55 GMTI've been working with some c++ lately and had previously been using static hash values for strings (similar to this from the Bitsquid blog) but the one thing that bothered me was the lack of ability to use them in switch statements, where I often want to use them. With constexpr in c++11, I can.

I've been talking a bit with Alan Wolfe (@Atrix256) and he has been posting a good series of exploring code vs data, and compile time evaluation on his blog. You should read it, if that interests you.

I had been meaning to convert my fnv1a hash functions into constexpr but haven't had enough reason to yet, but the recent posts from Alan spurred me on to just do it, and it was surprisingly straight forward. I found a few really old examples of people trying it but for some reason they were all a mess and wrapped in classes...

Since I had to go digging to find a nice simple version of fnv1a originally (especially a 64 bit one), here are the implementations of the hash function I am using below along with the constexpr version. The license is public domain or an equivalent (a link would be nice so others may find it, but not necessary).

See the assembly output on Compiler Explorer.
Gist on Github

If you have any thoughts, feel free to send them my way:

runtime version

constexpr version

other uses

I shared this recently as another use case of hashes at compile time:

<![CDATA[haxe: compile time macros]]>Haxe is a really great language for me. It does cross platform in a sensible way - by compiling and generating code to a target language. The best part is that it's not just converting, it's properly compiling the code - so all errors are caught by the compiler itself

https://notes.underscorediscovery.com/haxe-compile-time-macros/05c4c7f6-bc2c-41be-b44f-5f18e7295a1aTue, 04 Nov 2014 11:05:45 GMTHaxe is a really great language for me. It does cross platform in a sensible way - by compiling and generating code to a target language. The best part is that it's not just converting, it's properly compiling the code - so all errors are caught by the compiler itself long before the generated code even gets there.

One of its most powerful features is the macro system which allows you to run haxe code at compile time, to augment and empower your existing haxe code. It sounds crazy - so let's dig in.


In this post, the example output code will be shown using javascript for simplicity - just one of the many language targets that it supports - c++, c#, php, java, python, js.

Haxe in Haxe at compile time

Haxe manages to get macros right, it uses it's own language at compile time to alter the compilation state. This means you can inject expressions, code, remove code, throw errors and generally make code do things not usually possible, that are specific to your code base or target. And even better, you have during this compilation phase the full power of the language behind you to do so.

In luxe for example - there is a concept available for using a Component/Entity system. Sometimes, a user would accidentally try and use the entity parent property in the contstructor of the Component, long before it was assigned. This wasn't their fault, it's just the nature of the way the system works and that was something that would have to be learnt. But not with macros around!

One of the first macros I decided to write was based on this problem - I made an @:autoBuild macro happen on every descendent of Component - which at compile time, has a look at the expressions within the constructor of the given component. If it finds you touching the field named entity - it throws a neat and clearly marked error message to warn you. This saves oodles of time on things being null and obscure crashes, and gives a massive boost to usability when you can design for that explicitly.

The exact code is actually not complete right now - but the ability to do this type of thing is far more helpful than it first seems.

complex code rejection

Because you can alter the expressions in the macros at compile time, you can reject code from ever existing in the output. This is possible through #if loglevel > 1 using Haxe already - but what if the condition was far more complex? What if the condition was based on where the code is being built - like in a continuous integration server? What about environment variables? Or git commit revisions? Basically - any condition you can program - a macro can do. Since a macro is just haxe code, it has the full capability of the Haxe language and compiler to do it's bidding at compile time.

log code rejection

One simple example is logging code, using log levels to define what level of logging is present in a build. I like really dense detailed logs because I can write a parser for them and visualize them in ways that aid debugging complex systems quickly. This can add a large toll on a code base if the log code ends up in the output, because every logged string has to be stored and allocated and adds to the final build size output and sometimes runtime cost.

The macro rejecting the expression means the final code does not include the logging at all. Haxe already has a concept like this built in as a build flag --no-traces, which removes trace() calls - the built in debugging print command - but the concept applies not only to logging but more expensive and intricate systems like profiling and instrumentation.

profiling and instrumentation

Haxe macros let me add instrumentation code to my hearts content without it ever affecting runtime release builds, something I have been wanting an elegant solution for for quite some time. The next section is an even better option - what about deep profiling all functions automatically? Or each block, or each expression of each block?

complex code injection

Since you can emit code expressions from a macro, you can inject code as well. You can construct entire classes and types dynamically - at compile time.

Let's take the profiling example one step further and devise a conceptual macro for automatically profiling every block expression within a given class. Notice below I have tagged my class for a "build" macro - I want this class to be handled by my Profiling macro apply function at compile time. Since I only care about the update function right now in this example - let's tag that code for profiling only using custom metadata @:profiling. Note that @:build is from haxe, the custom one is ours.

Also take note that I separated logic into blocks { } of expressions - because I can use this to my advantage in the macros at compile time.

class Player {

    function update(dt:Float) {

automatic injection

Now I have everything I need - my macro will run at compile time on the class I am interested in measuring, my macro will check all methods in the class for @:profiling - if it finds it, it will look for each root block { } expression and automatically insert a start and end measurement at runtime so the final code would in pseudo code look like

profiler.start('update_ai') {

    //update ai

} profiler.end('update_ai').

For now - I won't be posting the code (this system is not even finished being coded heh) but the important thing is to understand the potential from macros and their ability to empower the code base for the development process to be quicker, more friendly, more streamlined in the output and more expressive.


This of course has down sides, code is being executed at compile time. While haxe compiler is incredibly fast - you can slow it to a crawl by a single compile time macro. If your macro introduces network latency for pinging a server or something - you will be waiting for that too.

The other thing to consider is that macros are quite complex and are the most advanced feature in haxe - so it often appears unapproachably difficult. Often this is not the case, and patience and examples will get you using them in no time. Haxe 3 made massive strides in simplifying their usage - they still have some things that are fairly difficult to wrap your head around that WILL take time to get used to.

This is not something you can fast track - the easiest way I have found is to learn by doing. That's why I am making this post, to hopefully inspire you to think of really simple, really easy macros that help you get your feet wet.

Simple concrete example

Most times a unique build id is useful in determining which version or specific build is being executed on a test machine or users machine for debugging purposes. To that end, our simple example will generate a unique-ish static string value for a build id. Since this code happens at compile time, a far more complex algorithm can be used to ensure uniqueness if required, but for the most part this code will do fine.

// MIT License
//https://github.com/underscorediscovery/haxe-macro-examples | notes.underscorediscovery.com

import haxe.macro.Expr;  
import haxe.macro.Context;

import haxe.crypto.Md5;  
import haxe.Timer.stamp;  
import Math.random;

class BuildID {

        /** Generate a unique enough string */
    public static function unique_id() : String {
        return Md5.encode(Std.string( stamp()*random() ));

        /** Generates a unique string id at compile time only */
    macro public static function get() {
        return macro $v{ unique_id() };

} //BuildID

Take note of the functions here - one is for generating a string ID at runtime - a regular public static function. You can use this any time from your program. Then, there is a macro function, these are compile time functions and can use the macro context to spit out expressions. I won't dig too much into the specifics of the expressions themselves - but $v{ } generates an expression from a value if it's a primitive type. Our case is a string but this is covered in the Haxe manual if you wanted more insight.

Let's look at what the using code would look like, and the resulting output target javascript code. This class is stand alone and can be used with the Haxe compiler to have a look at the results yourself, using the older documentation here as the new manual is still working on these introductions.

Basically to use this example at home, run haxe build.hxml from the compile_time_buildid/ folder of the repo.

// MIT License
//https://github.com/underscorediscovery/haxe-macro-examples | notes.underscorediscovery.com

class TestID {

    public static var build_id : String = BuildID.get();

    public function new() {
        trace( build_id );
        trace( 'running build ${build_id}' );

        //called automatically as the entry point
    static function main() {
        new TestID();


resulting output

There - now we have a unique value for the ID. The ... represents some haxe specifics that aren't useful in this example but notice how the build id is hardcoded into the output source file. This value will change with every build you run.

(function () { "use strict";
var TestID = function() {  
    console.log("running build " + TestID.build_id);
TestID.main = function() {  
    new TestID();
TestID.build_id = "cf30a1a97db5628b91535dfd3a972ea6";  

even more hardcoded

Notice console.log(TestID.buildid); and the line below it? This is printing the value of a variable called build_id. It's a fixed value because its hardcoded into the file, but then why do we even need the variable access when we could replace every mention of TestID.build_id with the exact id string? Haxe allows this too, using inline static access.

Let's change :
public static var build_id : String = BuildID.get();
public inline static var build_id : String = BuildID.get();

For strings this is not that great, since it will generate a lot more strings, but for numbers, constants and the like it can really cut out a lot of code and even optimize the output significantly by doing away with superflous values at compile time.

Now that we have changed it to inline, this is the output - every mention of the variable build_id is gone and is now hardcoded into the file directly.

(function () { "use strict";
var TestID = function() {  
    console.log("running build " + "10e594bd858844cb16a1577c61309b49");
TestID.main = function() {  
    new TestID();

code example

The complete code for the above can be found here for convenience :

Github Repository

Links and tutorials

The official Haxe manual
Getting better all the time, this is the definitive guide though quite meaty and requires a good couple of passes before things make sense. Learn from simple examples and practicing, use the manual as a reference.

Andi Li: Everything in Haxe is an expression
This guide is a really helpful understanding that everything is an expression in haxe. This makes macros make a lot more sense for me.

Mark Knol: custom autocompletion with macros
An example of using macros to populate code for the compiler so it can code complete things that it wouldn't be able to otherwise.

Mark Weber: lots of simple macro snippet examples
A simple and useful reference for getting some ideas and introduction to the concepts behind the macro code, the different macro contexts and their uses.

Dan Korostelev writes about macros for tolerant JSON code
A look at "Using haxe macros as syntax-tolerant, position-aware json parser" with example code.

Lot's of these blogs and links include many great posts about haxe, and there are many more online if you search.

Good luck - and I hope to post more about haxe macros specifically in the near future as well.

<![CDATA[Shaders : second stage]]>The second part in a series on understanding shaders, covering how data gets sent between shaders and your app, how shaders are created and more.

Other parts:
- here is part one
- you are viewing part two

I wrote a post about shaders recently - it was a primer,

https://notes.underscorediscovery.com/shaders-second-stage/d924a65f-6027-45c3-adc0-0d3df24a7888Thu, 11 Sep 2014 12:17:00 GMTThe second part in a series on understanding shaders, covering how data gets sent between shaders and your app, how shaders are created and more.

Other parts:
- here is part one
- you are viewing part two

I wrote a post about shaders recently - it was a primer, a "What the heck are shaders?" type of introduction. You should read it if you haven't, as this post is a continuation of the series. This article is a little deeper down the rabbit hole, a bit more technical but also a high level overview of how shaders are generally made, fit together, and communicated with.

As before, this post will reference WebGL/OpenGL specific shaders but this article is by no means specific to OpenGL - the concepts apply to many rendering APIs.

I was overwhelmed by the positive response and continued sharing of the article, and I want you to know that I appreciate it.

brief “second stage” overview

This article will cover the following topics:

  • The road from text source code to active GPU program
  • Communication between each stage, and to and from your application

How shaders are created

Most rendering APIs share a common pattern when it comes to programming the GPU. The pattern consists of the following :

  • Compile a vertex shader from source code
  • Compile a fragment shader from source code*
  • Link them together, this is your shader program
  • Use this program ID to enable the program

*Intentionally keeping it simple, there are other stages etc. This series is for those learning and that is ok.

For a simple example, have a look at how WebGL would do it. I am not going to get TOO specific about it, just show the process in a real world use case.

There are some implied variables here, like vertex_stage_source, and fragment_stage_source are assumed to contain the shader code itself.

1 - Create the stages first

var vertex_stage = gl.createShader(gl.VERTEX_SHADER);  
var fragment_stage = gl.createShader(gl.FRAGMENT_SHADER);  

2 - Give the source code to each stage

gl.shaderSource(vertex_stage, vertex_stage_source);  
gl.shaderSource(fragment_stage, fragment_stage_source);  

3 - Compile the shader code, this checks for syntax errors and such.


Now we have the stages compiled, we link them together to create a single program that can be used to render with.

   //this is your actual program you use to render
var the_shader_program = gl.createProgram();

   //It's empty though, so we attach the stages we just compiled
gl.attachShader(the_shader_program, vertex_stage);  
gl.attachShader(the_shader_program, fragment_stage);

   //Then, link the program. This will also check for errors!

Finally, when you are ready to use the program you created, you simply use it :


Simple complexity

This seems like a lot of code for something so fundamental, and it can be a lot of boilerplate but remember that programming is built around the concept of repeating tasks. Make a function to generate your shader objects and your boilerplate goes away, you only need to do it once. As long as you understand how it fits together, you are in full control of how much boilerplate you have to write.

Pipeline communications

As discussed in part one - the pipeline for a GPU program consists of a number of stages that are executed in order, feeding information from one stage to the next and returning information along the way.

The next most frequent question I come across when dealing with shaders, is how information travels between your application and between the different stages.

The way it works is a little confusing at first, it's very much a black box. This confusion is also amplified by "built in" values that magically exist. It's even more confusing because there are deprecated values that should never be used - in every second article. So when someone shows you "the most basic shader" it's basically 100% unknowns at first.

Aside from these things though, like the rest of the shading pipeline - A lot of it is very simple in concept and likely something you will grasp pretty quickly.

Let's start with the built in values, because these are the easiest.

Built in functions

All shader languages have built in language features to compliment programming on the graphics hardware. For example, GLSL has a function called mix, this is a Linear Interpolation function (often called lerp) and is very useful in programming on the GPU. What I mean is that you should look these up. Depending on your platform/shader language, there are many functions that may be new concepts to you, as they don't really occur by default in other disciplines.

Another important note about the built in functions - these functions often are handled by the graphics hardware intrinsically, meaning that they are optimized and streamlined for use. Barring any wild driver bugs or hardware issues, these are often faster than rolling your own code for the functions they offer - so you should familiarize yourself with them before hand writing small maths functions and the like.

Built in variables

Built in variables are different to the functions, they store values from the state of the program/rendering pipeline, rather than operating on values. A simple example would be when you are creating a pixel shader, gl_FragCoord exists, and contains the window-relative coordinates of the current fragment. As with the function list, they are often documented and there are many to learn, so don't worry if there seem to be a lot. You learn about them and use them only when you need to in practice. Every shader programmer I know remembers a subset by heart and has a reference on hand at all times.

These values are implicit connections between the pipeline and code you write.

staying on track

To avoid the “traps” of deprecated functions, as with any API for any programming language, you just have to read the documentation. It's the same principle as targeting bleeding edge features - you check the status in the API level you want to support, you make sure your requirements are met, and you avoid things that are clearly marked as deprecated for that API level and above. It's irrelevant that they were changed and swapped before that - focus only on what you need, and forget it's history.

Edit: Since posting these, an amazing resource has come up for figuring out the availability and usage of openGL features http://docs.gl/

Most APIs provide really comprehensive "quick reference" sheets, jam packed with every little detail you would need to know, including version, deprecation, and signatures. Below are some examples from the OpenGL 4.4 quick reference card.

OpenGL 4.4 built in variables

OpenGL 4.4 built in functions

Information between stages

Also mentioned in part one, the stages can and do send information to the next stage.

Across different versions of APIs, over a few years, newer better APIs were released that improved drastically over the initial confusing names. This means that, across major versions of APIs, you will come across multiple approaches for the same thing.

Remember : The important thing here is the concepts and principles. The naming/descriptions may be specific but it's simply to ground the concept in an existing API. These concepts apply in other APIs, and are differ only in use, rather than concept.

stage outputs

vertex stage

In OpenGL, the concept of the vertex shader sending information to the fragment shader was named varying. To use it, you would:

  • create a named varying variable inside of the vertex shader
  • create the same named varying variable inside of the fragment shader

This allowed OpenGL to know that you meant "make this value available in the next stage, please". In other shader languages the same concept applies, where explicit connections are created, by you, to signify outputs from the vertex shader.

An implicit connection exists for gl_Position which you return to the pipeline for the vertex position.

In newer OpenGL versions, these were renamed to out:

out vec2 texcoord;  
out vec4 vertcolor;  

fragment stage

We are already saw that the fragment shader uses gl_FragColor as an output to return the color. This is an implicit connection. In newer GL versions, out is used in place of gl_FragColor:

out vec4 final_color;  

It can also be noted that there are other built in variables (like gl_FragColor) that are outputs. These feed back into the pipeline. One example is the depth value, it can be written to from the fragment shader.

stage inputs

Also in OpenGL, you would "reference" the variable value from the previous stage using varying or, in newer APIs, in. This is an explicit connection, as you are architecting the shader.

in vec2 texcoord;  
in vec4 vertcolor;  

The second type of explicit input connections are between your code and the rendering pipeline. These are set through API functions in the application code, and submit data through them, for use in the shaders.

In OpenGL API, these were named uniform, attribute and sampler among others. attribute is vertex specific, sampler is fragment specific. In newer OpenGL versions these can take on the form of more expressive structures, but for the purpose of concept, we will only look at the principle :

vertex stage

Attributes are handed into the shader from your code, into the first stage :

attribute vec4 vertex_position;  
attribute vec2 vertex_tcoord;  
attribute vec4 vertex_color;  

This stage can forward that information to the fragments, modified, or as is.

The vertex stage can take uniform as well, there is a difference in how attributes work and uniforms work.

fragment stage

uniform vec4 tint_color;  
uniform float radius;  
uniform vec2 screen_position;  

Notice that these variables are whatever I want them to be. I am making explicit connections from my code, like a game, into the shader. The above example could be for a simple lantern effect, lighting up a radius area, with a specific color, at a specific point on screen.

That is application domain information, submitted to the shader, by me.

Another explicit type of connection is a sampler. Images on the graphics card are sampled and can be read inside of the fragment shader. Take note that the value passed in is not the texture ID, it's not the texture pointer, it is the active texture slot. Texturing is usually a state, like use this shader, then use this texture and then draw. The texture slot, allows multiple textures to co-exist, and be used by the shaders.

  • bind texture A
  • set active slot 0
  • bind texture B
  • set active slot 1

The texture slot determines what value the shader wants, as it will always use the bound texture, and the given sampler slot!

fundamental shaders

The most basic shaders you will come across simply take information, and use it to present the information as it is. Below, we can look at how "default shaders" would fit together, based on the knowledge we now have.

This will be using WebGL shaders again, for reference only. These concepts are described above, so they should hopefully make sense now.

As you recall - geometry is a set of vertices. Vertices hold (in this example) :

  • a color
  • a position
  • a texture coordinate

This is a vertex, geometry, so these values will go into a vertex attribute and sent to the vertex stage.

The texture itself, is color information. It will be applied in the fragment shader, so we pass the active texture slot we want to use, as a shader uniform.

The other information in the shader below, is for camera transforms, these are sent as uniforms because they are not vertex specific data. They are just data that I want to use to apply a camera.

You can ignore the projection code for now, as this is simply about moving data around from your app, into the shader, between shaders, and back again.

Basic Vertex shader

//vertex specific attributes, for THIS vertex

attribute vec3 vertexPosition;  
attribute vec2 vertexTCoord;  
attribute vec4 vertexColor;

//generic data = uniforms, the same between each vertex!
//this is why the term uniform is used, it's "fixed" between
//each fragment, and each vertex that it runs across. It's 
//uniform across the whole program.

uniform mat4 projectionMatrix;  
uniform mat4 modelViewMatrix;

//outputs, these are sent the next stage.
//they vary from vertex to vertex, hence the name.

varying vec2 tcoord;  
varying vec4 color;

void main(void) {

        //work out the position of the vertex, 
        //based on its local position, affected by the camera

    gl_Position = projectionMatrix * 
                  modelViewMatrix * 
                  vec4(vertexPosition, 1.0);

        //make sure the fragment shader is handed the values for this vertex

    tcoord = vertexTCoord;
    color = vertexColor;


Basic fragment shader

If we have no textures, only vertices, like a rectangle that only has a color, this is really simple :


//make sure we accept the values we passed from the previous stage

varying vec2 tcoord;  
varying vec4 color;

void main() {

        //return the color of this fragment based on the vertex 
        //information that was handed into the varying value!

        // in other words, this color can vary per vertex/fragment

    gl_FragColor = color;



   //from the vertex shader
varying vec2 tcoord;  
varying vec4 color;

   //sampler == texture slot
   //these are named anything, as explained later

uniform sampler2D tex0;

void main() {

        //use the texture coordinate from the vertex, 
        //passed in from the vertex shader,
        //and read from the texture sampler, 
        //what the color would be at this texel
        //in the texture map

    vec4 texcolor = texture2D(tex0, tcoord);

        //crude colorization using modulation,
        //use the color of the vertex, and the color 
        //of the texture to determine the fragment color

    gl_FragColor = color * texcolor;


Binding data to the inputs

Now that we know how inputs are sent and stored, we can look at how they get connected from your code. This pattern is very similar again, across all major APIs.

finding the location of the inputs

There are two ways :

  1. Set the attribute name to a specific location OR
  2. Fetch the attribute/uniform/sampler location by name

This location is a shader program specific value, assigned by the compiler. You have control over the assignments by name, or, by forcing a name to be assigned at a specific location.

Put in simpler terms :

”radius“, I want you to be at location 0.
Compiler, where have you placed “radius”?

If you use the second way, requesting the location, you should cache this value. You can request all the locations once you have linked your program successfully, and reuse them when assigning values to the inputs.

Assigning a value to the inputs

This is often application language specific, but again the principle is universal : The API will offer a means to set a value of an input from code.

vertex attributes

Let's use WebGL as an example again, and let's use a single attribute, for the vertex position, to locate and set the position.

var vertex_pos_loc = gl.getAttribLocation(the_shader_program, "vertexPosition");  

Notice the name? I am asking the compiler where it assigned the named variable I declared in the shader. Now we can use that location to give it some array of vertex position data.

First, because we are going to use attribute arrays, we want to enable them. If you read this code as simple terms, it says "enable a vertex attribute array for location, where location refers to "vertexPosition".


To focus on what we are talking about here, some variables are implied :

   //this simply sets the vertex buffer (list of vertices) 
   //as active, so subsequent commands use this buffer

gl.bindBuffer( gl.ARRAY_BUFFER, rectangle_vertices_buffer );

   //and this line points the buffer to the location, or "vertexPosition"

gl.vertexAttribPointer(vertex_pos_loc, 6, gl.FLOAT, false, 0, 0);  

There, now we have:

  • taken a list of vertex positions, stored them in a vertex buffer
  • located the vertexPosition variable location in the shader
  • enabled attribute arrays, because we are using arrays
  • we set the buffer as active,
  • and finally pointed our location to this buffer.

What happens now, is the vertexPosition value in the shader, is associated with the list of vertices from the application code. Details on vertex buffers are well covered online, so we will continue with shader specifics here.

uniform values

As with attributes, we need to know the location.

var radius_loc = gl.getUniformLocation(the_shader_program, "radius");  

As this is a simple float value, we use gl.uniform1f. This varies by API in syntax, but the concept is the same across the APIs.

gl.uniform1f(radius_loc, 4.0);  

This tells OpenGL that the uniform value for "radius" is 4.0, and we can call this multiple times to update it each render.


As this article was already getting quite long, I will continue further in the next part.

As much of this is still understanding the theory, it can seem like a lot to get around before digging into programming actual shaders, but remember there are many places to have a look at real shaders, and try to understand how they fit together :

Playing around with shaders : recap
Here are some links to some sandbox sites where you can see examples, and create your own shaders with minimal effort directly in your browser.


An important factor here is understanding what your framework is doing to give you access to the shaders,

which allows you to interact with the framework in more powerful ways. Like, drawing a million particles in your browser - passing information through textures, encoding values into color information and vertex attributes.

The delay between post one and two were way too long, as I have been busy, but the next two posts are hot on the heels of this one.

Tentative topics for the next posts :

shaders stage three

  • A brief discussion on architectural implications of shaders, or "How do I fit this into a rendering framework" and "How to do more complex materials".
  • Understanding and integrating new shaders into existing shader pipelines
  • Shader generation tools and their output

shaders stage four

  • Deconstructing a shader with a live example
  • Constructing a basic shader on your own
  • A look at a few frameworks shader approach
  • series conclusion

Follow ups

If you would like to suggest a specific topic to cover, or know when the next installment is ready, you can subsribe to this blog (top right of post), or follow me on twitter, as I will tweet about the articles there as I write them. You can find the rest of my contact info on my home page.

I welcome asking questions, sending feedback and suggesting topics.
As before, I hope this helps someone in their journey, and look forward to seeing what you create.

<![CDATA[Primer : Shaders]]>A common theme I run into when talking to some developers is that they wish they could wrap their head around shaders. Shaders always seem to solve a lot of problems, and often are referenced as to the solution to the task at hand.

But just as often they are

https://notes.underscorediscovery.com/shaders-a-primer/a23707ac-c0b8-42fa-9ae6-25899e96f94eThu, 03 Apr 2014 22:13:35 GMTA common theme I run into when talking to some developers is that they wish they could wrap their head around shaders. Shaders always seem to solve a lot of problems, and often are referenced as to the solution to the task at hand.

But just as often they are seen as a sort of enigma or black box - one that is so shrouded in complexity that it makes learning them from ”basic” examples near impossible.

Hopefully, this primer will help those that aren't well versed and help transition into using shaders, where applicable.

Other parts:
- you are viewing part one
- here is part two

What are shaders?

When you draw something on screen, it is generally submitted as some “geometry”. Like, a polygon or a group of triangles. Even drawing a sprite, is drawing some geometry with an image applied.

Geometry is a set of points (vertices) describing the layout which is sent to the graphics card for drawing. A sprite, like a player or a platform is usually a “quad”, and is often sent as two triangles arranged in a rectangle shape.

When you send geometry to the graphics card to be drawn, you can tell the graphics card to use custom shaders that will be applied to the geometry, before it shows up on the render.

There are two kinds of shaders to understand for now - vertex and fragment shaders. You can think of a shader as a small function that is run over each vertex, and every fragment (a fragment is like a pixel) when rendering. If you look at the code for a shader, it would resemble a regular function :

void main() {  
   //this code runs on each fragment, or vertex.

It should be noted as well that the examples below reference OpenGL Shader Language, referred to as GLSL, but the concepts apply to the programmable pipeline in general and are not for any specific rendering API. This information applies to almost any platform or API.

The vertex shader

As mentioned, there are vertices sent to the hardware to draw a sprite. Two triangles - and each triangle has 3 vertices, making a total of 6 vertices sent to be drawn.

When these 6 vertices reach the rendering pipeline in the hardware, there is a small program (a shader) that can run on each and every vertex. Remember the graphics hardware is built for this, so it does many of these at once in parallel so it is really fast.

That program only cares about one thing really : The vertex shader mainly cares about the position that the vertex will be (there is a footnote in the conclusion). This means that we can manipulate (or calculate) the correct position that the vertex should be. Very often this includes camera calculations and determines how and where the vertex ends up before being drawn.

Let's visualise this below, by shifting the sprite 10 units to the left :

vertex shader

If you wanted to, you could apply sin waves, or random noise or any number of calculations on a per vertex level to manipulate the geometry.

Practical example
This can be used to generate waves that work to move vertices according to patterns that look like lava or water. All of the following examples were provided by Tyler Glaiel from Bombernauts

The lava (purple) area geometry, bunches of vertices!

Lava Area

How it looks when a vertex shader moves it around (notice how the vertices are pushed up and down and around like water, this is the vertex shader at work)

You can have a look at how it looks when it ripples on the blog post here, at the Bombernauts development blog.

The fragment shader

After the vertices are done moving about, they are sent to the next stage of the shader to be "rasterized", that means converted into fragments that end up as pixels on screen.

When doing this stage of rasterizing geometry (which are now called fragments), each fragment is given to the fragment shader. These are also sometimes referred to as pixel shaders, because some people associate the fragments with pixels on screen, but there is a difference.

Here is a gif from an excellent presentation on Acko.net which usefully demonstrates how sampling works, which is part of the rasterization process. It should help understand how the vector geometry becomes pixels in the end.


Now, the fragment shader, much like the vertex shader, is run on every single fragment. Again, it is good at doing this really quickly, but it is important to understand that a single line of code in a shader can cause drastic performance cost due to the sheer number of times the code will be run! (See the note at the end of this section for some interesting numbers).

The fragment shader mainly cares about what the resulting color of the fragment becomes. It will also interpolate (blend) from each vertex, based on it's location between them. Let's visualize this below :

fragment shader

When I say interpolated, here is what I mean : Given a rectangle with 4 corners (arranged as 2 triangles) and the corner vertices colors set to red, green, blue and white - the result is a rectangle that is blended between the colors automatically.

Interpolated colors sourced from open.gl

Practical example
A fragment shader can be used to blur some or all of the screen before drawing it, like in this example, some blur was applied to the map screen below the UI to obtain a tilt shift effect. This is from a game I was working on for a while, and the tilt shift shader came from Martin Jonasson.
For the curious, here is the source for the tilt shift shader along with some notes about separating the x and y passes for a blur, since that has come up a bunch.

tilt shift

An important note on numbers

A game rendered at 1080p, a resolution of 1920x1080 pixels, would be 1920 * 1080 = 2,073,600 pixels.

That is per frame - usually games run at 30 or 60 frames per second. That means (1920 x 1080) x 60 for one second of time, that's a total of 124,416,000 pixels each second. For a single frame buffer, usually games have multiple buffers as well, for special effects and all kinds of rendering needs.

This is important because you can do a lot with fragment shaders especially because the hardware is exceptionally good at it but when you are pushing performance problems it can often come down to how quickly the hardware can process the fragments, and shaders can easily become a bottleneck if you aren't paying attention.

Playing around with shaders

Playing with shaders can be fun, here are some links to some sandbox sites where you can see examples, and create your own shaders with minimal effort directly in your browser.



Recap : Shaders are applied in a program that consists of parts, and apply when enabled to geometry, when submitted to be drawn.

Vertex shaders : first, applied to every vertex when enabled, each render, and mainly care about the end position of the vertex.

Fragment shaders : second, applied to every fragment when enabled, each render, and mainly care about the resulting color of the fragment.

Because the nature of shaders are so versatile, there are many many things that you can do with them. From complex 3D lighting algorithms down to simple image distortion or coloring, you can do a huge range of things with the rendering pipeline.

Hopefully this post has helped you better understand shaders, and let you explore the possibilites without being completely confused by what they are and how they work going into it.

It should be said there is more that you can do with vertex shaders, like vertex colors and uv coordinates, and there is a lot more you can do with fragment shaders as well but to keep this post a primer, that is for a future post.

Notes on the term “Shaders”
The term “Shader” is often called out as a bit of a misnomer (but only sort of), so be aware of the differences. This post is really about the ”programmable pipeline”, as mentioned in bold really early on. The pipeline has stages that you can run some code for certain stages. A GPU program is made up of code from each programmable stage (vertex,fragment,etc), compiled into a single unit and then run over the entire pipeline while geometry is submitted for drawing, if that program is enabled.

Each stage does a little communicating between the stages (like the vertex stage hands the vertex color to the fragment stage), and the vertex and fragment stages are the most important to understand first.

I personally feel like the term shader comes from the fact that 99.9% of the time you will be working with the programmable pipeline will be spent on shading things, while the vertex and other stages are often a fraction of the day to day use of your average application or game.
