Purpose
Randomness plays a key roll in many graphics and simulation techniques. Variations in behaviour can make simulations feel more alive and randomised worlds can keep exploration alive in games.
Often a simple call to rand for these pertubations will suffice; perhaps modified by a specific distribution. But what if we wanted a smooth transition between generated values? A keyed function for n dimensions? Infinite and tilable but nonrepeating patterns?
These properties can be satisfied by a coherent noise function which maps an ndimensional vector to a single value with the restriction that small changes changes in inputs have small changes in the output. Conversely large changes in inputs may result in random changes in the output.
Noise functions are a very common component in procedural generation of graphics assets; most commonly textures and simple terrain generation. For example:

natural texture generators for clouds, wood, marble and others are generally a 2D fractal gradient noise

terrain generation in various voxel games like Minecraft, albeit in a modified form.

famously, the terraforming scenes in ‘Star Trek II’ (amongst countless other films)
The typical evaluation procedure is as follows:

map some input point to values through a basis function

combine values from ‘nearby’ integer points


sum a number of successive iterations of the basis function

modify the input frequency and output scaling each iteration


postprocess an output region for consistency/aethestics
I will cover the 3 basis parameters we have implemented for our game ‘Outpost’:

basis functions

interpolation functions

fractals
I am currently evaluating a collection of these algorithms for the procedural terrain components of ‘Outpost’. The remainder of this post may include some level of bias towards heightfield generation.
All examples will be in 2 dimensions for clarity, but can be generalised to arbitrary dimensions with relative ease. But beware of runtime performance in higher dimensions. 
Basis Functions
At the core of a noise algorithm is the basis function. It provides a mapping between the input point and the output value. This mapping must be deterministic, but as with ‘rand` and friends, a `seed’ value may be supplied so we can vary the outputs when needed.
There are two major categories of noise function in popular use: lattice:: Values are generated at points at regular intervals and the nearest values to the input point are interpolated, weight by distance. ‘value’ noise is a simplistic combined By generating values at regular intervals Lattice noise functions include ‘value’ noise, the gradient noise functions ‘Perlin’, ‘simplex’, and ‘wavelet’ noise.
Each basis function comes with efficiency and quality tradeoffs.
I won’t be covering simplex noise due to patent concerns, or wavelet noise for brevity’s sake. Both have better higher dimensional performance, and feature less aliasing, than Perlin noise. Though the difference may not be appreciable for less intensive applications. 
Value Noise
The simplest basis function is to assign a random value to each integral coordinate (the red points). Any deterministic random function can be used.
Evaluation at a fractional coordinate (the blue point) is simple interpolation across each dimension of the surrounding region weighted by the distance to each lattice point.
1: constexpr float lerp (float a, float b, float t); 2: 3: float 4: interpolate (const float values[2][2], vector2f frac) 5: { 6: auto y0 = lerp (v[0][0], v[0][1], frac.x); 7: auto y1 = lerp (v[1][0], v[1][1], frac.x); 8: return lerp (y0, y1, frac.y); 9: }
Acquiring Randomness
A tried and true approach is to use a prepopulated array of ‘random’ values with a known distribution and a second pertubation array of offsets. The loworder bits of the lattice coordinate is used to index into the pertubation array, which is used to index into the random values.
I utilise the integer mixing function of a keyed hash. Given a seed, and the lattice coordinates, I can divide the output by the maximum to get a uniform real.
If using the mixing function approach pay particular attention to any emergence of directional artefacts arising from your choice and usage of mixing function. 
Evaluation
By its nature, this does not allow for the more gradual transitions that coherent noise is generally selected for.
However this approach has high performance, allows straightfward control over the value distribution, and can often be good enough for many usecases.
Perlin Noise
Rather than assigning scalar values to lattice points, as with value noise, Perlin noise (a variant of gradient noise) assigns ndimensional vectors (the red vectors).
Evaluation at any coordinate is by computing the nd distance to each corner, then the dotproduct of each distance with the corner’s gradient. Interpolation of the n values is performed as per value noise.
Randomness can be generated in much the same way as value noise. Simple extensions include: * listing nd values in the value array * repeated perbutation indexing for each dimension * repeated integer mixing for each dimension
The algorithm gives a far more gradual change than value noise, at the cost of increased computational complexity.
The cost of scaling Perlin noise to higher dimensions becomes prohibitive quite rapidly, with complexity of O(pow(2,n)). Simplex noise was developed partially to counter this issue.
Worley/Voronoi Noise
Stepping away from lattice noise, Worley noise (also known as Voronoi or cellular noise) is derived from a distribution of points in ndimensional space.
The value for each query is the distance from the query to the nearest point.
Extensions
Evaluation is not restricted to the closest point. We can generalise the formula as a linear combination of distances to the nearest n points, commonly written as Fn (eg, F2 for second distance, F1F2 for first minus second distances).
The distance metric is not restricted to Euclidean metrics. Manhattan and Chebyshev distances are popular alternatives, giving linear edges tending towards axis aligned.
Interpolation
Value and Perlin noise make extensive use of interpolation across the pow(2,n) points.
Most commonly cubic and quintic polynomials are used (given their specification in the two Perlin papers).
Simple linear interpolation is sufficient for some purposes and gives some amount of evaluation speedup, but tends to display obvious axis aligned artefacts. The extreme of truncated interpolation is all but useless because of it’s obvious grid structure.
Cosine interpolation allows a smooth transition and has interesting gradient properties, but can incur a far higher runtime cost.
 linear

 cubic

 quintic

 cosine

While interpolation functions are an interesting (and excessive) parameter, they don’t appear to increase performance greatly, while potentially introducing significant grid artefacts.
Fractals
The basis functions above could be useful sucessfully by themselves for many simple operations, but their power is greatly increased by combining success evaluations as a fractal. The concept is to provide both high frequency (detail) and low frequency (structure) by summing successive iterations with increasing frequency and decreasing scaling.
 octaves

iteration count
 lacunarity

peroctave position scaling
 gain

peroctave position scaling
I will also reference parameters named ‘frequency’ (the initial lacunarity) and ‘amplitude’ (the initial gain) for implementation flexibility.
Fractional Brownian Motion
Fractional Brownian Motion (abbreviated as FBM) is the simplest and most well known fractal. It is often conflated with the Perlin (or other) basis functions given its near universal promotion.
 1

 2

 3

 4

 5

 6

 7

 8

As with many fractals, FBM evaluates the basis function for a set number of octaves. Each iteration has an increasing multiplier for the evaluated point position, and a decreasing multiplier on the effect of each octave. Both are applied for each evaluation of the basis function, whose results are summed.
1: float 2: fbm::eval (point2f p) const 3: { 4: float total = 0; 5: float scale = m_amplitude; 6: 7: p *= m_frequency; 8: 9: for (unsigned i = 0; i < m_octaves; ++i) { 10: total += m_basis (p) * scale; 11: 12: p *= m_lacunarity; 13: scale *= m_gain; 14: } 15: 16: return total; 17: }
One gotcha in the above implementation is amplification of results near the origin. Given successive evaluations with one input coordinate very close to zero, multiplicative frequency scaling (through lacunarity) will leave the input coordinate very close to it’s original value. Simple fixes include octave dependant seeds, and a small constant input coordinate offset per octave. 
TODO: Add single octave images.
Evaluation
 1

 2

 3

 4

 5

 6

 7

 8

A well defined grid from the initial octave is quite obvious for the first 4 octaves. After which the higher frequency layers soften, but never quite eliminate, the grid artefacts.
This isn’t necessarily a negative for some applications, but can be exceedingly difficult to remove in postprocessing.
 1

 2

 3

 4

 5

 6

 7

 8

The base layer is much smoother than value noise, and the structure does not alter significantly after perhaps 2 octaves. Indeed, Perline noise can appear quite homogenous when compared to value noise at similar scales.
 1

 2

 3

 4

 5

 6

 7

 8

Worley noise appears vastly different to other lattice noise.
Additional visual detail is greatly diminshed after 4 octaves. I tend to select at least 2 octaves less Worley noise than Value or Perlin.
Evaluating Worley noise can be quite expensive, depending on your implementation. Particularly if using higher arity variants (eg, f2f1, f4).
Musgrave’s Models
There are a number of common fractal noise algorithms, specifically designed for terrain, written by Ken Musgrave.
Ridged Multifractal, Hybrid Multifractal, and Heterogenous terrain are the typical examples that can found scattered around the internet. These are all described in detail in the book ‘Texture and Modelling: A Procedural Approach’.
Turblence
Offset terbulence is a simple extension to any noise function where the queried position is randomly perturbed; commonly by a secondary coherent noise function.
float turblence::eval (point2f p) const { vector2f offset { m_perturb[0] (p), m_perturb[1] (p) }; return m_data.eval (p + offset * m_scale); }
The simplest method is to instantiate an additional basis function for each dimension and offset the query using these values.
 0%

 5%

 10%

 15%

 20%

 25%

 400%

 1600%

Experimenting with parameters can prove interesting, with values well above 100% displaying an unexpected effect.
Postprocess
Coherent noise routines don’t stop here. Many applications, such as terrain generation, greatly benefit from postprocessing.
Normalising the output to unit values can simplify display and further processing.

normalisation

guaranteed bounds, but probabilistic.

not amenable to infinite planes


ocean offset

erosion models

gradient colours
Conclusions
Implementing your own coherent noise routines can be an interesting exercise.
While it is as important as ever to carefully evaluate the problem you are attempting to solve, a simple Perlin FBM implementation will get you a long way.