noise~lang

The Noise Programming Language

A small, expression-based, stochastic language: write random variables like ordinary variables. JIT-compiled, massively parallelizable, blazing fast.

Abstract. In Noise, every value is a probability distribution: a number is the degenerate case, a point mass. Operators lift over random variables and a query such as P(event) estimates a probability by simulation. The result is a small language in which propagating uncertainty and running Monte-Carlo experiments reads like ordinary mathematics, highly optimized.

Figure 1. the noise field this language is named for — a fractal-noise surface drawn as ink contours. Move your cursor to disturb it.

Keep scrolling

The best way to learn a language is to watch it unfold in front of you — so keep scrolling. Six small programs, each one idea past the last, animate as you read them, then run for real on the compiled engine in your browser. It has a touch of magic to it: every variable here is a whole wave of possibility, and a query is what collapses it to a single number.

Every value is a distribution

A = 5D1 ~ rand::unif_int(1, 6)D2 ~ rand::unif_int(1, 6)S  = A + D1 + D2E(S)
draws 0 A

Figure 1. One histogram, four ideas. A number is a spike; the tilde spreads it into a die; a constant and two dice joined make a bell; a query reads the result.

A number is a distribution

A = 5 looks like a plain constant — and it is, but Noise sees it as a probability distribution whose every draw lands on 5. A single infinitely-thin spike: the Dirac delta, the degenerate case every other value generalizes.

The tilde spreads it out

D1 ~ rand::unif_int(1, 6) binds a fair die — a discrete uniform spread, all six faces equally likely. The spike fans out into six flat bars. D1 is now genuinely uncertain, yet you write it like any other variable.

Join them — a bell appears

Now add the constant and two independent dice together: S = A + D1 + D2 — five, plus a roll, plus a roll. The flat shapes convolve into a triangle peaked at 12 — already curving toward a bell. Join a few more and it sharpens into a true Gaussian: the central limit theorem, for free. (Noise also ships normal, poisson, bernoulli and more for the same ~ slot.)

A query collapses millions of draws

Nothing actually runs until you ask. E(S) — the expected value — fires a Monte-Carlo pass: millions of rolls of both dice, averaged into one honest number. By the law of large numbers the running mean settles on 12. Two lines of math; an expert kernel underneath.

Estimating π with Monte Carlo

X ~ rand::unif(-1, 1)Y ~ rand::unif(-1, 1) C = X**2 + Y**2 < 1 4 * P(C)
darts 0 inside 0 π ≈

Figure 2. Monte-Carlo π. A probability becomes a number by sampling.

Two random draws

Same ~ as before, now used twice. X ~ unif(-1, 1) and Y ~ unif(-1, 1) each draw a number anywhere in [-1, 1] — together, a random point in the square.

Ask one yes/no question

C = X**2 + Y**2 < 1 is true exactly when the point falls inside the unit circle. In Noise that comparison is itself a random variable — a Bernoulli distribution, landing true or false on each draw.

Now throw the darts

Each dart is one draw of (X, Y); teal if inside the circle, grey if not. They scatter evenly — no point is special.

Four times the fraction is π

The circle covers π/4 of the square, so the teal share hovers around π/4 — and 4 * P(C) is π. More darts, sharper estimate.

The birthday paradox

0102030405060
days ~[23] rand::unif_int(1, 365) match = vec::has_duplicates(days) P(match)
people 0 P(two share) at 23

Figure 3. the birthday paradox — coincidence is far likelier than it feels.

A room full of birthdays

Every ~ so far drew a single number. ~[23] hands 23 people one random day each — and that whole batch is still a single draw. The strip up top is one such room scattered across the 365 days of the year.

Ask: does any pair collide?

has_duplicate checks all 253 pairwise comparisons at once and is true the moment two days land on top of each other — the matches flash maroon above.

Add people, watch it climb

Each extra person adds more pairs that could collide, so P(match) rises far faster than intuition expects — not a gentle ramp but a steep wall.

23 beats a coin flip

The curve crosses the dashed 50% line at just 23 people — fewer than a classroom. By 50 it's a near-certainty. Our intuition compares ourselves to others; the math counts every pair.

The bell curve, two ways

sampled sum   Normal(N/2, N/12)

# 1 — a Galton board: 12 left/right bouncesrights = vec::count(~[12] rand::bernoulli(0.5))P(rights == 6)        # centre bin -> 0.226 # 2 — the same bell, from a sum of uniformssum = vec::sum(~[12] rand::unif(0, 1)) - 6P(sum > 1)            # -> 0.159, Normal(0,1)
balls 0 P(centre) ≈ N = 1 shape flat

Figure 4. the bell curve, two machines — a Galton board and a sum of uniforms, one law.

Each peg is a coin flip

Two dice already hinted at a bell; twelve coin flips sharpen it. A ball meets twelve pegs; at each it goes left or right with equal chance. ~[12] bernoulli(0.5) is those twelve flips, and count collapses the batch back to one number — how many went right.

Where does it land?

The number of rights is the bin. P(rights == 6) asks how often a ball ends dead centre — twelve fair steps landing exactly halfway.

Drop the balls

Watch them cascade. Any single ball's path is unpredictable, yet the bins fill into a strikingly regular shape.

A bell, from coin flips

The stacks trace the Binomial — a sum of twelve Bernoulli steps. But is the bell really about coins? Let's swap them for something with no bell in it at all.

Start over with flat noise

unif(0, 1) is equally likely anywhere in [0, 1] — a flat slab, nothing bell-shaped. We'll sum N of them and standardize, starting with just one.

Two make a triangle

Sum two independent uniforms and middling totals beat the extremes — a triangle. The parts are still flat; the sum is what bends.

A few, and it rounds

By four terms the corners are already softening toward a curve — the very same shape the falling balls drew, with no pegs in sight.

Same bell, different machine

Twelve uniforms hug the true Normal (red). That's the central limit theorem: sums drift to a Normal regardless of the parts — coins or flat noise, the bell is the same law underneath.

AM vs FM

msg = signal::sample(0.3 * signal::sine(3), 64); am = [1 + msg, 0 * msg];fm = [math::cos(3 * msg), math::sin(3 * msg)]; static = signal::noise_white(0.3);rx_am = am + static;rx_fm = fm + static; rec_am = (rx_am[0]**2 + rx_am[1]**2)**0.5 - 1;rec_fm = math::atan(rx_fm[1] / rx_fm[0]) / 3; Print("AM error", E(vec::mse(rec_am, msg), 40000));Print("FM error", E(vec::mse(rec_fm, msg), 40000));

scroll through the steps →

Figure 5. AM vs FM, end to end. The carrier is a phasor; AM rides its length, FM its angle. The same static corrupts the length but spares the angle — which is why FM shrugs off noise. Numbers from the real engine.

The message

A batch can carry a shape, not just a count — here it's a signal. A gentle tone: 64 samples of a slow sine. This is what we want to send through a noisy channel and get back intact.

AM — message in the amplitude

Write the carrier as a phasor [I, Q] — a spinning arrow. AM puts the message in its length: the tip slides in and out, crossing the unit circle as the message rises and falls.

FM — message in the angle

FM instead puts the message in the angle: the tip rides the unit circle while its length stays fixed. Same information, encoded in rotation rather than reach.

Add the same static

Identical noise hits both tips. It smears AM along the radius it reads from — but only nudges FM around the circle, barely changing the angle. Watch the clouds.

Recover & compare

Read the length back for AM, the angle for FM, and compare to the original. FM returns far cleaner for the very same static — the payoff of spending bandwidth on the angle.

100 boxes, 100 prisoners

prisoners.noise open in playground
n = 100; opens = 50;boxes ~ rand::permutation(n);   # box k holds slip boxes[k] all_win = true;for prisoner in 0..n {  box = prisoner;               # open your own box first  found = false;  for hop in 0..opens {    box = boxes[box];           # follow the chain    found = found || (box == prisoner);  };  all_win = all_win && found;};P(all_win)               # ~ 0.3118

Figure 6. the 100 prisoners riddle — following your own loop wins ~31%, exactly when no cycle exceeds half the boxes.

A riddle that sounds impossible

A draw can be a whole arrangement, too — a random permutation. 100 numbered prisoners, 100 numbered boxes, each box holding one prisoner's number in random order. Each may open 50 boxes hunting for their own number. If everyone finds it, all go free — otherwise all are lost. No signalling.

Guessing is hopeless

Open 50 of the 100 boxes at random and you find your own number half the time. Fine for one prisoner — but all 100 must succeed at once, so the odds are (½)¹⁰⁰ ≈ 8×10⁻³¹. Astronomically, certainly zero.

The trick: open your own box first

Now the clever rule. Prisoner k opens box k. Inside is some number — so go open that box next. The slip there points to the next box, and so on. You are quietly walking a loop.

Follow the chain home

Because the boxes are a permutation, the chain you follow must eventually loop back to box k — and the slip that closes the loop is your own number. You find it on the very last hop of your cycle… if that cycle is at most 50 long.

Every prisoner walks one loop

The whole arrangement splits into a handful of separate loops, and each prisoner is born onto exactly one of them. Everyone on a given loop succeeds or fails together, depending only on that loop's length.

One rule decides everything: no loop over 50

So all 100 go free precisely when the longest loop is ≤ 50. Watch fresh shuffles roll by — a single long loop (red) dooms everyone on it; keep every loop short (green) and the whole room walks free.

How often? About 31%

Over thousands of shuffles, plot the longest loop. A permutation can have at most one loop longer than 50, and the chance it has none works out to 1 − (1/51 + … + 1/100) ≈ 0.3118. The green mass left of the line is the answer.

Examples

A catalogue of short programs, each a Monte-Carlo experiment with a known closed form, so the printed answer can be checked. Open any one in the playground to run and edit it — the real Noise compiler, built to WebAssembly and running in your browser. Each program gets its own shareable link.

How it works

Noise is small by design. Everything above is built from a handful of ideas.

Everything is a distribution

A number is just a distribution with all its weight on a single point — a Dirac delta. Operators lift over random variables automatically — so X below is random, and Y is random too, with no special syntax. Propagating uncertainty reads exactly like ordinary arithmetic.

X ~ unif(-1, 1)
Y = 2 * X + 3   # Y is a distribution too

The tilde draws; equals transforms

A name bound with ~ is one fixed random draw that every mention reuses — so X − X is exactly 0, never "two samples." Independence comes from separate ~ bindings, exactly like writing X₁, X₂ on paper. No hidden re-draws, no surprises.

A ~ unif_int(1, 6)
B ~ unif_int(1, 6)   # two independent dice
A + B                # a genuine 2d6 distribution

Queries: P, E, Var, Q

Nothing is sampled until you ask. A query runs a fast columnar Monte-Carlo pass and reports an honest estimate — the printed digits reflect the standard error, and that error propagates through arithmetic, so 4·P(C) rounds itself correctly.

C = X**2 + Y**2 < 1
4 * P(C)   # ≈ 3.14

Independence is a shape

Put a shape on the tilde to draw a whole batch at once: ~[n] is an iid vector, ~[n, m] a matrix. A reducer collapses it back to one number — so the birthday paradox over 23 people, all 253 pairwise comparisons, is a single expression.

days ~[23] unif_int(1, 365)
P(has_duplicates(days))   # ≈ 0.51

if is a value, not a branch

When the condition is random, if c { a } else { b } does not take a path — it builds a new random variable, choosing a or b per sample. That single rule hands you max, min, abs, clamps and payoffs over distributions for free.

higher = if A > B { A } else { B }   # the larger of two dice

Performance

Almost every Noise program ends in “evaluate this expression over a few million random draws.” That loop is compiled, not interpreted. ~ and the distribution constructors build a graph IR that lowers three ways — a portable columnar interpreter, a native JIT via Cranelift, and a WebAssembly emitter for the browser — all sharing one cost model, so the backend only ever changes speed, never results (bit-identical across core counts).

You write a one-line P(...) and get an expert kernel for free. It is built from a stack of techniques, each with its own measured win:

  • Kernel fusion — the codegen backends emit one loop that draws its sources, computes the whole expression in registers, and stores only the result, erasing the interpreter's intermediate memory traffic.
  • Graph simplification — constant folding, finite-safe algebraic identities, and common-subexpression elimination shrink the DAG before any code is generated (so X + X is one draw, not two).
  • Inlined xoshiro256++ PRNG — the generator is emitted straight into the kernel as a handful of shifts/xors/rotates, with zero call overhead on native and in WASM alike.
  • Inlined transcendentalsln/sin/cos (the heart of normal, exp, and signals) become straight-line polynomial approximations (~1e-9 vs libm), roughly doubling transcendental-bound kernels and skipping a per-draw crossing of the JS boundary in the browser.
  • Multi-stream RNG — four independent xoshiro streams run at once to hide the generator's serial-dependency latency (the scalar form of SIMD), switched on only where the graph is latency-bound.
  • Columnar batches — the interpreter runs 1024 lanes through one instruction at a time: a tight, cache-friendly, auto-vectorizing pass over contiguous f64s.
  • Vectorized power-sum reduction — moments accumulate as raw power sums across eight unrolled lanes with no per-element divide: ~9.5× faster than a streaming Welford update, turning the reduction from the ceiling into a rounding error.
  • Deterministic multicore — sampling fans out with a work-stealing loop whose per-chunk accumulators merge as an exactly-associative monoid, so the answer is bit-identical regardless of thread count, and reproducible from a seed.
  • Profitability gate — a cost model emits a fused kernel only where it beats the vectorized interpreter, so codegen can change the speed but never lose.

The payoff, measured on a 14-core M4 Pro:

  • ~5.8 billion samples/sec (π Monte Carlo, generate + reduce, all cores), scaling ~9.6× from one core to all of them.
  • Within ~1.15× of hand-written, LLVM-compiled Rust per core — and faster end to end, because the one-liner fuses and fans out across every core with no flags or annotations.
  • In the browser the emitted WASM kernel runs the same fused loop at ~0.5–0.75× of native codegen — hundreds of millions of samples/sec, client-side.

The full write-up, with the benchmark tables behind each number, is in PERF.md.

About the creator

Manu Martínez-Almeida
Manu Martínez-Almeida

Manu Mtz.-Almeida. Creator of Gin, core contributor to Ionic, Stencil and Qwik. Principal engineer at Builder.io, working on compilers, high-performance systems, and AI agents.

I started Noise nine years ago and never quite finished it. The idea grew out of my telecommunications degree — a world of signals, noise, and probability — where I kept wishing for a language that could express uncertainty as naturally as it expresses arithmetic. This is that wish, picked up again all these years later.