Simple Pwa Piano Hero
FleetingI have an acoustic piano and a phone. I’d like a game that scrolls notes towards a line, the way Guitar Hero does, and tells me whether I actually played them — not by tapping a button, but by listening to the piano through the phone’s microphone and checking that the right pitches sounded at the right moment.
A piano is polyphonic: I play several notes at once. General “what am I playing?” transcription of overlapping notes is a research problem. But a game already knows the answer — the chart says which notes are due now. So the app never has to transcribe the open-ended sound; it only has to verify that the notes the chart expects are present in what the microphone hears. Verification against a known target is tractable with plain signal processing; transcription is not. That reframing is what makes this buildable without a machine-learning model.
Live analysis only — nothing is recorded or stored.
Live: https://sam.konubinix.eu/piano/.
Choice of technology
The organiser drove a Lit render layer from a single TinyBase store so that one source of truth fed both the renderer and cross-device sync. This app has nothing to sync — no inventory, no relations, no second device that would want to merge. The charts it plays are part of the build, and a game in progress is meaningless to persist. So TinyBase goes, and with it the widget kit: there are no dialogs or menus here, only a canvas and a couple of buttons.
For the chrome around the game — the heading, the buttons, a live readout — I reach for VanJS, the same kilobyte-sized reactive layer the karate beep trainer settled on: a reactive state cell is the source of truth, and the DOM is plain functions that read it. No build step, one ES module from a CDN.
But the heart of the app is neither chrome nor a store: it is a stream of numbers arriving sixty times a second — the spectrum of the sound in the room — and a stream of notes scrolling towards a line. Neither belongs in a reactive DOM: re-rendering elements at audio rate would thrash. Both live on a <canvas> painted from a requestAnimationFrame loop, and the sound comes from the Web Audio API — the microphone feeding an AnalyserNode whose spectrum the loop reads each frame. VanJS owns the chrome; the loop owns the game.
That microphone is also what the document cannot honestly test: a suite
can’t play a piano. So the analysis is split from its source. The pipeline
that turns a spectrum into “which notes are sounding” is pure and fully
testable; what feeds it a spectrum is swappable. In the app the source is
the microphone; under test it is a set of oscillators synthesised in the
page to a pitch the test names — the faithful analog of the organiser’s
make_png, which handed a file input known bytes rather than a real photo.
The seam is a ?synth query parameter, the same shape as the trainer’s
?demo hook (Hear a note).
Setting up the app
One thing on this screen changes reactively for now: the note the app currently hears. That is a VanJS state cell — read it inside a render function and VanJS re-runs that function when it changes. It starts empty: nothing has sounded yet.
const heard = van.state(null);
VanJS builds DOM from functions named after tags, so the setup line is the import plus the handful of tag builders the views use.
import van from 'vanjs-core';
const { header, main, h1, div } = van.tags;
The whole app mounts into one element.
<div id="app"></div>
Boot mounts the view into #app, raises the data-app-ready flag the tests
wait on, and — only when the page carries a ?synth parameter — starts
listening to the synthesised source that parameter describes (Hear a
note). The microphone path waits for a later chapter and a user gesture; the
synth path needs neither, so it can start on load.
van.add(document.getElementById('app'), App());
document.body.setAttribute('data-app-ready', '1');
Visual basics
The same dark palette as the organiser and the trainer, pinned to CSS custom
properties so the feature chapters reach for var(--accent) by name. The
live readout is large and centred — at a glance from the piano bench I can
see what the app thinks I just played.
:root{
--bg:#1b1d2e; --card:#262a40; --fg:#e8e8f0; --muted:#8a8ea5;
--accent:#f9a826;
}
*{box-sizing:border-box}
body{margin:0;background:var(--bg);color:var(--fg);
font-family:system-ui,sans-serif;line-height:1.4}
.app-bar{padding:12px 16px;border-bottom:1px solid #2a2d44}
.app-bar h1{font-size:1.1rem;margin:0}
.screen{padding:24px 16px}
.readout{font-size:4rem;font-weight:700;text-align:center;
color:var(--accent);min-height:1.2em}
Hear a note
Before any game, I want proof the app hears me correctly: a live readout of the note it currently detects, like a tuner. It is the foundation every later chapter builds on, and the first thing the suite can pin down — feed a known pitch in, read the right note name out.
The test feeds a synthesised A above middle C — MIDI note 69, 440 Hz —
through the ?synth seam and waits for the readout to show A4. No
microphone, no permission, no real sound: the oscillator the page builds is
the test’s “played note”, the same way make_png was the organiser’s
“uploaded photo”.
@testcase
def test_hears_single_note(page):
"""A synthesised A4 (MIDI 69) makes the live readout show A4."""
page.goto(BASE_URL + "?synth=69")
page.wait_for_selector("[data-app-ready]")
page.get_by_text("A4", exact=True).wait_for()
print(" PASS: hears single note")
The readout is a single reactive element: it shows the note name in heard,
or an em dash while the app has heard nothing.
function Readout(){
return div({ class: 'readout' }, () => heard.val ?? '—');
}
function App(){
return [
header({ class: 'app-bar' }, h1('Piano hero')),
main({ class: 'screen' }, Readout(), Targets()),
];
}
A note name is just a MIDI number dressed up: twelve names repeating every
octave, the octave number sliding every twelve semitones, with MIDI 60
landing on middle C (C4). And a frequency maps to the nearest MIDI number
through the standard equal-temperament formula, anchored at A4 = 440 Hz.
const NOTE_NAMES = ['C','C#','D','D#','E','F','F#','G','G#','A','A#','B'];
function noteName(midi){
return NOTE_NAMES[((midi % 12) + 12) % 12] + (Math.floor(midi / 12) - 1);
}
function freqToMidi(freq){
return Math.round(69 + 12 * Math.log2(freq / 440));
}
The sound reaches the analyser through whatever source the boot picked. The context is created lazily and resumed in case the browser handed it back suspended.
let audioCtx = null;
function ensureAudio(){
if(!audioCtx){
const Ctx = window.AudioContext || window.webkitAudioContext;
audioCtx = new Ctx();
}
if(audioCtx.state === 'suspended') audioCtx.resume();
return audioCtx;
}
The synthesised source is the test’s instrument: for each MIDI note in the
?synth list it starts an oscillator at that note’s frequency and mixes
them into one output. A single note is the degenerate case of the list, so
the same builder will serve chords unchanged in a later chapter.
function synthSource(ctx, spec){
const midis = spec.split(',').filter(s => s !== '').map(Number);
const mix = ctx.createGain();
for(const m of midis){
const osc = ctx.createOscillator();
osc.frequency.value = 440 * Math.pow(2, (m - 69) / 12);
osc.connect(mix);
osc.start();
}
return mix;
}
Listening wires that source into an AnalyserNode and starts the read loop. The analyser also drives a muted gain into the speakers: an analyser only produces data while the graph it sits in is being pulled toward a destination, and a zero gain pulls it without making a sound. The window size is 8192 samples — fine enough to separate adjacent semitones in the range a piano is usually played.
const FFT_SIZE = 8192;
function startListening(synthSpec){
const ctx = ensureAudio();
const analyser = ctx.createAnalyser();
analyser.fftSize = FFT_SIZE;
synthSource(ctx, synthSpec).connect(analyser);
const mute = ctx.createGain();
mute.gain.value = 0;
analyser.connect(mute);
mute.connect(ctx.destination);
listen(analyser);
}
Each frame, the loop reads the spectrum and names the loudest pitch in it. A single clean tone is one tall peak, so for now “the note” is simply the highest-magnitude bin, converted from its bin frequency to a name. A floor in decibels keeps silence from being read as a phantom note. Polyphony and harmonic disambiguation arrive with the game; the tuner only owes a single clear pitch.
const FLOOR_DB = -80;
function detect(analyser){
const bins = new Float32Array(analyser.frequencyBinCount);
analyser.getFloatFrequencyData(bins);
const hzPerBin = audioCtx.sampleRate / analyser.fftSize;
let peak = -Infinity, peakBin = -1;
for(let i = 1; i < bins.length; i++){
if(bins[i] > peak){ peak = bins[i]; peakBin = i; }
}
if(peak < FLOOR_DB) return null;
return noteName(freqToMidi(peakBin * hzPerBin));
}
function listen(analyser){
(function tick(){
heard.val = detect(analyser);
requestAnimationFrame(tick);
})();
}
The converse guards the readout against an empty room: with no oscillator on
the seam — an empty ?synth list — nothing rises above the floor, so
detect returns nothing and the readout holds its em dash rather than chase
noise into a phantom pitch.
@testcase
def test_silence_reads_as_nothing(page):
"""An empty ?synth list feeds no tone, so the readout stays an em dash."""
page.goto(BASE_URL + "?synth=")
page.wait_for_selector("[data-app-ready]")
page.wait_for_timeout(300)
assert page.get_by_text("—", exact=True).is_visible(), \
"a silent room registered a phantom note"
print(" PASS: silence reads as nothing")
PWA shell
I want to add the app to my phone’s home screen and open it straight at the piano, no browser chrome. Three small things turn the web app into a PWA, the same three the organiser used.
The first is a manifest: the metadata the OS reads when I add the app to my home screen — name, theme colour, and an icon drawn inline as an SVG data-URL so there is no separate asset to ship.
{
"name": "Piano hero",
"short_name": "Piano",
"description": "Listen to an acoustic piano and score the notes you play.",
"start_url": ".",
"display": "standalone",
"background_color": "#1b1d2e",
"theme_color": "#1b1d2e",
"icons": [
{
"src": "data:image/svg+xml,<svg xmlns='http://www.w3.org/2000/svg' viewBox='0 0 512 512'><rect width='512' height='512' rx='96' fill='%231b1d2e'/><text x='256' y='350' font-size='280' text-anchor='middle' fill='%23f9a826'>P</text></svg>",
"sizes": "512x512",
"type": "image/svg+xml",
"purpose": "any maskable"
}
]
}
The second is a service worker. The cache-first-with-write-through logic is the same as in every other PWA here, so it lives in a shared block; what is app-specific is which files to cache and the cache name, keyed on the build hash so a deploy drops the old cache on the next activation.
const CACHES = [
{ name: 'piano-nil' },
];
const ASSETS = ['./', './index.html', './app.js', './manifest.json'];
nil
The third is a loading ring — a spinner covering the page until the boot
raises data-app-ready, so the moment between tapping the icon and the app
appearing doesn’t read as broken.
<div id="loading"></div>
#loading{position:fixed;inset:0;background:var(--bg);z-index:9999}
body[data-app-ready] #loading{display:none}
And a tiny build tag in the corner, to confirm at a glance which build a device actually picked up after a deploy.
<span class="build-tag" title="Build">nil</span>
.build-tag{position:fixed;top:8px;left:8px;font-size:.65rem;color:var(--muted);
font-family:monospace;pointer-events:none;z-index:50}
Playwright tests
The per-feature tests need a runner, shaped by the same two rules the organiser and the trainer settled on: assert on the user-facing surface (visible text, roles, labels) over DOM internals, and let each test register itself with a decorator so there is no central list to forget to update.
TESTS = []
def testcase(fn):
"""Registration decorator: each =@testcase= appends to =TESTS= in source order."""
TESTS.append(fn)
return fn
The imports block does a little setup beyond importing. The Nix shell doesn’t
put the Playwright browsers on a path the loader knows, so it points
PLAYWRIGHT_BROWSERS_PATH at the playwright-browsers entry in buildInputs
unless already set. It pins a portrait phone viewport, since the app is
phone-first, and names the dev slot’s served URL as the default target.
import os, sys, time
if "PLAYWRIGHT_BROWSERS_PATH" not in os.environ:
for p in os.environ.get("buildInputs", "").split():
if "playwright-browsers" in p:
os.environ["PLAYWRIGHT_BROWSERS_PATH"] = p
break
from playwright.sync_api import sync_playwright
BASE_URL = os.environ.get("PIANO_URL", "http://localhost:9682/debug/piano/")
PHONE_VIEWPORT = {"width": 400, "height": 800}
The runner mirrors the trainer’s — whole suite by default, name-substring
filter from positional args, -x to stop on first failure, --headed to
watch — with one addition. Web Audio won’t run a context until the user has
interacted with the page, which would freeze the synthesised source before
it ever sounded; launching Chromium with --autoplay-policy=no-user-gesture-required
lifts that gate so the ?synth seam can start on load.
def main():
argv = sys.argv[1:]
headed = "--headed" in argv
stop_on_fail = "-x" in argv or "--exitfirst" in argv
patterns = [a for a in argv if a not in ("--headed", "-x", "--exitfirst")]
selected = [t for t in TESTS
if not patterns or any(p in t.__name__ for p in patterns)]
if not selected:
print(f"no test matches {patterns}")
sys.exit(2)
with sync_playwright() as pw:
browser = pw.chromium.launch(
headless=not headed,
args=["--autoplay-policy=no-user-gesture-required"],
)
ctx = browser.new_context(viewport=PHONE_VIEWPORT)
page = ctx.new_page()
page.set_default_timeout(5000)
passed = failed = 0
for t in selected:
try:
t(page); passed += 1
except Exception as e:
print(f" FAIL: {t.__name__}: {e}"); failed += 1
if stop_on_fail: break
browser.close()
print(f"\n{passed} passed, {failed} failed out of {len(selected)}")
sys.exit(1 if failed else 0)
if __name__ == "__main__":
main()