following my initial progress on a backend-backed idea, I faced challenges managing the complexities of API usage and LM output, and thus switched to another idea briefly mentioned at the start of my last documentation
Core Concept
For my midterm project, I wanted to explore how a fundamental change in user input could completely transform a familiar experience. I landed on the idea of taking a game—Flappy Bird, known for its simple tap-based mechanics, and re-imagining it with voice control. Instead of tapping a button, the player controls the bird’s height by changing the pitch of their voice. Singing a high note makes the bird fly high, and a low note brings it down.
The goal was to create something both intuitive and novel. Using your voice as a controller is a personal and expressive form of interaction. I hoped this would turn the game from a test of reflexes into a more playful—and potentially, honestly, sillier challenge.
How It Works (and What I’m Proud Of)
At its core, the project uses the p5.js library for all the visuals and game logic, combined with the ml5.js library to handle pitch detection. When the game starts, the browser’s microphone listens for my voice. The ml5.js pitchDetection model (surprisingly lightweight) analyzes the audio stream in real-time and spits out a frequency value in Hertz. My code then takes that frequency and maps it to a vertical position on the game canvas. A higher frequency means a lower Y-coordinate, sending the bird soaring upwards.
I’m particularly proud of a few key decisions I made that really improved the game feel.
First was the dynamic calibration. Before the game starts, it asks you to be quiet for a moment to measure the ambient background noise, which is measured set a volume threshold, so the game doesn’t react to the hum of a fan or distant chatter. Then, it has you sing your lowest and highest comfortable notes. This personalizes the control scheme for every player, adapting to their unique vocal range, which could be an important design choice for a voice-controlled game.
Another technical decision I’m happy with was implementing a smoothing algorithm for the pitch input. Early on, the bird was incredibly jittery because the pitch detection is so sensitive. To fix this, I stored the last five frequency readings in an array and used their average to position the bird. This filtered out the noise and made the bird’s movement feel much more fluid and intentional. Finally, instead of making the bird fall like a rock when you stop singing, I gave it a gentle downward drift. This “breath break” mechanic makes the game feel like air and acknowledges the physical reality of needing to breathe, which was a small but important game design tweak.
Challenges and Future
My biggest technical obstacle was a recurring bug where the game would crash on replay. It took a lot of console-logging and head-scratching, but it ultimately turned out that stopping and restarting the microphone doesn’t work the way I’d thought. The audio stream becomes invalid after microphone stops, and I couldn’t reuse it. The solution was to completely discard the old microphone object and create a brand new one every time a new game starts.
In addition, there are definitely areas I’d love to improve. The calibration process, while functional, is still based on setTimeout, which can be sort of rigid. A more interactive approach, where the player clicks to confirm their high and low notes, would be a much better user experience. Additionally, the game currently only responds to pitch. It might be fascinating to incorporate volume as another control dimension—perhaps making the bird dash forward or shrink to fit through tight gaps if you sing louder.
A more ambitious improvement would be to design the game in a way that encourages the player to sing unconsciously. Right now, you’re very aware that you’re just “controlling” the bird in another way. But what if the game’s pipe gaps prompt you to utter a simple melody? The pipes could be timed to appear at moments that correspond to the melody’s high and low notes. This might subtly prompt the player to hum along with the music, and in doing so, they would be controlling the bird without even thinking.