Yiyang Xu – Introduction to Interactive Media

Week 9 – Two-Degree Safety

My Concept

I decided to build a “two-degree safety guardrail.” The logic is based on two separate actions:

Idle State: A red LED is ON (digital HIGH).
“Armed” State: A button is pressed. This turns the red LED OFF (digital LOW).
“Active” State: While the button is held, a FSR (force-sensing resistor) is pressed, which controls the brightness of a green LED (my analog output).

Challenge

My challenge was getting the red LED to be ON by default and turn OFF only when the button was pressed. I tried to build a hardware-only pull-up or bypass circuit for this but struggled to get it working reliably on the breadboard.

So, I shifted that logic into Arduino code adapted from a template in the IDE.

// constants won't change. They're used here to set pin numbers:
const int buttonPin = 2;  // the number of the pushbutton pin
const int ledPin = 13;    // the number of the LED pin

// variables will change:
int buttonState = 0;  // variable for reading the pushbutton status

void setup() {
  // initialize the LED pin as an output:
  pinMode(ledPin, OUTPUT);
  // initialize the pushbutton pin as an input:
  pinMode(buttonPin, INPUT);
}

void loop() {
  // read the state of the pushbutton value:
  buttonState = digitalRead(buttonPin);

  // check if the pushbutton is pressed. If it is, the buttonState is HIGH:
  if (buttonState == LOW) {
    // turn LED on:
    digitalWrite(ledPin, LOW);
  } else {
    // turn LED off:
    digitalWrite(ledPin, HIGH);
  }
}

The demo was shot with the Arduino implementation.

Schematic

However, I later figured out the pull-up logic, and was able to implement a hardware-only solution. This schematic was a result of the updated circuit.

Week 8 – Reading Response

If good design induces positive emotion, why do bad designs exist?

For a long time, I thought good designs not “nice-to-haves”, because it would take much mental efforts for me to even open an app I don’t like. Norman’s piece reinforced my belief, and I started noticing how intentional UX design can guide positive feelings. If good design makes users happier (and presumably more loyal), why would anyone create something clunky or unpleasant?

One big reason is cost. Last year, I bought a cheap disposable raincoat for a trip. It was thin, the buttons fell off easily, and it didn’t even cover my shoulders properly. The design was terrible, but it cost $5 compared to a $50 waterproof jacket that would’ve lasted years. For the brand, the goal was to sell a low-cost, single-use product, not to create something that felt good to wear. Good design here would’ve raised production costs, which didn’t align with their business model.

Another reason is misaligned priorities. I’ve used software for courses that’s so confusing—not p5.js, but it would be fair to mention Arduino IDE and most Adobe offerings. The team behind it probably focused solely on function, “Does it track data, can it be integrated with all hardware?” They forgot to ask, “Will this feel easy to use?” Maybe they were rushed to launch, or thought “usability” was less important than checking off feature lists.

There are also cases of designing for the wrong user. I’ve read about stories where the visually impaired struggles with many “well-designed” apps, with all the bright colors that blur together, tiny text, no screen-reader support. The designers may have imagined a “typical” user and didn’t consider how their choices would exclude others. Poor design is a lack of empathy for diverse needs.

Is software engineering compatible with good design?

Software engineering becomes non-negotiable in high-risk, high-pressure contexts, most probably the world Hamilton worked in. When a product’s failure could mean disaster (e.g., a space mission, a medical device, or a banking app), engineering ensures no crashes, no glitches, no errors that cost lives or money. Overall, reliability. I think about Apollo 11. If Hamilton’s code hadn’t detected and fixed the P01 flaw mid-flight, the moon landing must’ve failed. In that case, engineering wasn’t just “important” but existential.

But engineering and good design don’t conflict, but rather complement each other. Engineering builds the “trust” (i.e., will a product work when I need it?), and good design builds the “connection” (i.e., will it be easy, intuitive, and after all nice to use?). Take the Wio banking app as an example. Its engineering ensures my transactions are secure and fast, and its clear, low-jargon design makes it easy to check my balance or transfer money. If the app crashed every time I used it, its pretty interface wouldn’t matter. If it worked perfectly but was impossible to navigate, I’d switch to another bank.

Midterm – Pitchy Bird

following my initial progress on a backend-backed idea, I faced challenges managing the complexities of API usage and LM output, and thus switched to another idea briefly mentioned at the start of my last documentation.

Core Concept

For my midterm project, I wanted to explore how a fundamental change in user input could completely transform a familiar experience. Flappy Bird, known for its simple tap-based mechanics, was what I took to re-imagine with voice control. Instead of tapping a button, the player controls the bird’s height by changing the pitch of their voice. Singing a high note makes the bird fly high, and a low note brings it down.

The goal was to create something both intuitive and novel. Using voice as a controller is a personal and expressive form of interaction. I hoped this would turn the game from a test of reflexes into a more playful—and potentially, honestly, sillier challenge.

How It Works (and What I’m Proud Of)

The project uses the p5.js for all the visuals and game logic, combined with the ml5.js library to handle pitch detection. When the game starts, the browser’s microphone listens for my voice. The ml5.js pitchDetection model (surprisingly it’s lightweight) analyzes the audio stream in real-time and spits out a frequency value in Hertz. My code then takes that frequency and maps it to a vertical position on the game canvas. A higher frequency means a lower Y-coordinate, sending the bird soaring upwards.

click here to access the game as the embed is not functional.

I’m particularly proud of two key decisions I made that really improved the game feel.

First was the dynamic calibration for both noise and pitch. Before the game starts, it asks you to be quiet for a moment to measure the ambient background noise, which is measured set a volume threshold, so the game doesn’t react to the hum of a fan or distant chatter. Then, it has you sing your lowest and highest comfortable notes. This personalizes the control scheme for every player, adapting to their unique vocal range, which could be an important design choice for a voice-controlled game. I conseptualized this “calibration” idea and used AI to explore ways to implementation, finally coding up the components in the game.

setTimeout(() => {
  // Set noise threshold (average level + a buffer)
  noiseThreshold = mic.getLevel() * 1.5 + 0.01;
  console.log("Noise threshold set to: " + noiseThreshold);
  
  gameState = 'calibratePitch';
  isCalibratingLow = true;
  // Capture lowest pitch after a short delay
  setTimeout(() => {
    minPitch = smoothedFreq > 50 ? smoothedFreq : 100;
    console.log("Min pitch set to: " + minPitch);
    isCalibratingLow = false;
    isCalibratingHigh = true;
    // Capture highest pitch after another delay
    setTimeout(() => {
      maxPitch = smoothedFreq > minPitch ? smoothedFreq : minPitch + 400;
      console.log("Max pitch set to: " + maxPitch);
      isCalibratingHigh = false;
      // Ensure model is loaded before starting game loop with getPitch
      if (pitch) {
          gameState = 'playing';
      } else {
          console.log("Pitch model not ready, waiting...");
          // Add a fallback or wait mechanism if needed
      }
    }, 3000);
  }, 3000);
}, 2000); // 2 seconds for noise calibration

Another technical decision I’m happy with was implementing a smoothing algorithm for the pitch input. Early on, the bird was incredibly jittery because the pitch detection is so sensitive. To fix this, I stored the last five frequency readings in an array and used their average to position the bird. This filtered out the noise and made the bird’s movement feel much more fluid and intentional. Additionally, instead of making the bird fall like a rock when you stop singing, I gave it a gentle downward drift. This “breath break” mechanism hopefully makes the game feel like air.

Challenges and Future

My biggest technical obstacle was a recurring bug where the game would crash on replay. It took a lot of console-logging and head-scratching, but it ultimately turned out that stopping and restarting the microphone doesn’t work the way I’d thought. The audio stream becomes invalid after microphone stops, and I couldn’t reuse it. The solution was to completely discard the old microphone object and create a brand new one every time a new game starts.

In addition, there are definitely areas I’d love to improve. The calibration process, while functional, is still based on setTimeout, which can be sort of rigid. A more interactive approach, where the player clicks to confirm their high and low notes, would be an alternative user experience I could test and compare with. Additionally, the game currently only responds to pitch. It might be fascinating to incorporate volume as another control dimension—perhaps making the bird dash forward or shrink to fit through tight gaps if you sing louder.

A more ambitious improvement would be to design the game in a way that encourages the player to sing unconsciously. Right now, the player is very aware that you’re just “controlling” the bird only. But what if the game’s pipe gaps prompt them to perform a simple melody? The pipes could be timed to appear at moments that correspond to the melody’s high and low notes. This might subtly prompt the player to hum along with the music, and in doing so, they would be controlling the bird without even thinking.

Week 5 – Midterm Progress

After three days of painstaking brainstorming for my midterm, I came up with two directions: one was a game-like networking tool to help people start conversations, and the other was a version of Flappy Bird controlled by the pitch of your voice.

I was undoubtedly fascinated by both, but as I thought more about the project, it was clear that I wanted to experiment with generative AI. Therefore, I combined the personal, identity-driven aspect of the networking tool with a novel technical element.

The Concept

“Synthcestry” is a short, narrative experience that explores the idea of heritage. The user starts by inputting a few key details about themselves: a region of origin, their gender, and their age. Then, they take a photo of themselves with their webcam.

From there, through a series of text prompts, the user is guided through a visual transformation. Their own face slowly and smoothly transitions into a composite, AI-generated face that represents the “archetype” of their chosen heritage.

Designing the Interaction and Code

The user’s journey is the core of the interaction design, as I already came across game state design in class. I broke the game down into distinct states, which becomes the foundation of my code structure:

Start: A simple, clean title screen to set the mood.
Input: The user provides their details. I decided against complex UI elements and opted for simple, custom-drawn text boxes and buttons for a more cohesive aesthetic. The user can type their region and gender, and select an age from a few options.
Capture: The webcam feed is activated, allowing the user to frame their face and capture a still image with a click.
Journey: This is the main event. The user presses the spacebar to advance through 5 steps. The first step shows their own photo, and each subsequent press transitions the image further towards the final archetype, accompanied by a line of narrative text.
End: The final archetype image is displayed, offering a moment of finality before the user can choose to start again.

My code is built around a gameState variable, which controls which drawing function is called in the main draw() loop. This keeps everything clean and organized. I have separate functions like drawInputScreen() and drawJourneyScreen(), and event handlers like mousePressed() and keyPressed() that behave differently depending on the current gameState. This state-machine approach is crucial for managing the flow of the experience.

The Most Frightening Part

The biggest uncertainty in this project was the visual transition itself. How could I create a smooth, believable transformation from any user’s face to a generic archetype?

To minimize the risk, I engineered a detailed prompt that instructs the AI to create a 4-frame “sprite sheet.” This sheet shows a single face transitioning from a neutral, mixed-ethnicity starting point to a final, distinct archetype representing a specific region, gender, and age.

To test this critical algorithm, I wrote the startGeneration() and cropFrames() functions in my sketch. startGeneration() builds the asset key and uses loadImage() to fetch the correct file. The callback function then triggers cropFrames(), which uses p5.Image.get() to slice the sprite sheet into an array of individual frame images. The program isn’t fully functional yet, but you can see the functions in the code base.

As for the use of image assets, I had two choices. One is to use a live AI API generation call; the other is to have a pre-built asset library. The latter would be easier and less prone to errors, I agree; but given the abundance of nationalities on campus, I would have no choice but to use a live API call. It is to be figured out next week.

Week 5 – Reading Response

After reading Golan Levin’s “Computer Vision for Artists and Designers,” I’m left with a deep appreciation for the creativity that arose from confronting technical limitations. The article pulls back the curtain on interactive art, revealing that its magic often lies in a clever and resourceful dialogue between the physical and digital worlds, not in lines of complex code. Apparently, the most effective way to help a computer “see” is often to change the environment, not just the algorithm.

Levin shows that simple, elegant techniques like frame differencing or brightness thresholding can be the building blocks for powerful experiences, in contrast to my preexisting thought for a powerful CV system. The LimboTime game, conceived and built in a single afternoon by novice programmers who found a large white sheet of Foamcore, pushed the change in my perspective. They didn’t need a sophisticated algorithm; they just needed a high-contrast background. It suggests that creativity in this field is as much about physical problem-solving as it is about writing code. It’s a reminder that we don’t live in a purely digital world, and that the most compelling art often emerges from the messy, inventive bridge between the two.

The article also forced me to reflect on the dual nature of this technology. On one hand, computer vision allows for the kind of playful, unencumbered interaction that Myron Krueger pioneered with Videoplace back in the 1970s. His work was a call to use our entire bodies to interact with machines, breaking free from the keyboard and mouse. In the past or now, it is always joyful that our physical presence can draw, play, and connect with a digital space in an intuitive way.

On the other hand, the article doesn’t shy away from the darker implications of a machine that watches. The very act of “tracking” is a form of surveillance. Artists like David Rokeby and Rafael Lozano-Hemmer confront this directly. Lozano-Hemmer’s Standards and Double Standards, in particular, creates an “absent crowd” of robotic belts that watch the viewer, leaving a potent impression that I would not have expected from visual technology in the early 2000s.

Ultimately, this reading has shifted my perspective. I see now that computer vision in art isn’t just a technical tool for creating interactive effects. It is a medium for exploring what it means to see, to be seen, and to be categorized. The most profound works discussed don’t just use the technology; they actively raise questions about the technology. They leverage its ability to create connection while simultaneously critiquing its capacity for control. I further believe that true innovation often comes from embracing constraints, and that the most important conversations about technology could best be articulated through art.

Week 4 – Reading Response

One thing that drives me crazy is the overcomplicated interface of most modern microwaves. Even as someone who’s pretty tech-savvy, I groan every time I use a new one. 10+ buttons are there for “defrost poultry,” “bake potato,” “reheat pizza,” and more, when all I usually need is to heat up leftovers. Half the time when I use a new microwave, I end up pressing random buttons wasting an equal amount of time to set a basic 2-minute timer. It feels like designers prioritize “showing off” features over usability, exactly as Norman warns. They cram functions to make the microwave seem advanced, but they forget the core user need. Simplicity.

This frustration ties directly to Norman’s principles. The microwaves lack good signifiers, as there’s no clear visual cue (like a large, labeled “Quick Heat” button) to guide basic use. The mapping is muddled, too, as why is “Popcorn” next to “Sensor Cook” when most users reach for quick heating first? They are inherently all part of audience “affordances”—I don’t like this word, as we could simply call it design “friendliness”—which the author argues, sometimes contradicts with the designers’ desire to “show off”.

I agree to a certain extent. On the other hand, however, a designer IS capable of prioritizing human-centered design as their end goal—think of Steves Jobs or a random Apple designer—it is also a process to perfection to excel in design friendliness. How about this simple, intuitive microwave you might have seen: a large digital dial for time (natural mapping) and one “Start” button, plus a small “More Functions” menu for niche uses. This keeps discoverability high, because even a first-time user would know to twist the dial, and reserves extra features for those who need them, without cluttering the experience.

Design doesn’t have to choose between “impressive” and “usable.” Apple proves this by making complex tech feel intuitive, and microwaves could do the same, once designers focused on what users actually do instead of what they think looks good.

Week 4 – Torrent of Transience

For this week’s assignment, I wanted to take the NYC Leading Causes of Death data and turn it into generative text art on p5.js. Initially, I was torn between a “Data Rain” concept where statistics fall like a cascade, and an “Unraveling Text” idea where words literally dissolve, mirroring entropy.

After some back-and-forth, I settled on combining the two into what I’m calling “Torrent of Transience”. The core idea is a continuous stream of disease names falling from the top of the screen. But it’s not just falling. Each word is actively dissolving, blurring, and fading as it descends, vanishing before it even reaches the bottom. It’s meant to evoke a waterfall of ink words, where the ink itself is dissolving as it flows.

The challenge was mapping the data in a way that felt intuitive and impactful. I decided that the Deaths count for each cause would determine the textSize – larger words for more fatalities, making their presence felt more strongly. The Age Adjusted Death Rate also became useful, as it controls both how fast the word falls and how quickly it dissolves. So, a cause with a high death rate will rush down the screen and disappear rapidly, a stark visual metaphor for its devastating impact.

I also made sure to clean up the data. Those ICD-10 codes in the Leading Cause column were a mouthful, so I’m stripping them out, leaving just the disease name for clarity. And I’m filtering for valid Deaths and Death Rate entries, because a null isn’t going to map to anything meaningful.

For the “unraveling” effect, I knew textToPoints() on every single particle would crash the sketch. My solution was a bit of a cheat, but effective: I draw each word a few times, with slight random offsets, and increase that offset as the word fades. This creates a ghosting, blurring effect that visually implies dissolution. Coupled with a semi-transparent background, it gives the impression of words literally melting into the ether.

Right now, the dataSample is a curated (i.e., selected) list of causes to get the demo running smoothly. If this were a full-blown project, I’d implement a way to dynamically load and parse the entire CSV, allowing the user to select a year and see a completely different torrent. That’s a future enhancement, but for now, the sample gives a good impression of the dynamic effect.

Week 3 – Exquisite Candidate

Inspiration

I found myself thinking about the current state of political discourse—how it often feels chaotic, random, and almost nonsensical. Arguments and personas could go totally random, as if different parts have been stitched together to form a strange new whole.

This immediately brought to mind ancient myths, like the 人头马身 (the centaur), a creature with the head and torso of a human and the body of a horse. This became my core visual metaphor: what if I could create political “centaurs”? I could randomly pair the heads of recognizable political figures with symbolic, abstract bodies to represent the absurdity and randomness of political rhetoric.

The project needed a name that captured this idea. I was inspired by the Surrealist parlor game, “Exquisite Corpse,” where artists collaboratively draw a figure without seeing the other sections. My program does something similar, but with political figures, or “candidates.” The name clicked almost instantly: Exquisite Candidate.

Description

Exquisite Candidate is an interactive artwork that explores the chaotic nature of political identity. By clicking the mouse, the viewer generates a new “candidate”—a hybrid figure composed of a randomly selected head and a randomly selected body.

The heads are abstract but recognizable vector drawings of political figures. The bodies are symbolic and thematic, representing concepts like power (“suit”), vulnerability (“stripped_down”), foolishness (“sheep”), or emotional immaturity (“baby with tears”). The resulting combinations are surprisingly (at least for me the creator) humorous or poignant, creating a visual commentary on the fragmented and performative nature of public personas. To bring these abstract figures to life, Gemini helped me generate part of the many vector-based drawing functions for the assets.

Code

The program is built on an Object-Oriented structure with three main classes: Head, Body, and Creature. This keeps the code clean, organized, and easy to expand.

A challenge I encountered was with the “baby with tears” body. My initial design was simple: the Body object would draw itself, and the Head object would draw itself. But the tears needed to be drawn on the face, which is part of the Head object. How could the Body object know where the head was going to be drawn? Unfortunately, until submission, I haven’t figured out how to implement this successfully.

Week 3 – Reading Response

The author differentiates “tools” for utility, such as the hypothetical Nintendo fridge, from something “fun” and “interactive”. Thus, he raises a question, “Is interactivity utterly subjective?”, only to discuss the process of interactivity as a subjective flow. In particular, he argues that the thinking that spurs creativity–and a certain level of randomness as we discussed last week–is a crucial element of interactivity.

I agree that the thinking for creative responses is the most important part. In the past, even now, some low- and mid-level interactive creations, as the author would categorize, are solely dependent on a set of rules that only attempt to be generative. Their output doesn’t serve a meaning itself, only reflecting part of a bigger scene defined by the rule setter. Ideally, however, every output should prompt some further thinking in the receiver of the response, or the originator of the conversation. It is indeed fairly difficult to achieve so, particularly in the past.

The advent of Generative AI could bring some change, especially when it’s seemingly untouched in the interactive visual art sphere. What if, I say what if, some code is written in real-time, following a certain broader set of rules? What if, in addition to a set of rules, some new impromptu visual rules are created in real time?

Week 2 – Looped

This week we’re asked to create an artwork that incorporates the “loop” concept in code. I saw dynamic squares before, and would personally like to create a grid that gently “breathes” and drifts. Each square’s size, brightness are driven by layered sine waves using a shared time phase so the whole field feels organic and connected, like a low‑key pixel ocean.

Below is the code for core motion + styling logic (the vibe engine).

const w = sin(phase + (x * 0.35) + (y * 0.45));  // wave seed
const s = map(w, -1, 1, cell * 0.25, cell * 0.85);  // size pulse
const dx = sin(phase * 0.7 + x * 0.3) * 6;
const dy = cos(phase * 0.7 + y * 0.3) * 6;
const hueVal = (x * 8 + y * 6 + frameCount * 0.4) % 360;
const bri = map(w, -1, 1, 35, 90);
fill(hueVal, 60, bri, 0.9);
rect(px, py, s, s, 6);

What works: Simple math, no arrays or heavy state—scales nicely with window size. The motion feels smooth and unified.
Limitations: All squares animate uniformly; interaction is missing. No colors. (In a version, colors follow a fixed formula, so longer viewing gets predictable.)

To be frank, this implementation still lacks the smooth “sea wave” vibe that I was looking for. In particular, I would have liked the edges to transform into non-linear lines like waves. But I would call this a first prototype as a p5.js beginner.

However, I trialed for smaller square size, and I’m surprised that such a minor change created something perceptually different. Finally, I implemented a super cool mouse click effect, which in a way achieved another level of dynamic aesthetics.