Week 5 Reading Response

Computer vision isn’t really “vision” in the way humans experience it, it’s more like a giant calculator crunching patterns in pixels. Where we see a friend’s smile and immediately read context, emotion, and memory, the computer just sees light values and tries to match them against models. It’s fast and can process way more images than a person ever could, but it lacks our built-in common sense. That’s why artists and developers often need to guide it using things like face detection, pose estimation, background subtraction, or optical flow to help the machine focus on what’s actually interesting. Techniques like MediaPipe that can map out your skeleton for gesture-based games, or AR apps that segment your hand so you can draw in mid-air, could let us bridge the gap between human intuition and machine literalism.

But once you start tracking people, you’re also borrowing from the world of surveillance. That’s a double-edged sword in interactive art. On one hand, it opens up playful experiences. On the other, the same tech is what powers CCTV, facial recognition in airports, and crowd analytics in malls. Some artists lean into this tension: projects that exaggerate the red boxes of face detection, or that deliberately misclassify people to reveal bias, remind us that the machine’s gaze is never neutral. Others flip it around, letting you “disappear” by wearing adversarial patterns or moving in ways the system can’t follow. So computer vision in art isn’t just about making the computer “see”, it’s also about exposing how that seeing works, what it misses, and how being watched changes the way we move.

You can also invert the logic of surveillance: instead of people being watched, what if the artwork itself is under surveillance by the audience? The camera tracks not you but the painting, and when you “stare” at it too long, the work twitches as if uncomfortable. Suddenly, the power dynamics are reversed.

Week 5 – Midterm Project Progress

For my midterm project, I decided to make a little balloon-saving game. The basic idea is simple: the balloon flies up to the sky and faces obstacles on its way, that the player needs to avoid

Concept & Production

Instead of just popping balloons, I wanted to make the balloon itself the main character. The player controls it as it floats upward, while obstacles move across the screen. The main production steps I’ve worked on so far include:

  • Making the balloon move upwards continuously.
  • Adding obstacles that shift across the screen.
  • Writing collision detection so that the balloon “fails” if it hits something.

  • Bringing back the buttons and menu look from the beginning, so the game starts cleanly.

It’s been fun turning the balloon from a simple object into something the player actually interacts with.

The Most Difficult Part
By far, the trickiest part has been the balloon popping without errors. Sometimes, the collisions were detected when they shouldn’t be, which gave me a bunch of false pops. Fixing that took way more trial and error than I expected, but I think I finally have it working in a way that feels consistent (I used help from AI and YouTube).

Risks / Issues
The main risk right now is that the game sometimes lags. Most of the time, it works fine, but once in a while, the balloon pops out of nowhere in the very beginning. I’m not sure if it’s about how I’m handling the objects or just the browser being picky. I’ll need to look into optimizing things as I add more features.

Next Steps
From here, I want to polish the interactions more, add sound effects, and make sure the game is fun to play for longer than a few seconds and looks visually appealing and more aesthetic. But overall, I feel good that the “scariest” part (getting rid of the balloon popping errors) is mostly handled.

Week 5 – Reading Response – Shahram Chaudhry

One thing that really stood out to me from this week’s reading is how different computer vision is from human vision. We take it for granted that we can look at a scene and instantly make sense of it. We can tell if it’s day or night, if there’s someone in the frame, if they’re walking or just waving – all without thinking. But to a computer, a video is just a bunch of colored pixels with no meaning. It doesn’t “know” what a person or object is unless we explicitly program it to. There are several techniques to help computers track. For example, frame differencing which compares two frames and highlights motion could be  helpful in detecting someone walking across a room or background subtraction to reveal new people or objects that appear. These sound simple, but they’re super powerful in interactive media. 

What makes this especially interesting is how computer vision’s ability to track things brings up both playful and serious possibilities. On one hand, it’s fun,  you can build games that react to your body like a mirror or let users move objects just by waving. But on the other hand, it opens doors to surveillance and profiling. Installations like The Sorting Daemon use computer vision not just to interact, but to critique how technology can be used for control. Or take the Suicide Box, which supposedly tracked suicides the Golden Gate Bridge. And it made me wonder, did it actually alert authorities when that happened, or was it just silently recording? That blurred line between passive tracking and ethical responsibility is something artists can explore in powerful ways.

Also, while humans can interpret scenes holistically and adapt to new contexts or poor lighting, computer vision systems tend to be fragile. If the lighting is off, or the background is too similar to a person’s clothes, the system might fail. No algorithm is general enough to work in all cases,  it has to be trained for specific tasks. We process thousands of images and scenes every day without even trying. For a machine to do the same, I am assuming it would need countless hours (or even years) of training. Nevertheless, clever engineering and artistic intuition means that we can still make good interactive art with the current state of computer vision.



Week 5 – Midterm Progress

For my midterm, I knew I wanted to incorporate a machine learning library, specifically for gesture recognition. I initially explored building a touchless checkout interface where users could add items to a cart using hand gestures. However, I realized the idea lacked creativity and emotional depth.

I’ve since pivoted to a more expressive concept: a Mind Palace Experience (not quite a game), where symbolic “memories” float around the screen  – some good, some bad. The user interacts with these memories using gestures: revealing, moving, or discarding them. The experience lets users metaphorically navigate someone’s inner world and discard unwanted memories, ideally the painful ones. Here’s a basic canvas sketch of what the UI could look like.

At this stage, I’ve focused on building and testing the gesture recognition system using Handsfree.js. The core gestures, index finger point, pinch, open palm, and thumbs down, are working and will be mapped to interaction logic as I build out the UI and narrative elements next.

The code for different gestures.

function isPinching(landmarks) {
  const thumbTip = landmarks[4];
  const indexTip = landmarks[8];
  const d = dist(thumbTip.x, thumbTip.y, indexTip.x, indexTip.y);
  return d < 0.05;
}

function isThumbsDown(landmarks) {
  const thumbTip = landmarks[4];
  const wrist = landmarks[0];
  return (
    thumbTip.y > wrist.y &&
    !isFingerUp(landmarks, 8) &&
    !isFingerUp(landmarks, 12) &&
    !isFingerUp(landmarks, 16) &&
    !isFingerUp(landmarks, 20)
  );
}

function isOpenPalm(landmarks) {
  return (
    isFingerUp(landmarks, 8) &&
    isFingerUp(landmarks, 12) &&
    isFingerUp(landmarks, 16) &&
    isFingerUp(landmarks, 20)
  );
}

function isFingerUp(landmarks, tipIndex) {
  const midIndex = tipIndex - 2;
  return (landmarks[midIndex].y - landmarks[tipIndex].y) > 0.05;
}

The sketch link:

https://editor.p5js.org/sc9425/full/n6d_9QDTg

Week 5 – Midterm Assignment Progress

Concept

For my midterm project, I’m building an interactive Hogwarts experience. The player starts by answering sorting questions that place them into one of the four houses. Then they get to choose a wand and receive visual feedback to see which wand truly belongs to them. After that, the player will enter their house’s common room and either explore various components in the room or play a minigame to earn points for their house.

The main idea is to capture the spirit and philosophy of each Hogwarts house and reflect it in the minigames, so the experience feels meaningful and immersive. Instead of just random games, each minigame will be inspired by the core traits of Gryffindor, Hufflepuff, Ravenclaw, or Slytherin.

Design

I want the project to feel smooth and interactive, with a focus on simple controls mostly through mouse clicks. Each stage (from sorting, to wand choosing, to the common room minigames) will have clear visual cues and feedback so the player always knows what to do next.

For the minigames, I’m aiming for gameplay that’s easy to pick up but still fun, and thematically tied to the house’s values. The design will mostly use basic shapes and animations in p5.js to keep things manageable and visually clean.

Challenging Aspect

The part I’m still figuring out and find the most challenging is designing minigames that really match each house’s philosophy but are also simple enough for me to implement within the project timeline. It’s tricky to balance meaningful gameplay with code complexity, especially because I already have a lot of different systems working together.

Risk Prevention

To manage this risk, I’ve been brainstorming minigames that are easy to build, like simple clicking games for Gryffindor’s bravery or Memory games for Ravenclaw, while still feeling connected to the houses’ themes. I’m focusing on minimal input and straightforward visuals so I can finish them reliably without overwhelming the code.

Reading Reflection – Week 5

As Levin noted in the article, there is a wide range of opportunities to utilize computer vision for interactive projects in the real world. On the surface level, human vision and computer vision seem similar, but at their core, the differences between them are striking. Human sight is based on context and shaped by years of experience living life, but computer vision is technically just raw pixel data at the start. Computer vision depends on the compatibility of the image with its abilities. If we give it the image of a person in different lighting or a new angle, it can result in unexpected processing outcomes, even though our human vision can easily identify that it’s the same person.

To help computers track what we’re interested in, I think it comes down to building a contrast between the object we wish to scan and its immediate surroundings. The author mentioned several techniques for doing this, such as frame differencing, which compared changes between video frames, background subtraction, which identified what was new compared to a static scene, and brightness thresholding, which isolated figures using light and dark contrasts. What I found most interesting was the use of difference in movement in the Suicide Box project, where it was the odd vertical motion of the persons that was the contrasting event in the image, and what the computer consequently identified as the target.

That said, computer vision’s capacity for tracking and surveillance makes its use in interactive art complicated. On one hand, it can make artworks feel so much more alive, and on the other, like in the Suicide Box project, it leads to significant controversy and even disbelief that the recordings could be real. It’s also interesting to think that what computer vision did in the Suicide Box project, human vision could never do, at least without causing the observer lifelong trauma. So computer vision does not just enable interactive art, but helps raise questions about privacy and control, and reflects cultural unease with the idea of being watched. 

I would also like to add how cool I find it that I’m now learning about these technologies in detail, when as a child I would go to art and science museums to see artworks that would use this technology and leave me feeling like I just witnessed magic; a similar feeling when I got my Xbox One and all the sports games would detect my movement as the characters’.

Week 5 – Reading Reflection

What I enjoyed most in this piece is how it drags computer vision down from the pedestal of labs and military contracts into something artists and students can actually play with. The examples, from Krueger’s Videoplace to Levin’s own Messa di Voce, remind me that vision doesn’t have to mean surveillance or soulless AI pipelines. It can also mean goofy games, poetic visuals, or even awkward belt installations that literally stare back at you. I like this take, it makes technology feel less like a monolith and more like clay you can mold.

That said, I found the constant optimism about “anyone can code this with simple techniques” a little misleading. Sure, frame differencing and thresholding sound easy enough, but anyone who’s actually tried live video input knows it’s messy. Lighting ruins everything, lag creeps in, and suddenly the elegant vision algorithm thinks a chair is a person. The text does mention physical optimization tricks (infrared, backlighting, costumes), but it still downplays just how finicky the practice is. In other words, the dream of democratizing vision is exciting, but the reality is still a lot of duct tape and swearing at webcams.

What I take away is the sense that computer vision isn’t really about teaching machines to “see.” It’s about choosing what we want them to notice and what we conveniently ignore. A suicide detection box on the Golden Gate Bridge makes one statement; a silly limbo game makes another. Both rely on the same basic tools, but the meaning comes from what artists decide to track and why. For me, that’s the critical point: computer vision is less about pixels and algorithms and more about the values baked into what we make visible.

Week 5 – Reading Reflection

The moment I started reading the article I immediately recognized Myron Krueger’s Videoplace from my first week in Understanding IM; I remember it because Professor Shiloh explained that Kreuger was actually manually adjusting the project in the background but making it appear to audiences like an automatic feedback loop. At the time, only computer specialists and engineers had access to complex computer vision technologies; this story is a reminder to me that the development tools that we now take for granted have only become accessible to the majority of the population in the past decade.

How does computer vision differ from human vision?
In the simplest sense, I believe computer vision lacks perspective and has an innate lack of context. Where humans lack in high processing speeds, they make up for in their innate flexible perception of reality of what is in front of them. They ask questions or make comparisons to what may not necessarily be the objectively closest comparison.

When it comes to perspective in AI– artificial intelligence didn’t grow up with an innate curiosity about the world no matter how many “Hello, World!”s it says. A human can look at a boy and a girl who always hang out together and assume romantic context but an AI wouldn’t know that innately; that’s probably why the trope of AI learning human emotions from watching their movies and media is such a common one in our fiction pieces.

Techniques to help the computer see / track what we’re interested in?
I believe the article mentions using bright lighting or at least high contrast backgrounds. However, I’m sure that image training is also very important in today’s computer vision.

Effect of tracking & surveillance in interactive art
I remember when I got my Xbox 360 as a kid and got the Kinect system bundled alongside it. It was such a revolutionary technology back then and now we can recreate the same thing on the software side with just a webcam on p5js! That is incredibly impressive to me.

I never even considered computer vision in surveillance until I read the piece on Suicide Box, which recorded real tragedies of people taking their lives at the Golden Gate bridge. What surprised me is how port authorities counted thirteen in the initial hundred days of deployment whereas the suicide box with its computer vision recorded seventeen. That’s four human lives that were tragically lost and possibly forgotten.

 

 

Week 5 – Midterm Progress (VERY) rough draft

(VERY ROUGH) draft of my game

For my midterm project I am designing an interactive memory game called Garden of Sequence. The idea is inspired by the concept of a magical garden where flowers “light up” in a sequence, and the player must repeat the pattern. Each round, the sequence grows longer and playback gets faster, which challenges the player’s short-term memory and focus. The interaction is simple but engaging: the player begins at a menu and presses Enter to start the game. During the playback phase, the game shows a sequence of flowers highlighted one by one with a circle (which i will later change to a glow or shine). Once playback ends, the player’s turn begins, and they must click the flowers in the same order. If they are correct, the game advances to the next round with a longer sequence. If they are incorrect, the game ends and a restart option appears. At any time, pressing “R” resets the game to the menu so a new session can begin.

Right now, I’m starting off with the bare bones of the game and keeping things simple. I’m not too focused on visuals or polish yet because I want to make sure the core concept, gameplay mechanics, and basic UI are working first. The prototype is built with a very clear structure: the flow of the game is controlled by four states, MENU, PLAYBACK, INPUT, and GAMEOVER. Each state decides what gets drawn on the screen and how the player can interact at that moment. I also created a Flower class to represent each clickable flower, which stores its position, size, color, and index. The class has a draw() method to show the flower and a contains() method to check if the player clicked inside it. The flowers  are just circles for now just as placeholders. Other functions like startGame(), restartGame(), and prepareNextRound() handle moving from one round to the next, while makeSequenceForRound() creates a random sequence with the correct length for each round. The updatePlayback() function is what plays the sequence back to the player, it shows which flower is active by drawing a simple white outline circle around it (which will later replace with a glow or other visual effect). Interaction is kept basic: the Enter key starts the game, the R key restarts it, and clicking on the flowers lets the player try to repeat the sequence.

NOT YET IN PROTOTYPE BUT PLANNED FOR THE ACTUAL GAME: When designing the visual elements for Garden of Sequence, I wanted to blend AI-generated assets using chat gpt with my own creative touch. I used AI tools to quickly generate base images such as the background, which gave me a solid starting point and saved time on initial drafts. From there, I created a logo and customized it in Procreate, adding hand-drawn details, adjusting colors, and layering text with the flowers so they felt more personal and unique to the game. For the flowers I used images from google that I liked and removed their backgrounds to make them a png, and tweaked minor details in them to make sure they looked like what I want for my actual game. This mix of AI efficiency and manual drawing allowed me to create visuals that are polished but still carry my own artistic style. Its important to note that these elements are not yet in the prototype but will be added to the actual game later on.

Background:

Flowers: Game Logo for cover page:

The most intimidating part of this project was figuring out how to handle the playback of the sequence and the checking of user input without overlap. The challenge was not only to generate a random sequence but also to display it one flower at a time, with pauses in between, and then smoothly transition to the input phase. If playback and input overlapped, the game would feel broken. To minimize this risk, I stripped the game down to its simplest form. Instead of complex glowing graphics, I used a basic white circle to indicate the active flower. I tested different sequence speeds and lengths until the loop felt reliable. By reducing the visuals and focusing on the sequence logic, I was able to confirm that the core mechanic works before moving on to more complex features such as the sound and design. I’m excited to mess around with the sounds, I feel like it will add a lot of depth to my game especially when users click and get the sequence right I can add a positive noise and an error noise if they get it wrong.

This prototype demonstrates the essential gameplay loop and shows that the memory challenge mechanic actually works in p5.js. With the hardest logic already tested, I now feel confident adding more polished elements such as custom flower drawings, glow animations, sparkles, and ambient sound. The prototype also sets up room for future features like score tracking, or maybe even weather events that could make gameplay more dynamic. Starting small and addressing the most uncertain part first gave me a working structure to build on, along with a clear plan for how to transform this into a polished final project.

Week 5 – Reading Response (Computer Vision for Artists and Designers)

Reading Computer Vision for Artists and Designers made me realize how differently machines interpret the visual world compared to humans. Where my eyes and brain can immediately recognize faces, objects, and contexts, a computer sees only streams of pixel data without inherent meaning. That difference kinda amazes me: what feels intuitive for me (like noticing the mood on a friend’s face) must be translated into measurable rules for the computer, such as brightness thresholds or background subtraction. This gap forces me to think about vision not as a natural act but as a series of constructed processes, something that both reveals the limits of human assumptions and opens new artistic possibilities.

The text also showed me that helping computers “see” isn’t only about coding better algorithms but also about designing the physical environment to be legible to the machine. Techniques like backlighting, infrared illumination, or retroreflective markers are surprisingly simple but effective. I found this point significant because it shifts responsibility back onto the artist or designer: we’re not just programming systems but curating conditions where vision becomes possible.

What I can’t ignore, though, is how these same techniques can easily blur into surveillance. Works like Lozano-Hemmer’s Standards and Double Standards or Jeremijenko’s Suicide Box made me uncomfortable precisely because they expose how tracking technologies, even when playful or artistic, carry power dynamics. If a belt can silently follow me or a camera can count unacknowledged tragedies, then computer vision isn’t neutral, it’s political. This makes me question: when I use vision algorithms in interactive art, am I creating a playful experience, or am I rehearsing systems of control?

For me, the text ultimately sharpened a tension: computer vision is at once liberating, because it expands interaction beyond a keyboard and mouse, and troubling, because it normalizes being watched. As a student studying Interactive Media, I feel I must navigate this duality carefully. A question that stuck with me is how to design works that use computer vision responsibly, acknowledging its history in surveillance, while still exploring its potential for creativity and embodiment.