week 5 – reading

  • What are some of the ways that computer vision differs from human vision?

As humans, we are able to look at something and classify it no matter the angle, lighting. A computer on the other hand just sees pixels of certain colours and we see the result as a reflection of real life. Then this is how machine learning gets involved, through hundred of images being labelled, colours and patterns identified, the computer is able to tell what it is looking at. That is exactly how my ml model in my midterm is able to detect which hand is which, which fingers are which.

As humans, we are told what is what by our environment and we see this come up with blind people in particular. What we see as green, one may not see it the same way. So in that sense, we are similar.

  • What are some techniques we can use to help the computer see / track what we’re interested in?

Frame differencing – detects motion by comparing each pixel in one video frame with the next. The difference in brightness indicates movement and this requires stable lighting and a stationary camera.

Background subtraction – detects presence by comparing the current frame against a stored image of the empty scene. Areas that differ significantly likely represent objects of interest, but this is sensitive to lighting changes.

Brightness thresholding – distinguishes objects based purely on luminosity and comparing each pixel’s brightness to a threshold value. This works when you can control illumination through backlightin.

By combining these techniques, we can create more complex art representations such as contact interactions (triggering events when a silhouette touches a graphic object), overlap interactions (measuring shared pixels between silhouette and virtual elements), or reflection interactions (computing angles when objects strike the silhouette). Warren’s research shows that once you’ve identified body pixels, implementing sophisticated interactions requires “little more than counting pixels” – making computer vision accessible for creating responsive installations, games, and performance systems where participants interact with virtual creatures or control visual elements through gesture and movement.

  • How do you think computer vision’s capacity for tracking and surveillance affects its use in interactive art?

The surveillance aspect is unavoidable, computer vision in art exists in the same technological ecosystem as security systems and facial recognition.

Different artists engage with this differently. Krueger’s Videoplace uses vision technology playfully where people willingly participate. But Lozano-Hemmer’s Standards and Double Standards explicitly creates “a condition of pure surveillance” using symbols of authority and that visibility of surveillance is the point. Rokeby’s Sorting Daemon confronts automated profiling by making visible the disturbing implications of computer vision used for racial categorisation, using surveillance tools to critique surveillance itself. Jeremijenko’s Suicide Box is honeslty very creepy to me, to see deaths and have them shown really raises questions about who has the right to see those sort of moments.

This is a topic that raises questions on consent and where data of users is being stored. If I was to interact with some art, should I assume that it won’t store any input that I am feeding it

Week 5: Midterm progress

My Midterm Project Concept

Last week, after a long and tiring day, I decided to take a short break and treat myself to a simple dinner. I made a fresh salad, seasoned it well, and added a generous scoop of hummus. I thought that a good meal would help me feel better. However, halfway through eating, I noticed a fly lying right in my food. The sight instantly ruined my appetite and left me feeling uneasy, worried I might end up with a stomach ache. I couldn’t help but think how much better the evening would have been if that fly hadn’t landed in my meal.

Interestingly, a friend later shared a similar unpleasant experience of finding a worm in their food. That conversation sparked an unusual but fun idea for a game: Worm Against Sanity. In this game, the player goes around the campus covering spots like the library, D1, D2, marketplace, and the Palms eliminating worms before they ruin the food.

.One of the most challenging parts of building Worm Against Sanity was making the game seamlessly switch between multiple screens while also animating the girl and worm sprites so that they moved realistically across the canvas. I wanted the opening screen, the play area, and the menu to feel like distinct spaces, but still connect smoothly when the player clicked a button. To achieve this, I kept track of a screen variable that updates whenever a mouse click falls within certain button coordinates. In the draw() function, I check the current value of screen and display the correct background and elements for that state. At the same time, I focused on fluid character and enemy movement. For the girl, I downloaded a running GIF and converted it into a sprite sheet, then wrote logic to cycle through the sprite frames every time an arrow key is pressed, flipping the image when she moves left. The worm uses a similar sprite-sheet approach, but it continuously advances across the screen on its own, updating its frame at regular time intervals and reducing the player’s life if it escapes. Coordinating these mechanics screen transitions, sprite-sheet animation, and frame-by-frame movement took careful planning and debugging, but it created a smooth and lively gameplay experience once everything clicked together.

I also experimented with adding interactive features, such as having the character jump on a worm when I move my hand or make a fist. Although I haven’t fully figured out how to implement motion-based controls yet, I’m actively exploring solutions and refining the concept.

In terms of visuals, I wanted the game to feel lively and unique, so I used AI tools to generate a cartoony illustration of the NYUAD campus to serve as the background for the different screens. This gives the game a playful, campus-specific atmosphere and saves time that would have gone into manual drawing.

 

My Work so Far

 

Week 5 – midterm progress

So for my midterm, I want to create some form of art and use Machine Learning. I want to have a visualisation of biology. I want to show a flower and have it keep being zoomed in to the atomic level. I want to use the ML model to detect the pinching motion, and this would trigger the page change.

index.html -> leaf.html -> cell.html -> atom.html

Firstly, I wanted to focus on the ML model and have the motion be detected. I used the ‘Hand Pose Detection with ml5.js’ video from the coding train as a foundation. I changed the parameters just to detect the right hand index finger and thumb.

Currently, I have incredibly basic images for the 4 pages and I will work on making them more aesthetic. The last page has OOP principles from the several atoms and the spinning electrons.

I also want to add some sort of noise to the first 3 images to represent what environment you could find them in. I am also thinking of making the transition between them represent some sort of medium between the 2 images.

 

class Atom {
  constructor(x, y, rotationSpeed = 0.02, innerOrbitRadius = 40, outerOrbitRadius = 60) {
    this.x = x;
    this.y = y;
    this.rotationSpeed = rotationSpeed;
    this.innerOrbitRadius = innerOrbitRadius;
    this.outerOrbitRadius = outerOrbitRadius;
    this.rotation = 0;
    this.nucleusSize = 20;
    this.electronSize = 8;
    this.outerElectronSize = 6;
  }

 

Week 5 Midterm Progress

Concept

For my midterm project, I came up this dining hall idea at the last minute. I had originally been inspired by music interactivity in p5.js and considered continuing with my earlier idea of a meditation game. But while eating lunch, I came up with a new idea that felt both playful and relevant to my experience here at NYUAD. So, I mostly working on replanning my idea and preparing assets this week.

As a visiting student from the New York campus, I was used to the dining hall’s pre-made meals. But At NYUAD, the on-demand menus were at first a little overwhelming. Without pictures, I often had no idea what I had ordered (especially with Arabic dishes I wasn’t familiar with) and I even found myself pulling out a calculator to check how much I had left in my meal plan and how much I orderd. Counters like All Day Breakfast felt especially confusing.

So my concept is to digitalize the experience of eating at NYUAD’s D2 All Day Breakfast counter. The project will let users visualize the ordering process, making it more interactive and hopefully reducing the friction that comes with navigating the real-life menu.

User Interaction

Planned Scenes (prototype):

1.Entering the A LA BRASA counter and tapping into the menu

2.Picking up the clamp to get food from the grill to the plate

3.Scanning food on the plate at the cashier’s scanner

4.Paying with coins in the cashier tray (display receipt?)

5.Eating!!

6.Burping to finish the meal

 

Assets:

Audio:

Dining hall ambient background

Cashier scanner beep

Cash register “kaching”

Burp sound

Yumyum sound

 

Pixelated images:

A LA BRASA counter background

All Day Breakfast menu

Grill plate

Clamp

Plate

Cashier scanner

Cashier with coins tray

Coins (D5, D3, D2, D1, D0.5, D0.25)

Fork

 

Pixel art food items:

Avocado fried egg toast

Avocado toast

French toast

Fried egg

Scrambled egg

Plain omelet

Cheese omelet

Mixed vegetable omelet

Tofu omelet

Hash brown

Chicken sausage

Beef bacon

Turkey bacon

Classic pancake

Coconut banana pancake

small bowl salad

 

The Most Frightening Part & How I’m Managing It

The biggest challenge I anticipate is gathering and aligning all these assets into a coherent game within the midterm timeframe. Real-life food images can be messy and hard to unify visually. To reduce this risk, I’ve decided to make everything in pixel art style. Not only does this match the “breakfast game” aesthetic, but it also makes it much easier to align items consistently.

Since Professor Mang mentioned we can use AI to help generate assets, I’ve been experimenting with transforming photos of my own plates and my friends’ meals into pixelated versions. This approach makes asset creation more manageable and ensures I’ll be able to integrate everything smoothly into the game.

 

Week 5: Reading Response

The part that stopped me was Suicide Box (1996), a camera pointed at the Golden Gate Bridge, quietly counting every time someone jumped. It sounds blunt, almost cold, yet I like the idea behind it. The artists (Natalie Jeremijenko and Kate Rich) flipped surveillance on its head: instead of policing people, the camera bore witness to a tragedy that the official numbers under-reported. I even looked it up afterward and found more debate and follow-up writing. Some people doubt the footage, others question the ethics of recording suicides. That tension actually makes the piece stronger for me; it shows how art can force uncomfortable truths into view.

 

What struck me next was how the essay treats technology as something physical and playful. Levin keeps pointing out that success often comes from the scene you build, not just the code you write: light a wall evenly, add reflective tape, adjust the lens. I like that attitude. It feels more like setting a stage than crunching math, and it makes computer vision sound approachable even fun for artists and students. The student project LimboTime, for example, came together in one afternoon with a webcam and a bright background. That shows how the simplest setups can spark creative interaction.

 

Overall, reading this made me want to experiment myself. The mix of raw data, social urgency, and poetic framing in Suicide Box shows how art and code can meet to notice what society tries not to see and maybe, slowly, help change it.

Week 5 – Midterm Progress

After three days of painstaking brainstorming for my midterm, I came up with two directions: one was a game-like networking tool to help people start conversations, and the other was a version of Flappy Bird controlled by the pitch of your voice.

I was undoubtedly fascinated by both, but as I thought more it, I wanted to explore more with generative AI. Therefore, I combined the personal, identity-driven aspect of the networking tool with a novel technical element.

The Concept

“Synthcestry” is a short, narrative experience that explores the idea of heritage. The user starts by inputting a few key details about themselves: a region of origin, their gender, and their age. Then, they take a photo of themselves with their webcam.

From there, through a series of text prompts, the user is guided through a visual transformation. Their own face slowly and smoothly transitions into a composite, AI-generated face that represents the “archetype” of their chosen heritage.

Designing the Interaction and Code

The user’s journey is the core of the interaction design, as I already came across game state design in class. I broke the game down into distinct states, which becomes the foundation of my code structure:

  1. Start: A simple, clean title screen to set the mood.
  2. Input: The user provides their details. I decided against complex UI elements and opted for simple, custom-drawn text boxes and buttons for a more cohesive aesthetic. The user can type their region and gender, and select an age from a few options.
  3. Capture: The webcam feed is activated, allowing the user to frame their face and capture a still image with a click.
  4. Journey: This is the main event. The user presses the spacebar to advance through 5 steps. The first step shows their own photo, and each subsequent press transitions the image further towards the final archetype, accompanied by a line of narrative text.
  5. End: The final archetype image is displayed, offering a moment of finality before the user can choose to start again.

My code is built around a gameState variable, which controls which drawing function is called in the main draw() loop. This keeps everything clean and organized. I have separate functions like drawInputScreen() and drawJourneyScreen(), and event handlers like mousePressed() and keyPressed() that behave differently depending on the current gameState. This state-machine approach is crucial for managing the flow of the experience.

The Most Frightening Part

The biggest uncertainty in this project was the visual transition itself. How could I create a smooth, believable transformation from any user’s face to a generic archetype?

To minimize the risk, I engineered a detailed prompt that instructs the AI to create a 4-frame “sprite sheet.” This sheet shows a single face transitioning from a neutral, mixed-ethnicity starting point to a final, distinct archetype representing a specific region, gender, and age.

To test this critical algorithm, I wrote the startGeneration() and cropFrames() functions in my sketch. startGeneration() builds the asset key and uses loadImage() to fetch the correct file. The callback function then triggers cropFrames(), which uses p5.Image.get() to slice the sprite sheet into an array of individual frame images. The program isn’t fully functional yet, but you can see the functions in the code base.

As for the use of image assets, I had two choices. One is to use a live AI API generation call; the other is to have a pre-built asset library. The latter would be easier and less prone to errors, I agree; but given the abundance of nationalities on campus, I would have no choice but to use a live API call. It is to be figured out next week.

 

Week 5 – Reading Response

After reading Golan Levin’s “Computer Vision for Artists and Designers,” I’m left with a deep appreciation for the creativity that arose from confronting technical limitations. The article pulls back the curtain on interactive art, revealing that its magic often lies in a clever and resourceful dialogue between the physical and digital worlds, not in lines of complex code. Apparently, the most effective way to help a computer “see” is often to change the environment, not just the algorithm.

Levin shows that simple, elegant techniques like frame differencing or brightness thresholding can be the building blocks for powerful experiences, in contrast to my preexisting thought for a powerful CV system. The LimboTime game, conceived and built in a single afternoon by novice programmers who found a large white sheet of Foamcore, pushed the change in my perspective. They didn’t need a sophisticated algorithm; they just needed a high-contrast background. It suggests that creativity in this field is as much about physical problem-solving as it is about writing code. It’s a reminder that we don’t live in a purely digital world, and that the most compelling art often emerges from the messy, inventive bridge between the two.

The article also forced me to reflect on the dual nature of this technology. On one hand, computer vision allows for the kind of playful, unencumbered interaction that Myron Krueger pioneered with Videoplace back in the 1970s. His work was a call to use our entire bodies to interact with machines, breaking free from the keyboard and mouse. In the past or now, it is always joyful that our physical presence can draw, play, and connect with a digital space in an intuitive way.

On the other hand, the article doesn’t shy away from the darker implications of a machine that watches. The very act of “tracking” is a form of surveillance. Artists like David Rokeby and Rafael Lozano-Hemmer confront this directly. Lozano-Hemmer’s Standards and Double Standards, in particular, creates an “absent crowd” of robotic belts that watch the viewer, leaving a potent impression that I would not have expected from visual technology in the early 2000s.

Ultimately, this reading has shifted my perspective. I see now that computer vision in art isn’t just a technical tool for creating interactive effects. It is a medium for exploring what it means to see, to be seen, and to be categorized. The most profound works discussed don’t just use the technology; they actively raise questions about the technology. They leverage its ability to create connection while simultaneously critiquing its capacity for control. I further believe that true innovation often comes from embracing constraints, and that the most important conversations about technology could best be articulated through art.

Week 5: Pong² (Midterm Progress Report)

Concept:

Finding the concept for this project was rather difficult– I initially thought about creating a rhythm game that would use the WASD keys and the arrow keys similar to a 2D Beat Saber where the direction you cut mattered as much as your timing. I was inspired by Mandy’s project from Fall 2024 where she made a two player dance game where you have to input a specific pattern of keys in combination with your partner. I thought her project had a lot of charm to it; however, I imagine syncing the rhythm to the onscreen inputs would prove challenging so I scrapped that idea early on.

Then I revisited the project I made for week 3 where we had to use loops. Back then I just wanted to follow Professor Aya’s advice (when she visited on Monday of week 3) and use object-oriented programming to add manually controlled collision physics to the project I made for week 2. That’s how I accidentally made Pong but this week I considered seriously turning it into a fully-functioning project.

I realized that my code was going to be very messy and a long scroll from top to bottom if I kept it all in one .js file, so I looked up how to split up the files. After I created new js files I realized that setup() and draw() could only be run in the main sketch.js file so I had to find a way to work around that for the gameLogic js file which is just an imported week 3 js file. I basically made new functions and named them initiateGame() and initiateMenu() and called them in the main sketch.js setup() function; I also had to add two lines of code into the html file to have the project access these new js files.

<script src="menu.js"></script>
<script src="gameLogic.js"></script>

Updating the “PING” Game

Besides the obvious requirements to add a menu screen and a restart button, there were plenty of personal touches I wanted to add to what I made two weeks ago.

The first of which was to implement a more visually appealing score tracking system. Last time I had the game continuously repeat so I made a very minimalistic but number-based scoring system to fit with the rest of the minimalist aesthetic. Since I was adding a set number of rounds now, I wanted to use a more interesting way of representing each point. There would be a rectangle in the middle that would either flick upwards or downwards slightly based on which side took the point (kind of like a lightswitch in a way).

The next idea was to have the ability for an options menu to change the in-game volume mixer and maybe even keybinds.

Since there was also the need to reset the game without restarting the whole sketch now, I made sure to add an ESC function where you can back out to the menu from the game and also subsequently reset the scores and start a new game the next time the user clicks the button to enter the game.

//RESET SCORE
function resetGame(){ //resets the score
  roundNumber = 0;
  topScore = 0;
  topCounter = [];
  bottomScore = 0;
  bottomCounter = [];
}
...
function menuMousePressed(){
  twoPlayerButton.handleClick(mouseX, mouseY);
  resetGame();
}

I also made each button with its own class while passing functions as parameters to call when clicked. I have only test one button so far– the twoPlayerButton that leads you into the normal mode – but it works great and I’ve “scaffolded” an area for more buttons to be added the same way.

allMenuButtons.push(
  twoPlayerButton = new button(
    tempButton, //image 
    width/2 + 120, //xPos
    height/2, //yPos
    tempButton.width, //sets to uploaded file dimensions
    tempButton.height, 
    switchToGame //which function to call if clicked?
  )
);

User Interface Planning

To plan out the UI, I quickly made a Canva file that matched the dimensions of my project and made a quick sketch of what I wanted the menu to look like. I’m going for a rather

This is also how I came up with a better name for the game: Pong2  

It’s not quite Pong 2, but since the paddles are no longer restricted to one dimension along each edge and now have true 2D movement, I wanted to call it something representative of that.

For the font choice I chose Gotham for its geometric feel. Since p5js didn’t have access to Gotham, I downloaded an otf file online and placed it in my sketch folder.

For the final version of the sketch,  I want to add a faint bouncing ball in the background to make the menu screen feel more dynamic and alive too.

A detail I was quite proud about was setting up a hover checker that would turn my cursor into a hand if I was hovering over something that was clickable.

for (let btn of allMenuButtons) { 
  btn.display(); //displays the button
  
  if (btn.isHovered(mouseX, mouseY)) {
    hovering = true; //check if hovering any buttons
  }
}

if (hovering) {
  cursor(HAND);
} else {
  cursor(ARROW);
}
Identify the Most Frightening Aspect: 

The most frightening part of this project is most certainly the one player mode I want to make where you can play against an AI. Theoretically it’s just some simple math to have it sort of predict the ball’s trajectory but I imagine it would consume a considerable amount of effort to figure out how that would work.

I might drop this aspect altogether if it’s too challenging to make before the deadline but I really would like to make it work.

 

 

Week 5 – Midterm Project Progress

Project Concept & Design

For my midterm project, I wanted to make something that was a little different. Looking at past projects, I noticed that all the sketches used a ton of colors and soft shapes that created an aesthetically pleasing and welcoming environment for the user. So I thought, why not this time, I make the user uncomfortable? Of course, I can’t know the different things that make various people uncomfortable, so this interactive artwork + game is based on things that make me uncomfortable.

Basically, it’s a series of challenges that the user must get through in order to win. But each challenge is supposed to evoke some sort of uncomfortable emotion/feeling (hopefully) within the user. The design is such that there are 8 doors, behind each of which is a challenge that the user must complete. Once the user has clicked on a door, they are trapped in that room until they successfully complete the challenge. Only then are they given the option to return to the hallway and go through another door. The only colors I aim to use across this game is black, white, and red, as I think it throws people off and adds to the overall mood I’m going for.

Production & The most frightening part

There were two things I was most worried about. First was making the hallway. This is because I couldn’t find the right kind of doors online, and had to design my own using Canva, and then go through the complicated process of cutting out backgrounds, resizing, and figuring out the coordinates on the canvas for each door so that they align in a perfect one-point perspective. That said, I’m pretty happy with how the end result turned out.

Second, I had imagined that it would take a lot of work to implement the general algorithm for the game. That means the structure of dividing it into different rooms, coding a timed mini game for each room (while keeping everything in theme), and keeping the user trapped in a room until they solved it. To minimize this risk, I decided to divide the code into different game states, so that whenever I need to add the code for a new door, for example door 2, I can just add the required code and set the game state to room 2 and test it out. Also for elements like the popups and the timer, which get used for every door, I made them into reusable functions so that I can just call them for every door, only with different arguments (popup messages) based on what I want. Hopefully, for every door I can continue to have the format I have implemented now, which is an initRoom__ and drawRoom__ function for each door. Then it’s pretty much just following the same pattern 8 times.

Risks/Issues

My biggest problem right now is that I may have overestimated my abilities by choosing to do 8 rooms. I’m going to have to think of 8 different mini-game ideas that make people uncomfortable and implement it in the given time frame.

Next Steps

Currently I’ve only implemented 2 rooms. I need to implement the other 6, as well as decide on what to do for the end page of the game when the user finishes all the challenges. I also need to decide on a background soundtrack that will run for the entire game. I also haven’t been able to think of a name for this project yet, and that’s pretty important.

It’s much better to view the sketch in full-screen: https://editor.p5js.org/siyonagoel/full/eEPhmUoLw

Week 5 – Reading discussion

When I think about computer vision, what interests me most is how strange it feels to give a machine the ability to “see.” Human vision is so automatic and seamless that we don’t really think about it, but when you translate it into algorithms, you realize how fragile and mechanical that process is. I find it fascinating that a computer can pick up tiny details that our eyes might not notice, yet at the same time, it can completely miss the “big picture.” That makes me wonder whether computer vision is really about replicating human vision at all, or if it’s creating an entirely different way of perceiving the world.

What I find both exciting and unsettling is how computer vision plays with control. On one hand, it can feel magical when an artwork follows your movements, responds to your gestures, or acknowledges your presence (like in TeamLab). There’s an intimacy there, like the piece is aware of you in a way that a static painting could never be. On the other hand, I can’t help but think about surveillance every time I see a camera in an installation. Am I part of the artwork, or am I being monitored? That ambiguity is powerful, but it also puts a lot of responsibility on the artist to think about how they’re using the technology.

For me, the most interesting potential of computer vision in interactive art isn’t just the novelty of tracking people, but the chance to reflect on our relationship with being watched. In a world where surveillance cameras are everywhere, an artwork that uses computer vision almost automatically becomes a commentary on power and visibility, whether or not the artist intends it. I think that’s what makes the medium so rich: it’s not just about making art “see,” it’s about making us more aware of how we are seen.