Week 5 – Reading Reflection

The main difference between human vision and computer vision is the limitations of computer vision. The text mentions that “no computer vision algorithm is completely ‘general’.” This means that none of them can perform reliably given any possible video input. Each algorithm comes with specific assumptions about what the scene will look like, and if those assumptions aren’t met, the results can be poor, ambiguous, or completely broken. This is obviously very different from human vision which is significantly more adaptable. We are able to recognize almost anything in any environment.

However, one advantage of computer vision is its strength as a surveillance tool. Unlike human eyes, which can only see in normal light, computer vision systems can be paired with infrared or thermal cameras that work in complete darkness or detect body heat. This gives them a significant advantage as surveillance tools as they aren’t held back by the same biological limitations we are.

The techniques for helping the computer see better are mostly about manipulating the real world to suit the algorithm’s assumptions. Examples the text gives include using backlighting or retroreflective materials to create contrast, using infrared illumination in low-light conditions, choosing the right camera and lens for the situation, or even dressing subjects in specific colors. The idea is that good physical design and good code need to be developed together, not separately.

Computer vision’s limitations for surveillance mean that to incorporate them in interactive art, you need careful planning and knowledge on where this art will be, to appropriately plan for physical or environmental limitations. For instance, if you are creating a computer vision interactive art project for some exhibition, you will need to analyze the venue and its environmental conditions to ensure you use the right technique to properly analyze the subject(s) being surveilled.

Interestignly, the irony is that the very limitations of computer vision mean that in an art context, the surveilled person often has to cooperate with the conditions for the system to work at all. That’s quite different from CCTV, where you’re tracked without consent or awareness. So, interactive art using computer vision tends to occupy this strange middle ground where surveillance becomes participation, which raises its own questions about what it means to be watched by a system you’re also performing for. This becomes a crucial moral question when it comes to projects such as the suicide box mentioned in the text and David Rokeby’s Sorting Daemon, where people are participants in the art installations without their concent, especially during vulnerable moments such as with the case of the suicide box.

Week 5: Reading Response

During the pandemic, I was really amazed by the process people had to go through before entering a space. Many places installed thermal face recognition systems at their entrances, and I remember lining up outside a mall, feeling confused about how it actually worked. While reading the article, that memory came back to me and helped answer the confusion I had back then. This experience made me realize the differences between computer vision and human vision. Instead of relying on perception, judgment, and context like a human would, computer vision processes visual information through algorithms that detect specific patterns, such as facial features and temperature readings. The system does not interpret situations the way humans do, it reads measurable data and produces a result based on programmed criteria. People had to stop, face the camera, and stand at the right distance so the system could read them accurately. In this case, the environment and people’s behavior were adjusted to be more legible to the algorithm, showing that while human vision is flexible and adaptable to different situations, computer vision relies on structured data and optimized conditions to function efficiently and consistently.

Computer vision’s ability to track and monitor people also changes how interactive art functions. Because the technology can detect movement, faces, or body position, it allows artworks to respond directly to the audience’s presence. However, as people differ from each other, the system produces various responses, creating multiple forms of interaction and monitoring. We can help the computer see or track what we are interested in by providing more labeled data so it can learn the patterns we want it to detect, improving visibility with good lighting and clear angles, using visual markers or cues, and controlling the environment to reduce background clutter and maintain proper distance.

Week 5-Reading Response

Reading this made me think a lot about how different computer vision is from how I see the world. The author mentions that a computer is “unable to answer even the most elementary questions about whether a video stream contains a person or object,” and that honestly surprised me. It made me realize how much meaning humans naturally add when we look at something.  We don’t just see pixels we understand context, emotions, and objects instantly. Computers don’t do that, they see raw data that needs to be processed. That really changed how I think about interactive art, because I realized it’s not just about being creative it’s also about setting up the right conditions so the computer can actually see what we want it to. I also noticed the author is very positive about computer vision.I don’t think he’s wrong, but I do feel he focuses more on the benefits than the risks, which makes me think he might be a little biased toward celebrating the technology.

The part about tracking and surveillance raised the biggest questions for me. In class, we saw that piece where the visuals changed based on how loud someone spoke, and that example helped me understand how the viewer becomes part of the artwork. The system watches you, follows your movement, and reacts right away. It’s cool, but it also feels like being watched. Even if the goal is interaction, it still brings up the idea of surveillance. It made me wonder where the line is between participation and being monitored. And how does this relate to the way technology watches us in everyday life? The reading didn’t fully answer those questions for me, but it definitely made me more aware of them.

Reading Reflection – Week 5

Reading this honestly made me laugh a little at the Marvin Minsky anecdote, the idea that “the problem of computer vision” could be assigned as a summer project feels almost delusional now, and I think the article uses that story perfectly to show how much we underestimate what vision actually means and what it really involves. What really stayed with me is the description of digital video as computationally “opaque,” because that word completely shifts how I think about it now. We all know text carries structure and meaning, whereas video is just, as stated in the text, rectangular pixel buffers with no built in meaning. Humans attach meaning almost instantly, whereas computers need instructions just to separate foreground from background.

I also found it interesting that many of the techniques that were mentioned in the reading, like frame differencing and brightness thresholding, sound simple but are actually incredibly dependent on the physical conditions of the place. The article kept on emphasizing that no algorithm is completely “general,” and that honesty stood out to me because it means computer vision only really works smoothly and successfully when the environment is carefully prepared for it, which is actually crazy if you think about it, because it feels like everything you once knew about how computers see was a lie. The workshop example with the white Foamcore made that very clear, since the students basically redesigned their physical space to make brightness thresholding easier. That detail made me realize that computer vision is not just about writing a more complex and smart code, but also about kind of staging reality so the system can read it, which feels less like artificial intelligence and more like controlled intelligence.

The surveillance themed works fropm the reading added another layer that I couldn’t ignore. When Rokeby describes his system as “looking for moving things that might be people,” the phrasing feels sort of purposefully detached, and that detachment made me feel a little unsettled. The same foundational techniques that allowed Videoplace to create playful full body interactions are also what made Suicide Box possible, quietly recording real tragedies, which is just so scary to think about. I think that tension is what makes computer vision in interactive art powerful and complicated at the same time, because it forces us to confront how easily bodies can be tracked and reorganized into data. For me personally, the most compelling idea that i got from this reading is that computer vision does not really just detect what is there, but kind of reflects what we choose to prioritize and make visible to the computer. Overall, this was an extremely fascinating reading and truly opened my eyes to the “true” meaning and reality behind computer vision.

Week 5 – Reading Reflection

It’s easy to forget that computers don’t actually see anything. When we look at a video feed, we instantly recognize a person walking across a room. A computer just registers a grid of numbers where pixel values shift over time. Because of this, computer vision is incredibly fragile. Every tracking algorithm relies on strict assumptions about the real world. If the lighting in a room changes, a tracking algorithm might completely break. The computer doesn’t see “general” picture with context, since it only knows the math it was programmed to look for.

Basic Tracking Techniques

To avoid this blindness of the computer, some techniques are used to track/react to things the developers are interested in.

    • Frame differencing: comparing the current video frame to the previous one. If the pixels changed, the software assumes motion happened in that exact spot.

    • Background subtraction: memorizing an image of an empty room. When a person walks in, it subtracts the “empty” image from the live feed to isolate whatever is new.

    • Brightness thresholding: tracking a glowing object in a dark room by telling the software to ignore everything except the brightest pixels.

    • Simple object tracking: This involves looking at the color or pixel arrangement of a specific object and looking for those same values as they move across the screen.

Surveillance in Art

I believe that the fact that people use the technology made for surveillance and military to create art is very interesting. I believe that using technology built for control to create art is truly impressive: flipping the understanding of this technology, or even making it very double-sided. While interactivity that comes with such tracking technology has a huge variety, and sometimes feels magical and extremely emotional, it comes from the computer tracking, analyzing and reacting to every move of the person in front of it. Such art presents the invisible unsettling surveillance we have everyday to a work of art that makes it extremely present.

Honestly, this military baggage explains a lot of computer vision’s blind spots. If you’re designing a system just to monitor crowds or track moving targets, you don’t need it to understand the whole scene and all details. You just need fast analysis of tiny differences, like a shift in pixels.

However, I feel like in interactive media details are very important, and that art runs on them. This way, while computer vision has not yet reached the state when it can analyze everything at once, artists have to come up with algorithms that will try to do it instead.

Week 4 Assignment-Data Visualization

The Concept:

I decided for this assignment to recreate the five-star rating system used for film ratings. So my plan was for there to be five stars, and once you click on any of the five, it will show you the films with an average rating of that particular star.

The Process:

I first loaded all my images, fonts, and csv file.

I thought this would be easier than it actually was; this was much, much harder than I expected. I first started by building my stars in a separate class to make it neater and easier for me to use. I followed this YouTube tutorial to make them since there is no star shape on p5, and trying to create the stars using lines that I would have to manually connect was too complex.

After creating the stars, I put them on my screen using a for loop to present 5 stars on the canvas. The first challenge I faced was getting the stars to light up or get filled in once my mouse was inside them. I tried using a particular if statement within a for loop, stating that if we were at index=0 to fill, but that did not work. I knew I had to do something with the distance between the mouse and each star’s outer radius.  So, I decided to create an if statement where I used the distance function, but I used for my parameters mouseX, mouseY, and the outer radius. Unfortunately, that did not work, so I had to use ChatGPT to fix it for me, and it instead included the x and y parameters of the stars. After that, I created an if statement where if the distance is less than the outer radius, the star would fill and “light up.” That’s when I faced another issue where, at some point, two stars would light up at the same time. I decided to call my sister, who is a computer science major, to help me solve the problem. We tried manipulating the x and y positions of the stars, but the problem persisted. My sister then suggested that the issue is with the if statement of the distance between the mouse and the outer radius, since there will always be an overlap between the stars’ outer radii. She suggested I do some addition/subtraction to the mouseX and mouseY positions within the if statement. Once I did that, the stars lit up successfully.

The real challenge was extracting the ratings of the data onto the canvas. I tried many different things such as a triple nested loop like this:

for(let i=0; i<numRows; i++){ for(let j=0; j<=Film_title; j++){ for(let k=0; k<ratingStar.length; k++){ if(mouseIsPressed&&d < ratingStar[i].outerRadius&&int(Average_rating[i])<1.5&&ratingStar[0]){ background(255) text(Film_title[j], 200, 200) } } } }

but it did not work. I knew I wouldn’t be able to work out the code, so I decided to go the peer tutors to figure something out. Although we couldn’t completely figure out what to do, she suggested I use ranges.

My friend then explained to me that I could create a minimum and a maximum and assign them to an array of ratings we created. While his method worked, it was difficult to understand for someone with minimal coding experience.

I asked for help from the professor as this stage, and she gave me a starting code that helped me understand what I needed to implement. After I applied the code, the code fully ran successfully.

Code I’m Proud of:

Even though I wrote out the code with the help of the professor, I still felt particularly proud of this chunk because it made me realize the logic behind what I wanted to do, and it felt like a moment of realization and understanding. Also this was the hardest part of the code I wanted to achieve so being finally able to do it was very relieving.

    } else if (status == 1) {
      for(i=0;i<TwoRatingFilms.length;i++){
      text(TwoRatingFilms[i], 260, 240+i*30);
      }
    } else if (status == 2) {
      for(i=0;i<ThreeRatingFilms.length;i++){
      text(ThreeRatingFilms[i], 420, 240+i*30);}
    } else if (status == 3) {
for(i=0;i<FourRatingFilms.length;i++){
      text(FourRatingFilms[i], 590, 240+i*30);}
    } else {
      for(i=0;i<FiveRatingFilms.length;i++){
      text(FiveRatingFilms[i], 540, 420+i*30);}
    }
  }

Future Reflection:

Honestly for the future I would try to aim for something only a little bit outside my comfort zone. I would also not underestimate what I would have to do like I did in this project.

References:

https://youtu.be/rSp5iSTXwAY?si=RaaxtuAu8XivtpAF

 

Week 5 Assignment

Midterm Project Progress 1: Polyglot Galaxy (Week 5)

For my midterm project, I decided to develop an interactive generative artwork called Polyglot Galaxy. The concept is to create a multilingual visual space where users can click to “stamp” greetings from different languages onto a galaxy background. Each click generates a unique phrase using randomness, along with visual glow effects and sound feedback. My goal is to combine text, image, sound, and object-oriented programming into one  interactive experience that reflects my interest in languages combining aesthetics like sounds, animations.

In terms of user interaction design, the program starts with a start screen and transitions into the play state after the first click. Once the user enters the play mode, clicking on the canvas generates a new greeting text at the mouse position. The phrases are randomly selected from a JSON file and styled with different sizes, colors, and blinking alpha effects using sine functions.

The coded that I am proud of would be:

// sound on click
 if (mouseX <= 300) {
   clickSound.play();
 } else { 
   clickSound1.play(); 
}

As I also implemented two different sounds depending on the click position (left or right side), where split in the middle if you click towards the left side of the frame it will play 0.mp3 file sound more of a peep sound and if you click more towards the right side it would play 1.mp3 sound more deep and it also restart function using the “R” key to reset the session without reloading the page. This structure I think gives a good step by step approach.

From a coding perspective, I have begun designing the project using functions, classes, and interactivity as required. I created a GreetingText class to manage each stamped phrase as an object, including its position, color, size, glow shape, and blinking animation. The generatePhrase() function handles generative text creation using randomness from language data, punctuation, and decorative elements. Moreover, I added a state system (“start” and “play”) to control the interface flow.

The challenging part was integrating multiple media elements together incorporating sound playback, generative text from JSON, and object-oriented animation in the same system. To reduce this risk, I tested each component separately by doing sound playbacks on mouse click, image backgrounds for different states, and a prototype class for animated text objects. I also added a limit to the number of stamped texts to ensure the sketch runs smoothly.

I think maybe I can improve by adding some songs into the galaxy output at the start menu to attract people e.g. assuming you were watching the movie guardians of the galaxy or universal starting music.

https://p5js.org/reference/p5/textAlign/

Reading Reflection-Week #5

The reading made me think back on how invisible software work used to be in the past and how easily important contributions can be overlooked nowadays, especially when they do not fit dominant expectations of who a “technical innovator” should actually be. The article highlights that software wasn’t even considered important in the early Apollo mission planning, which aligns with how many modern technological systems still undervalue behind-the-scenes digital labor. From my own experience studying technology and creative coding, I see a similar pattern which is that people often praise visible outputs (design, hardware, final product) while ignoring the programming logic that makes everything function. This actually supports the author’s point that Hamilton’s work was revolutionary not only technically but conceptually, because she helped establish software as a legitimate engineering discipline. But at the same time, the reading also challenges my previous assumption that space exploration was mainly about hardware and astronauts; it made me reconsider how much critical decision-making and problem-solving actually happens in code and systems design.

However, the author might show some bias by strongly linking Hamilton as a singular heroic figure, which risks simplifying the collaborative nature of large-scale scientific projects. While the article acknowledges teams and engineers, it still centers a narrative of individual genius, which is common in technical journalism and can actually overlook collective labor and institutional structures. This raises questions for me about how history chooses which contributors to actually highlight and which to marginalize. I also wonder whether the article’s emphasis on gender barriers, while being important, might shape the story to fit a modern narrative about women in tech rather than fully exploring the technical debates and engineering processes of the time. The reading ultimately makes me question how innovation is actually thought off. Do we celebrate people based on their actual impact, or based on how well their story fits contemporary social values and narratives about progress and inclusion?

Reading Reflection – Week 5

I used to assume computer vision worked like human vision, just less advanced, but I realized the difference is definitely bigger. Human vision automatically understands meaning, like someone’s face, while digital video is “computationally opaque”. It basically shows that a camera image is just pixel buffers with no meaning unless an algorithm like frame differencing, background subtraction, or brightness thresholding interprets it. I was surprised that simple techniques like object tracking can detect motion just by comparing the pixels. The vision systems do not have to be so complex. Even basic detection can be powerful if the physical environment factors in the code are designed well.

One example that stuck with me was Myron Kruger’s videoplace. I found it really interesting that early interactive art already used vision tracking to let people draw with their bodies. It made me realize how computer vision can expand the way we can interact with technology. At the same time, Rafael Lozano Hemmer’s work shows more of a critical side. His belt tracking piece turns surveillance into art, which made me wonder whether interactive work with surveillance abilities that track viewers is also training us to accept being watched. 

The reading left me to question if computer vision works best when environments are made specifically so that the computer can easily detect it, does that mean future spaces will be designed more for machines, made accessible for machines, than us humans? Like, will there be now more controlled lightning, infrared illumination, and retroreflective material? I think this text definitely shifted my perspective from seeing computer vision just as a technical tool to also seeing it as a cultural force that affects art and even social power.

Reading Reflection Week 5: The visionary difference between a Computer and a Human

I found it quite interesting seeing how computer vision actually is different than human vision. Initally I assumed that computer vision being chock full of the knowledge we would provide from the side of AI, it would be able to, at the very least analyze what the image is. However I was surprised to find out how computers only really see grids of pixel and a fully relient on mathematical algorithms, in order to get a cleaner picture of what is on screen. Whereas uh humans, we’re able to distinguish an object from a background and different lighting, computers have a hard time to tell a shadow passing along a room.

However with regards to the use of tracking and surveilence, I would say it honestly opens up a world of possibilities to make use of body tracking as a controller for many games and loads of interactive media artworks. The coolest one I’ve personally seen so far is Just Dance. It utilizes a camera for motion tracking so that its able to give an accurate assessment if the dance moves match up with the computer’s example. It’s main concept isn’t just a gimmik, but the crux of the main functionality of the game. But it’s the implementation where you get an accurate assessment of whether you follow the dance moves and can give you instant feedback, through the use of sound effects, that is very useful. And I mean, with regards to interactive media, this will allow say, people to interact with our art in a deeper way so that they can genuienly feel immersed in the art in question.