S2026 – Aya – Page 3 – Introduction to Interactive Media

Week 5: Midterm progress

For my midterm project, I want to make a nail salon game in p5.js. The idea is that customers come in, and the player has to design nails that match how they feel. I want it to be more than just picking random colors, so their mood is what makes the game more creative. I also want to include a mood-guessing game at the beginning. The customer will say a short line, and the player will guess their mood before starting the nail design. Then the player designs the nails based on that mood, and the game gives feedback at the end.

I want the design of the project to look cute, simple, and easy to understand, like a cozy nail studio. I plan to use soft colors and clear buttons so the user can move through the experience without getting confused. The project will start with an instructions screen, then move to the main nail salon screen where the customer says their line and the player guesses their mood, then the player designs the nails, and finally sees a result screen with feedback and a restart option.

For the visuals, I will use shapes for the nails, buttons, and decorations, and images for things like the background or customer. I also plan to include sound and text for the customer’s reaction/line so the project feels more interactive. This is a sketch of what I’m planning for my game.

I think the most challenging part will be organizing the project and the code, and making sure everything appears at the right time. Since the project has multiple stages, I need to keep the flow clear and make sure the user knows what to do next. Another challenge is the nail design interaction itself. I still need to decide the simplest and best way for the player to apply colors and decorations. I want it to be easy to use, but still feel fun and creative. I also still have not figured out how to decide whether the final design matches the customer’s mood at the end or not.

To reduce that risk, I will first make a simple version of the project with only the screen flow and text/colors. This will help me test if the structure works before I spend time on the final visuals. I will also make a reusable button class early so I can use it for mood choices, color choices, and the restart button. After that, I will test the nail design interaction with basic shapes first, like clicking a button to change the nail color, and I may also try a brush-like stroke animation.

Week 5 Reading – Computer Vision

Computer Vision has always been something I’ve been interested in, I used it in my 3rd assignment and I am currently using it in my midterm project. The article has given me answers to questions I have had while working with computer vision.

So far I have really only worked with hands, and it got me really curious, how does the AI model what is a hand and what isn’t, to the point it can assign so many key points to a single hand, it knows where the each finger tip is, the middle, the base and so on. And I know this article doesn’t fully answer this, but it gave me an idea to what exactly is computer vision. To a computer with no inherent context, anything it “sees” is just a bunch of no pixels with absolutely no relation whatsoever. It relies on mathematical calculations to make it’s own to context to what is happening and what is what. But that is just an abstract definition, honestly the techniques provided seem to only work in really specific cases, the author says there is no computer vision algorithm which is “completely” general.

I am going to have to disagree with that on the basis that this is not specific enough. Hand detection algorithms seem to work in almost any environment. It is able to detect when a hand is on screen or not, even multiple hands. Now if we take a hand algorithm and say that this algorithm wont detect this object in any environment? Of course it won’t. When we say general we need some sort of context to what general is! A lot of hand detection algorithms can be considered general in detecting hands no matter the environment for example.

There is a detection technique that I had to learn to improve my hand detection in the midterm project, and it is called Kalman filtering. To briefly describe this technique, the algorithm tries to predict the location of what it is tracking in the next frame, an compares it to the what the location actually is, and depending on a threshold we give this algorithm, the visualization of this tracking will either follow our predicted calculation, or the camera’s calculation. And this is an algorithm which I found to be quite intuitive in how it works, and I have noticed considerable difference in my hand tracking after implementation.

Honestly computer vision’s potential in interactive art is so extremely untapped. I do not see many people implementing it besides very few, and considering how accessible it is now to implement, it is such a shame We can have true interaction with our art work if we have the computer make decisions based on what it sees, giving us a new piece, not just every time the program is ran, but every time the background or the person does something.

Assignment 5 – Midterm Progress

Concept
For my midterm project, I decided to create an interactive courtroom experience where the player becomes the judge and has to decide whether a defendant is guilty or not guilty based on testimony and evidence. The original element of my project is that the player must interpret conflicting evidence and testimony rather than relying on obvious clues, making each decision feel uncertain and investigative. I chose this idea because I was interested in how people interpret information differently and how evidence can sometimes be misleading depending on how it’s presented. I wanted to design something that makes the user think critically rather than just react quickly. Also, because I love law and the process in general.

The experience begins with a cover, then an instruction screen that explains what the user has to do. From there, they move through the trial, evidence review, verdict decision, and final results. Each case is randomly generated from a set of scenarios, so the experience feels different every time someone plays.

Design
So far, I have focused on designing both the concept and the structure of the project. I planned out the different screens first (cover, instructions, trial, evidence, verdict, result) so I could understand the flow before building anything. That helped me feel less overwhelmed because I could work on one part at a time instead of the whole game at once. I went ahead and made the backgrounds using Canva and some generative AI pieces with the text (I will implement on-screen text for the testimonies), here are some:

Right now, I have the main structure of the game working, like the interaction controls, screen transitions, and the characters. I separated everything into classes and functions, and made some interactive buttons and keys to move through the different stages. I already have an idea in mind on how I want this game to work, so now I’m just trying to put it all together. I also started designing case scenarios, and I came up with 20. Now I just need to think about designing the visual evidence icons because I will have 5 pieces of evidence displayed for each case.

Visually, I plan to keep the characters stylized and minimal, created using p5 shapes instead of detailed illustrations. I want the courtroom environment to feel cohesive but not overly complex, so the focus stays on interaction and decision-making. The characters (defendant, lawyer, and witness) are drawn from mid-chest up using oop, so I can easily place them anywhere on the screen.

Challenging Aspects
The most frustrating and uncertain part so far has been positioning the elements on the screen, especially when switching to full-screen mode. My characters kept moving to different places, which made it hard to design the layout. Another difficult aspect is managing multiple interactive elements at once, like the hover detection, clickable areas, and screen transitions, because they all require precise logic to work smoothly together. I’m also worried about making all of the pieces of evidence using shapes, but I am thinking about doing them on different sketches, finding inspiration, and then combining them into my final sketch.

Risk Prevention
To reduce the layout issues, I switched from fixed pixel positioning to relative positioning based on the canvas width and height. This allows the objects to scale and stay in the correct space even when the screen size changes. I also used a coordinate display tool that shows my mouse position on screen while designing, to help me put everything precisely instead of guessing.

Also, to manage the interaction complexity, I tested individual features separately before combining them. For example, I built and tested hover detection for the characters before integrating it into the full scene. I also focused on building the basic system early, so I could confirm that the scene transitions worked before adding the detailed images. To me, breaking the project into smaller testable parts made the process feel more manageable.

Moving forward, I want to focus on refining the visuals, adding the pop-up for the testimony and evidence, which you get when you click on the character, and the sounds.

Week 5 – Reading Reflection

When I think about computer vision, I usually think of it as the way computers “see,” mostly in apps, websites, or phone features. After learning more about it, I realized that computers do not see anything the way humans do. Humans take in a whole scene at once, but computers break everything into tiny pieces like pixels, brightness, and movement. They notice small details that we might miss, but they also miss the bigger picture that humans understand naturally.

My own experience with things like Face ID and Snapchat filters shaped how I reacted to the topic. Unlocking my phone with my face feels normal and easy now, and Touch ID on a Mac makes things even faster. At the same time, I do not trust every technology that tracks people. I feel fine when big companies use it, because I honestly have the mentality of why would they want to use my data out of the billions of people using their platforms. However if it is a random app or something that could be hacked, then I wouldn’t want them to easily track me. That made me understand why computer vision based artworks can feel both creative and unsettling at the same time.

I think surveillance in public spaces is important for safety, but using it in art is kind off weird. It can be meaningful, but it can also feel invasive depending on how people are being watched. The idea of a machine constantly observing people makes me a little uncomfortable and weirded out honestly, but also curious about how far this technology will go. I do not think computers will ever fully understand human behavior the way humans do. Emotions, intentions, and intuition are probably never going to be experienced by computers.

If I were to design an artwork with computer vision, I would focus on tracking gestures or movement instead of faces. That feels less personal and more playful and fun to experience. I also think artists should have limits when using real people as data, especially when people do not know they are being recorded. Overall, learning about computer vision made me think more about how much we rely on it and how it affects both everyday life and creative work.

Week 5 – Reading Reflection

The main difference between human vision and computer vision is the limitations of computer vision. The text mentions that “no computer vision algorithm is completely ‘general’.” This means that none of them can perform reliably given any possible video input. Each algorithm comes with specific assumptions about what the scene will look like, and if those assumptions aren’t met, the results can be poor, ambiguous, or completely broken. This is obviously very different from human vision which is significantly more adaptable. We are able to recognize almost anything in any environment.

However, one advantage of computer vision is its strength as a surveillance tool. Unlike human eyes, which can only see in normal light, computer vision systems can be paired with infrared or thermal cameras that work in complete darkness or detect body heat. This gives them a significant advantage as surveillance tools as they aren’t held back by the same biological limitations we are.

The techniques for helping the computer see better are mostly about manipulating the real world to suit the algorithm’s assumptions. Examples the text gives include using backlighting or retroreflective materials to create contrast, using infrared illumination in low-light conditions, choosing the right camera and lens for the situation, or even dressing subjects in specific colors. The idea is that good physical design and good code need to be developed together, not separately.

Computer vision’s limitations for surveillance mean that to incorporate them in interactive art, you need careful planning and knowledge on where this art will be, to appropriately plan for physical or environmental limitations. For instance, if you are creating a computer vision interactive art project for some exhibition, you will need to analyze the venue and its environmental conditions to ensure you use the right technique to properly analyze the subject(s) being surveilled.

Interestignly, the irony is that the very limitations of computer vision mean that in an art context, the surveilled person often has to cooperate with the conditions for the system to work at all. That’s quite different from CCTV, where you’re tracked without consent or awareness. So, interactive art using computer vision tends to occupy this strange middle ground where surveillance becomes participation, which raises its own questions about what it means to be watched by a system you’re also performing for. This becomes a crucial moral question when it comes to projects such as the suicide box mentioned in the text and David Rokeby’s Sorting Daemon, where people are participants in the art installations without their concent, especially during vulnerable moments such as with the case of the suicide box.

Week 5: Reading Response

During the pandemic, I was really amazed by the process people had to go through before entering a space. Many places installed thermal face recognition systems at their entrances, and I remember lining up outside a mall, feeling confused about how it actually worked. While reading the article, that memory came back to me and helped answer the confusion I had back then. This experience made me realize the differences between computer vision and human vision. Instead of relying on perception, judgment, and context like a human would, computer vision processes visual information through algorithms that detect specific patterns, such as facial features and temperature readings. The system does not interpret situations the way humans do, it reads measurable data and produces a result based on programmed criteria. People had to stop, face the camera, and stand at the right distance so the system could read them accurately. In this case, the environment and people’s behavior were adjusted to be more legible to the algorithm, showing that while human vision is flexible and adaptable to different situations, computer vision relies on structured data and optimized conditions to function efficiently and consistently.

Computer vision’s ability to track and monitor people also changes how interactive art functions. Because the technology can detect movement, faces, or body position, it allows artworks to respond directly to the audience’s presence. However, as people differ from each other, the system produces various responses, creating multiple forms of interaction and monitoring. We can help the computer see or track what we are interested in by providing more labeled data so it can learn the patterns we want it to detect, improving visibility with good lighting and clear angles, using visual markers or cues, and controlling the environment to reduce background clutter and maintain proper distance.

Week 5-Reading Response

Reading this made me think a lot about how different computer vision is from how I see the world. The author mentions that a computer is “unable to answer even the most elementary questions about whether a video stream contains a person or object,” and that honestly surprised me. It made me realize how much meaning humans naturally add when we look at something. We don’t just see pixels we understand context, emotions, and objects instantly. Computers don’t do that, they see raw data that needs to be processed. That really changed how I think about interactive art, because I realized it’s not just about being creative it’s also about setting up the right conditions so the computer can actually see what we want it to. I also noticed the author is very positive about computer vision.I don’t think he’s wrong, but I do feel he focuses more on the benefits than the risks, which makes me think he might be a little biased toward celebrating the technology.

The part about tracking and surveillance raised the biggest questions for me. In class, we saw that piece where the visuals changed based on how loud someone spoke, and that example helped me understand how the viewer becomes part of the artwork. The system watches you, follows your movement, and reacts right away. It’s cool, but it also feels like being watched. Even if the goal is interaction, it still brings up the idea of surveillance. It made me wonder where the line is between participation and being monitored. And how does this relate to the way technology watches us in everyday life? The reading didn’t fully answer those questions for me, but it definitely made me more aware of them.

Reading Reflection – Week 5

Reading this honestly made me laugh a little at the Marvin Minsky anecdote, the idea that “the problem of computer vision” could be assigned as a summer project feels almost delusional now, and I think the article uses that story perfectly to show how much we underestimate what vision actually means and what it really involves. What really stayed with me is the description of digital video as computationally “opaque,” because that word completely shifts how I think about it now. We all know text carries structure and meaning, whereas video is just, as stated in the text, rectangular pixel buffers with no built in meaning. Humans attach meaning almost instantly, whereas computers need instructions just to separate foreground from background.

I also found it interesting that many of the techniques that were mentioned in the reading, like frame differencing and brightness thresholding, sound simple but are actually incredibly dependent on the physical conditions of the place. The article kept on emphasizing that no algorithm is completely “general,” and that honesty stood out to me because it means computer vision only really works smoothly and successfully when the environment is carefully prepared for it, which is actually crazy if you think about it, because it feels like everything you once knew about how computers see was a lie. The workshop example with the white Foamcore made that very clear, since the students basically redesigned their physical space to make brightness thresholding easier. That detail made me realize that computer vision is not just about writing a more complex and smart code, but also about kind of staging reality so the system can read it, which feels less like artificial intelligence and more like controlled intelligence.

The surveillance themed works fropm the reading added another layer that I couldn’t ignore. When Rokeby describes his system as “looking for moving things that might be people,” the phrasing feels sort of purposefully detached, and that detachment made me feel a little unsettled. The same foundational techniques that allowed Videoplace to create playful full body interactions are also what made Suicide Box possible, quietly recording real tragedies, which is just so scary to think about. I think that tension is what makes computer vision in interactive art powerful and complicated at the same time, because it forces us to confront how easily bodies can be tracked and reorganized into data. For me personally, the most compelling idea that i got from this reading is that computer vision does not really just detect what is there, but kind of reflects what we choose to prioritize and make visible to the computer. Overall, this was an extremely fascinating reading and truly opened my eyes to the “true” meaning and reality behind computer vision.

Week 4 Assignment-Data Visualization

The Concept:

I decided for this assignment to recreate the five-star rating system used for film ratings. So my plan was for there to be five stars, and once you click on any of the five, it will show you the films with an average rating of that particular star.<span data-mce-type="bookmark" style="display: inline-block; width: 0px; overflow: hidden; line-height: 0;" class="mce_SELRES_start"></span>

The Process:

I first loaded all my images, fonts, and csv file.

I thought this would be easier than it actually was; this was much, much harder than I expected. I first started by building my stars in a separate class to make it neater and easier for me to use. I followed this YouTube tutorial to make them since there is no star shape on p5, and trying to create the stars using lines that I would have to manually connect was too complex.

After creating the stars, I put them on my screen using a for loop to present 5 stars on the canvas. The first challenge I faced was getting the stars to light up or get filled in once my mouse was inside them. I tried using a particular if statement within a for loop, stating that if we were at index=0 to fill, but that did not work. I knew I had to do something with the distance between the mouse and each star’s outer radius. So, I decided to create an if statement where I used the distance function, but I used for my parameters mouseX, mouseY, and the outer radius. Unfortunately, that did not work, so I had to use ChatGPT to fix it for me, and it instead included the x and y parameters of the stars. After that, I created an if statement where if the distance is less than the outer radius, the star would fill and “light up.” That’s when I faced another issue where, at some point, two stars would light up at the same time. I decided to call my sister, who is a computer science major, to help me solve the problem. We tried manipulating the x and y positions of the stars, but the problem persisted. My sister then suggested that the issue is with the if statement of the distance between the mouse and the outer radius, since there will always be an overlap between the stars’ outer radii. She suggested I do some addition/subtraction to the mouseX and mouseY positions within the if statement. Once I did that, the stars lit up successfully.

The real challenge was extracting the ratings of the data onto the canvas. I tried many different things such as a triple nested loop like this:

for(let i=0; i<numRows; i++){ for(let j=0; j<=Film_title; j++){ for(let k=0; k<ratingStar.length; k++){ if(mouseIsPressed&&d < ratingStar[i].outerRadius&&int(Average_rating[i])<1.5&&ratingStar[0]){ background(255) text(Film_title[j], 200, 200) } } } }

but it did not work. I knew I wouldn’t be able to work out the code, so I decided to go the peer tutors to figure something out. Although we couldn’t completely figure out what to do, she suggested I use ranges.

My friend then explained to me that I could create a minimum and a maximum and assign them to an array of ratings we created. While his method worked, it was difficult to understand for someone with minimal coding experience.

I asked for help from the professor as this stage, and she gave me a starting code that helped me understand what I needed to implement. After I applied the code, the code fully ran successfully.

Code I’m Proud of:

Even though I wrote out the code with the help of the professor, I still felt particularly proud of this chunk because it made me realize the logic behind what I wanted to do, and it felt like a moment of realization and understanding. Also this was the hardest part of the code I wanted to achieve so being finally able to do it was very relieving.

    } else if (status == 1) {
      for(i=0;i<TwoRatingFilms.length;i++){
      text(TwoRatingFilms[i], 260, 240+i*30);
      }
    } else if (status == 2) {
      for(i=0;i<ThreeRatingFilms.length;i++){
      text(ThreeRatingFilms[i], 420, 240+i*30);}
    } else if (status == 3) {
for(i=0;i<FourRatingFilms.length;i++){
      text(FourRatingFilms[i], 590, 240+i*30);}
    } else {
      for(i=0;i<FiveRatingFilms.length;i++){
      text(FiveRatingFilms[i], 540, 420+i*30);}
    }
  }

Future Reflection:

Honestly for the future I would try to aim for something only a little bit outside my comfort zone. I would also not underestimate what I would have to do like I did in this project.

References:

https://youtu.be/rSp5iSTXwAY?si=RaaxtuAu8XivtpAF

Week 5 Midterm Project

Midterm Project Progress 1: Polyglot Galaxy (Week 5)

For my midterm project, I decided to develop an interactive generative artwork called Polyglot Galaxy. The concept is to create a multilingual visual space where users can click to “stamp” greetings from different languages onto a galaxy background. Each click generates a unique phrase using randomness, along with visual glow effects and sound feedback. My goal is to combine text, image, sound, and object-oriented programming into one interactive experience that reflects my interest in languages combining aesthetics like sounds, animations.

In terms of user interaction design, the program starts with a start screen and transitions into the play state after the first click. Once the user enters the play mode, clicking on the canvas generates a new greeting text at the mouse position. The phrases are randomly selected from a JSON file and styled with different sizes, colors, and blinking alpha effects using sine functions.

The coded that I am proud of would be:

// sound on click
 if (mouseX <= 300) {
   clickSound.play();
 } else { 
   clickSound1.play(); 
}

As I also implemented two different sounds depending on the click position (left or right side), where split in the middle if you click towards the left side of the frame it will play 0.mp3 file sound more of a peep sound and if you click more towards the right side it would play 1.mp3 sound more deep and it also restart function using the “R” key to reset the session without reloading the page. This structure I think gives a good step by step approach.

From a coding perspective, I have begun designing the project using functions, classes, and interactivity as required. I created a GreetingText class to manage each stamped phrase as an object, including its position, color, size, glow shape, and blinking animation. The generatePhrase() function handles generative text creation using randomness from language data, punctuation, and decorative elements. Moreover, I added a state system (“start” and “play”) to control the interface flow.

The challenging part was integrating multiple media elements together incorporating sound playback, generative text from JSON, and object-oriented animation in the same system. To reduce this risk, I tested each component separately by doing sound playbacks on mouse click, image backgrounds for different states, and a prototype class for animated text objects. I also added a limit to the number of stamped texts to ensure the sketch runs smoothly.

I think maybe I can improve by adding some songs into the galaxy output at the start menu to attract people e.g. assuming you were watching the movie guardians of the galaxy or universal starting music.

https://p5js.org/reference/p5/textAlign/