Week 5 – Midterm Progress

Concept

Every time I walk into Madame Tussauds, I feel this strange mix of excitement and also superficiality from the figures I am encountering. Because you’re standing next to someone you’ve only ever seen on a screen, except they’re not really there, and yet it still feels like you “met” them. It’s staged and curated, but somehow still memorable in a way. That exact feeling is what I want to recreate for my midterm, but in a digital format.

I don’t want to build another game where you’re trying to score points or beat something. I want to build an experience you move through. My idea is to create a wax-museum inspired digital space where you browse through celebrities, pick one and take a photo with them in a photo booth setup.

The whole concept revolves around that illusion of artificial closeness. You’re not actually meeting anyone, but you still walk away with a cute memory. I want users to feel like they stepped into a staged exhibit for a few minutes and left with a souvenir!

Design

Visually, I don’t want this to look like a bright and cartoonish app. I want it to feel sort of dramatic in a way, with a dark background and clean and polished looking framed celebrity cards.

I actually like that wax museums feel a little staged and exaggerated. I want the digital version to embrace that instead of hiding it.

The experience will start with a dramatic opening screen with soft background music. Nothing moves until the user presses start. That intentional pause kind of mimics standing outside an exhibit before stepping in.

After that, there will be a short instruction screen, and then the gallery. The gallery will show multiple celebrity cards. When you click it, you’ll move into the photo booth scene.

In the photo booth, your webcam will appear next to the celebrity you chose. There will be a snap button with a camera shutter sound, and then a final screen showing your “souvenir.” From there, you’ll be able to restart without refreshing everything, because I don’t want the experience to feel like it just cuts off.

Sound is also very important for my interactive experience. The gallery will have background music, and I might let it shift slightly depending on the celebrity chosen. Small things like that will make it feel less flat and boring and more alive.

Frightening Part

The webcam honestly scares me the most. The whole idea depends on that photo moment. If the camera doesn’t work, the entire concept kind of collapses. Browsers can be weird about permission sometimes, and I don’t want to build this whole dramatic museum and then realize the main interaction fails.

Reducing Risk

Instead of leaving the webcam for later, I’m going to test it early. I want to make sure the camera actually works and shows up inside the canvas and that I can capture a image from it before I build everything else around it.

Testing the technical parts like the camera webcam early will make the rest feel less stressful, because once I know the main interaction works, I can focus on the atmosphere of my experience, which is honestly the part i care about the most.

Week 5: reading reflection

The reading mentioned that computer vision works best when the artist helps the computer by controlling things like lighting, background, and contrast. Before reading this, I mostly thought of computer vision as a coding problem, but the reading made it feel more like a design problem too, which I liked because it connects the technical side to the creative side of interactive art. It also made me think about my own project, because even simple interactions can fail if the setup is not clear enough for the user or the system.

The reading also made me think more critically about how easily tracking can become part of an experience without users really thinking about it. In interactive art, tracking can make the work feel more immersive, but it can also feel invasive depending on how it is used. I do not necessarily think computer vision is automatically bad in art, but I do think it raises questions about consent and comfort. It made me wonder how artists can make interactive work engaging while still being clear and respectful about what is being tracked.

Assignment 5 – Midterm Progress

Concept Demo Below!

Concept:

The name of the game right now is Cyberpunk Breach (tentative) and as you can see in the demo above I am doing for a cyberpunk theme game!

The game play is currently a work in progress, I have started on it but no character stripes implementation of yet. The concept is as such:

I took inspiration from a game called Magic Touch which is on the app store. The gist of the game is, you are a wizard, and you need to stop the robots from attacking you, the way you do that is pop the balloons that the robots are using, these balloons have specific glyph that you need to draw, if you draw them currently the balloon containing that glyph will pop.

Now I have added my own twist to this. I am making it cyberpunk themed, with drones rather, and the biggest change of functionality is the fact this entire game does not use your keyboard or mouse. It is entirely based on hand tracking, where you use your hand to navigate the menus, and play the game.

Now there are multiple issues with this that I have tackled or in the process of tackling.

The problem with hand tracking on browsers, is that they are really really REALLY latent and jittery. Latency would be a hard problem to fix since this is a browser issue, but jittery I can fix. This is where Kamlan filtering comes in play.

To explain the core concept:

The filtering has 3 steps:

– Predict

– Update

– Estimate

The Kalman filter works in a simple loop. First, it predicts what the system should look like next based on what it already knows. Then, it checks that prediction against a new (noisy) measurement and corrects itself.

Because of this, the Kalman filter has two main steps. The prediction step moves the current estimate forward in time and guesses how uncertain that estimate is. The correction step takes in a new measurement and uses it to adjust the prediction, giving a more accurate final estimate.

And finally, using a threshold, we can choose between using the estimated path, or the camera path.

Using this we can have pretty smooth hand tracking.

Now the issue of having recognized gestures, and even adding my own custom gestures, using a library called $1 Unistroke recognizer.

Alternate sketch just to test out the library:

The library has inbuilt strokes, so for example if we try to draw a triangle, the algorithm guesses your drawing with how confident it is:

You can also add your own custom gestures:

The tracking and the gesture recognition is what I was worried about before I got started on this project

For the final stages of the game:
I will need to work on the game-play itself and the process of implementing this into the game.

Week 5: Midterm progress

For my midterm project, I want to make a nail salon game in p5.js. The idea is that customers come in, and the player has to design nails that match how they feel. I want it to be more than just picking random colors, so their mood is what makes the game more creative. I also want to include a mood-guessing game at the beginning. The customer will say a short line, and the player will guess their mood before starting the nail design. Then the player designs the nails based on that mood, and the game gives feedback at the end.

I want the design of the project to look cute, simple, and easy to understand, like a cozy nail studio. I plan to use soft colors and clear buttons so the user can move through the experience without getting confused. The project will start with an instructions screen, then move to the main nail salon screen where the customer says their line and the player guesses their mood, then the player designs the nails, and finally sees a result screen with feedback and a restart option.

For the visuals, I will use shapes for the nails, buttons, and decorations, and images for things like the background or customer. I also plan to include sound and text for the customer’s reaction/line so the project feels more interactive. This is a sketch of what I’m planning for my game.

I think the most challenging part will be organizing the project and the code, and making sure everything appears at the right time. Since the project has multiple stages, I need to keep the flow clear and make sure the user knows what to do next. Another challenge is the nail design interaction itself. I still need to decide the simplest and best way for the player to apply colors and decorations. I want it to be easy to use, but still feel fun and creative. I also still have not figured out how to decide whether the final design matches the customer’s mood at the end or not.

To reduce that risk, I will first make a simple version of the project with only the screen flow and text/colors. This will help me test if the structure works before I spend time on the final visuals. I will also make a reusable button class early so I can use it for mood choices, color choices, and the restart button. After that, I will test the nail design interaction with basic shapes first, like clicking a button to change the nail color, and I may also try a brush-like stroke animation.

 

 

Week 5 Reading – Computer Vision

Computer Vision has always been something I’ve been interested in, I used it in my 3rd assignment and I am currently using it in my midterm project. The article has given me answers to questions I have had while working with computer vision.

So far I have really only worked with hands, and it got me really curious, how does the AI model what is a hand and what isn’t, to the point it can assign so many key points to a single hand, it knows where the each finger tip is, the middle, the base and so on. And I know this article doesn’t fully answer this, but it gave me an idea to what exactly is computer vision. To a computer with no inherent context, anything it “sees” is just a bunch of no pixels with absolutely no relation whatsoever. It relies on mathematical calculations to make it’s own to context to what is happening and what is what.  But that is just an abstract definition, honestly the techniques provided seem to only work in really specific cases, the author says there is no computer vision algorithm which is “completely” general.

I am going to have to disagree with that on the basis that this is not specific enough. Hand detection algorithms seem to work in almost any environment. It is able to detect when a hand is on screen or not, even multiple hands. Now if we take a hand algorithm and say that this algorithm wont detect this object in any environment? Of course it won’t. When we say general we need some sort of context to what general is! A lot of hand detection algorithms can be considered general in detecting hands no matter the environment for example.

There is a detection technique that I had to learn to improve my hand detection in the midterm project, and it is called Kalman filtering. To briefly describe this technique, the algorithm tries to predict the location of what it is tracking in the next frame, an compares it to the what the location actually is, and depending on a threshold we give this algorithm, the visualization of this tracking will either follow our predicted calculation, or the camera’s calculation. And this is an algorithm which I found to be quite intuitive in how it works, and I have noticed considerable difference in my hand tracking after implementation.

Honestly computer vision’s potential in interactive art is so extremely untapped. I do not see many people implementing it besides very few, and considering how accessible it is now to implement, it is such a shame We can have true interaction with our art work if we have the computer make decisions based on what it sees, giving us a new piece, not just every time the program is ran, but every time the background or the person does something.

Assignment 5 – Midterm Progress

Concept
For my midterm project, I decided to create an interactive courtroom experience where the player becomes the judge and has to decide whether a defendant is guilty or not guilty based on testimony and evidence. The original element of my project is that the player must interpret conflicting evidence and testimony rather than relying on obvious clues, making each decision feel uncertain and investigative. I chose this idea because I was interested in how people interpret information differently and how evidence can sometimes be misleading depending on how it’s presented. I wanted to design something that makes the user think critically rather than just react quickly. Also, because I love law and the process in general.

The experience begins with a cover, then an instruction screen that explains what the user has to do. From there, they move through the trial, evidence review, verdict decision, and final results. Each case is randomly generated from a set of scenarios, so the experience feels different every time someone plays.

Design
So far, I have focused on designing both the concept and the structure of the project. I planned out the different screens first (cover, instructions, trial, evidence, verdict, result) so I could understand the flow before building anything. That helped me feel less overwhelmed because I could work on one part at a time instead of the whole game at once. I went ahead and made the backgrounds using Canva and some generative AI pieces with the text (I will implement on-screen text for the testimonies), here are some:

Right now, I have the main structure of the game working, like the interaction controls, screen transitions, and the characters. I separated everything into classes and functions, and made some interactive buttons and keys to move through the different stages. I already have an idea in mind on how I want this game to work, so now I’m just trying to put it all together. I also started designing case scenarios, and I came up with 20. Now I just need to think about designing the visual evidence icons because I will have 5 pieces of evidence displayed for each case.

Visually, I plan to keep the characters stylized and minimal, created using p5 shapes instead of detailed illustrations. I want the courtroom environment to feel cohesive but not overly complex, so the focus stays on interaction and decision-making. The characters (defendant, lawyer, and witness) are drawn from mid-chest up using oop, so I can easily place them anywhere on the screen.

Challenging Aspects
The most frustrating and uncertain part so far has been positioning the elements on the screen, especially when switching to full-screen mode. My characters kept moving to different places, which made it hard to design the layout. Another difficult aspect is managing multiple interactive elements at once, like the hover detection, clickable areas, and screen transitions, because they all require precise logic to work smoothly together. I’m also worried about making all of the pieces of evidence using shapes, but I am thinking about doing them on different sketches, finding inspiration, and then combining them into my final sketch.

Risk Prevention
To reduce the layout issues, I switched from fixed pixel positioning to relative positioning based on the canvas width and height. This allows the objects to scale and stay in the correct space even when the screen size changes. I also used a coordinate display tool that shows my mouse position on screen while designing, to help me put everything precisely instead of guessing.

Also, to manage the interaction complexity, I tested individual features separately before combining them. For example, I built and tested hover detection for the characters before integrating it into the full scene. I also focused on building the basic system early, so I could confirm that the scene transitions worked before adding the detailed images. To me, breaking the project into smaller testable parts made the process feel more manageable.

Moving forward, I want to focus on refining the visuals, adding the pop-up for the testimony and evidence, which you get when you click on the character, and the sounds.

Week 5 – Reading Reflection

When I think about computer vision, I usually think of it as the way computers “see,” mostly in apps, websites, or phone features. After learning more about it, I realized that computers do not see anything the way humans do. Humans take in a whole scene at once, but computers break everything into tiny pieces like pixels, brightness, and movement. They notice small details that we might miss, but they also miss the bigger picture that humans understand naturally.

My own experience with things like Face ID and Snapchat filters shaped how I reacted to the topic. Unlocking my phone with my face feels normal and easy now, and Touch ID on a Mac makes things even faster. At the same time, I do not trust every technology that tracks people. I feel fine when big companies use it, because I honestly have the mentality of why would they want to use my data out of  the billions of people using their platforms. However if it is a random app or something that could be hacked, then I wouldn’t want them to easily track me. That made me understand why computer vision based artworks can feel both creative and unsettling at the same time.

I think surveillance in public spaces is important for safety, but using it in art is kind off weird. It can be meaningful, but it can also feel invasive depending on how people are being watched. The idea of a machine constantly observing people makes me a little uncomfortable and weirded out honestly, but also curious about how far this technology will go. I do not think computers will ever fully understand human behavior the way humans do. Emotions, intentions, and intuition are probably never going to be experienced by computers.

If I were to design an artwork with computer vision, I would focus on tracking gestures or movement instead of faces. That feels less personal and more playful and fun to experience. I also think artists should have limits when using real people as data, especially when people do not know they are being recorded. Overall, learning about computer vision made me think more about how much we rely on it and how it affects both everyday life and creative work.

Week 5 – Reading Reflection

The main difference between human vision and computer vision is the limitations of computer vision. The text mentions that “no computer vision algorithm is completely ‘general’.” This means that none of them can perform reliably given any possible video input. Each algorithm comes with specific assumptions about what the scene will look like, and if those assumptions aren’t met, the results can be poor, ambiguous, or completely broken. This is obviously very different from human vision which is significantly more adaptable. We are able to recognize almost anything in any environment.

However, one advantage of computer vision is its strength as a surveillance tool. Unlike human eyes, which can only see in normal light, computer vision systems can be paired with infrared or thermal cameras that work in complete darkness or detect body heat. This gives them a significant advantage as surveillance tools as they aren’t held back by the same biological limitations we are.

The techniques for helping the computer see better are mostly about manipulating the real world to suit the algorithm’s assumptions. Examples the text gives include using backlighting or retroreflective materials to create contrast, using infrared illumination in low-light conditions, choosing the right camera and lens for the situation, or even dressing subjects in specific colors. The idea is that good physical design and good code need to be developed together, not separately.

Computer vision’s limitations for surveillance mean that to incorporate them in interactive art, you need careful planning and knowledge on where this art will be, to appropriately plan for physical or environmental limitations. For instance, if you are creating a computer vision interactive art project for some exhibition, you will need to analyze the venue and its environmental conditions to ensure you use the right technique to properly analyze the subject(s) being surveilled.

Interestignly, the irony is that the very limitations of computer vision mean that in an art context, the surveilled person often has to cooperate with the conditions for the system to work at all. That’s quite different from CCTV, where you’re tracked without consent or awareness. So, interactive art using computer vision tends to occupy this strange middle ground where surveillance becomes participation, which raises its own questions about what it means to be watched by a system you’re also performing for. This becomes a crucial moral question when it comes to projects such as the suicide box mentioned in the text and David Rokeby’s Sorting Daemon, where people are participants in the art installations without their concent, especially during vulnerable moments such as with the case of the suicide box.

Week 5: Reading Response

During the pandemic, I was really amazed by the process people had to go through before entering a space. Many places installed thermal face recognition systems at their entrances, and I remember lining up outside a mall, feeling confused about how it actually worked. While reading the article, that memory came back to me and helped answer the confusion I had back then. This experience made me realize the differences between computer vision and human vision. Instead of relying on perception, judgment, and context like a human would, computer vision processes visual information through algorithms that detect specific patterns, such as facial features and temperature readings. The system does not interpret situations the way humans do, it reads measurable data and produces a result based on programmed criteria. People had to stop, face the camera, and stand at the right distance so the system could read them accurately. In this case, the environment and people’s behavior were adjusted to be more legible to the algorithm, showing that while human vision is flexible and adaptable to different situations, computer vision relies on structured data and optimized conditions to function efficiently and consistently.

Computer vision’s ability to track and monitor people also changes how interactive art functions. Because the technology can detect movement, faces, or body position, it allows artworks to respond directly to the audience’s presence. However, as people differ from each other, the system produces various responses, creating multiple forms of interaction and monitoring. We can help the computer see or track what we are interested in by providing more labeled data so it can learn the patterns we want it to detect, improving visibility with good lighting and clear angles, using visual markers or cues, and controlling the environment to reduce background clutter and maintain proper distance.

Week 5-Reading Response

Reading this made me think a lot about how different computer vision is from how I see the world. The author mentions that a computer is “unable to answer even the most elementary questions about whether a video stream contains a person or object,” and that honestly surprised me. It made me realize how much meaning humans naturally add when we look at something.  We don’t just see pixels we understand context, emotions, and objects instantly. Computers don’t do that, they see raw data that needs to be processed. That really changed how I think about interactive art, because I realized it’s not just about being creative it’s also about setting up the right conditions so the computer can actually see what we want it to. I also noticed the author is very positive about computer vision.I don’t think he’s wrong, but I do feel he focuses more on the benefits than the risks, which makes me think he might be a little biased toward celebrating the technology.

The part about tracking and surveillance raised the biggest questions for me. In class, we saw that piece where the visuals changed based on how loud someone spoke, and that example helped me understand how the viewer becomes part of the artwork. The system watches you, follows your movement, and reacts right away. It’s cool, but it also feels like being watched. Even if the goal is interaction, it still brings up the idea of surveillance. It made me wonder where the line is between participation and being monitored. And how does this relate to the way technology watches us in everyday life? The reading didn’t fully answer those questions for me, but it definitely made me more aware of them.