Week 5: Creative Reading

Reading Levin’s “Computer Vision for Artists and Designers” made me realize how much I take my own sight for granted and got me questioning what it means for a computer to “see.” I think I always imagined it as something similar to human vision? just more technical with lots of codes and numbers and letters. This reading though, made it clear that computer vision is not actually vision in the way we experience it. While we humans automatically recognize faces, objects, details, and context without consciously thinking about it, a computer just receives pixel values without understanding the meaning since it only processes numbers and the differences between frames. That really stood out to me because it shows how much interpretation we naturally do as humans without noticing it.

Another important difference is that human vision is flexible while computer vision is mostly dependent on certain conditions, for example if the lighting or background changes a little, the computer vision system can fail completely. The reading emphasized how algorithms like frame differencing or background subtraction rely on stable environments and clear contrasts which made me realize that computer vision is not just about writing code but it’s also about carefully designing the physical environment so that the system can succeed. There are things like using a strong contrast or controlled lighting or also reflective materials can improve the tracking of the computer vision. This also allows the artist to be responsible for preparing the place or area so the computer can see clearly.

Overall, this reading made me see computer vision through a more structured system that has its own limitations. It also made me think about how these tracking systems can feel like surveillance, which gives the artist power to make the audience feel observed or even part of the piece in a more intense way. As someone working in interactive media, this makes me more aware that designing with computer vision is not just about making something work, but about understanding what the system can actually perceive and what it cannot.

Week 5: Midterm Progress

Concept:
Instead of making a competitive game that includes winning or losing, I wanted to create something calm and creative while still connecting it personally. I decided to create “Morocco’s Door Studio” for my Midterm project. Initially, I got inspired by traditional Moroccan doors because of the detailed arches, colorful zellige tiles, and the decorative handles that make each door unique. Whenever I come across these doors in Morocco, each door is different, although the pattern and color can be similar, the entire final piece is different. The experience is similar to a digital sticker book because the user starts with a blank Moroccan door frame and can build their own entrance by selecting different decorative elements such as tiles, arches, and knockers while adding their own colors to it. The goal is for the user to experiment freely and design something that feels visually pleasing to them. Traditional Gnawa music will play in the background to enhance the atmosphere and make the experience feel immersive and culturally rooted.

Design:

The design will focus on Moroccan tones like gold, blue, red, and nude colors. The door frame will remain centered on the canvas as the main focus, while decorative options will appear as selectable buttons on the side of the screen (for example “Tiles” “Knockers” “Arches”). When the user selects a category and clicks on the door area, a decoration will appear at the mouse position. Each decoration will be represented as an object and added to an array so multiple elements can be placed on the screen. The experience will begin with an instruction screen that explains how to play. The user must press a key or click a button to begin designing. After decorating, there will be a “Reset” or “Start Again” button that clears the canvas so users are able to experiment again. Additionally, to make the interaction more engaging a soft sound will play when a decoration is placed.

Images: (inspo/planning for my game/experience)

Challenging Aspects:

One of the most challenging aspects of this project I think will be managing the different decoration objects and making sure they are stored and displayed properly. Since every click creates a new object, I need to make sure they are correctly added to an array and drawn continuously in the draw() function. Another challenging aspect is organizing the different design categories. I need to make sure that when the user selects “Tiles” only tile decorations are placed, and not arches or knockers. Another issue might be finding transparent PNG images for the decorations which might take a while, but I can try creating them on my own or maybe using AI for this part.

Risk Prevention:

Although I am still working on this project, and I won’t know what issues I will run into, to reduce the risk of the objects becoming confusing, I will first create a simple test version where clicking the screen generates basic colored shapes using a class that I recently found and learned about: Decoration. This will help me confirm that the objects are properly created and stored in the array.

Before adding detailed images, I will test everything using simple shapes like rectangles and circles. Once the system works, I will replace those shapes with the actual decorative images I have saved.

For the category selections, I will use a variable maybe naming it currentCategory that will update whenever a button is clicked and this way the program always knows which type of decoration to place on the screen.

Week 5 – Reading Response – Kamila Dautkhan

I think some of the ways that computer vision differs from human vision is that while we see “meaning”, computer sees only the math behind it. Because human vision is very semantic, for example when we look at a video we instantly see a person, a mood or a story. But for a computer that same video is just a massive dump of pixel data and color coordinates. Because the computer has zero clue if it’s looking at a human or something else until we write an algorithm to prove it. To help the computer see what we’re interested in we can use cheats like  background subtraction, where the computer compares a live shot to a blank photo of the room to spot what’s new or frame differencing to track motion by subtracting one frame from the next.

The fact that computer vision is rooted in military surveillance really colors how it’s used in interactive art. I think because computer vision was born in military and law enforcement labs, they are designed to track and monitor. That’s why they bring a sense of control to interactive arts. For example, in works like Myron Krueger’s Videoplace the tracking is used for play. It turns your body into a paintbrush, giving you a role in the digital world. Also, projects like Suicide Box show that surveillance can be used to track the tragic social phenomena that the government might ignore.



Week 5 – Midterm Intro – Kamila Dautkhan

The Concept 

I’ve decided to create an interactive game called “Star Catcher.” The core idea is a fast paced arcade game where the player controls a collector at the bottom of the screen to catch falling stars. Since I wanted the interaction with the game to feel smooth, I have decided it to be mouse driven. The user starts at a menu screen and has to click to enter the game. My goal is to at the end add more layers but for now, I’m focused on the core loop: fall, catch, score, repeat.

Designing the Architecture

I’ve already started laying the foundation of the code using object oriented programming. 

  • The player class which handles the paddle at the bottom.
  • The star class which manages the falling behavior, the random speeds, and resetting once a star is caught or hits the ground.
  • State management where I implemented a gameState variable to handle the transition from the Start Screen to the Active Game and finally to the Game Over screen.

The “Frightening” Part 

The most scary part for me was asset management and collision logic. I was really worried about the game crashing because of broken image links or the sound not playing.

In order to minimize the risk I wrote a “fail safe” version of the code. Instead of relying on external .png or .mp3 files that might not load, I used: createGraphics() to generate my own star “images” directly in the sketch and p5.Oscillator to implement a synthesized beep instead of an audio file. This helped me to test the collision detection algorithm (using the dist() function) and get immediate audio feedback without worrying about any file paths.

 

 

Midterm Progress

Midterm Idea:

I was inspired by Portal 2 by Valve in this endeavour, with my own twist on the gameplay and concept.

While I haven’t settled on a fitting name thus far for the project, the concept entails a “multi-faceted” reality, with a ray-gun that fires a bullet. When this bullet is incident on specific “panels” in the game, it may reflect off the surface of the panel. The objective is to reflect the bullet such that it reaches a designated mechanism to progress to the next level or finish the game.

The Twist:

This gets interesting in such that there are two realities that this special ray-gun, bullet and panels engage with. That is, the ray-gun may fire bullets in either one of Dimension ‘A’ or Dimension ‘B’. If the panel is of the same ‘dimension’ as the bullet, it will hit the panel and shatter it (this may unlock new areas or cause a loss if the wrong panel is destroyed). Alternatively, if the panel and the bullet differ in their ‘dimension’, then the bullet reflects off the panel and this can then be further reflected off other panel as the concept graphics below show:

 

The game will feature inspiration from very simple brutalist architecture, with 1K PBR Textures because I like a degree of realism in my games. Maps will generally be dimly lit, and there will be cool snippets of lore that players can find as they explore different parts of the map or progress though the game.

User Interaction:

Other than the aforementioned ray-gun, there will be WASD keys for forward-backward and left-right movement. Additionally, to look around them, users can drag their mouse towards the sides of the screen to pan their camera. Users can walk to all accessible parts of the map, which will be made sufficiently obvious to avoid confusion, as I am intending on using a darker theme for this game.

Users may use E to fire a bullet, and Q to switch the dimension setting of the ray-gun, id est the dimension of the bullet it fires.

Progression:

There are a total of two levels I plan to implement, and users are limited to three shots from their ray-gun before they have to restart the ENTIRE game because I want to convey a feeling of greater loss, not just that they spawn back at their current level.

Development & Implementation:

I have thus far coded in a very simple game engine to handle all objects. I am going to implement physics and collisions soon, along with more textures and the map.

There are many classes in this, but the two I shall focus on here are ‘BasePart’, which handles a simple 3D box or part of the map in the game, and ‘Workspace’, which is a class that contains all BasePart and other extended instances in the game, as well as contains the camera that displays the map to the user.

Major Concerns and What I’m Dreading:

This projects may seem overly ambitious, but the initial game engine setup has come through nicely. I am not looking forward to collision handling since that always is a nightmare.

Code and Credits:

Find here a link to my code: https://editor.p5js.org/rk5260/sketches/b6B9BVyoE

All code is written by myself (exception below), with many elements I first learnt how to use from the p5.js documentation, after which I implemented them myself.

Exception: The code for the PBR shaders was composed and modified from multiple existing shaders on github and p5.js documentation itself.

All PBR textures sources are credited in a credits.txt file in the PBR Texture’s directory. (Thus far its only GravelConcrete).

 

Week 5 – Midterm Progress Zere

Concept of the project: I decided to create an interactive piece called “Day in the Life of a Cat”, where the user gets to experience normal cat stuff that cats get to do throughout their days.

Why did I choose this concept? Well, I have a hairless sphynx cat named Luna, and I love her VERY MUCH. I miss her a lot since she lives with my grandparents, and I decided to remind myself of her by creating this interactive experience.

What and How: The user gets to experience morning, day, and night as a cat, clicking on different items and discovering what they can do with them. I decided to keep the interface very simple, since overcomplicating things would make it hard for me personally to produce a good interactive experience. Here is my sketch for now:
I think one of the most important part of the sketch is the “meow” sound that is played when users click on the paw. That is why I created a test program for sound playback. It may be simple and flawed for now, but I think it’s an easy solution to the problem if it arises. Here is the link for the program: LINK. 

Midterm Progress

Concept & Interaction

What I love more than horror games, is psychological story-based visual novels. Games that hold you in place, extremely focused and afraid to even blink to miss something important (or something that will get you in trouble). Also, I really love when innocent and soft, childlike things are framed in the way that makes you really uncomfortable, creating a really two-sided feeling of nostalgia and comfort with unsettling disturbance.

More than that, for a very long time I wanted to try experiment with computer vision and body capturing, so I decided to combine these two things in my midterm.

What I want to do is a game that is controlled by user’s video input. In Russia, we play a clapping game called “Ладушки” (ladushki; I believe in English it’s called Patty Cake), where you need to match the rythm of the other person clapping, as well as their hands (right to right, left to left, two hands to two hands). I want to make the user to play this game with the computer. There will be a girl in a room that will welcome the player to play this game with her. Her clapping will be consequentive, and player just has to match her hands state with their hands.

The twist is that if the player fails to match the girl’s rythm and hands state, she will get angry. And as the user makes more mistakes, she will get more angry. With the anger level increasing, the whole picture and game will become distorted: video glitching (in later phases: disappearing), rythm becoming unstable and/or much faster, unfair hand detection, intentional mistakes in detecting state of the hands, sound becoming distorted, phrases girl says after mistakes turning to be aggressive, her appearance shifting as well. If the user makes it to the anger level = 100, there will be a jumscare with their video distorted (I figured that out of all jumscares it will make the most impact).

Some details about the concept might be found as comments in my code, and the general outline is planned to be like in the picture below. To create creepy atmosphere, I plan to use really soft colors and cute artstyle that won’t match the gameplay and plot.

Code Outline

Right now I decided to focus on the technical side of the project, making the algorithm work, so after this I will have the core mechanic. After that, I will focus on visual and sound design: drawing sprites, finding suitable sounds, creating glitching effects, scales, text etc.

This is the plan of how I (as of now) think the code should include:

  • class Girl with one object initiated and the following methods:
    • talking method (reaction when user fails to match the girl’s tempo)
    • changing hands states method
    • comparing Girl’s to user’s states method
    • drawing girl method (sprites needed)
    • anger level scale draw method
  • Functions in the general code block:
    • detecting user’s handpose function
    • displaying user’s video
    • video distortion function
    • sound implementation + sound distortion methods
    • final video screamer function
  • Assembled game in setup() and draw() with restart option (maybe pause + exit buttons ?)
Code Made

Sketch requires camera access!

This is my sketch so far and the code I made. As I said, now I’m focusing on the technical part. Now, the code can:

  • detecting user’s handpose function
  • display user’s video in the corner
  • talking method (reaction when user fails to match the girl’s tempo) (Girl Class)
  • changing hands states method (Girl Class)
  • comparing Girl’s to user’s states method (Girl Class)

Instead of drawing, the code now just outputs the anger level and the state of the girl’s hands. The code compares the video input and user’s hands position with girl’s hands state. When user makes mistake, anger level is increased by 10 and there’s text displayed on the screen (3 phrases for each sector: 4 sectors depending on anger level). However, it isn’t staying on it now (to be fixed). Also, the game loop stops once annger level reaches 100.

The base code for hand detection function implementation is copied from ml5 HandPose reference page. I also used this The Coding Train video about Hand Pose.

Complex Part

I believe that the most difficult part of my midterm is working with video input and hand detection. It’s a pretty new concept for me, as well as it’s pretty hard to implement it to use not just as a small interactive component but a core concept around which the game is built. Risk of improper detection of pose, poor video input and glitching is quite high. However, I tried to build this part first and it turned out to not be too difficult. After testing my code for some time, I defined three poses the computer should recognize: two hands open, left hand open, right hand open. Perfectly, to fit my concept and the Ladushki gameplay, I also need to have a pose for clap, but the problem is that when hands are clapped and face their edge to the camera, hand detection disappears. Since it could possisbly break the game and unfairly detect it as user’s mistake when it’s not, I decided to ignore this state fully and check only for claps to the camera, when palm is facing the computer.

Also, to avoid random poses appearing and being detected, I added a confidence level: if the confidence in detection of hands is lower than this set level, the computer won’t register it as a pose. This really helps a lot to not identify some random pose/made up movement as the user’s mistake.

Now, the most challenging part for me would be the visual design. I don’t have much experience in this type of creative work unlike in coding, so creating sprites and building environment that will serve for the goal of the game and suit its atmosphere, as well as arranging everything properly and not overloading, will be a bit hard for me. To check on my progress here and to track the result of the aesthetic impact and suitability, I would ask my friends to give me feedback and interact with AI so it can give me some more theoretical rules about how things should be arranged in final outcome according to some basic design rules.

I believe that this project is really fun and much easier to make than I expected, since the hardest part was mostly completed already!

Some of my inspirations for design, concept and aesthetic are: Needy Streamer Overload and DDLC

Week 5 – Reading Response

What are some of the ways that computer vision differs from human vision?

Previously, I kind of always linked computer vision with machine learning. I always assumed there was some use of machine learning to identify the different objects in a given video, and to really understand the movements and different interactions within the video. However, after reading this article, I feel like I’ve gained a much clearer understanding of how computer vision actually works as well as a better understanding of the limitations of the technology available. While both computers and humans can probably identify where a person is in a video and their movements, humans are also usually able to predict their next movements. Humans are familiar with how humans interact with objects, while computers really depend on data, which can sometimes miss anomalous cases or outliers. An example that may seem a bit far fetched is someone who only has 4 fingers, human vision is obviously able to comprehend that, while I assume computer vision may not be able to tell that there is something missing in the image, and it’s only programmed to work with the norm.

In terms of computer vision’s capacity for tracking and surveillance and it’s effect on its uses in interactive art, I think one of the examples from the article, Suicide Box, combines those two ideas nicely. The tracking and surveillance aspect of computer vision has been used to create an art piece (kind of) about suicide and to emphasize irregularities in data. An issue that immediately comes up for me with computer vision is privacy concerns. A tool that once so heavily used for tracking and surveillance, to now be used in interactive art may be suspicious to viewers. Viewers may be paranoid that these art pieces are collecting data about them, however, I’m not sure if this is a common concern, considering most art pieces we’ve looked at that use computer vision have been well-received.

Week 5 – Reading Reflection

It was interesting to learn about how computers actually see and what stood out for me was the various methods employed by a computer to see and make decision or create art. Selection of a computer vision technique adds complexity to the interactive works and alters how one can interact with the work. The right technique must also be selected to minimize errors and ensure consistency in the art as some techniques are known not to perform well in certain conditions.

One possible application of this could be how an interactive artwork involving computer vision can be placed strategically in an arts exhibition to accentuate or improve the vision of the work. Carefully selected piece of art can be placed around the work to generate the needed contrast, brightness or effects for the computer vision just like how the white Foamcore was used for the LimboTime game.

The use of surveillance to generate arts was also something worth taking a look at. Are there any privacy restrictions or laws protecting the identities of the people in these forms of arts and how are their privacies protected? The work Suicide Box by the Bureau of Inverse Technology makes me question if artist actually have the right to use data or information like this to create a piece of work. It gives me the impression that they are amusing tragedy. I am also left with the question: how do they respect the dignity of those who jumped off the bridge?

Week 5 – Reading Response | COMPUTER VISION FOR ARTISTS AND DESIGNERS

When I think of Computer Vision, the first thing that comes to my head is this coder called the Poet Engineer on social media who uses computer vision to create the most insane visuals purely from the camera capturing their hand movements. They have the coolest programs ever. I also love it when artists make videos of them creating cool things with their hands purely through code, and one of my favourite examples of using code to create art is Imogen Heap’s MiMu gloves. And, also, the monkey meme face recognizer I keep seeing everywhere (photo attached). It still baffles me that we can use our hands and our expressions to control things on a device that usually interacts with touch! So, this reading was one of my favourite readings so far, because it discussed one of the main concepts that hooked me into interactive media in the first place. 

From what I understood of the text, the primary difference between computer and human vision is that while a human observer can understand symbols, people or environmental context like whether it’s day or night, a computer (unless programmed otherwise) perceives video simply as pixels. Computer vision uses algorithms now to make assertions about raw pixels, and even then, designers need to optimize the physical environment to make it “legible” to the software, such as using backlighting to create silhouettes or using high-contrast and retroreflective materials. Despite these limitations, is it still not insane that we’ve evolved so much that we can make computers identify specific things now, despite it being a computer? The fact that now computers can have hardware that goes beyond our own capabilities, such as infrared illumination, polarizing filters and more is almost scary to think about. I’d also say that computer vision is much more objective than human vision. Is it possible for computers to suffer from inattentional blindness as much as we do? For example, when we enter a room and fail to see something and then we come back and the object is right there and it never moved, is a computer capable of the same thing?

I liked that this reading stated down the different techniques used in computer vision, because when I originally understood CV, I was overwhelmed by the amount of things it could sense. I understood these techniques (and I’m listing them down so I can refer to them later as well):

  1. Frame Differencing / Detecting Motion: Detects motion by comparing each pixel in a video frame to the corresponding pixel in the next frame.
  2. Background Subtraction / Detecting Presence: Detects the presence of objects by comparing the current video frame to a stored image of an empty background.
  3. Brightness Thresholding: Isolates objects based on luminosity, by comparing brightness to a set threshold. (I did an ascii project a few years ago, where it would capture your image, figure out the contrast and brightness and then replicate the live video input as letters, numbers and symbols. I would like to replicate that project with this concept now!)
  4. Simple Object Tracking: Program computer to find the brightest or darkest pixel in a frame to track a single point. 
  5. Feature Recognition: Once an object is located, the computer can compute specific characteristics like area or center of mass (this is CRAZY). 

There are definitely more techniques that are out there, but I’ll start off with the basics, since I’m a complete beginner at this. I did want to try using feature recognition paired with simple object tracking, something I noticed is used in hand tracking (and the monkey video. LOL).

I mentioned the objectivity of CV earlier, but what happens if the datasets that they are trained on are biased? What if the creator behind the program has their own biases that they implement into the program? I like how Sorting Daemon (2003) mentioned looking at the social and racial environment, because I was wondering about situations where CV could be programmed to unintentionally (or intentionally) discriminate against certain traits such as race, gender, or disabilities. Surveillance is a scary concept to me too, because what happens to the question of consent?  While computer vision could be used to reveal hidden data in environments that are often overlooked, create programs that can help people without the need for a human to be present (e.g. Cheese), and so many other cool things, it could also be used in a negative way. I need to make sure to find a way that any programs I create with CV are inclusive and not used for ill intent.