WEEK 5 READING RESPONSE

Computer vision and human vision are very different in how they process information. While people can quickly understand what they see, computers need specific algorithms to detect motion, brightness changes, and object differences. We help computers “see” by adjusting lighting and using techniques like background subtraction and motion detection to improve tracking accuracy.

In interactive art, computer vision allows viewers to engage with the artwork in real-time. By tracking movements and gestures, it creates an immersive experience where the audience becomes an active participant, enhancing their interaction with the art.

However, this ability to track people also raises concerns about privacy, especially in public spaces. While it makes art more interactive and responsive, the same technology can be used for surveillance, which can feel invasive. Artists and technologists must strike a balance between creating innovative interactive art and respecting individual privacy, ensuring the technology is used responsibly and ethically.

 

MIDTERM PROGRESS

Superman Saves (Enhanced Game)

Concept of the Project

“Superman Saves” is an interactive game inspired by my previous project, which focused on simple character movement and rescue mechanics. In this version, I aimed to elevate the project by adding dynamic challenges, such as time limits, obstacles, and a progressive difficulty system. The objective is to control Superman as he navigates the sky, rescuing individuals while avoiding clouds and birds. The game becomes increasingly difficult with each successful rescue, introducing faster obstacles and reducing the time available to complete the rescue.

The concept is rooted in creating an engaging, responsive game environment that tests the player’s reflexes and strategic thinking. By introducing new features like lives, levels, and a timer, I’ve created a more immersive experience compared to the original version, which was relatively straightforward in terms of gameplay.

 

How the Project Works

The game begins with Superman stationed at the bottom of the screen, and a person randomly placed near the bottom as well, awaiting rescue. Using arrow keys, the player can move Superman to navigate the sky, avoid clouds and birds, and reach the person. Upon reaching the person, Superman flies upwards, carrying them to the top of the screen to complete the rescue.

A notable feature of the game is its dynamic difficulty adjustment. Each successful rescue increases the game’s difficulty by speeding up the clouds and bird movements, which adds a sense of progression. Additionally, the inclusion of a timer introduces a layer of urgency, forcing players to make quick decisions. I’m particularly proud of how the game manages the timer, lives system, and level progression seamlessly, as these were complex components to implement but significantly enhanced the overall experience.

The code uses object-oriented programming principles to manage the background stars, obstacles, and gameplay mechanics. I took advantage of arrays to efficiently handle the stars’ animations and the positioning of various game elements.

 

Areas for Improvement and Challenges

One area that could be improved is the game’s overall visual design. Although the current visual elements (e.g., clouds, birds, and Superman) are functional, they could benefit from more detailed and polished artwork. Additionally, I would like to enhance the sound effects in the future, adding background music and sound cues for when Superman successfully completes a rescue or collides with an obstacle.

I encountered a few challenges during development, particularly in managing the game’s timer and ensuring that collisions between Superman and obstacles felt fair and consistent. I resolved this by tweaking the collision detection algorithm and adjusting the movement speeds of the obstacles as the difficulty increases. Another issue was ensuring that the game feels balanced at higher levels, where the speed increase can quickly overwhelm players. However, after adjusting the difficulty curve, the gameplay experience became smoother.

 

EMBEDDED SKETCH

 

LINK TO FULL SCREEN

https://editor.p5js.org/b_Buernortey_b/full/p0Rs9Tzbk

Week 5: Reading Response

Computer vision differs from human vision in several ways. Humans can focus on basic features of objects and identify them even under different conditions, such as low light or slight changes in color and shape. In contrast, computer vision focuses on details rather than basic features, relying on a set of rules to detect objects. This can lead to mistakes when slight environmental or object modifications occur, such as changes in lighting. Another key difference is the ability to recognize objects in three dimensions. Humans can perceive depth, while computer vision typically operates in two dimensions, meaning that slight tilting of objects can cause confusion.

Various techniques can help computers see and track objects more effectively, similar to how humans do. One such technique is frame differentiation, which is useful for detecting motion. This is done by comparing consecutive frames, where differences in pixel color indicate movement. Another technique is background subtraction, where the computer is provided with a reference image of the background. When an object is introduced into the scene, the computer detects pixels that differ from the background and identifies the object. A third method is comparing pixels with a threshold value, which is especially useful when there are significant light differences between the background and the object. Object tracking can also be achieved by tracking the brightest pixel in a video frame. Each pixel’s brightness is compared to the brightest encountered, and its location is stored. This technique can be adapted to track the darkest pixel or multiple objects of different colors.

In interactive art, the complexity of implementing certain ideas limits artistic expression, as only a few people have the expertise to implement such designs. However, with ongoing advancements making computer vision techniques more accessible and easier to use, art will increasingly benefit from these technologies.

Week 5 – Midterm Progress

Concept

Whenever I went to the arcade, the dance machine was always the most fascinating and fun for me. This inspired me to create a dancing game, but with a twist. The story takes place on a planet about to be invaded by aliens, where a cat is granted a superpower to defeat them. The catch is that his superpower is activated through dance moves.

Arcade Dance Machine Hire | Arcade & Games Machines | Odin Events

Game design

The game mechanism will be moving your mouse to avoid falling asteroids/aliens and using UP,DOWN,LEFT, RIGHT arrows to destroy them. After destroying a number of enemies, extra moves will be unlocked to destroy multiple aliens at the same time.

Implementation design

Structure of code:

  • class for cat: display, move, unlock moves, variables: x, y, dance (current key pressed).
  • class for enemies: display, move (fall), variables: x, y, arrow, velocity.
  • starting screens: including story screens that tell the story of the cat.

Không có mô tả.

Starting screens and tutorial

Không có mô tả.

Game screens

Progress

Currently, I have implemented the basics of cat class: move, display and arrow class: move, display. I also implemented the randomly generated arrows and mechanism to remove it from the screen when users input a key.

Struggle

For this project, my main concern is how to synchronize the music with the gameplay. Typically, dancing games generate arrows in time with the beat. In the current version, the arrows in my game are randomly generated with random timing and velocity. Therefore, in the next step, I plan to improve this by generating arrows based on the music’s beats.

Another struggle I have is finding the sprite for the game. There was not a lot of resources online for sprites in dancing game. I was trying to generate from gif to sprite, however the file size was too large and cannot be imported to p5js. During the next step, I will try to create my own sprites and design the visuals for the game.

Next steps

  • Implement extra moves
  • Add collision when arrows fall on the cat
  • Apply visuals
  • Create starting scenes
  • Fix initialization of falling arrows

 

Week 5: Mid-term Progress “Animal sounds trivia.”

Concept

My idea for Midterm is to create a simple game that I will call “Animal sounds trivia.I will create a simple sprite sheet of a human character walking and running. The movement of the human character will be determined by the pressing of the arrow buttons by the user.
As the user presses to move the background will move so that the human character appears to move as per the direction of arrows.  As the human character moves, I will position Animal animations at different intervals depending on how long the human character has moved (Determined by how long the key arrow was pressed). As the animal appears, I will play sounds that are related to the animal shown in the background. After the sound is played I will pop up a window, temporarily freezing the background. The pop up window will ask the user to tell the name of the animal. If the user tells a correct name, they get a score, otherwise they can choose to re-listen for once more  and guess again or reveal the animal and get zero score for that animal.  

The game will go to around 10 animals and stop with a window that displays the user’s total score and give them options to replay the game or quit.

My front screen should look something more like this one.

Implementation.  

I hope to implement my game using Four classes. My first class is the Player class. This class will manage the Sprite sheet for the player walking and standing animations as well as Player’s position with respect to the starting position. The second class will be the Animal class, this class will manage sounds as well as animal Images/animations. The third class will be the Message class. This class will manage the pop up window that keeps track of the information given by the user , comparing them with correct ones and updating the score of the player. The last class will be the Game class. The game class will contain instances of all the three classes and will manage the game flow. Allowing restarts and quitting. 

The starting window will be briefly detailed with instructions on movement and keys to use. With every interaction my users will be able to know what to do as the pop up window will ensure that!

Challenges

My hope is that I can create some good animations. Matching the photo I attached or at least close to it. I have seen that I can archive this using Unity. Which I have started  to explore and hopefully will work out.

In order to minimise the risk, I have thought of using a simpler design given the time frame. I will use some simple animations using available sprite sheets on the internet. 

Midterm Project Progress: Eye of the Sound

Concept

The “Eye of the Sound” is a project which transforms sound into mesmerizing visual art using the principles of physics. There are many others before me who have used music as a way of generating artwork, like:

However, there was one artwork which was quite interesting to see on Vimeo: A circular spectrogrram. This project would be based on this visualization at the core and would be expanded on that too.

This spectrogram appears to be mostly based on the loudness of the sound, and though it is an intriguing and cool artwork, has no user interaction, no generative features, and so on. That’s where my project, “Eye of the sound” comes in. It would be based on sensory experiences of sound and sight, and the circular flow is inspired from the flow of life, and the end is quite similar to the ‘Iris’ of the eye, a symbol of life.

Implementation

I have a basic idea of what must be done in the project. The central part of the project would be FFT (Fast Fourier Transform Analysis) to distinguish the different frequencies in the song, and then use a linear visualizer to display it. The visualizer’s displacement from the mean position is stored in an array, and is used to display the results on a rotating graphics buffer layer. The layer is rotating in one direction, and the visualizer in the opposite direction to cancel the rotation effect and “appear” stationary.

The color of the imprints on the layer are based on the loudness, and the intensity of the imprints is determined by the FFT analysis. The subtitles are displayed below.

The user also has another option in which they are able to sing along with the music being played, and their voice imprints are also calculated and diplayed in the same manner but in a different color scheme according to them.

This means that providing a menu screen for the user in which the instructions are provided and separate buttons are there to lead them to the mode they want.

Improvements which can be made

  • Adding a choosing the song screen, where the user can select some songs from some options in both the modes
  • Adding a sprite running loading bar for the amount of song left
  • Adding some generative effects on the subtitles, or some sparkle effects on the “highs” of a song
  • Adding a performance score in the second mode to see how well the user has done with the song
  • A “Save Canvas” option for the user to store their experiences.

Week 5 reading

This reading was instrumental in my understanding of how computer vision techniques can be harnessed in the realm of interactive art and design.

One of the most enlightening aspects of the article was its clear explanation of the fundamental differences between computer and human vision. Understanding these distinctions helped me grasp why certain approaches are necessary when implementing computer vision in artistic contexts. The emphasis on the limitations of computer vision systems, such as their struggle with environmental variability, underscored the importance of thoughtful design in both the physical and digital realms.

The article’s discussion of various techniques to optimize computer vision for artistic applications was particularly valuable. Levin’s explanations of methods like controlled lighting, and algorithms provided me with a toolkit of practical approaches. This knowledge feels empowering, as it opens up new possibilities for creating interactive artworks that can reliably detect and respond to elements in a scene.

The ethical considerations raised in the article regarding tracking and surveillance capabilities of computer vision were thought-provoking. Levin’s examples of artists like David Rokeby and the Bureau of Inverse Technology, who have used these technologies to comment on surveillance culture and social issues, inspired me to think about how I might incorporate similar critical perspectives in my own work.

Furthermore, the range of artistic applications presented in the article, from full-body interactions to facial expression analysis, expanded my understanding of what’s possible with computer vision in art. These examples serve as a springboard for imagining new interactive experiences and installations.

In conclusion, this reading has significantly enhanced my understanding of computer vision in the context of interactive art. It has equipped me with technical knowledge, practical approaches, and critical perspectives that I’m eager to apply in my own creative practice.

Midterm Project Progress – Week 5

Concept and Introduction

Like a movie teaser, this would serve as the very first visual to be seen for this project. I designed it using Adobe Photoshop. Additional images are from pixabay.com

For my midterm project, I wanted to explore something I deeply love—a concept that excites me to bring to life. This project combines elements from my Week 1 and Week 3 assignments into an interactive piece that I’m thrilled to work on. My love for sci-fi began with watching Dragon Ball Z and was later fueled by shows like Naruto, Marvel and DC animated and live-action films, Star Wars, and many more. From this inspiration, I created a universe that’s too vast to fully explain here, but I can say that this project represents a small piece of something much larger. ‘The Flame Boy’ is a character I’ll be exploring through interactive storytelling, which excites me because it allows me to experiment with a different medium than the filmmaking I’m most accustomed to.

In short, ‘The Flame Boy’ is about a young boy who lives with his grandfather. He was abandoned by the royal family (the reasons are explained in the project) and left on the side of the planet where the sun never rises. He meets someone special who inspires him to trace his roots, as he never felt he truly belonged to this side of the world. The interactive story allows the user to explore this world and learn more about this character. Eventually, he discovers the truth about his family, specifically the Robinsons.

The concept of the interactive artwork offers a choice: you can either explore The Flame Boy’s world or uncover the truth (think ‘red pill, blue pill,’ if you will). Choosing to explore his home lets the user interact with his room and discover his personality. On the other hand, choosing to know the truth allows the user to experience the story through interactive storytelling, which will unfold as you continue the journey.

 

User Interaction

  1. The interactive artwork begins with an opening splash screen. On this screen, there are visual cues guiding the user to enter and continue. This is the starting point every time the user explores this world. A soundtrack plays in the background whenever this screen appears. The following images are rough sketches of how I envisioned the splash screen before moving into Photoshop:
Cover screen idea 1
Cover screen idea 2

2. Once the user presses any button, they are transported to a menu screen. This screen presents them with the option to either explore The Flame Boy’s world/home or learn about his story.

If they choose to explore his home, the screen transitions to a scene resembling his house. Users will be able to interact with various objects within his space, allowing them to learn more about him through this interaction. This will be created using a combination of shapes in p5.js, along with a few images, music, and sounds. The experience will be simple and intuitive.

 

If the user chooses to learn about his story, they are transported into a movie/book-like environment. Here, a narrator introduces the protagonist, explaining how he was born, how he received his name and powers, and why he is where he is now. The user can advance to the next page by clicking the screen. As the story progresses, they meet a magician who guides the protagonist in discovering his identity.

The user is then presented with another choice: either ‘shoot the stars’ using The Flame Boy’s fire powers to earn 100 star coins, or navigate their way through a dark maze using The Flame Boy’s fire as a light source. The maze changes each time the user selects this option, creating an unpredictable and ‘random’ experience. Once the user completes these mini-games, they witness The Flame Boy meeting his parents for the first time. The experience then concludes, prompting the user to start over.

The following image is a simple node sketch I made in photoshop which depicts the flow of the program in its entirety:

This was the 3rd version of this visual. A more sophisticated version exists. The nodes represent the structure explained above.

The following is a brief progress report on the program as of the time of writing. The music was made in Suno A.I:

 

Most frightening Part and its Solution

Problem: The main challenge of my interactive artwork lies in implementing the two mini-games within the story segment. This project feels like three projects combined into one, which makes me concerned about whether users will find it as rewarding as I hope. Specifically, I felt apprehensive about how to implement the maze game and the shooting stars feature.

Solution: At the time of writing this report, I am researching ways to integrate these features using predefined algorithms available on GitHub. I will discuss these algorithms in my final update, whether they work or if there’s a need to change the creative and technical approach. For now, this is my progress update.

The following is an image of the splash screen as a thank you for reading this report in its entirety (and because I’m excited to share this image I made in Photoshop!).

Image made in photoshop for the interactive piece titled: The Flame Boy: Becoming a Robinson. Further developments will be made. Images were downloaded from Pixabay and Meta A.I

Week 5: Midterm Progress

# Jump To:


 

# Introduction

Hey everyone! 👋

Like probably everyone else on the entire planet, I wasn’t really sure what to do for my Intro to IM Midterm Project at NYUAD (ok fine, maybe not everyone else) (fun fact, I’m still not certain 😂😭). I was deliberating whether to stick with traditional input devices (mouse and keyboard), or go for some interesting new (like the classic and probably cliche, face or body tracking). Unfortunately, I don’t own a guitar like Pi Ko, so I guess I’ll have to stick with something traditional 😅 (now that I think about it, I could use a piano, wait a moment…)

 

# Piano Game Concept

Temp Piano Game Concept, with keys controlling the height of the bars, and a ball on it
A piano game concept!…
Gosh… why do I waste so much time, on a joke concept…

So the keys of the piano (but optionally also the keyboard, allowing everyone to play it) control the height of the bars, and you have to get a ball that drops out of the start area into the goal, potentially avoiding obstacles and overcoming bad physics, all the while automatically producing (probably ear piercing) music random garble of sounds. That’s a win-win-win!

Ok, this was a joke. Now back to the actual concept I wanted to talk about 😅 (though to be honest, this seems more original and fun ._.)

 

# Actual Game Concept

You know those outlines they have of people (like at crime scenes, but don’t think about that)?

Body outline

Or those ones where a cartoon character goes through a wall?

Human shaped hole in wall

 

Well, what if you had to make the correct pose to fit through a series of holes and avoid hitting the wall? That’s the essence of my idea. I thought about it while thinking on how I could utilise face/body tracking (which probably shows 😅), and which is exactly the wrong approach (you’re supposed to first have an issue/idea, then think about how to solve it, not try to find a use case for a certain techonology, that’s like a solution in search of a problem). Also, this idea is simple and obvious enough that while I haven’t seen it yet, it very well might already be a real thing. Still, I find the idea quite charming, especially as I envision it on a large screen, with people frantically and hilariously jumping around positions. I will also include a “laptop mode”, where the program only shows upper body cutouts, in order to make it accessible and playable on laptops too (where it would be hard to get the distance required for the camera to see the full body, while still allowing you to comfortably see the screen).

It should not come as a surprise then, that I plan to use body tracking to be able to control the motion of the character. Of course, it would be a gargantuan task to implement a decent body tracking solution from scratch (something more for a final year’s or even PhD’s research project than an Intro to IM midterm), so I will use a pre-existing library to handle the detection for me, mainly movenet (probably through ml5.js), a fast and accurate pose detection model by Google.

As I was thinking about this a bit, I began thinking that maybe this could be about a ninja’s training. You enter the secret facility (the interface is hopefully interactable through hand gestures, similar to the Xbox with kinect, if I have enough time), and then have to undergo the “lightning quick attack & defense poses” (or something), as part of your training.

 

## Complex part & Risk Reduction

As part of our midterm progress blog post, we have to identify the most complex/frightening part, and do something to tackle and reduce that risk. For this project, it is obviously the body tracking, so I created a new sketch to test out the library, and coded together a simple barebones concept to ensure the tracking works, and if I can reliably assess whether a person is in the correct pose. In addition to the pose detection, it also displays a skeleton of the person (of the detected points), and I made it slightly show the webcam’s video feed (both to help the player adjust their position). Also, it shows a (faint) guide to the target pose.

Mini rant:

I first made an image for the target pose, but I’m still somehow unable to upload anything to p5, despite trying with different browsers, at different times, and even different accounts, since the start of the semester (yep, I haven’t been able to upload anything at all). Luckily in this case, it was just a simple image, so I drew it manually with p5.js (which is better in fact, since the image automatically updates if I change the target pose), but still, this is extremely annoying and limiting. If you know any solutions, please let me know.

 

Try and see if you can match the pose!

 

(Hint: It tests for the shoulders, elbows, and wrists. Also, try moving back and forth if the frame doesn’t fit, and try changing the lighting if it isn’t detecting you properly (ignore the flickering, that’s a known issue))

It works! Something I don’t like is the constant flickering, but I think I might be able to mostly solve that (at the expense of slower update times, by using a moving/sliding average), so I would consider this a success!

Week 5: Reading Reflection of “Computer Vision for Artists and Designers”

Hey there! 👋

I was reading this great article about Computer Vision for Artists and Designers, which mainly talks a bit about the history of computer vision, interactive media using computer vision, and some basic computer vision techniques.

It made me think about an interesting comparison, of the difference between our vision, and that of a computer’s. While the structural and architectural differences are obvious (biological vs technological), a more interesting comparison point lies in the processing. Video, as the article mentioned, doesn’t contain any inherent, easily understood meanings. Instead, in both cases, it is up to the processing to extract key information and figure out the required details. While such a task is usually easy for our brain, trivial in fact, is an incredibly difficult problem to overcome, especially from scratch. It might seem easy to us, because our brain is one the most complex things in the universe, and also had a huge head start 😂, allowing it to develop specialised pathways and processes capable of carrying out this operation in a breeze, which is understandable as vision is an extremely important tool in our survival. Computer vision meanwhile, is starting in this gauntlet of a task, nearly from scratch. While we do not have to evolve biological structures and compete on our survival, there is still a very real challenge, though thankfully, we can progress on this frontier much quicker than evolution or nature could (that big brain of ours coming in handy again. Wait, if you think about it, we are using our eyes and brain, to replace, our eyes and brain… and our entire body and self, in fact).

There are several methods of implementing different aspects of computer visions. Some basic ones, mentioned in the article, include detecting motion (by comparing the difference in the current frame with the previous one), detecting presence (by comparing the difference in the current frame with one of the background), brightness thresholding (just checking if a pixel is lighter or darker than a certain threshold), and rudimentary object tracking (usually by finding the brightest pixel in the frame), though many other basic techniques also exist (eg. such as recognising what a picture is by comparing against a stored collection of images). These basic implementations however cannot stand much alternation in their expected environment, and even something relatively simple (like the background changing colour, perhaps by a shadow being cast) would render them ineffective. Also, trying to implement a task as vast as computer vision by precise, human defined algorithms, is extremely hard (I know all algorithms are technically precisely defined, so what I mean by this is that we have specifically coded in behaviours and processes).

A far more successful approach (like in many other fields from aerodynamics to energy efficiency), has been to try and somewhat copy what nature does (though on a simpler scale). “Recent” approaches like neural networks, and other machine learning techniques, have begun far outperforming anything a precisely defined algorithm could do, on most real world tasks. The simplest structure of a neural network, is to have neurones, or nodes, connected to other nodes, and the value is simply modified as it travels from node to node (vast oversimplifying), mimicking the model of the neurons in a brain (though in a much more basic representation). The beauty of such approaches is that they leave the specifics undefined, allowing the model and training process to improve itself, automatically choosing a somewhat good selection of processes to extract meaningful information (when done right). Of course, this doesn’t mean that this is easy, or that one a neural network is made, all the problems can be solved by simply feeding it enough data – innovation in the architecture of the neural networks themselves (e.g. the introduction of U-nets, LTSM, transformers, etc) and surrounding ecosystem must also happen – but it does allow a different way of doing things, which has so far yielded fruitful results (checkout any of the vast number of computer vision libraries, including PoseNet, YOLO, and probably most importantly, OpenCV).

Though the advancements in computer vision are coming ever faster, and many people around the globe dream about being able to control things with just their body and vision (probably in no small part due to movies showing cool futuristic tech), the use of computer vision, in anything, is still a heated discussion. On one hand, widely implementing it could provide alternative pathways to accomplish tasks for many people who might otherwise not be able to do it, increasing accessibility, and it also enables several use cases which before were never even thought to be remotely possible (and also, it’s just cool!). On the other hand however, as with any technology, we have to acknowledge that the vast capabilities of such a system could also be used for nefarious purposes. In fact, governments and corporations are already misusing the capabilities to carry out dystopian practices on an unprecedented scale, widely being in mass surveillance, targeted tracking, selling user info, avoiding responsibilites (eg. car manufacturers on warranty claims), and so much more, all for power and profit (you might think I’m exaggerating a little, but no, look it up. The rabbit hole goes extremely deep (and I’d be happy to chat about it 🙂 )). As a certain spidey’s uncle once said, “With great power comes great responsibility”, and currently, we can’t be sure that those with that power won’t abuse it, especially as they have already routinely been.

With such a conundrum of implications and possibilities, computer vision’s use in interactive art is no doubt a little damper, with it predominantly being featured in places where it is easily and visibly stopped/escaped, such as galleries and public (but not too public) and private installations, and apps and websites users trust, vs throughout every aspect of our lives (though it is undoubtedly hard to fully trust an app/website, and that trust can be easily broken, though of course, not everyone is as weary of these technologies).