Reading Response— Computer Vision for Artists and Designers

Reading afterthoughts:

In “Computer Vision for Artists and Designers”, there are multiple interesting concepts and projects presented that piqued my interest heavily.

For starters, one term that caught my attention instantly was “computer vision techniques”. I did not expect the text to dive into this topic because my expectations when I dived into this was that the concept mentioned would be more related to the uses and applications of computer vision rather than proper explanation of how it works. As a Computer Science major, I was immediately shocked to discover that “there is no widely agreed-upon standard for representing the content of video”. Having worked with video myself, it never crossed my mind that there are multiple standards for handling the information of videos, which for many cases might be as bad as having no standard at all. Why? The problem relies on the level of understanding computers have. They have no knowledge on the content itself, but rather, they only know how to show it. In better ways, blurred, black and white, or whatever way you want, but they are not able to know if a person or a dog is there.

Instead of diving into AI and model training, I wanted to mention how interesting it is that older projects were able to work with computer vision even when there was no concept of artificial intelligence. It is not needed to teach the computer how to understand it, but we can start working by simply telling it how to recognize it. The simplest idea is detecting motion. By doing something as trivial as checking whether the frames changed, we can properly detect motion and movement. Even better, we can check how the pixels changed: if there was pixel A in position X, and that same pixel and neighbor pixels are found in position Y, then we can confidently guess that what pixel A and its neighbors portrayed moved from X to Y.

Overall, this reading was very enjoyable from the point of view of someone who loves coding and working with computer vision. It was interesting to see concepts that I could recognize, but at the same time exciting because I could learn new information about these exact topics.

Week 5 – Reading Reflection

The study that was given as our reading covers how the technology of computer vision started and how it evolved through different usages of different artists and fields. I think most of the readers would have expected how computer vision’s utility is limitless- we’ve already seen so many works that use computer vision that suggests potentials of further development and extension.

The reading gave a good reminder that as much as using computer vision can lead to fascinating works, it holds lots of limitations and needs careful considerations in order to maintain good accuracy. The fact that how precisely the technology can work depends on our decisions is another charm of it in my opinion.

I also like how the work Suicide Box sparks questions. I understand that there can be different views (especially ethical) on the approach. I wouldn’t say a certain side is more ‘correct’ than the other. However, I do want to say that the sole fact that it sparked questions and discussions about an issue that people tended to walk away from and ignore is a significance on its own.

midterm progress 2

so for now i still did not start with the coding part i am getting to it ,  but i finally found a idea for my midterm.

i saw a project of a spinning record, and i really liked this idea, it is going to have sound and animation , and i will add more to it  from buttons to music, and possibly add abstract art for the background.

The spinning record will play an instrumental mp3 file that i composed on the software Logic. on the bottom i will add a sprite sheet of a character dancing to the uploaded music.

Sprite sheet:

Spinning record example:

Logic instrumental:

 

 

 

MIDTERM PROJECT UPDATE 2

AURA PART 2 UPDATE

If you can remember, I was basing my original concept on this painting that I made a couple of weeks ago.

Here is an update of how things are going.

PROGRESS:

So for this iteration of the project, I was mainly focusing on getting the user to be able to type in an initial of their name and the particles in the sketch to change colors accordingly via a CSV file I made connecting each letter of the alphabet to a different set of colors. FUN I know!

I was able to get this part of the project going and start mixing and matching the colors so that the user can actually see multiple colors being displayed in the sketch. 

COOL CODE:

 //TYPE IN INITIAL
  let sample = "A";
  
  // find the row index of the input sample initial and set that value to idx
  for (let i = 0; i < alphabet.length; i++) {
    if (alphabet[i] == sample) {
      idx = i; //4
    }
  } 
  // get corresponding colors 
  colors_for_sample = [color1[idx], color2[idx], color3[idx]];


  for (let i = 0; i < num; i++) {
    particles.push(createVector(random(width), random(height)));
  }
}

function draw() {
  background;
  for (let i = 0; i < num; i++) {
    let p = particles[i];
    square(p.x, p.y, 2);
    
    let color1 = color(colors_for_sample[0]);
    let color2 = color(colors_for_sample[1]); 
    
    fill(lerpColor(color1, color2, noise(p.x * noiseScale, p.y * noiseScale)));// lerpColor() function is used to interpolate two colors to find a third color between them.
    noStroke();

In the next step of my project I hope to add a background image behind my sketch, as well as some sound to make things a bit more interesting.

 

 

 

week5.reading – Computer Vision for Artists and Designers

In his article Computer Vision for Artists and Designers, Golan Levin writes about the progression of computer vision and how it has played a crucial role in shaping what we perceive to be interactive tech, art or not. It is interesting to acknowledge that prior to people experimenting with using computer vision for artistic endeavors, the “application development for computer vision technologies, perhaps constrained by conventional structures for research funding, has generally been limited to military and law-enforcement purposes” (Levin). Nevertheless, in our fast-paced and exponentially growing society, it is bizarre to note that with each decade, our computer vision capabilities expand vastly.

In his article, Levin demonstrates multiple examples of where computer vision met artistic and interactive ideas, all ranging in creation times spanning the past few decades. Levin also focused on the different techniques that are used to allow the computing of visual files, mainly by pixel analysis, and this led me to reflect on how, throughout my childhood, we took these technologies for granted. When I was 10, like any other young aspiring boy who liked to play video games, I dreamt of creating the perfect set-up to record myself playing my favorite games. Green screens were extremely popular during that time amongst various creators, and they allowed them to capture and project only the subject of the video onto a different layer. This effect was ultimately used to achieve a better immersive experience for the viewers; however, it is only now that I realize how these applications function and what algorithms and processes are involved to create the seamless effect of being able to change your background. And with each month, we see more implementations of these techniques; for instance, Zoom allows people to change their backgrounds, even without a proper green screen.

In conclusion, I believe that it is a fascinating topic for many to explore, and understanding the complexities behind all the computer vision algorithms is substantially brought into a simpler context in Levin’s article.

Midterm Progress

Midterm – Pride Dragon Generator

Inspired by the simple yet compelling cartoons of @dinosaurcouch, and the fun and customizability of avatar designing websites such as Picrew, I set out to make a “Pride Dragon Generator” in which users can select a LGBTQ+ identity and receive a dragon with spike colors of the corresponding pride flag. I wanted to incorporate an educational element as well, and what came to mind was using the generator to teach people LGBTQ+ terms in Chinese. When the user hovers their mouse over each of the buttons, an audio will play pronouncing the Chinese term. When they click on a button, they will get a picture of a dragon with, for example, lesbian flag spike colors. They can then save this image to their device.

Dinosaurcouch comic, featuring dinosaurs with lesbian and bi flag color spikes

One of the many customizable avatars in Picrew image maker

Most Frightening Part & Risk Reduction

The most frightening parts of this midterm are 1) the underlying logic of the “hover over button” interactions how users will go forward and go back to the homepage, 2) the sound playing and image saving functionalities, and 3) the complexities of drawing a dragon.

To address risk #1, I first tried to make interactive buttons on my own, and then went to IM Lab hours. With help from Coding Train and Zion, the IM lab assistant, I now have the basic logic of my program, and buttons that enlarge when you hover over them. The next steps are adding the sound and customization options.

To address risk #3, I went to this website to try to better understand bezierVertex, and played around with it a bit to get the hang of which numbers control which aspects of the shape.

week5.assignment – Midterm Project Progress

Concept Design

As we were looking at previous Midterm projects in class, one of them reminded me of a type of game I used to highly enjoy. The project that reminded me of this was the Coffee shop experience. A few years back, I really enjoyed playing mobile 2D puzzle / escape room games, which had a very similar structure to how the Coffee shop experience functioned. You would be able to zoom into certain parts of the walls, where the objects were placed, and solve interactive puzzles, which would eventually lead to you figuring out some final code to opening the door and passing the game. Thus, I decided that I would attempt and create a virtual 2D escape room game of my own. I am still debating on whether I should create the images myself or find references online. I began by sketching out two versions of how I would want the game to function.

 

 

 

 

 

 

 

I am still not fully decided on the overall theme of the experience/puzzle game; however, I will shortly begin sketching possible ideas for the visuals and take it from there.

Code Design

In order to piece all of the walls and “zoom-in puzzles” together, I am sure that I will need to come up with some sort of layers and switch them according to some indicators.

As of now, I think the best approach for this would be to create a class that would help differentiate between the different scenes. Additionally, I need to consider where I will include each of these elements:

  1. At least one shape – Perhaps the door, maybe create shapes as objects underlying the images for them to detect if they are being selected
  2. At least one image – Images for puzzles, keys, characters, background, etc.
  3. At least one sound – Some theme song playing in the background.
  4. At least one on-screen text – One of the puzzles will be a riddle which will include on-screen text.
  5. Object-Oriented Programming – I will create a “Layer Manager” class which will help me switch to different layers, such as the overall wall view, zoom into a specific puzzle, etc.
Frightening Concepts

Since I have not yet tried creating objects that are clickable, I believe that this aspect, along with switching views, will be the most challenging for me. In order to overcome this, I will try to research methods of how I can implement this. I have a few ideas about how I can create clickable objects, and I will create a tester p5 sketch, where I will try to implement all of these concepts, which are complex to me. After I am able to make sure that they work well, I can then confidently add them to my midterm project.

Midterm Progress Report 2 – ArtfulMotion: A Digital Canvas of Creativity

Initially, my plan was to create an engaging game by incorporating various elements, including sprites, background parallax scrolling, object-oriented programming (OOP), and TensorFlow.js integration for character control through either speech recognition or body motion detection.

However, for several reasons, I have changed my midterm project idea. The game I was initially going to create would have likely been a remake of an existing game, and it didn’t sound very authentic. My goal for taking this class is to challenge myself creatively, and I gained valuable insights during Thursday’s class, which greatly influenced the ideas I’m going to implement in my project. The part I was missing was probably deciding which machine learning model to use. After observing Professor Aya’s demonstration of the poseNet model in class, my project’s direction became clearer. I have transitioned from creating a game to crafting a digital art piece.

As I write this report on October 7 at 4:29 PM, I have been experimenting with the handpose model from the ml5 library. Handpose is a machine-learning model that enables palm detection and hand-skeleton finger tracking in the browser. It can detect one hand at a time, providing 21 3D hand key points that describe important palm and finger locations.

I took a systematic approach, first checking the results when a hand is present in the frame and when it isn’t.

 

My next step was to verify the accuracy of the points obtained from the model. I drew green ellipses using the model’s points to ensure they corresponded to the correct locations on the hand. I noticed that the points were mirrored, which was a result of my attempt to mirror the webcam feed.

I resolved this issue by placing the drawing function for the points between the push() and pop() functions I used to mirror the feed.

I also discovered that the object returned from the prediction included a bounding box for the hand. I drew out the box to observe how it was affected by the hand’s movements. I plan to use the values returned in topLeft and bottomRight to control the volume of the soundtrack I intend to use in the application.

I have also spent time brainstorming how to utilize the information from the model to create the piece. The relevant information I receive from the model includes landmarks, bounding box, and handInViewConfidence. I am contemplating whether to compute a single point from the model’s points or to utilize all the points to create the piece. To make a decision, I have decided to test both approaches to determine which produces the best result.

In light of this, I created a new sketch to plan how to utilize the information from the model. In my first attempt, I created a Point class that takes x, y, and z coordinates, along with the handInViewConfidence. The x, y, and z coordinates are mapped to values between 0 and 255, while the handInViewConfidence is mapped to a value between 90 and 200 (these values are arbitrary). All this information is used to create two colors, which are linearly interpolated to generate a final color.

After creating the sketch for the Point class, I incorporated it into the existing sketch for drawing the landmarks on the hand. I adjusted the drawKeyPoint() function to create points that were added to an array of points. The point objects were then drawn on the canvas from the array.

// A function to create points for the detected keypoints
function loadKeyPoints() {
  for (let i = 0; i < hand.length; i += 1) {
    const prediction = hand[i];
    for (let j = 0; j < prediction.landmarks.length; j += 1) {
      const keypoint = prediction.landmarks[j];
      points.push(new Point(keypoint[0], keypoint[1], 
                            keypoint[2], prediction.handInViewConfidence))
    }
  }
}

 


I also worked on creating different versions of the sketch. For the second version I created, I used curveVertex() instead of point() in the draw function of the Point class to see how the piece would turn out. I liked the outcome, so I decided to include it as a different mode in the program.


In my efforts to make the sketch more interactive, I have been attempting to utilize the SoundClassification model from the ml5 library as well. I tried working with the “SpeechCommands18w” model and my own custom pre-trained speech commands. However, both models I have tried are not accurate. I have had to repeat the same command numerous times because the model fails to recognize it. I am exploring alternative solutions and ways to potentially improve the model’s accuracy.

Although I am still working on the core of my project, I have begun designing the landing page and other sub-interfaces of my program. Below are sketches for some of the pages.

 

Summary

The progress I’ve made so far involves shifting my initial plan from creating a game to crafting a digital art piece. This change came after attending a class that provided valuable insights, particularly in selecting a machine learning model. I’ve been working with the handpose model, addressing issues like mirroring points and exploring the use of bounding box data for sound control.

I’m also brainstorming ways to utilize landmarks and handInViewConfidence from the model to create the art piece, testing various approaches to mapping data to colors. Additionally, I’ve been experimenting with the SoundClassification model from the ml5 library, though I’ve encountered accuracy challenges.

While the core of my project is still in progress, I’ve started designing the program’s landing page and sub-interfaces. Overall, I’ve made progress in refining my project idea and addressing technical aspects while exploring creative possibilities.

Below  is a screenshot of the rough work I’ve been doing.

Midterm Project Progress

Concept and Design

For the midterm project, I’d like to design a Peking Opera experience. In this experience, the user will be able to have a look at the traditional setting of a Peking Opera theater, listen to a clip of a famous Peking Opera, and interact with some of the objects within the scene. This idea comes from the cafe experience the professor showed us in class, and I decided to make some similar experience which is practical but also related to my own culture.

The general style is cartoonish and the general interaction will be: 1. On the start page, the user will click somewhere on the canvas and will enter the theater; 2. within the theater, the user will be able to click on several objects and the object clicked on will zoom in and the user can have a closer look at it.

The most frightening part

I think the most frightening part for this project is sketching the entire setting. I was worried how I could build out the desired setting for a traditional Peking Opera theater and characters. And technically, I was concerned of using transformation function of p5js.

To tackle the setting of the theater, I first looked up some pictures online to decide the color and design of a Peking Opera theater. And I found something like this:

Peking opera stage hi-res stock photography and images - Alamy

In this picture, I identified some theme colors, including Chinese red, yellow, and brown, and some patterns such as Chinese dragon and phoenix. Therefore, I decided to make a simplified version of this using these colors and patterns, with a character downloaded online that has similar appearances.

For the technical difficulty, I look at some online tutorials on how transformation, specifically scale(), works and started trying with some simple images and shapes to understand this function. With the scale(), I am able to make the character turn around when she’s moving backward.

Next steps

The next steps will be adding the start page, instructions for the suer, and the function of returning to the previou scene as well as refining the entire setting.

Midterm Progress

Concept

Initially, I had planned on making a project with gravity manipulation as a core mechanic. However, I did not particularly like the ideas that I came up with. One particular idea was to create a voxel-based game where players could create objects that would fall to the ground. Upon impact, the land would be destroyed based on the momentum of impact. However, this proved to be difficult. I might attempt this for a future project, but the idea I settled on took its roots in this idea of a voxel-based landscape, where a voxel is a 3-dimensional pixel.

My idea was to implement a landscape constructed with voxels, and the player could play around with. Additionally, I wanted to give players the ability to change the view from 3D to 2D and vice versa. What I have so far is the project below:

I really enjoy pixel art, which is why I wanted my landscape to be pixel-based instead of being a continuous plane. Some of my previous projects have had the same style, so I wanted to stick to something that I knew design-wise.

I particularly like the way I transition from the 2-D view to the 3-D view. The 2-D plane rotating as it grows and morphs into a 3-D landscape gives a sleek look to the experience.