Week 5 – Reading Response

What are some of the ways that computer vision differs from human vision?

Previously, I kind of always linked computer vision with machine learning. I always assumed there was some use of machine learning to identify the different objects in a given video, and to really understand the movements and different interactions within the video. However, after reading this article, I feel like I’ve gained a much clearer understanding of how computer vision actually works as well as a better understanding of the limitations of the technology available. While both computers and humans can probably identify where a person is in a video and their movements, humans are also usually able to predict their next movements. Humans are familiar with how humans interact with objects, while computers really depend on data, which can sometimes miss anomalous cases or outliers. An example that may seem a bit far fetched is someone who only has 4 fingers, human vision is obviously able to comprehend that, while I assume computer vision may not be able to tell that there is something missing in the image, and it’s only programmed to work with the norm.

In terms of computer vision’s capacity for tracking and surveillance and it’s effect on its uses in interactive art, I think one of the examples from the article, Suicide Box, combines those two ideas nicely. The tracking and surveillance aspect of computer vision has been used to create an art piece (kind of) about suicide and to emphasize irregularities in data. An issue that immediately comes up for me with computer vision is privacy concerns. A tool that once so heavily used for tracking and surveillance, to now be used in interactive art may be suspicious to viewers. Viewers may be paranoid that these art pieces are collecting data about them, however, I’m not sure if this is a common concern, considering most art pieces we’ve looked at that use computer vision have been well-received.

Week 5 – Reading Reflection

It was interesting to learn about how computers actually see and what stood out for me was the various methods employed by a computer to see and make decision or create art. Selection of a computer vision technique adds complexity to the interactive works and alters how one can interact with the work. The right technique must also be selected to minimize errors and ensure consistency in the art as some techniques are known not to perform well in certain conditions.

One possible application of this could be how an interactive artwork involving computer vision can be placed strategically in an arts exhibition to accentuate or improve the vision of the work. Carefully selected piece of art can be placed around the work to generate the needed contrast, brightness or effects for the computer vision just like how the white Foamcore was used for the LimboTime game.

The use of surveillance to generate arts was also something worth taking a look at. Are there any privacy restrictions or laws protecting the identities of the people in these forms of arts and how are their privacies protected? The work Suicide Box by the Bureau of Inverse Technology makes me question if artist actually have the right to use data or information like this to create a piece of work. It gives me the impression that they are amusing tragedy. I am also left with the question: how do they respect the dignity of those who jumped off the bridge?

Week 5 – Reading Response | COMPUTER VISION FOR ARTISTS AND DESIGNERS

When I think of Computer Vision, the first thing that comes to my head is this coder called the Poet Engineer on social media who uses computer vision to create the most insane visuals purely from the camera capturing their hand movements. They have the coolest programs ever. I also love it when artists make videos of them creating cool things with their hands purely through code, and one of my favourite examples of using code to create art is Imogen Heap’s MiMu gloves. And, also, the monkey meme face recognizer I keep seeing everywhere (photo attached). It still baffles me that we can use our hands and our expressions to control things on a device that usually interacts with touch! So, this reading was one of my favourite readings so far, because it discussed one of the main concepts that hooked me into interactive media in the first place. 

From what I understood of the text, the primary difference between computer and human vision is that while a human observer can understand symbols, people or environmental context like whether it’s day or night, a computer (unless programmed otherwise) perceives video simply as pixels. Computer vision uses algorithms now to make assertions about raw pixels, and even then, designers need to optimize the physical environment to make it “legible” to the software, such as using backlighting to create silhouettes or using high-contrast and retroreflective materials. Despite these limitations, is it still not insane that we’ve evolved so much that we can make computers identify specific things now, despite it being a computer? The fact that now computers can have hardware that goes beyond our own capabilities, such as infrared illumination, polarizing filters and more is almost scary to think about. I’d also say that computer vision is much more objective than human vision. Is it possible for computers to suffer from inattentional blindness as much as we do? For example, when we enter a room and fail to see something and then we come back and the object is right there and it never moved, is a computer capable of the same thing?

I liked that this reading stated down the different techniques used in computer vision, because when I originally understood CV, I was overwhelmed by the amount of things it could sense. I understood these techniques (and I’m listing them down so I can refer to them later as well):

  1. Frame Differencing / Detecting Motion: Detects motion by comparing each pixel in a video frame to the corresponding pixel in the next frame.
  2. Background Subtraction / Detecting Presence: Detects the presence of objects by comparing the current video frame to a stored image of an empty background.
  3. Brightness Thresholding: Isolates objects based on luminosity, by comparing brightness to a set threshold. (I did an ascii project a few years ago, where it would capture your image, figure out the contrast and brightness and then replicate the live video input as letters, numbers and symbols. I would like to replicate that project with this concept now!)
  4. Simple Object Tracking: Program computer to find the brightest or darkest pixel in a frame to track a single point. 
  5. Feature Recognition: Once an object is located, the computer can compute specific characteristics like area or center of mass (this is CRAZY). 

There are definitely more techniques that are out there, but I’ll start off with the basics, since I’m a complete beginner at this. I did want to try using feature recognition paired with simple object tracking, something I noticed is used in hand tracking (and the monkey video. LOL).

I mentioned the objectivity of CV earlier, but what happens if the datasets that they are trained on are biased? What if the creator behind the program has their own biases that they implement into the program? I like how Sorting Daemon (2003) mentioned looking at the social and racial environment, because I was wondering about situations where CV could be programmed to unintentionally (or intentionally) discriminate against certain traits such as race, gender, or disabilities. Surveillance is a scary concept to me too, because what happens to the question of consent?  While computer vision could be used to reveal hidden data in environments that are often overlooked, create programs that can help people without the need for a human to be present (e.g. Cheese), and so many other cool things, it could also be used in a negative way. I need to make sure to find a way that any programs I create with CV are inclusive and not used for ill intent.

Midterm Progress Report

Concept:

Throughout the Assignments, I really feel in love with Assignment 3 where I made this mesmerizing colorful display. Even as developing that production, I saw that there is more to be made and even playing around with some of the variables, inspired me to make it the core focus of my midterm project. I think if time allows, I really want to create a magnificient interactive display, on that will be close with the viewer of this.

The main concept is customization of the colored canvas. I plan to add options so that the user can interact with key things of the project, such as sliders for the direction of the balls on screen (both in the X and Y direction). There will also be an option for the user to change around the RGB colors in order to get the desired color they wish. But the main thing I want to incorporate is the text from Assignment 4 and it would be surrounded by the colorful balls. Also I could see the user having so the mouse interrupts the flow of the balls, similar to how the mouse interrupts the text in Assignment 4.

Design

The design process is mainly extending and adding more features to the colorful concoction project. Firstly, there’s going to be an intro screen, which the user will be guided into what exactly the project is, and give them an overview of what’s to come. There will also be instructions for what the user could do to interact further with the project.

Then, when the user is ready, it will switch to the generative artwork. There is going to be sliders or probably text boxes, where the user will add a value and it will change something from the artwork. This includes range of color, the direction and speed of the balls, and a text box so the text can be displayed on screen. Finally there will be a button so that the user can take a picture of their final artwork.

Challenging Aspects:

I think the biggest challenge is mainly implementing the text and getting it to be a blockade for the balls so that it surrounds them. In a sense, the balls need to recognise the letters as a wall, so not only do the balls surround it, but also bounce off if they change. It’ll be a case of playing aronud with direction vectors.

Another challenging aspect is making the sliders, as I do not have any experience with making sliders so that they can dynamically change different parts of the artwork.

Mitigating Risk:

In terms of implementing the text, I plan to experiment with the text and seeing how it will affected by other objects. As a starting place, I could take the code which I used to make it so the balls do not go outside of the walls, and try to implement it for the letters. Then from there, I can manipulate the variables in order to get the desired effect I want.

For the sliders, I will read up upon how they’re implemented. Most likely our friends at the Coding Train, have made a video about how to use sliders so they will be a great starting ground. From there, I can extended them so the sliders can manipulate the variables of say, the color or direction of the balls.

 

Week 5 – Reading Reflection

It’s easy to forget that computers don’t actually see anything. When we look at a video feed, we instantly recognize a person walking across a room. A computer just registers a grid of numbers where pixel values shift over time. Because of this, computer vision is incredibly fragile. Every tracking algorithm relies on strict assumptions about the real world. If the lighting in a room changes, a tracking algorithm might completely break. The computer doesn’t see “general” picture with context, since it only knows the math it was programmed to look for.

Basic Tracking Techniques

To avoid this blindness of the computer, some techniques are used to track/react to things the developers are interested in.

    • Frame differencing: comparing the current video frame to the previous one. If the pixels changed, the software assumes motion happened in that exact spot.

    • Background subtraction: memorizing an image of an empty room. When a person walks in, it subtracts the “empty” image from the live feed to isolate whatever is new.

    • Brightness thresholding: tracking a glowing object in a dark room by telling the software to ignore everything except the brightest pixels.

    • Simple object tracking: This involves looking at the color or pixel arrangement of a specific object and looking for those same values as they move across the screen.

Surveillance in Art

I believe that the fact that people use the technology made for surveillance and military to create art is very interesting. I believe that using technology built for control to create art is truly impressive: flipping the understanding of this technology, or even making it very double-sided. While interactivity that comes with such tracking technology has a huge variety, and sometimes feels magical and extremely emotional, it comes from the computer tracking, analyzing and reacting to every move of the person in front of it. Such art presents the invisible unsettling surveillance we have everyday to a work of art that makes it extremely present.

Honestly, this military baggage explains a lot of computer vision’s blind spots. If you’re designing a system just to monitor crowds or track moving targets, you don’t need it to understand the whole scene and all details. You just need fast analysis of tiny differences, like a shift in pixels.

However, I feel like in interactive media details are very important, and that art runs on them. This way, while computer vision has not yet reached the state when it can analyze everything at once, artists have to come up with algorithms that will try to do it instead.

Reading Reflection Week 5: The visionary difference between a Computer and a Human

I found it quite interesting seeing how computer vision actually is different than human vision. Initally I assumed that computer vision being chock full of the knowledge we would provide from the side of AI, it would be able to, at the very least analyze what the image is. However I was surprised to find out how computers only really see grids of pixel and a fully relient on mathematical algorithms, in order to get a cleaner picture of what is on screen. Whereas uh humans, we’re able to distinguish an object from a background and different lighting, computers have a hard time to tell a shadow passing along a room.

However with regards to the use of tracking and surveilence, I would say it honestly opens up a world of possibilities to make use of body tracking as a controller for many games and loads of interactive media artworks. The coolest one I’ve personally seen so far is Just Dance. It utilizes a camera for motion tracking so that its able to give an accurate assessment if the dance moves match up with the computer’s example. It’s main concept isn’t just a gimmik, but the crux of the main functionality of the game. But it’s the implementation where you get an accurate assessment of whether you follow the dance moves and can give you instant feedback, through the use of sound effects, that is very useful. And I mean, with regards to interactive media, this will allow say, people to interact with our art in a deeper way so that they can genuienly feel immersed in the art in question.

Week 4 – Generative Text

For this assignment I created a kinematic typography sketch using the word “MADINA.” I wanted the word to feel like it is in motion. My main inspiration was Patt Vira’s kinetic typography work, where letters shift in rhythm. I liked how those examples use simple motion to give a word a stronger presence, so I focused on one word and explored movement across time.

I used p5.js together with opentype.js and geomerative. First I loaded the font “BebasNeue-Regular.ttf” and converted the word “MA D I NA” into a vector path. Then I resampled the outlines into many points. In draw, I repeated those points multiple times in vertical layers. I applied a sine function to the x position and a gradual offset to the y position, so each layer moves like a wave. I kept the color palette minimal with a dark blue background, white strokes, and semi transparent blue fills. Patt Vira’s kinetic typography guided my decisions about rhythm and repetition.

I wrote the sketch in p5.js geomerative to work with vector text. In setup, I created the canvas, set angle mode to degrees, and loaded the font file “BebasNeue-Regular.ttf” with opentype.load. After the font loaded, I called font.getPath on the string “MA D I NA” with a large font size, then wrapped the commands in a geomerative Path object. I resampled this path by length so the letters turned into a dense list of points. I looped through the commands and, whenever I encountered a move command “M,” I started a new sub array in points. For each drawing command that was not “Z,” I pushed the x and y coordinates into the current sub array as p5 vectors.

In draw, I cleared the background to a dark blue color, set stroke weight and stroke color, and translated the origin so the word appears centered on the canvas. I used a nested loop. The outer loop moves through the number of layers, from num down to zero. The inner loop moves through each group of points for each letter. For some letter indices I used noFill to keep only outlines, and for others I used a semi transparent blue fill. Inside beginShape and endShape, I looped over the points and applied a sine based offset to the x coordinate with r * sin(angle + k * 20), and a vertical offset of k * 10 to the y coordinate. This creates layered copies of the word that shift in x and y as angle increases. At the end of draw, I incremented angle by 3 so the sine function changes over time and the typography keeps moving.

let font;
let msg = "MA D I NA"; let fontSize = 200; 
let fontPath; let path; let points = [];

let num = 20; let r = 30; let angle = 0;

function setup() {
  createCanvas(700, 400);
  angleMode(DEGREES);
  opentype.load("BebasNeue-Regular.ttf", function(err, f){
    if (err) {
      console.log(err);
    } else {
      font = f;
    }
    
    fontPath = font.getPath(msg, 0, 0, fontSize);
    path = new g.Path(fontPath.commands);
    path = g.resampleByLength(path, 1);
    
    for (let i=0; i<path.commands.length; i++) {
      if (path.commands[i].type == "M") {
        points.push([]);
      }
      
      if (path.commands[i].type != "Z") {
        points[points.length - 1].push(createVector(path.commands[i].x, path.commands[i].y));
      }
    }
    
    
  });
  
}

function draw() {
  background(0, 0, 139);
  strokeWeight(3);
  stroke(255);
  translate(40, 170);
  
  for (let k=num; k>0; k--) {
    for (let i=0; i<points.length; i++) {
      if(i == 1) {
        noFill();
      } else if (i == 3) {
        noFill();
      } else {
        fill(0, 0, 255, 100);
      }
    beginShape();
      for (let j=0; j<points[i].length; j++)      {
        vertex(points[i][j].x + r*sin(angle + k*20), points[i][j].y + k*10);
      }
      endShape(CLOSE);
    } 
  }
  angle +=3;
}

 

Week 4 – Reading Reflection

One thing that always confuses me is the variety of modes on some household items. When using an iron, I see that spinning the circle increases the steam production, and for people who have no idea which level is needed for which clothes, they write the names of the materials on the same circle respectively. What drives me mad is that washing machines and dryers are NEVER intuitive. What’s the difference between Cupboard Dry and Cupboard Dry+ if they take the same time and operate at the same temperature? What is the difference between Gentle and Hygiene, and why is the time difference there 3 hours? And to actually figure out the difference, you have to find the name of the machine (which will never match its actual name), look it up in some 2008 PDF file on the very last Google page, and it still won’t answer the question. I always use Mixed washing and Cupboard Dry just because it works, and I have no idea how the other regimes work. And as Norman says, it’s not me being stupid, but the design allowing for these mistakes.

“The same technology that simplifies life by providing more functions in each device also complicates life by making the device harder to learn, harder to use”

I think my example perfectly supports this idea, since the bad design of all these items: with no signifiers, no clear affordances, and no clear conceptual model formed either through life experience or through using the item, just creates more confusion and makes the user always choose one method instead of the huge variety of (probably) useful and functional ones.

I think one way to fix it is to provide some sort of manual, even a tiny table on the edge of the machine would help so much to at least understand which method does what and what the difference between them is. Another way is to display something on the small screen that almost every machine has, like all the characteristics and statistics that are unique to each method, or some short warnings/instructions. Another way to solve this problem is to at least make small illustrations near each method that actually depict what the method does. Genuinely, it would help unleash the potential of these machines and help people use them.

Talking about interactive media, I think the principles Norman talks about are really applicable and foundational.

Sometimes great art pieces with very interesting and complex interactions can be overlooked just because people can’t figure out how to interact with them. I believe that it is very important to design the piece in a very intuitive or guiding way, a way that encourages the user to make the interaction that the author created. As Norman says, humans are really predictable, and in this way, some silent guiding design (not notes, not manuals, but the design itself) should trigger the interaction that is meant to be done in order to experience the art.

Week 4 – Reading Response

Reading Norman’s chapter made me realize how often I get frustrated with specific designs, especially ones that lack efficiency in everyday objects. Norman emphasizes that good design should communicate clearly, prevent errors, and provide feedback. I see this principle in some interactive media, where the design makes it easy to use without much explanation—anyone can figure it out quickly. When something is designed well, you don’t even notice it because everything feels natural and intuitive. Unlike the examples the author mentioned, such as the sink that requires pushing down on it or the door that needs a sign to explain that it is a sliding door, good design should not require instructions. If a user has to stop and think about how to use something basic, then the design has already failed.

Something that drives me crazy is the access doors on campus. I walk around carrying two access cards—one specifically for my suite and room, and another for the rest of the campus. It feels unnecessary and inefficient. On top of that, the glass doors are extremely heavy, and the sensors do not work most of the time. Instead of making entry smooth and accessible, the design creates frustration. According to Norman’s ideas, better mapping, clearer feedback, and fewer constraints could significantly improve this experience.

Week 4 – Global Mood (Data Visualization)

Concept:
My concept is based on showing the current global mood, and the world’s current situation. Because whenever I used would google “news” most of the things would is evoke aa negative emotion from me. So, I decided to visualize the news and categorize them into a few different emotions or feelings.

How I created the code:

I used Guardian and NYT API keys in order to get access to live articles, although there are some restrictions, like page requests. Therefore, I added some delay in order to access a larger number of pages and news article headlines. I also used world.json for the country borders.

I then created different arrays: one for the emotional bubbles, one for the country borders, one for the CNN breaking news ticker, and one for tracking articles so they are not shown twice. I also added a timer that updates every 60 seconds and adjusted the speed and position of the news ticker.

Then I added geographical points for a list of countries. I created bubbles for different emotions, with each emotion represented by a color. There is also a map key showing which color represents which emotion. The bubbles have visual effects like glowing and shrinking over time to make the map feel dynamic. Emotions are detected using keywords in article titles to classify sadness, anger, hope, or joy.

It initially gets the last 48 hours of news, then it is updated with live breaking news. I also added fallbacks: if the world map fails to load, a simple grid is shown, and if the API fails, a CORS proxy is used to make sure the news still comes through.

The code:
// Convert guardian format to our format
// Fetch 48 hours of historical news from The Guardian
function fetchHistoricalNews() {
let twoDaysAgo = new Date();
twoDaysAgo.setDate(twoDaysAgo.getDate() - 2);
let fromDate = twoDaysAgo.toISOString().split("T")[0]; // Format: YYYY-MM-DD
console.log("📅 Fetching Guardian news from " + fromDate + " to today...");

let totalArticles = [];
let pagesToFetch = 10; // Get 10 pages of results
let pagesLoaded = 0;
let failedPages = 0;

// Fetch pages sequentially with delay to avoid rate limiting
for (let pageNumber = 1; pageNumber <= pagesToFetch; pageNumber++) {
setTimeout(() => {
let apiURL =
"https://content.guardianapis.com/search?section=world&show-tags=keyword&from-date=" +
fromDate +
"&page-size=30&page=" +
pageNumber +
"&show-fields=webPublicationDate&api-key=" +
GUARDIAN_API_KEY;
console.log("🔄 Requesting Guardian page " + pageNumber + "...");
fetch(apiURL)
.then((response) => {
console.log("📡 Guardian page " + pageNumber + " response status: " + response.status);
if (!response.ok) throw new Error("HTTP " + response.status);
return response.json();
})
.then((data) => {
if (data && data.response && data.response.results) {
totalArticles = totalArticles.concat(data.response.results);
pagesLoaded++;
console.log("✅ Page " + pageNumber + " loaded: " + data.response.results.length + " articles");
if (pagesLoaded + failedPages === pagesToFetch) {
if (totalArticles.length > 0) {
console.log("📊 Total Guardian historical: " + totalArticles.length + " (" + pagesLoaded + "/" + pagesToFetch + " pages successful)");
isShowingHistorical = true;
sourceStatus.guardian.active = true;
sourceStatus.guardian.articleCount = totalArticles.length;
processArticles(totalArticles, true, "guardian"); // true = historical
} else {
console.error("❌ All Guardian pages failed");
sourceStatus.guardian.active = false;
}
}
} else {
console.warn("⚠️ Guardian page " + pageNumber + " returned empty results");
failedPages++;
}
})
.catch((error) => {
console.error("❌ Guardian page " + pageNumber + " failed:", error.message);
failedPages++;
if (pagesLoaded + failedPages === pagesToFetch) {
if (totalArticles.length > 0) {
console.log("📊 Total Guardian historical: " + totalArticles.length + " (" + pagesLoaded + "/" + pagesToFetch + " pages successful)");
isShowingHistorical = true;
sourceStatus.guardian.active = true;
sourceStatus.guardian.articleCount = totalArticles.length;
processArticles(totalArticles, true, "guardian");
} else {
console.error("❌ All Guardian pages failed");
sourceStatus.guardian.active = false;
}
}
});
}, pageNumber * PAGE_REQUEST_DELAY); // Use delay variable
}
}

// Fetch the latest breaking news from The Guardian
function fetchGuardianNews() {
console.log("📰 [" + getCurrentTime() + "] Fetching Guardian news...");
let apiURL =
"https://content.guardianapis.com/search?section=world&show-tags=keyword&page-size=25&show-fields=webPublicationDate&api-key=" +
GUARDIAN_API_KEY;
fetch(apiURL)
.then((response) => {
if (!response.ok) throw new Error("HTTP " + response.status);
return response.json();
})
.then((data) => {
if (data && data.response && data.response.results) {
console.log("✅ [" + getCurrentTime() + "] Guardian: " + data.response.results.length + " articles");
sourceStatus.guardian.active = true;
sourceStatus.guardian.lastUpdate = new Date();
sourceStatus.guardian.articleCount = data.response.results.length;
isShowingHistorical = false; // We're showing breaking news now
processArticles(data.response.results, false, "guardian"); // false = breaking news
}
})
.catch((error) => {
console.log("⚠️ Guardian direct failed, trying CORS proxy...");
tryGuardianWithProxy();
});
}

// Backup method: Try Guardian API through CORS proxy
function tryGuardianWithProxy() {
let apiURL =
"https://content.guardianapis.com/search?section=world&show-tags=keyword&page-size=25&show-fields=webPublicationDate&api-key=" +
GUARDIAN_API_KEY;
let proxiedURL = "https://api.allorigins.win/raw?url=" + encodeURIComponent(apiURL);
fetch(proxiedURL)
.then((response) => {
if (!response.ok) throw new Error("HTTP " + response.status);
return response.json();
})
.then((data) => {
if (data && data.response && data.response.results) {
console.log("✅ [" + getCurrentTime() + "] Guardian via proxy: " + data.response.results.length + " articles");
sourceStatus.guardian.active = true;
sourceStatus.guardian.lastUpdate = new Date();
sourceStatus.guardian.articleCount = data.response.results.length;
isShowingHistorical = false;
processArticles(data.response.results, false, "guardian");
}
})
.catch((error) => {
console.error("❌ [" + getCurrentTime() + "] Guardian completely failed:", error.message);
sourceStatus.guardian.active = false;
});
}

 

Reflection and ideas for future work or improvements:

Reflection:

Global Mood taught me a lot about combining live data, visualization, and emotion analysis. Seeing emotions vary across regions in real time was fascinating, and effects like glowing and shrinking bubbles made the map feel dynamic. It also taught me how to APIs and json files on P5.js.

Future Work and  improvements:

I would love to present it as an installation to show people the current global situation. For future improvements, I would incorporate Natural Language Processing to classify emotions more accurately, rather than relying solely on specific keywords. I also wish I had greater access to open-source news APIs to expand the dataset.