S2026 – Mang – Page 14 – Introduction to Interactive Media

Midterm Project: Where is the ghost?

Concept
This project is a small “ghost hunting” camera game inspired by Identity V. In Identity V, a detective enters a mysterious place and slowly discovers clues. I borrowed that detective-in-a-secret-castle feeling and turned it into an interactive p5.js experience. The player is a brave “ghost catcher” who explores an ancient castle at night, hears strange whispers, and tries to help the people living there by collecting evidence.

I also wanted to change the usual mood of ghosts. Ghosts don’t always have to be terrifying. In my game, the ghosts are faint and mysterious in the live scene, but when you successfully capture one in a photo, it becomes cute and playful (with a tongue-out expression). I like this because it matches how cameras work in real life: people often want to look “better” in front of the camera, and the photo becomes a different version of reality.

How the project works + what I’m proud of
The game starts on an instruction screen with a short story setup, then waits for the player to press a key or click a start button. During gameplay, the mouse controls a flashlight that reveals the scene. Ghosts appear only sometimes, and they are only visible when they are inside the flashlight area. To “capture” a ghost, the player takes a photo (click or space) while a ghost is inside the light. The photo preview shows the captured frame like a polaroid, and if a ghost was caught, it displays a special “cute ghost” version. The game ends when the player captures enough ghosts, runs out of time, or runs out of film, and then it offers a restart without refreshing.

I separated “live view” from “photo view.” In the live scene, ghosts only count if they are currently visible AND inside the flashlight radius (so the player must aim and time it). Then, after a successful capture, I draw a special “tongue ghost” onto the captured image buffer (photoImage). This makes the camera feel meaningful: it doesn’t just add score, it changes the ghost’s personality in the “photo reality,” matching my concept that people want to look better on camera.

let capturedGhost = null;
for (let g of ghosts) {
  if (g.isVisibleNow() && g.isInsideFlashlight(mouseX, mouseY)) {
    capturedGhost = g;
    break;
  }
}

Then I did the important trick: I made a separate “photo layer” instead of drawing everything directly on the main screen. I create a new graphics canvas for the photo, and I copy the current screen into it. That’s what makes the photo feel like it’s frozen in time:

photoImage = createGraphics(width, height);
photoImage.image(get(), 0, 0);

After that, if I really did capture a ghost, I draw the cute tongue ghost onto the photo layer (not the live game). And I add to my capture count:

capturedGhost.drawTongueStrongOn(photoImage);
ghostsCaptured++;

Once I got this working, the whole game started to make sense. The live view stays spooky and subtle, but the photo becomes the “evidence,” and the ghost looks cuter in the picture—kind of like how people also want to look better when a camera points at them.

Areas for improvement + problems I ran into
One area to improve is balancing and clarity. Sometimes players may miss ghosts too easily, depending on timing and where the flashlight is. I want to tune the ghost visibility timing and the capture conditions so it feels fair but still challenging. I also want to add clearer feedback when a ghost is nearby so the player can learn the game faster.

Adding sound was harder than I expected because browsers don’t just let a game play audio whenever it wants. In p5.js, if you try to play sound automatically when the page loads, most browsers will block it. They only allow audio after a real user action, like a click or pressing a key. At first this felt confusing, because my code was “correct,” but nothing played. So the challenge wasn’t only choosing sounds—it was designing the game flow so sound is unlocked in a clean way.

To fix that, I made sure audio starts only after the player begins the game on purpose. When the player presses SPACE (or clicks START), I call my audio setup function ensureAudioStarted(). Inside that function I use userStartAudio() (from p5.sound) to unlock audio, then I start my sound sources (an oscillator and a noise generator) at zero volume so they’re ready but not making noise.

Midterm “Ladushki”

Sketch

* In order to play the game, go to p5 sketch and give access to the camera!

Concept

I created a game that is controlled by user’s video input. In Russia, we play a clapping game called “Ладушки” (ladushki; in English it’s called Patty Cake), where you need to match the rythm of the other person clapping, as well as their hands (right to right, left to left, two hands to two hands). A cute kind girl in the room welcomes the player to play this game with her, starting the game after a short tutorial.

However, if the player fails to match the girl’s rythm and handpose, she will get more and more upset. With more mistakes, the girl will clap faster, and her anger will distort the environment and sound around her. What happens if you manage to fail so many times that she reaches the boiling point? Play and find out.

Proccess of Development & Parts I’m Proud of

Sprites & Design

To create the sprites, I first created a character on Picrew, so I can later edit the image of a complete, well-designed in one style character. I chose the design of the girl to be cute-classy to fit the mood of the game.

After that, I inserted the photo to Nano Banana to pixelate to 16-bit and extend the image. After that, I edited the image in Canva, so all the faces, hands positions are properly aligned, and the image has all 4 positions with 4 different faces.

Sound

The sounds from the game were generated and/or taken from open-source copyright free resources. The background music was created using Suno AI using the following prompt:

Dreamy chiptune instrumental, midtempo, modular sections built for easy tempo shifts, Playful square leads carry a singable main motif, doubled an octave up on repeats, Soft, bouncy drum kit with rounded kicks and brushed snares; bubbly sub-sine/square bass locks to a simple walking pattern, Light 8-bit arps and gentle pitch bends sparkle at phrase ends while warm, detuned pad layers smear the edges for a cozy, nostalgic arcade glow, Occasional breakdowns thin to arps and pad swells before the full groove pops back in with extra countermelodies for an intensifying, joyful loop, playful, nostalgic, light, warm, soft, gentle, bright

Other sounds, such as clapping sounds, screaming sound were taken from Pixabay.

I had a lot of manipulations with sound for its speeding up/distortion for creepy effect.

update() {    
  //for sounds
  let current_rate = map(this.level, 50, 100, 1.0, 1.3, true);
  soundtrack.rate(current_rate);
  if (this.level >= 70) {
    let intensity = map(this.level, 70, 100, 0, 0.3); 
    distortion.set(intensity); // set the distortion amount
    distortion.drywet(map(this.level, 70, 100, 0, 0.2));
  } else {
  distortion.drywet(0); // keep it clean under level 70
  }

Here, I use few methods from p5.js sound reference page. Background soundtrack is connected to the distortion variable that can be seen in the code. By mapping the rate (speed of the soundtrack) and intensity (the distortion amount), as well as drywet value (for reverbing) and connecting all these values to the background soundtrack, the sound effect and background music slow but noticeable change is created.

ml5

The fundamental part of my project is hands tracking, which was implemented using ml5.js HandPose ML model.

The implementation process was carefully explained in my previous post since it was the first step in the development. I didn’t change this part since then, but I built up on closed palm pose detection: I added the following condition:

//DISTANCE BETWEEN THUMB AND PINKY is also counted for state of the hand
//define what means when hand is open and set status of the user's hand positions
if (hand.keypoints && hand.keypoints.length >= 21) {
  let isHandOpen = (
    hand.keypoints[4].y < hand.keypoints[2].y &&   
    hand.keypoints[8].y < hand.keypoints[5].y &&   
    hand.keypoints[12].y < hand.keypoints[9].y &&  
    hand.keypoints[16].y < hand.keypoints[13].y && 
    hand.keypoints[20].y < hand.keypoints[17].y &&
    abs(hand.keypoints[4].x - hand.keypoints[20].x) > abs(hand.keypoints[5].x - hand.keypoints[17].x));

  if (isHandOpen) {
    if (hand.handedness === "Right" && hand.keypoints[20].x - hand.keypoints[4].x > 0) {
      leftOpen = true;  
    } else if (hand.handedness === "Left" && hand.keypoints[20].x - hand.keypoints[4].x < 0) {
      rightOpen = true; 
    }
  }
}

The condition abs(hand.keypoints[4].x - hand.keypoints[20].x) > abs(hand.keypoints[5].x - hand.keypoints[17].x)); measures the distance between pinky tip and thumb tip and compares it with the distance between knuckle of index finger and pinky, ensuring that the palm is fully open and not tilted. The condition hand.keypoints[20].x - hand.keypoints[4].x < 0 checks if the distance between pinky and thumb tip is positive, ensuring that the user shows the inner side of the palm to the camera, not its back side.

Other parts

One part that I’m proud of in this code is the typewriter text effect in pixel dialogue window.

//draw text like a typewriter
function draw_text(t, anger_level) {
  //add shaking for higher anger levels
  let shakeAmount = 0;
  if (anger_level > 40 && anger_level < 100) {
    shakeAmount = map(anger_level, 40, 99, 0, 5, true); 
  }
  // random offset
  let offsetX = random(-shakeAmount, shakeAmount);
  let offsetY = random(-shakeAmount, shakeAmount);

  let currentIndex = floor(text_counter / text_speed);
  if (currentIndex < t.length) {
    text_counter++;
  }
  let displayedText = t.substring(0, currentIndex);

  push();
  translate(offsetX, offsetY);
  
  textFont(myFont);
  textSize(19);
  noStroke();
  
  fill(0);
  textAlign(CENTER, CENTER);
  rect(width/2, height*0.9, width*0.6+15, 40); //lines from side
  rect(width/2, height*0.9, width*0.6, 55); //lines from up/down
  //dialogue window
  fill(237, 240, 240);
  rect(width/2, height*0.9, width*0.6, 40);
  fill(0);
  text(displayedText, width/2, height*0.9);
  pop();
}

Here, if-condition checks on which index in the text we are currently on (default is set to 0 since text_counter = 0), if it’s less that the length of the desirable output string. If it is, it increments a counter. The counter is being divided by text speed (set to 2 frames), and the current index displayed is a rounded to lower number result of this division with the help of floor() function. Substring function converts the initial string to an array of characters using starting index (0) and ending index which is exactly the current index we’re reevaluating every time, and then it outputs the string captured between these indices. This way, a small pause (of 2 frames) between drawing each letter is created, creating an effect of typewriting.

In the final part of the function black rectangles are created under the main gray dialogue window, creating a pixel-style border to it.

Another valuable part of the code here is the shaking. In other parts of the code the shaking technique is almost the same: the offsets by x and y that depends on the anger level are passed to translate() function, changing the coordinates origin. Thanks to that, the whole dialogue window displayed has this new coordinate origin each time the function runs if the condition is satisfied, creating an effect of shaking.

Apart from that, the core of my code is the class “Girl” which controls almost everything connected to the girl charachter, from her speech to comparing handpose states. Also, I have some independent functions, like detect() that recognizes and returns the handpose state of the player and tutorial running that explains the player the rules of the game (by controlling and modifying some class public variables as well).

To control the game state, when it should run the tutorial, when the main part is being played, and when it’s over and needs a restart I use game states. For resseting, player is prompted to press “ENTER” on the final screen to fully restart the game by triggering the resetting function that sets all global variables back to default state and creates a new Girl object with new default attributes:

//reset the whole game upon calling this function
function resetGame() {
  // reset global variables
  game_state = "START";
  state = "CLOSED";
  text_counter = 0;
  screenFlash = 0;
  girlImages = [];
  
  girl = new Girl();
  
  // reset girl's variables
  girl.current_state = 0; 
  girl.level = 0;
  girl.change_state();
  endStage = 0;
  endTimer = 60;

  
  // reset the audio
  soundtrack.stop();
  soundtrack.rate(1.0);
  soundtrack.setVolume(1.0);
  distortion.set(0);
  distortion.drywet(0);
  soundtrack.loop();
}

...

function keyPressed() {
  ...
  if (keyCode === ENTER) {
    if (game_state === "GAME_OVER") {
      resetGame();
    }
  }
...
}

My code is pretty big but I feel like explained parts are the most interesting ones. I believe I have some inefficient parts in my code (such as hardcoded ending speech and its progression) but they all work now without lagging or taking long time to load, so I believe that at least for this projects it is fine to leave them like that.

While writing the code, I used the following resources:

1. p5.js reference
2. ml5.js reference
3. The Coding Train Handpose video
4. Gemini (Guided Learning Mode) for debugging and searching for functions of p5.js (such as substring function in typewriter, for example)

+just googling some methods and clarifications

Problems

Throughout the development of the project I ran into a lot of problems and small bugs but I will describe one that actually taught me a very useful trick.

I had a lot of visual parts that required precise positioning of the object, as well as I had different effects applied to them. Offsets of the object that were limiting its shaking, the mode of displaying the object (rectMode, imageMode), aligning, the translating conditions etc. were different for many parts. However, when you assign imageMode in one place globally, and then somewhere else you set another imageMode, and then in the third place you just use it without assigning expecting the default mode — the whole sketch turns to complete chaos. As you can see on the photos, I had video being aligned to another part of the screen, the textMode being set to some weird value, font style dissapearing, and textbox moving out of the screen. I learned how to isolate the styles (with the help of Gemini), as in this example:

function draw_video() {
  push();
  imageMode(CORNER);
  image(bg_img, 0, 0, width, height);
  
  //layer that gets the room darker as the anger level rises
  rectMode(CORNER);
  let mask_level = map(girl.level, 20, 100, 0, 180);
  noStroke();
  fill(0, mask_level);
  rect(0, 0, 640, 480);
  pop();

By surrounding the code block with push() and pop(), the style and code inside the block becomes isolated and doesn’t impact other parts of the code. It was really helpful, so I used it almost everywhere in my project!

Areas for Improvement

There’re some parts of my project that can be significantly improved and parts I don’t really like.

First of all, the final screamer, I feel like it is not scary enough to really make a great impact on the user. The concept was to have that cuteness vs. creepiness contrast. So, in contrast for a small childrens’ game and cutesy design, I wanted to make a really impactful and creepy screamer in the end, additional to other glitch/creepy effects. Turned out that making a scary screamer is actually a very hard job. I tested a few of the screamers versions, asking my friends to test the game so they can tell which one is scarier. I stopped on the current version because it was more unexpected, since it appears mid-sentence and has some stop-frame picture and not zoomed video or something else. Still, I feel like there’re ways to make this part much more surprising and scary that I wasn’t able to come up with.

Another part I could work on more is the design. I can’t draw, so in order to create visual assets I used picrew, editing AI (described earlier). However, I think that sprites created could be more fitting, and maybe I could have added additional sprites for more smooth pose-change, and sprites of a “still” pose. It is a bit hard to do in time-constraits and lack of skill, but I’m sure it’s something I can think about in the future.

Also, I believe I could introduce more unified control system. While playing, the user doesn’t touch the keyboard and only show their hands to the screen, but to progress through the tutorial and ending scene they need to press some buttons. I believe it is not really good to have these two controls systems mixed so maybe one of the improvement can be introducing some additional poses (like peace sign, maybe?) instead of keyboard pressing.

Week 5: Midterm Progress Report

Concept & Design

My concept for the Midterm is an interactive piece of a Filipino-style bakery or a panaderya. I want to make a nostalgic and cozy piece where you can learn about different Filipino pastries and baked goods, interact with the radio to change to music (the songs being classic Filipino songs), and the electric fan in the background.

I started with a rough sketch of the design and I’m planning to design the whole piece using pixel art and I will be using the PressStart2P font which is this pixelated looking font to really give it that nostalgic feeling. For the landing screen, I wanted it to be simple and straightforward with detailed instructions for the user and to transition to the actual bakery, I’ll be using keyPressed() function. For the bakery, there’s four main interactivity functions for now and all of them are going to be clicked to use. The radio is going to have play, pause, next, and previous buttons that will control the music. For the electric fan, I’m planning to implement animation using sprite from the previous lesson and I want the blades of the fan to change speed when clicked (I’m still debating whether to put numbers that correspond to the speed like an actual fan). Most importantly, the shelf is gonna have different objects and when clicked, there will be a pop up that’s going to tell the user about that specific pastry. Lastly, the door is going to restart the experience starting with the landing screen.

Code Testing & Uncertain Part

I wanted to test out using the font to ensure that I actually knew how to use them for the sketch and it looked the way I wanted it to. It was quite easy to figure that out as we already had a discussion on using downloaded fonts. I also wanted to test out having images as my object and the pop when clicked for my pastries section. I spent some time and asked AI for assistance because I only remember how to do interactions with shapes but not images. I eventually figured out that p5.js doesn’t automatically detect clicks on images, so we have to manually create an invisible box around an image using four variables (X, Y, width, and height) to track where it is on the canvas. Then in mousePressed() we check if the mouse coordinates fall inside that box, and if they do we know the image was clicked and trigger the popup.

Midterm 3 Progress: ROOFTOP RUSH

The Concept

ROOFTOP RUSH is a side-scrolling parkour runner built in p5.js. The player controls a free-runner crossing a city skyline at dusk. The city scrolls to the right at increasing speed. The player must jump between rooftops, avoid obstacles, and collect coins. Each run generates a different sequence of buildings, gaps, and obstacles, so no two runs are the same.

The central idea is one core mechanic: the more points you earn, the farther your jumps carry you. Score is not just a number in the corner. It directly changes how far the player can jump. Early in a run, jumps are short and the player must plan each crossing carefully. As the score grows, the jumps grow with it. The player gains the ability to clear gaps that were not possible at the start. At the same time, the world speeds up. The game becomes harder and more powerful at once. The tension between those two forces is what makes each run feel urgent.

The planned interactive features are:

Grapple Hook (G key): A crane will spawn automatically over any gap that is too wide to jump. Pressing G will lock onto the crane and swing the player across.
Wall-Run (Up key on a wall): Touching a wall will trigger a wall-slide. Holding Up will convert it into a wall-run, carrying the player upward before launching off.
Trick System (F for flip, R for spin): Performing tricks in mid-air will award bonus points. Chaining multiple tricks in one jump will multiply the reward.
Slide Kick: Sliding into certain obstacles will destroy them and award points instead of dealing damage. This turns a defensive move into an offensive one.
Upgrade Shop: Coins will carry over between runs. The player will spend them on permanent upgrades such as stronger jumps, longer dashes, or a larger coin magnet range.
Day and Night Cycle: The sky will shift from sunset to night over time. Stars will appear and a helicopter with a spotlight will patrol the skyline after dark.

The Riskiest Part: The Jump Curve

The most uncertain part of this project is the score-to-jump-force progression curve. This mechanic is the entire point of the game. If the curve is wrong, nothing else works. If it is too flat, the player will not notice the progression. If it is too steep, the player will overshoot buildings and the game will break.

The challenge is not technical. It is perceptual. Jump force is measured in pixels per frame. That number has no intuitive meaning to a player. The curve needs to satisfy three conditions:

The change must be noticeable early. A player who earns 500 points should feel a real difference in jump distance.
It must plateau at high scores. The growth must slow down so the game stays controllable.
The maximum jump height must stay within the bounds of the level. Buildings differ in height by at most 90px. The widest gap will be 180px.

I plan to use a logarithmic curve. Logarithms grow fast near zero and flatten at large values. This matches both requirements. The formula will be:

jumpForce = max( BASE_JUMP - K * ln(1 + score) , MAX_JUMP )

Planned constants: BASE_JUMP = -11.0, K = 0.0004, MAX_JUMP = -18.5. The negative sign follows the p5.js convention where upward velocity is negative.

To test this before building the game, I wrote a standalone sketch. It plots jump height in pixels against score so I can read the curve visually and check the numbers at key milestones.

// Risk-reduction test sketch
// Paste into p5.js editor to visualize the jump progression curve
// before writing any game logic

const BASE_JUMP = -11.0;
const MAX_JUMP  = -18.5;
const K         = 0.0004;
const GRAVITY   = 0.62;

function getJumpForce(score) {
  return max(BASE_JUMP - K * log(1 + score), MAX_JUMP);
}

// Physics: h = v^2 / (2 * gravity)
function jumpHeight(score) {
  let v = abs(getJumpForce(score));
  return (v * v) / (2 * GRAVITY);
}

function setup() {
  createCanvas(700, 400);
}

function draw() {
  background(20, 20, 30);

  // axis labels
  fill(180); noStroke(); textSize(12);
  text("Score ->", 600, 390);
  text("^ Jump Height (px)", 10, 20);

  // reference lines
  stroke(60, 60, 80);
  for (let h = 50; h <= 300; h += 50) {
    let y = map(h, 0, 300, height - 40, 20);
    line(40, y, width - 20, y);
    fill(100); noStroke(); text(h + "px", 2, y + 4);
    stroke(60, 60, 80);
  }

  // curve
  stroke(255, 160, 40);
  strokeWeight(2.5);
  noFill();
  beginShape();
  for (let score = 0; score <= 10000; score += 50) {
    let x = map(score, 0, 10000, 40, width - 20);
    let y = map(jumpHeight(score), 0, 300, height - 40, 20);
    vertex(x, y);
  }
  endShape();

  // milestone markers
  let milestones = [0, 500, 1000, 2500, 5000, 10000];
  for (let s of milestones) {
    let x = map(s, 0, 10000, 40, width - 20);
    let h = jumpHeight(s);
    let y = map(h, 0, 300, height - 40, 20);
    stroke(255, 80, 80); fill(255, 80, 80); ellipse(x, y, 7);
    noStroke(); fill(220);
    text("s=" + s + "\n" + nf(h, 0, 1) + "px", x - 10, y - 16);
  }

  noLoop();
}

The sketch produces the following numbers:

Score	Jump height	What the player will feel
0	97 px	Short. The player must judge each gap carefully.
500	116 px	Noticeably higher. The reward is felt immediately.
1,000	128 px	Confident. Medium gaps are now comfortable.
2,500	147 px	Strong. Most gaps are within reach.
5,000	163 px	Powerful. Wide gaps feel manageable.
10,000	177 px	Near the ceiling. The curve has flattened.

The hard cap at MAX_JUMP = -18.5 gives a maximum jump height of 277px. That is just under half the canvas height and within the maximum building height of 360px. A player at any score will never jump off screen. The widest gap in the level will always be crossable. These numbers confirm the curve is safe to use before writing a single line of game logic.

The second risk is procedural level generation. A bad sequence could produce an impossible gap or a long boring flat stretch. To address this, I will clamp the height difference between adjacent buildings to 90px. I will also write a query function that automatically places a crane anchor over any gap wider than 110px. The grapple hook will always be reachable from that gap, so no run will ever be blocked by the level generator.

Next Steps

The concept and design are clear. The riskiest algorithm has been tested and validated. The next step is to build the full game system: the player state machine, the building generator, the collision detection, the scoring logic, and the upgrade shop.

p5.js v1.9.0 · February 2026 · Intro to IM, NYUAD

Week 5 – Midterm Progress

For midterm project, I am currently working on an interactive artwork called “Space Mood Garden”. The basic idea is to have a space‑themed screen where the user’s voice plants glowing orbs in a starry background, so their sound gradually turns into a kind of visual garden. Right now I am exploring how different levels of voice loudness can map to the size and color of these orbs, with quieter sounds creating smaller, cooler orbs and louder sounds creating larger, warmer ones. I like that the main input is the voice rather than the mouse, because it feels more personal and connects to the idea of mood, but I am still experimenting with how strong or subtle the visual changes should be.

The interaction design is starting to take shape using a three‑screen structure. The sketch begins with a start screen that shows a space background image and some simple instructions, and it waits for the user to press S and allow microphone access before moving on. In the main state, the sketch listens to the microphone using p5.sound and reads the current sound level, and whenever the volume goes above a threshold it creates a new “mood orb” at a random position on the canvas. At the moment I am mapping the volume at that moment to the orb’s size and color, and I have added a gentle pulsing animation so they look like they are breathing.

On the coding side, I have already set up the basic structure in p5.js using a state variable to switch between the start and play.. I also created a star class for the small moving stars in the background. The microphone input is working using p5.AudioIn and getLevel when the user gives access to microphone, and I am mapping the raw volume values to the visual parameters with map. The overall system is running, but I still want to tune the timing, the cooldown between orb spawns, and the visual style so it feels more like a coherent “garden” and less like scattered dots.

For this project, the most uncertain part for me is actually how the artwork will look and feel when someone really uses it, not just the technical side of the microphone. I know in theory that the sketch will create orbs based on the user’s voice, but I am still unsure whether the final garden will feel coherent and expressive, or just like random circles scattered on a space background. Because the orbs appear at random positions and their size and color depend on sound levels that I cannot fully predict, it is hard to imagine in advance what kind of compositions people will end up with and whether those compositions will really communicate a sense of “mood”. Overall I am still actively working on the project, trying different parameter values and small design tweaks, but I feel more confident now that the core idea is solid and that with a bit more tuning the project will turn out well.

Week 5: Midterm Progress

Concept:

My concept is based on my favorite Ramadan TV shows, “ظاظا و جرجير” (Zaza w Gargyer) and “بكار” (Bakar). These shows teach children Islamic and Egyptian morals and values in an indirect and engaging way. Bakar is especially important because it was one of the first cartoons to represent Egyptian culture and identity.

I want to create an adventure game where players can choose one of these cartoon characters and go on a journey inspired by the spirit of Ramadan. Throughout the game, players will stop to help others, learn from their mistakes, and make choices that reflect kindness, generosity, patience, and responsibility. The goal is to combine fun gameplay with meaningful lessons, just like the original shows.

Design:

I want my game to give this cozy relaxing feel that you could play anytime like on a portable console or at console. I want it to be as nostalgic as possible. I want it to be similar aesthetic to Stardew Valley. I am currently working on the characters sprite sheets.

Frightening / Challenging Aspects

The most frightening aspect for me is figuring out how to make the game simple, yet fun, while still being a meaningful learning experience. I’m unsure whether I should design it as a pixel-style platformer or a 2D narrative game.

I also want the game to accurately capture and translate the spirit of Ramadan, just like the original cartoons. Can I successfully express that spirit through a cozy, pixelated platform game that can be enjoyed by the entire family?

Risk Prevention

To reduce this risk, I would create a short sample or prototype of the game to test whether it truly fits the idea and essence I have in mind. Most importantly, I would evaluate how the characters and the setting turn out visually.

Then, I would ask people who are familiar with the original cartoons whether they can recognize the characters and understand what they represent in the pixelated version. If a high number of people are able to recognize them and connect them to the original spirit of the shows, I would feel more confident about moving forward with the project.

Week 5 – Reading Response (mss9452)

From reading this I understood just how different “seeing” actually is from one person or entity to the next.

As a human, I don’t have to think about the process of seeing. I see a person, and I immediately understand a face, an expression, perhaps even an intention. But computer vision doesn’t see meaning, it sees numbers. The article clearly states that high-level image understanding is still a very difficult task, whereas low-level image understanding is much more feasible. A computer doesn’t see “a person walking,” it sees the differences in pixel values.

For instance, frame differencing involves subtracting one frame from another to identify movement. This seems so very mechanical in contrast to how easily we, as humans, can identify movement. Background subtraction involves comparing a live image with a pre-stored image of the background to identify what is out of place. As humans, we can easily identify a person regardless of how much the lighting changes. However, for a computer, lighting, contrast, and setup are very important.

One thing that I found particularly interesting is that rather than trying to make computers see the world as we do, the reading proposes that we design the physical world so that it is easier for the computer to see, using high contrast, controlled lighting, and reflective surfaces. This is a reversal of the situation, rather than trying to make the algorithm “smarter,” we are trying to make reality more computable. For me this is fascinating because interaction design is not just about the digital world, it is also about the physical world.

However, with interactive art, this is even more complicated. Tracking technologies can enable very powerful experiences of embodiment, such as in early works such as Videoplace, where silhouettes and motion become interactive elements. I find the concept of being able to have meaningful artistic experiences through simple detection technologies very appealing. The system does not fully “understand” the body but simply tracks enough information to react to it.

However, at the same time, the reading points out works such as Sorting Daemon, which emphasize surveillance and profiling. This was somewhat uncomfortable. The same technology that enables playful interaction can also extract, categorize, and analyze individuals. In the context of interactive art, being tracked can be very engaging. In other contexts, it can be very invasive.

I think it’s this tension that makes computer vision so potent in interactive media. It turns the body into data, but this data can either be expressive and interactive or controlling and analytical. As artists and designers, we’re not simply using tracking as a tool, we’re making choices about visibility and power.

This reading has made me more conscious of the fact that computer vision isn’t about simulating human vision. It’s about finding patterns that the machine can calculate. And perhaps it is interactive art that is where human vision and machine vision intersect.

week 5 – Midterm Progress (mss9452)

Concept:

For my project I plan on redesigning the classic game “Snake,” however I do plan on adding my own twists to the game.

In this game the Snake is represented as glowing circular outlines, moving like a light source.
For the background I intend to add islamic geometric patterns for the aesthetics, but I still haven’t implemented it yet.
More Features will be added to make the game more engaging and interesting.

UI Design:

Interactions remain the same as the classic game:

Arrow Keys to control direction
Mouse click to start/restart game
T key to toggle theme

I decided to keep interactions simple to preserve the familiarity of the game, in hopes that the visual redesign could shift the experience from purely arcade-based to more atmospheric and reflective. Additionally, there’s a time counter, to make players more aware of the duration rather than just the score.

Code Structure:

I tried making the program more organized by into different layers, where each part does a specific function. They are:

Game Logic: to handle movement, growth, collision and fruit spawning.
Rendering: for background, glowing snake, grid display and UI overlay
Interaction: to handle key input and theme toggling

I made sure to separate things to avoid confusion and to make adding additional features easier.

Uncertain / Complex Parts:

The thing I’m most worried about is adding sounds to the game as there are several different things that need it such as:

fruit collection
collision
background sound

There are several issues I’ve identified which include the timing of the sounds, possible audio restrictions and making sure the audio is not distracting.

To avoid this risk early on, I experimented with basic p5.js sound playback on its own. I tested playing simple oscillator sounds based on mouse interaction to make sure that:

Audio plays back properly after user interaction.
There are no console errors.
Sound playback can be controlled in short bursts.
Sound playback can synchronize with game events such as fruit pickup.

By testing sound playback independently before incorporating it into the full game logic, I avoided the risk of ruining the main system later on.

Week 5 – Reading Response

In the article “Computer Vision for Artists and Designers: Pedagogic Tools and Techniques for Novice Programmers”, many existing works, themes and applications of computer vision, and basic methods of representing computer vision.

To now directly address the questions asked of me, computer vision indeed differs gravely from human vision. While the goal of computer vision remains the same as that of human vision, id est to represent physical (extended) objects in a manner where no significant detail is lost, and so to identify and then perform computations upon the representations of this data (if so it may be needed).

Humans are unfathomably complex beings, with there being over 100 million rods (cells for low-light conditions and peripheral vision, source: Cleveland Clinic), and several million cones (cells for detail and color, source: Cleveland Clinic), many machines even by todays standards can never ever ever come close to us biological entities. Furthermore, operating at an average of 20 Watts (source: National Library of Health), our brains are incredibly efficient at managing and responding to input from the incredible complexity of our eyes, and every other sensory system and square inch of skin.

Now that I am done marveling at ourselves, I return to humiliate the computers. Computers are inefficient, slow, blocky, prone to faults and can really only function on binary numbers and logic (though in more recent years other number and logical systems are being explored). The challenge is both in the sensors scanning the environment and relaying this analog data to the computer. Next the challenge is for that analog data to be converted into a digital format (fundamentally 1s and 0s), and then for that data to be processed by a program in an efficient manner. Typically, videos from this sensory data are stored as “a stream of rectangular pixel buffers”, and according to the paper, this doesn’t really tell us much about what the computer is really being fed through the system.

The paper moves to mention several different schemes and standards that computer vision encoding may be analogous to, for representing real world data, and underscores how there is no unified convention when it comes to these schemes. Certain techniques that a basic algorithm may use to discern motion from stillness includes comparing two adjacent frames in a video to see what pixel values changes, as well as background subtraction. I now extend upon this independently, in that it is probably wiser to first subtract the background before measuring any sort of pixel value changes or points of reference, as we don’t want background pixel noise to impact accuracy.

What I really found interesting was how we may be able to implement basic interactions on these two – albeit simple – methods alone. Once the silhouette of a person has been detected, their motion or boundary can be used as a collider for free-falling objects, for example. Alternatively, we may even be able to recolor a person and their environment in grayscale, or B&W for intriguing stylistic effects. Perhaps it is so that it is only I who yearns for aged B&W technology. There is something I find oddly simple yet calming in such technology.

Alas, I have extended upon the core of that in the reading which held significance to me. Though I would like to mention, the more we try to personify computers, id est implementing traditionally biological processes to them, the more I marvel at our own biological complexity!

Week 5 Reading Response Zere

What are some of the ways that computer vision differs from human vision?

While humans use their eyes and brains to interpret the information about the scenes they see, computer vision utilizes cameras and algorithms to analyze images. Computers can’t understand images or videos without specific algorithms that assign meaning to the pixels.

What are some techniques we can use to help the computer see/track what we’re interested in?

Frame differencing: Motion is detected by comparing consecutive video frames. This works well when objects or people are in motion.
Brightness thresholding: This separates foreground and background based on the light/dark values. This helps in environments where there is a strong visual contrast.
Simple object tracking: This finds the brightest/darkest pixels in an image/scene and follows the object’s position.

How do you think computer vision’s capacity for tracking and surveillance affects its use in interactive art?

You can create responsive interaction artworks, as there is opportunity to track objects/bodies/faces/gestures etc. Additionally, some artworks use surveillance techniques to comment on power and monitoring in a social perspective. This gives room for critique and opinion, something that is crucial in any form of art.