Midterm Project Plan

Scope of Work
For my midterm project, I am designing a digital version of the classic Whack-a-Mole game, inspired by the attached references. The goal is to create an engaging and interactive game where players use their mouse to “whack” moles as they pop out of holes. The game should challenge players’ reflexes and introduce risk elements like bombs that add complexity.

The game will start with a Start Screen featuring the game title and a simple “Start” button. I also plan to add access settings, like toggling sound on or off. Once the game begins, moles will randomly pop up from a grid of holes, and the player must click or tap on them to score points. Not every hole will be safe. Occasionally, a bomb will pop up, and hitting it will result in losing points or lives. Players will have a limited time (for example, 60 seconds) to score as many points as possible before the timer runs out. As the game progresses, the difficulty will increase, with moles appearing and disappearing faster, making the game more challenging.

Each successful hit of a mole adds 10 points, while mistakenly hitting a bomb will deduct 20 points or reduce a life. The game will display a score counter and a countdown timer to keep players aware of their progress and remaining time. At the end of the game, an End Screen will appear, showing the final score and offering options to “Play Again” or “Quit.”

I want the grid to contain between 9 and 16 holes, depending on the level of complexity I decide to implement. Moles and bombs will randomly pop up in these holes at varying intervals. The randomness is crucial to establish unpredictability. To add to the challenge, the moles will pop up faster as time progresses, requiring quicker reflexes from the player.

Code Structure

For the game’s development, I plan to use an object-oriented approach. The game will be structured around a few core classes:

  • Game Class: Manages the overall game loop, score tracking, and time countdown
  • Mole Class: Controls the mole’s behavior (when it pops up, how long it stays, and how it reacts to player interaction)
  • Bomb Class: Functions similarly to the mole but triggers penalties when a bomb is clicked
  • Hole Class: Represents each position on the grid, randomly spawning moles or bombs
  • UI Class: Manages elements like the start screen, score display, timer, and end screen

The core gameplay loop will rely on these functions:

  • startGame(): Initializes the game and resets scores and timers
  • spawnMole(): Randomly selects holes for moles to appear
  • spawnBomb(): Introduces bombs with a set probability
  • whackMole(): Detects player clicks and updates the score
  • hitBomb(): Triggers penalties for clicking bombs
  • updateTimer(): Counts down and ends the game when time runs out
  • increaseDifficulty(): Speeds up mole appearances as the game progresses

Challenges and Risks

I expect that one of the most complex parts of this project would be ensuring accurate collision detection. In other words, making sure the program properly registers when a player clicks on a mole or bomb.

Timing is also a big concern. Moles need to appear and disappear at unpredictable but balanced intervals to make the pace of the game flawless and not frustrating.

To tackle these challenges, I plan to build a test script focused on collision detection, using simple shapes before applying it to the actual mole and bomb sprites. This will help me adjust the hitboxes and make sure user interactions feel responsive. I might also test randomization algorithms to ensure that mole and bomb appearances are unpredictable yet adequate.

Week 5 – Reading Response

What are some of the ways that computer vision differs from human vision?

While humans tend to rely on context, experience, and intuition to recognize objects and interpret scenes in front of them, computers process raw pixel data and require algorithms for example to to make sense of whatever visual input they are “seeing”. In addition to that, human vision can naturally adapt to different lighting and lighting changes, whereas computer vision can struggle with color perception under different illumination conditions. Similarly, motion recognition when it comes to humans is more intuitive and predictive, whereas in computers, it depends on frame differencing or object tracking for example to detect movement.

What are some techniques we can use to help the computer see / track what we’re interested in?

According to the paper, to help computers see and track objects we’re interested in, frame differencing can be used. When using this technique, frames are compared, and if pixels change between these frames, the computer sees it as movement. Another technique is brightness thresholding, which separates objects from the background based on their brightness levels. In simple terms, the process involves setting a specific brightness value (aka the threshold), and any pixel brighter or darker than that value is considered part of the object or background. For example, in an image, if the threshold is set to a certain brightness level, pixels brighter than that will be identified as the object, and those darker will be treated as the background.

How do you think computer vision’s capacity for tracking and surveillance affects its use in interactive art?

I think computer vision’s capacity for tracking and surveillance has really  expanded the possibilities of interactive art by allowing artist to create art that allows real time audience engagement and way more personalized experiences. Installations can now respond dynamically to movement and different  gestures, creating really immersive environments that evolve based on the viewer’s presence. By using computer vision, interactive art becomes more fluid and responsive, transforming the more traditional passive viewing of art, into something more active and engaging. I think as a whole, this technology not only enhances and improves  the storytelling and emotional impact of art, but also opens new doors in terms of  large scale public art and immersive installations that really blur the line between the digital and physical worlds.

Midterm Progress

I want to create a personalized DJ experience that allows users to choose different music genres, with the environment adapting accordingly. The idea is to present an interactive space where visuals, lighting, and animations react dynamically to the music and make it feel like a real party.

When the experience starts, a button launches the animation, and clicking anywhere switches songs while updating the environment to match the new song. The visuals rely on p5.Amplitude() to analyze the music’s intensity and adjust the movement of butterfly-like shapes accordingly (I reused my previous code to draw the butterflies).

One of the biggest challenges was managing these transitions without them feeling too sudden or chaotic. Initially, switching between songs resulted in jarring color and lighting changes, breaking the immersion. To fix this, I used lerpColor() to gradually shift the background and object colors rather than having them change instantly. Another issue was synchronizing the visuals with the audio in a meaningful way, at first, the amplitude mapping was too sensitive, making the animations look erratic -this still needs improvement maybe I will try modifying the amplitude scaling.

Moving forward, I plan to expand the genre selection with more styles and refine how users interact with the interface. I want each environment to reflect the music’s vibe.

Assignment 5: Midterm Project Update

I developed “Dragon Ball Z: Power Level Training,” an engaging and nostalgic game that captures the essence of the iconic anime series. This interactive experience allows players to step into the shoes of a Dragon Ball Z warrior, focusing on the thrilling power-up sequences that made the show so memorable. Players start with a low power level and, through rapid clicking, increase their strength while watching their character’s energy aura grow. The game features familiar visual and audio elements from the series, including character sprites, power level displays, and the unmistakable sound of powering up. As players progress, they encounter milestones that pay homage to famous moments from the show, culminating in a final power-level goal that, when reached, declares the player a true warrior.

📋Assignment Brief

  • Make an interactive artwork or game using everything you have learned so far
  • Can have one or more users
  • At least one shape
  • At least one image
  • At least one sound
  • At least one on-screen text
  • Object Oriented Programming
  • The experience must start with a screen giving instructions and wait for user input (button / key / mouse / etc.) before starting
  • After the experience is completed, there must be a way to start a new session (without restarting the sketch)

💭Conceptualisation

The idea for “Dragon Ball Z: Power Level Training” was born from a deep appreciation for the iconic anime series and a desire to recreate its most thrilling moments in an interactive format. As a long-time fan of Dragon Ball Z, I’ve always been captivated by the intense power-up sequences that often served as turning points in epic battles. The image of characters like Goku, surrounded by a growing aura of energy as they pushed their limits, has become a defining element of the series.

This project concept emerged while rewatching classic Dragon Ball Z episodes, particularly those featuring transformations and power level increases. I was struck by how these moments, despite their simplicity, generated immense excitement and anticipation among viewers. I wanted to capture this essence and allow players to experience the rush of powering up firsthand. The idea evolved to focus on the visual and auditory aspects of powering up, combining the growing energy aura, rising power level numbers, and the distinctive sounds associated with these transformations.

By digitalizing this experience, I aimed to create an interactive homage to Dragon Ball Z that would resonate with fans and newcomers alike. The game’s design intentionally incorporates key visual elements from the series, such as the character sprites and power level displays, to evoke nostalgia while offering a fresh, interactive twist on the power-up concept. This project not only serves as a tribute to the series but also as an exploration of how iconic pop culture moments can be transformed into engaging interactive experiences.

💻Process

I practiced making classes for certain elements as that is what I struggle with most. I created these classes for Characters, and Auras around the characters. Through this, I was able to solidify my ability with classes, now being a pro, and am able to use them for even more features.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
class Character {
constructor(name, x, y) {
this.name = name;
this.x = x;
this.y = y;
this.powerLevel = 100; // Starting power level
this.sprite = null; // Will hold the character's image
this.aura = new Aura(this); // Create an aura for this character
this.powerUpSound = null; // Will hold the power-up sound
}
// Load character sprite and power-up sound
loadAssets(spritePath, soundPath) {
// Load the sprite image
loadImage(spritePath, img => {
this.sprite = img;
});
// Load the power-up sound
this.powerUpSound = loadSound(soundPath);
}
// Increase power level and grow aura
powerUp() {
this.powerLevel += 50;
this.aura.grow();
// Play power-up sound if loaded
if (this.powerUpSound && this.powerUpSound.isLoaded()) {
this.powerUpSound.play();
}
}
// Display the character, aura, and power level
display() {
push(); // Save current drawing style
this.aura.display(); // Display aura first (behind character)
if (this.sprite) {
imageMode(CENTER);
image(this.sprite, this.x, this.y);
}
// Display character name and power level
textAlign(CENTER);
textSize(16);
fill(255);
text(`${this.name}: ${this.powerLevel}`, this.x, this.y + 60);
pop(); // Restore previous drawing style
}
update() {
// Add any character-specific update logic here
// This could include animation updates, state changes, etc.
}
}
class Aura {
constructor(character) {
this.character = character; // Reference to the character this aura belongs to
this.baseSize = 100; // Initial size of the aura
this.currentSize = this.baseSize;
this.maxSize = 300; // Maximum size the aura can grow to
this.color = color(255, 255, 0, 100); // Yellow, semi-transparent
this.particles = []; // Array to hold aura particles
}
// Increase aura size and add particles
grow() {
this.currentSize = min(this.currentSize + 10, this.maxSize);
this.addParticles();
}
// Add new particles to the aura
addParticles() {
for (let i = 0; i < 5; i++) {
this.particles.push(new AuraParticle(this.character.x, this.character.y));
}
}
// Display the aura and its particles
display() {
push(); // Save current drawing style
noStroke();
fill(this.color);
// Draw main aura
ellipse(this.character.x, this.character.y, this.currentSize, this.currentSize);
// Update and display particles
for (let i = this.particles.length - 1; i >= 0; i--) {
this.particles[i].update();
this.particles[i].display();
// Remove dead particles
if (this.particles[i].isDead()) {
this.particles.splice(i, 1);
}
}
pop(); // Restore previous drawing style
}
}
class Character { constructor(name, x, y) { this.name = name; this.x = x; this.y = y; this.powerLevel = 100; // Starting power level this.sprite = null; // Will hold the character's image this.aura = new Aura(this); // Create an aura for this character this.powerUpSound = null; // Will hold the power-up sound } // Load character sprite and power-up sound loadAssets(spritePath, soundPath) { // Load the sprite image loadImage(spritePath, img => { this.sprite = img; }); // Load the power-up sound this.powerUpSound = loadSound(soundPath); } // Increase power level and grow aura powerUp() { this.powerLevel += 50; this.aura.grow(); // Play power-up sound if loaded if (this.powerUpSound && this.powerUpSound.isLoaded()) { this.powerUpSound.play(); } } // Display the character, aura, and power level display() { push(); // Save current drawing style this.aura.display(); // Display aura first (behind character) if (this.sprite) { imageMode(CENTER); image(this.sprite, this.x, this.y); } // Display character name and power level textAlign(CENTER); textSize(16); fill(255); text(`${this.name}: ${this.powerLevel}`, this.x, this.y + 60); pop(); // Restore previous drawing style } update() { // Add any character-specific update logic here // This could include animation updates, state changes, etc. } } class Aura { constructor(character) { this.character = character; // Reference to the character this aura belongs to this.baseSize = 100; // Initial size of the aura this.currentSize = this.baseSize; this.maxSize = 300; // Maximum size the aura can grow to this.color = color(255, 255, 0, 100); // Yellow, semi-transparent this.particles = []; // Array to hold aura particles } // Increase aura size and add particles grow() { this.currentSize = min(this.currentSize + 10, this.maxSize); this.addParticles(); } // Add new particles to the aura addParticles() { for (let i = 0; i < 5; i++) { this.particles.push(new AuraParticle(this.character.x, this.character.y)); } } // Display the aura and its particles display() { push(); // Save current drawing style noStroke(); fill(this.color); // Draw main aura ellipse(this.character.x, this.character.y, this.currentSize, this.currentSize); // Update and display particles for (let i = this.particles.length - 1; i >= 0; i--) { this.particles[i].update(); this.particles[i].display(); // Remove dead particles if (this.particles[i].isDead()) { this.particles.splice(i, 1); } } pop(); // Restore previous drawing style } }
class Character {
  constructor(name, x, y) {
    this.name = name;
    this.x = x;
    this.y = y;
    this.powerLevel = 100; // Starting power level
    this.sprite = null; // Will hold the character's image
    this.aura = new Aura(this); // Create an aura for this character
    this.powerUpSound = null; // Will hold the power-up sound
  }

  // Load character sprite and power-up sound
  loadAssets(spritePath, soundPath) {
    // Load the sprite image
    loadImage(spritePath, img => {
      this.sprite = img;
    });
    // Load the power-up sound
    this.powerUpSound = loadSound(soundPath);
  }

  // Increase power level and grow aura
  powerUp() {
    this.powerLevel += 50;
    this.aura.grow();
    // Play power-up sound if loaded
    if (this.powerUpSound && this.powerUpSound.isLoaded()) {
      this.powerUpSound.play();
    }
  }

  // Display the character, aura, and power level
  display() {
    push(); // Save current drawing style
    this.aura.display(); // Display aura first (behind character)
    if (this.sprite) {
      imageMode(CENTER);
      image(this.sprite, this.x, this.y);
    }
    // Display character name and power level
    textAlign(CENTER);
    textSize(16);
    fill(255);
    text(`${this.name}: ${this.powerLevel}`, this.x, this.y + 60);
    pop(); // Restore previous drawing style
  }

  update() {
    // Add any character-specific update logic here
    // This could include animation updates, state changes, etc.
  }
}

class Aura {
  constructor(character) {
    this.character = character; // Reference to the character this aura belongs to
    this.baseSize = 100; // Initial size of the aura
    this.currentSize = this.baseSize;
    this.maxSize = 300; // Maximum size the aura can grow to
    this.color = color(255, 255, 0, 100); // Yellow, semi-transparent
    this.particles = []; // Array to hold aura particles
  }

  // Increase aura size and add particles
  grow() {
    this.currentSize = min(this.currentSize + 10, this.maxSize);
    this.addParticles();
  }

  // Add new particles to the aura
  addParticles() {
    for (let i = 0; i < 5; i++) {
      this.particles.push(new AuraParticle(this.character.x, this.character.y));
    }
  }

  // Display the aura and its particles
  display() {
    push(); // Save current drawing style
    noStroke();
    fill(this.color);
    // Draw main aura
    ellipse(this.character.x, this.character.y, this.currentSize, this.currentSize);
    
    // Update and display particles
    for (let i = this.particles.length - 1; i >= 0; i--) {
      this.particles[i].update();
      this.particles[i].display();
      // Remove dead particles
      if (this.particles[i].isDead()) {
        this.particles.splice(i, 1);
      }
    }
    pop(); // Restore previous drawing style
  }
}

I would like to clarify, I did use ChatGPT to help me understand classes further and it guided me as I used it to edit this code. However, the bulk of the work us mine.

🚩Predicted Challenges

One of the most intricate tasks will be implementing a particle system to create a dynamic, flowing energy aura around the character. This will require crafting a Particle class with properties like position, velocity, and lifespan, as well as methods for updating and displaying particles. Managing the creation and removal of particles based on the character’s power level will add another layer of complexity to this feature.

Customizing sounds for each character, particularly matching their iconic screams and power-up vocalizations, presents a unique challenge in this project. Dragon Ball Z is known for its distinctive character voices, and replicating this authenticity in the game will require careful sound editing and implementation. Finding high-quality audio clips that capture the essence of each character’s voice, while also ensuring they fit seamlessly into the game’s audio landscape, will be a time-consuming process.

The use of character sprites will be another difficult process, especially given that extracting character models from sprite sheets is a relatively new technique for me. Sprite sheets are efficient for storing multiple animation frames in a single image, but working with them requires a solid understanding of image slicing and animation timing. Learning how to properly extract individual frames, create smooth animations, and manage different character states (idle, powering up, transformed) will likely involve a steep learning curve. This process may involve trial and error, as well as research into best practices for sprite animation in p5.js.

📶Minimum Deliverables and Extras

Minimum:

  • Start screen with instructions and a start button
  • Main game screen with: Character sprite, Power level display, Energy aura (shape) around the character, Power-up button
  • Basic power-up mechanics (increase power level on button click)
  • Growing energy aura as power level increases
  • At least one sound effect (e.g., power-up sound)
  • Victory screen when final goal is reached
  • Option to restart the game after completion
  • Object-Oriented Programming implementation (Character, PowerUpButton, and EnergyAura classes)

Extras:

  • Multiple playable characters (e.g., Goku, Vegeta, Piccolo)
  • Animated character sprites that change with power level increases
  • Dynamic background that changes based on power level
  • More varied and engaging sound effects (e.g., different sounds for different power levels)
  • Power-up animations (e.g., lightning effects, screen shake)
  • Unlockable content (e.g., new characters, backgrounds) based on achievements
  • Adaptive music that intensifies as power level increases
  • Voice clips from the show playing at certain milestones
  • Mini-games or challenges to break up the clicking (e.g., timed button mashing, rhythm game)

Reading Reflection – Week 5

Computer vision differs from human vision in several key ways, primarily in its struggle with environmental variability, lack of semantic understanding, and limited field of view. While humans can easily adapt to changes in lighting, perspective, and context, computer vision systems process images as raw pixel data without inherent meaning. This fundamental difference presents both challenges and opportunities for artists and designers working with computer vision technologies.

To help computers see and track objects of interest, several techniques have been developed. These include controlled lighting to create consistent illumination, background subtraction to identify moving objects, brightness thresholding to detect significant differences, frame differencing to identify motion, and object tracking to maintain focus on specific elements. These methods, as highlighted in Golan Levin’s article, provide a toolkit for novice programmers and artists to incorporate computer vision into their work, enabling the creation of interactive experiences that respond to movement, gestures, and objects in real time.

I find it interesting how artists navigate ethical considerations regarding privacy and surveillance while also leveraging these technologies to create immersive and responsive installations. Some artists use computer vision as a medium for critical commentary on surveillance culture and social issues, turning the technology’s capabilities into a subject for artistic exploration. This dual nature of computer vision in art- as both a tool and a topic- encourages artists to deeply consider the societal impact of their work.

As computer vision tools become more accessible, there’s a growing tension between the democratisation of technology and the depth of understanding required to use it effectively. While user-friendly interfaces and AI-powered tools (like DALL-E and SORA) make it easier for artists to incorporate computer vision into their work, there’s a risk of oversimplification and a potential loss of the underlying principles that drive these technologies. This evolution in the artistic landscape offers exciting new avenues for creativity but also raises questions about the role of human ingenuity and technical literacy in art creation. As the field continues to advance rapidly, artists are challenged to balance the use of cutting-edge tools with a thoughtful approach to their application, ensuring that technology enhances rather than replaces human creativity.

Week 5 response

Computer vision differs from human vision in that humans perceive the world more holistically and understand visual cues based on experience and context, whereas computers use quantitative forms of image representation. Instead of recognizing things based on mental processes, machines use algorithmic and pattern recognition techniques based on pixel-based image representation.

Thus, compared to humans, computers also have difficulty identifying objects with different illuminations and directions, unless they are highly trained with varied databases. Just as humans estimate depth and motion based on vision and general knowledge, computer programs need specific methods such as optical flow detection, edge detection or machine learning algorithms to deduce similar information.

The power of computer vision to capture motion and analyze visual information has a profound effect on interactive art. Artists can take advantage of these technologies and use them to create installations that respond dynamically to the viewer’s movements, gestures, or even facial expressions and create immersive, interactive experiences. However, these technologies can also raise ethical issues, related to privacy and surveillance if we talk about the use of facial recognition and motion detection in interactive artworks. Consequently, artists working with computer vision must carefully weigh their creative possibilities with the ethical implications linked to surveillance culture.

Mid Term Project

Concept

“Stock Picker Fun” is a fast-paced, simplified stock market simulation game. The player’s goal is to quickly decide whether to buy or sell stocks based on their recent price trends. The game features:

  • Simplified Stocks: Three fictional stocks (AAPL, GOOGL, MSFT) with fluctuating prices.
  • Quick Decisions: Players must make rapid buy/sell decisions based on visual cues.
  • Visual History: Mini-graphs display each stock’s recent price history, aiding in decision-making.
  • Clear UI: A clean and intuitive user interface with color-coded indicators.
  • Progressive Difficulty: The speed of stock price changes increases over time, adding challenge.
  • Profit/Loss Tracking: A simple display of the player’s money and score.

A Highlight of Some Code That You’re Particularly Proud Of

I’m particularly proud of the drawGraph() function:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
function drawGraph(data, x, y) {
stroke('#fff');
noFill();
beginShape();
for (let i = 0; i < data.length; i++) {
vertex(x + i * 5, y + 40 - (data[i] - data[0]) * 0.5);
}
endShape();
noStroke();
function drawGraph(data, x, y) { stroke('#fff'); noFill(); beginShape(); for (let i = 0; i < data.length; i++) { vertex(x + i * 5, y + 40 - (data[i] - data[0]) * 0.5); } endShape(); noStroke();
function drawGraph(data, x, y) {
    stroke('#fff');
    noFill();
    beginShape();
    for (let i = 0; i < data.length; i++) {
        vertex(x + i * 5, y + 40 - (data[i] - data[0]) * 0.5);
    }
    endShape();
    noStroke();

Embedded Sketch

Reflection and Ideas for Future Work or Improvements

Reflection:

This game successfully simplifies the stock market experience, making it accessible and engaging for a wide audience. The visual history and clear UI provide valuable feedback, allowing players to quickly grasp the mechanics and make informed decisions. The progressive speed adds a layer of challenge, keeping the gameplay dynamic.

Ideas for Future Work or Improvements:

  1. More Data Visualization:
    • Add candlestick charts or other advanced visualizations to provide more detailed stock information.
    • Implement real-time data streaming from an API to simulate live market conditions.
  2. Advanced Trading Features:
    • Introduce different order types (limit orders, stop-loss orders).
    • Add the ability to short stocks (bet on price declines).
    • Include options trading.
  3. Dynamic News Events:
    • Generate random news events that impact stock prices, adding an element of unpredictability.
    • Use visual cues or animations to indicate the impact of news.
  4. User Profiles and Persistence:
    • Implement user profiles to save game progress and track performance over time.
    • Use local storage or a database to persist data.
  5. Sound Effects and Animations:
    • Add sound effects for buy/sell actions, price changes, and game events.
    • Incorporate more animations to enhance the visual feedback and create a more immersive experience.
  6. More Stock types:
    • Add more stock types, that have different volatilities.
  7. Game over conditions:
    • Add game over conditions, such as running out of money.
  8. Add a pause feature:
    • Add a pause feature to the game.
  9. Mobile optimization:
    • Optimize the game for mobile devices, using touch controls and responsive design.

By implementing these improvements, the game can be transformed into a more comprehensive and engaging stock market simulation.

Week #5 Reading – Computer Vision

Introduction

Computer Vision is the amalgamation of various mathematical formulae and computational algorithms accompanied by several computational tools – capable of carrying out the procedure. What was once deemed to expensive and high level (limited to experts such as on AI and signal processing), computer vision now has become readily available. Various software libraries and suites provide student programmers with the ability to run and execute those algorithms required for the object detection to work. The cherry on top, with mass refinement and larger availability of computer hardware, at a fraction of cost of what would have been in the early 1990’s,  now anyone, and by anyone I mean all of the institutions can access it and tinker around with it.

Difference between computer and human vision:

Computer Vision has a designated perimeter, where it scans for array of objects vertically and horizontally. Upon detecting a change in shade of pixel, it infers detection. By using complex algorithmic thinking, which is applied in the back-end, it is able to analyze and detect movement among various other traits such as character recognition etc. Various techniques like “Detection through brightness thresholding” are implemented. Alongs the similar lines happens to be the human vision. Our retinas capture the light reflecting from various surfaces, using which our brain translates the upside down projection into a comprehensible code. Our brain is trained to interpret objects, while computer vision requires algorithmic understanding and aid of Artificial intelligence. With AI, amongst a data set, training is done be it supervised or not, to teach the computer how to react to a certain matrix of pixels i.e image scanned.

Ways to make computer vision efficient:

As mentioned in the reading and the paper,  one of the things that I love is ‘background subtraction’. The capability to isolate the desired object. In my opinion, tracking several objects using this technique, and having a variety on the trained data set helps with more accurate and precise judgment. Especially if many objects are present at the same time. However, other techniques such as ‘frame differencing’ and ‘brightness thresholding’ exist as well. Also, from other readings, the larger the data set, and training time, the more the accuracy. However, to acquire image data, it comes with ethical dilemmas and added cost.

Computer Vision’s surveillance and tracking capability, and its implementation in interactive Media:

Works like Videoplace and Messa di Voce are example of earlier demonstration of interactive media and computer vision’s combination. Installations can track and respond to human input. This ‘feedback loop’ triggers a sense of immersion and responsiveness. In my humble opinion, the use of computer graphics takes away the user from traditional input techniques and gives them freedom to act as they will. Though it is also true, that the computer will make the sense out of the input, adjacent to the trained data set, and a totally random input might lead the system to fail. This is where the idea of ‘degree of control’ comes into play. Personally, I believe, as long as we have a combination of interactive components, user will never get tired of running inside the same tiring maze, but the use of computer vision definitely makes it seem less tangible and more user centered. Hence, I decided to use it for my midterm project as well!

Week #5 – Midterm Progress

1. Concept

I was inspired by the ‘coffee shop expo’ project, where the user can explore a place and click on different buttons freely; but I wished for more room for exploration and especially more freedom, in the way that a user can control a sprite  using keys to move around. Then, if this sprite lands on a specific place, a new place of discovery opens up.

I spent a considerable amount of time to develop the concept: a 2D RPG adventure, casual game in a pixelated world with a medieval setting, with story snapshots (as clear, not-so-pixelated images) appearing from time to time as a quest is completed. The user takes on a role as a character (represented with a sprite) that has migrated to a new city and wants a new job to get some money. There is a task board in a tavern villagers come to post various miscellaneous tasks. The user can pick a job from it, some examples below though my goal is to incorporate at least two from the following:

    • fetch water from the well,
    • harvest a number of carrots,
    • doctor’s mission – fetch herbs for medicinal purposes,
    • help me find my dog,
    • blacksmithing job,
    • help deliver letter.

I was wondering whether it would be strange to haveboth pixelated and non-pixelated graphics. After explaining my concept idea to my friend, my friend thought of a game that existed like that: “Omori.” Omori has pixelated sprites but clear CG scene snapshots, as well as iconic music – things to get inspired by – as well as a fan wiki for sprites which I could try to make good use of.

Interactive elements are to be present, such as a mouse click for opening a door and revealing a character’s speech. Audio is also heard for different story scenes, which should be chosen appropriately based on the atmosphere – eg. music with a sense of urgency on the task  board, relaxing music in the village, music with a sense of triumph and excitement when the quest is completed, etc. On-screen text could be used to beckon and inform the user of narration, instructions and dialogue.

After the experience is completed, there must be a way to restart the experience again (without restarting the sketch).

2. Code Highlights

I think the main challenge would be in the OOP, specifically making the character classes. I found a useful resource for collecting sprites: Spriter’s Resource. In particular, I would like to use village sprites from Professor Layton and the Curious Village. Here are the Character Profiles. I selected the following from the various characters in Professor Layton and the Curious Village:

  • Luke (user’s character)
  • Franco (farmer who needs help with harvesting carrots),
  • Ingrid  (neighbour grandma who needs help with delivering a letter),
  • Dahlia (noblewoman with a lost cat),
  • Claudia (Dahlia’s cat),
  • Lucy (young girl who needs help with getting herbs for her sick mother),
  • Flora (mysterious herb seller).

Since the challenge lies in OOP, I would like to practise making an object, namely the user’s character “Luke.”

In the development of the code, I found it challenging to adapt the walk animation code (discussed in class with Professor Mang) it in two ways: (1) into an OOP format and (2) to adapt it to my spritesheet, which does not have different frames of walking and does not have different different directions that the sprite faces. With (1):

  • I decided to have variables from the walk animation placed into the constructor as the class’s attributes.
  • Instead of keyPressed() as in the walk animation, I used move() and display() since keyPressed() is not an allowed function within a constructor (probably it’s an issue because of it being local when it should be a global function)
Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
class User_Luke {
constructor() {
this.sprites = [];
this.direction = 1; // 1 = right, -1 = left
this.step = 0;
this.x = width/20;
this.y = height/15;
this.walkSpeed = 3;
this.scaleFactor = 0.2; // Scaling factor
// 12 images across, 4 down, in the spritesheet
let w = int(luke_spritesheet.width / 6);
let h = int(luke_spritesheet.height / 2);
for (let y = 0; y < 2; y++) {
this.sprites[y] = [];
for (let x = 0; x < 6; x++) {
this.sprites[y][x] =
luke_spritesheet.get(x * w, y * h, w, h);
} // iterate over rows
} // iterate over columns
this.x = width / 2;
this.y = height / 2;
imageMode(CENTER);
// Display first sprite
image(this.sprites[0][0], this.x, this.y);
}
move() {
...
}
display() {
...
}
class User_Luke { constructor() { this.sprites = []; this.direction = 1; // 1 = right, -1 = left this.step = 0; this.x = width/20; this.y = height/15; this.walkSpeed = 3; this.scaleFactor = 0.2; // Scaling factor // 12 images across, 4 down, in the spritesheet let w = int(luke_spritesheet.width / 6); let h = int(luke_spritesheet.height / 2); for (let y = 0; y < 2; y++) { this.sprites[y] = []; for (let x = 0; x < 6; x++) { this.sprites[y][x] = luke_spritesheet.get(x * w, y * h, w, h); } // iterate over rows } // iterate over columns this.x = width / 2; this.y = height / 2; imageMode(CENTER); // Display first sprite image(this.sprites[0][0], this.x, this.y); } move() { ... } display() { ... }
class User_Luke {
  constructor() {
    this.sprites = [];
    this.direction = 1;  // 1 = right, -1 = left
    this.step = 0;
    this.x = width/20;
    this.y = height/15;
    this.walkSpeed = 3;
    this.scaleFactor = 0.2; // Scaling factor
    
    // 12 images across, 4 down, in the spritesheet

    let w = int(luke_spritesheet.width / 6);
    let h = int(luke_spritesheet.height / 2);

    for (let y = 0; y < 2; y++) {
      this.sprites[y] = [];
      for (let x = 0; x < 6; x++) {
        this.sprites[y][x] =
          luke_spritesheet.get(x * w, y * h, w, h);
      } // iterate over rows
    } // iterate over columns

    this.x = width / 2;
    this.y = height / 2;

    imageMode(CENTER);

    // Display first sprite
    image(this.sprites[0][0], this.x, this.y);
  }
  move() {
    ...
  }
  display() {
    ...
}

 

With (2), I set conditions if direction is 1 (representing right direction) or -1 (representing left direction). Since my spritesheet only shows the sprites in one direction, I used an image transformation:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
class User_Luke {
constructor() {
...
}
move() {
...
}
display() {
let spriteWidth = this.sprites[0][0].width * this.scaleFactor;
let spriteHeight = this.sprites[0][0].height * this.scaleFactor;
// Finally draw the sprite
// The transparent areas in the png are not
// drawn over the background
if(this.direction === -1) {
image(this.sprites[0][0],this.x,this.y, spriteWidth, spriteHeight)
}
else if(this.direction === 1) {
// We will use the scale() transformation to reverse the x-axis.
// The push and pop functions save and reset the previous transformation.
push();
// Scale -1, 1 means reverse the x axis, keep y the same.
scale(-1, 1);
// Because the x-axis is reversed, we need to draw at different x position.
image(this.sprites[0][0], -this.x, this.y, spriteWidth, spriteHeight);
pop();
}
}
}
class User_Luke { constructor() { ... } move() { ... } display() { let spriteWidth = this.sprites[0][0].width * this.scaleFactor; let spriteHeight = this.sprites[0][0].height * this.scaleFactor; // Finally draw the sprite // The transparent areas in the png are not // drawn over the background if(this.direction === -1) { image(this.sprites[0][0],this.x,this.y, spriteWidth, spriteHeight) } else if(this.direction === 1) { // We will use the scale() transformation to reverse the x-axis. // The push and pop functions save and reset the previous transformation. push(); // Scale -1, 1 means reverse the x axis, keep y the same. scale(-1, 1); // Because the x-axis is reversed, we need to draw at different x position. image(this.sprites[0][0], -this.x, this.y, spriteWidth, spriteHeight); pop(); } } }
class User_Luke {
  constructor() {
    ...
  }
  move() {
    ...
  }
  display() {
    let spriteWidth = this.sprites[0][0].width * this.scaleFactor;
    let spriteHeight = this.sprites[0][0].height * this.scaleFactor;
    
    // Finally draw the sprite
    // The transparent areas in the png are not
    // drawn over the background
    if(this.direction === -1) {
      image(this.sprites[0][0],this.x,this.y, spriteWidth, spriteHeight)
    } 
    else if(this.direction === 1) {
      // We will use the scale() transformation to reverse the x-axis.
      // The push and pop functions save and reset the previous transformation.
      push();
      // Scale -1, 1 means reverse the x axis, keep y the same.
      scale(-1, 1);
      // Because the x-axis is reversed, we need to draw at different x position.
      image(this.sprites[0][0], -this.x, this.y, spriteWidth, spriteHeight);
      pop();
    }
  }
}

I also noticed my sprite appear at a huge size – to deal with this, I used a scale factor in the spriteWidth and spriteHeight (already shown in above code).

3. Embedded Sketch

4. Reflection and Next Steps

I experienced multiple challenges along the way, but I gained valuable experience with OOP. I feel that making the next sprites won’t be so challenging, since I can use Luke’s code as reference and adapt it. I think it will be important for me to plan deadlines for me since there are big subtasks for this midterm project including:

  • Finding background(s)
  • Finding snapshots
  • Coding all the sprites
  • Interactive elements – door open animation, ‘choose a job’ buttons

Reading Reflection – Week#5

  • What are some of the ways that computer vision differs from human vision?

No computer vision algorithm is universally able to perform its intended function (eg. recognizing humans vs. background) provided any kind of input video, unlike the human eye and brain which can work together in general to perform its intended function (eg. recognizing humans vs. background). Instead, the object detection algorithm or tracking algorithm crucially relies on distinctive assumptions about the real-world scene that it is to study. If the algorithms’ assumptions are not met, then it could perform poorly, producing not very valuable results, or completely fail in its function. Take a first example: frame differencing is a computer vision technique that detects objects by detecting object movements. This is achieved by comparing the corresponding pixels of two frames by finding the difference in color and/or brightness between all corresponding pixels. Thus, the frame differencing algorithm can perform accurately on “relatively stable environmental lighting,” and “having a stationary camera (unless it is the motion of the camera which is being measured).” Hence, providing videos with much active movement, like in the NBA games, would be much more suitable than providing videos of focused people in the office. In addition to frame differencing, background subtraction and brightness thresholdings are more examples where having some presumptions are important for computer vision tasks. Background subtraction “locates visitor pixels according to their difference from a known background scene” while brightness thresholding uses “hoped-for differences in luminosity between foreground people and their background environment.” Thus, considerable contrast in color or luminosity between foreground and background is important for an accurate recognition of objects; otherwise, in nighttime scenes, the algorithm may detect objects in the video scene incorrectly as background. On the other hand, I personally feel that the human eye remarkably uses a combination of these three, and perhaps more, algorithms to detect objects, which allows it to perform extraordinarily well compared to current computer vision.

 

  • What are some techniques we can use to help the computer see / track what we’re interested in?

It is of great importance to design a physical environment with the conditions best suited for the computer vision algorithm, and, in the other way, select software techniques that are best with the physical conditions at hand. There are interesting examples that stood out to me for enhancing the suitability and quality of the video input provided to the computer vision algorithm. I believe that infrared (as used in night vision goggles) should complement conventional black-and-white security cameras, which can massively boost signal-to-noise ratio of video taken in low-light conditions. Polarizing filters are useful to handle glare from reflective surfaces, especially in celebrity shows. Of course, there are lots of cameras to consider as well, optimized for “conditions like high-resolution capture, high-frame-rate capture, short exposure times, dim light, ultraviolet light, or thermal imaging.”

 

  • How do you think computer vision’s capacity for tracking and surveillance affects its use in interactive art?

Computer vision’s capacity for tracking and surveillance opens doors for interactivity between the computer and the human body, gestures, facial expressions and dialogue/conversations. Already some difficult algorithms can correctly identify facial expressions, which could be used to detect someone’s emotional levels, and can be used in mental health initiatives to help people suffering emotionally. This might relate to Stills from Cheese, an installation by Christian Möller. Additionally, like in Videoplace, participants could create shapes using gestures, and their silhouettes in different postures can be used to form different compositions. If computer vision were combined with audio and language, then systems could be even more interactive with the increase in affordances.