“Vision of the Future” videos have been a thing since the early 21st century, the amazing hologram screen, or people waving their hands in the air to move digital windows around (I mean we have augmented reality for that now). It looks futuristic, sophisticated and innovative, but from an interaction perspective, it is actually incredibly timid.
I agree with what Bret is saying, a tool is supposed to amplify human capabilities, converting what we can do into what we want to do, and if that entire principle is gone, then it isn’t just not a good tool, it is not a tool at all.
Most modern devices ignore the two things hands do best, feeling and manipulating, so how can we call these ideas revolutionary if its going backwards?
Bret’s point about “finger-blindness” is actually terrifying to think about, what we do for granted the future generation may struggle with. If we do not use our hands to feel texture, weight, and pliability, we lose the ability to understand the “inner meaning” of objects. We are building a world where we can spend our entire lives immobile, starting at a “hokey visual facade” that has no physical connection to the work we are doing.
If the future of interaction does not let us see, feel, and manipulate space simultaneously, then it is not a future worth building or investing it.
We came up with a piano that utilizes your keyboard presses and a buzzer, using keys from A-L allows you to play 9 different notes. We added an LCD that displays the note of each key and the frequency of that note.
Implementation:
Schematic:
The components used are pretty simple, just being an LCD device and a buzzer. We wrote 2 files of code for this, a python file and an c++ file. As typing the letter into the serial monitor every time you wanted to play a note would be counter-intuitive, we wrote a python file that listens to your key presses, and if you press a key between A and L, then it will send that key press to the arduino which ends playing the note that correlates to that key press.
Here we first try to connect to the arduino using the port and the buad rate that is on the arduino IDE, then until the program stops, we check for any key presses, and if it matches one of our conditional statements, we write that letter to the arduino.
switch (key) {
case 'A': frequency = 262; noteName = "C4"; break;
case 'S': frequency = 294; noteName = "D4"; break;
case 'D': frequency = 330; noteName = "E4"; break;
case 'F': frequency = 349; noteName = "F4"; break;
case 'G': frequency = 392; noteName = "G4"; break;
case 'H': frequency = 440; noteName = "A4"; break;
case 'J': frequency = 494; noteName = "B4"; break;
case 'K': frequency = 523; noteName = "C5"; break;
case 'L': frequency = 587; noteName = "D5"; break;
default: return; // Ignore any other keys
}
Here we have a switch statement which checks whether we got a matching letter, which then returns the respective frequency and note. We got these frequencies for each note from here: https://en.wikipedia.org/wiki/Piano_key_frequencies as we wanted it to sound as similar as possible to a piano. The LCD shows the note you play and the frequency of that note when you play it. A potentiometer is used to control the contrast of the LCD!
Currently this is a single press piano meaning you can’t play multiple multiple notes at once, so an improvement that can be made is to find some way to be able to play multiple notes at once, otherwise this works perfectly, and is simple and accessible to anyone!
The idea of this is that, you “charge” up a light show by shining a really bright light on the photo resistor, once you charged it up for a good 10 seconds, a light show happens! A yellow RGB light slowly brightens up as the percentages reaches to a 100 while charging. I implemented a physical button that resets everything to stop the light show and so that you are able to charge it again.
Implementation:
Schematic:
The components used are RGB lights, a photo resistor, a 2×16 LCD and a potentiometer.
#include <LiquidCrystal.h>
// RS, E, D4, D5, D6, D7
LiquidCrystal lcd(12, 11, 5, 4, 3, 2);
const int ldrPin = A0;
const int btnPin = 7; // Button
const int ledAnalog = 9; // The fading LED
const int ledA = 13; // Light show LED
const int ledB = 10; // Light show LED
const int ledC = 8; // Light show LED
int threshold = 600;
unsigned long startMs = 0;
bool charging = false;
bool done = false;
Here we initialize everything, since LCD has its own module and example code, I took the initialization and syntax from there, we assign pins, the threshold and set everything to false since the program just began.
// Reset logic
if (digitalRead(btnPin) == LOW) {
// Set variables to original values
startMs = 0;
charging = false;
done = false;
// Turn off everything
digitalWrite(ledA, LOW);
digitalWrite(ledB, LOW);
digitalWrite(ledC, LOW);
analogWrite(ledAnalog, 0);
lcd.clear();
lcd.print("System Reset");
delay(500);
}
This is the reset code for when the button is pressed, it just turns off everything, and sets charging and done false to enter the mode it originally started with and just reset everything to the origin.
if (!done) {
// When light above threshold start counting
if (light > threshold) {
if (!charging) {
startMs = millis();
charging = true;
} ...
}
}
We have nested if statements here, to first check if the charging is done, if it isn’t we check if the light crosses the threshold, and if we aren’t already charging, we should start. If the light passes down the threshold, the charge goes to 0 and we restart.
When it has charged for 10 seconds or more, we set done to true, which calls the else commands and starts the light show till stopped by the button or simply turned off.
Some ways I thought of improving this is having a proper choreography of the lights that go along with some music, maybe manually do it or write up an algorithm that works for any song, either way I think that would improve this project. It was fun learning about the LCD however, and I am glad that I am able to use it now.
Reflection 1: On “Physical Computing’s Greatest Hits (and Misses)”
This reading really challenged how I think about originality in my projects. It is easy to feel discouraged when you realize a project you are excited about has already been done thousands of times. This is something I have experienced in my past assignments too, where it often felt like I was just making a copy of something else. However, as Tigoe points out, these classics are repeated because they tap into fundamental human expressions like movement and gesture.
This made me rethink my own approach to class projects. I realized that a project does not have to be a world-first to be successful. The value lies in the twist you add, how you refine the design, change the interaction, or make it feel unique to your own creative voice. I feel like this is something we all do unintentionally because we each have different styles and tastes.
Reflection 2: Making Interactive Art
The second reading shifted my perspective on the role of the designer in interactive art. I used to think that as a creator, I needed to explain exactly what my project was supposed to do so that people would not get confused. However, Tigoe suggests the opposite: build the environment, provide the context, and then shut up.
The idea that the audience completes the work through their own actions is powerful, but I think it is situational. Some projects are meant to do exactly what the creator intended, and there is not really anything else for the user to add. While I agree with Tigoe’s take for certain artistic pieces, I do not think it is universally true for every single project. Sometimes a clear, guided experience is exactly what is needed.
Reading Don Norman’s “Emotion & Design: Attractive things work better” shifted my perspective on what makes an interactive project successful. I used to think usability was strictly about clear logic and zero errors. However, Norman argues that positive affect actually broadens our thought processes, making us more creative and more tolerant of minor design faults.
This applies to everything, one example I can bring up is satisfactory. An automation based game, so you already know it is going to have a lot of problem solving to automate materials efficiently. However, I strive to make my factories and farms aesthetic, I spend more time on aestheticizing the factory rather than just building a bare bone factory. The scenario I’m going to provide applies to practically anything.
When you look at a factory that is aesthetic and well-organized, your affective system sends positive signals. This puts you in a “breadth-first” state of mind, which Norman says is perfect for the creative, “out-of-the-box” thinking needed to solve complex automation loops.
The “bare-bones” factory focuses only on the math and the belts. While functional, if a problem arises and the environment is ugly or stressful, you might fall into “depth-first” processing. This leads to tunnel vision where you can’t see the simple solution because you are too focused on the “danger” of the inefficiency.
It isn’t just about looks, it creates a positive mental state, which can lead to better problem solving and making things more functional and work better rather than if you just went for pure functionality.
Her Code Got Humans Into the Moon:
The arrogance of the 1960s tech world was the belief that perfect engineering could override human nature. Hamilton’s higher-ups dismissed her error-checking code because they were blinded by the myth of the perfect astronaut who would never make a mistake. This was not just a technical disagreement. It was a fundamental misunderstanding of how people interact with systems. By allowing her daughter to play with the simulators, Hamilton gained a perspective her colleagues lacked: if a crash is physically possible, it is eventually inevitable.
What actually saved the Apollo 11 mission was Hamilton’s move toward asynchronous processing. In an era where computers were expected to just execute a linear list of tasks, she designed a system that could decide what was important. When the hardware was being overwhelmed by a documentation error during the moon landing, the software did not just freeze. It prioritized the landing maneuvers and ignored the rest. She essentially invented the concept of a fail safe for software, shifting the industry from just making things work to making things survive the people using them.
She paved the way for the mindset of never settling on bare minimum, and to go above and beyond so that no mistake could possibly happen, and to cover every case there is, whether it is a case you know about or you don’t know about.
I have an extra Arduino I got years ago, so I decided to scour through the parts to see if I can find something to use, which I did! The switch here is more functional rather than unusual, but its hand-less nonetheless. The 2 main players here is an RFID reader and a dot matrix module. The RFID reader reads the card in your pocket while you walk through the door. If the card matches the accepted card(s), a green LED will flash and a smiley face will be shown on the dot matrix module. However if a person with the wrong card passes through the door, the red LED will flash and the dot matrix will show a huge X.
We begin our loop code with this, and what this does is just check if the RFID module can read a card at all, if it can’t the rest of the code won’t run at all.
// Long green flash when correct card and show smile.
if (match) {
Serial.println("ACCESS GRANTED");
digitalWrite(GREEN_LED_PIN, HIGH);
showSmile();
delay(3000);
digitalWrite(GREEN_LED_PIN, LOW);
}
Here if the card scanned matches the card we give access to, we turn on the green pin and show the smile on the dot matrix module, this lasts for 3 seconds before turning things off.
else {
Serial.println("ACCESS DENIED - ALARM");
showX();
// Repeated red flashing
for(int i = 0; i < 5; i++) {
digitalWrite(RED_LED_PIN, HIGH);
delay(100);
digitalWrite(RED_LED_PIN, LOW);
delay(100);
}
}
If the card that is read does not match the card we want, then we will show the X on the dot matrix and repeatedly flash the red LED 5 times.
This project is based off a childhood game of mine, called magic touch. The core concept of that game is you are a wizard, and you must stop robots from attacking your castle, in the game these robots fall slowly carried by balloons containing symbols. You must draw the symbol on the balloon to pop it, and when all the balloons are popped on the robot they fall to their death.
In my case, I made my game practically completely camera based, with no usage of keyboard at all, and a single use of your mouse just to toggle full screen. It is cyberpunk themed, and you are being attacked by drones, you must draw the symbols rotating around the drones with your hand to eradicate them and so that they don’t breach the system.
Implementation:
The code hierarchy consists of 2 folders, one for assets, and one for all the scripts.
The assets folder is self explanatory, it consists all my music/sound effects, images and fonts.
The script folder consists of 12 java script files (excluding sketch.js which is outside the folder) . I will be summarizing what each file does while providing more technical context when needed.
CyberButton.js: This file contains a class called CyberButton, which takes in the position, width and height, and the label for the button (text inside button).
However most of the code is designing the button itself, it has an outer blue outline with a transparent inside and a “filled” cyan color. As well as 2 purple trapezoids coming out of the button and are diagonally opposite.
HandTracking.js: This is where the magic happens, this entire file contains the code for all the hand tracking and the optimization of said hand tracking. It consists of a class used to store the Kalman filter sittings for each hand shown on screen. I will quote my midterm progress post to explain what a Kalman filter is.
To explain the core concept:
The filtering has 3 steps:
– Predict
– Update
– Estimate
The Kalman filter works in a simple loop. First, it predicts what the system should look like next based on what it already knows. Then, it checks that prediction against a new (noisy) measurement and corrects itself.
Because of this, the Kalman filter has two main steps. The prediction step moves the current estimate forward in time and guesses how uncertain that estimate is. The correction step takes in a new measurement and uses it to adjust the prediction, giving a more accurate final estimate.
This file also calculates the distance between your thumb and index to determine when you are pinching and when you are not.
The way the pinching logic works is kind.. of over complicated for the game play. I am sure there is most probably a better way, but this is the way I figured out and if it works it works.
Now when drawing with your hand, we know that the detector itself is very sensitive, and sometimes your drawings just stop midway and that ruins the gameplay because of the sheer sensitivity of the model. I have the value of pinching so that it is true when below 30 (distance). However, this ONLY becomes false if the value exceeds 60 (this can be changed in options). This allows for leeway and basically gives you some sort of grace. You would need to basically have your index and thumb really close to pinch, but to make the pinching “false” you would have to make the distance between them really far (60, double of the threshold to pinch).
if (pinchd < 30) {
isPinching = true;
}
---------------------------------
let isActuallyPinching = pinchd < pinchThreshold;
// Gives the user a 30 pixel buffer for when drawing to reduce the probability of accidentally stopping drawing.
// When we are drawing, we push the point of our cursor to the current path
if (isActuallyPinching) {....}
OnBoarding.js: This contains all the information the user needs before starting the game, so how to play, how to navigate the menu, and how to make sure your user experience is as good as it can be.
drones.js: This file contains a class called Drone. We have 3 types of drones that will spawn during the game play, a normal drone, a miniboss drone, and a boss drone. What differentiates each drone is the amount of symbols you need to draw to eradicate the drones. For a normal drone, you get 1-2 symbols to draw, a mini boss has 5-8 symbols. and a boss has 15 symbols. There are 5 different symbols to draw, so symbols will be repeated. For the drones, I am using a sprite for the drone with an idle animation for the falling and a death animation. The mini boss drone is tinted purple and slightly bigger, while the boss drone is tinted and red and is very large.
global.js: This was kinda just to clean everything up, and this contains all the global variables used in the project.
// Path of the drawing
let currentPath = [];
// The variable that will hold the stroke recognizer class.
let recognizer;
// Keep track of the state of the game (start with the splash screen)
let state = "menu";
// Hand model, will become true when it is intiialized and ready
let modelReady = false;
// Variable for the camera feed
let video;
// Split stripes into animations
let animations = {};
// Raw data of the stripe sheets
let sheets = {};
// Background photo of the menu
let menubg;
// Master volume default at 50%
let masterVolume = 50;
// Threshold
let pinchThreshold = 60;
// Distance between thumb and index
let pinchd = 0;
// CyberPunk font
let cyberFont;
// Store the buttons
let btns = [];
// Store the hands
let hands = [];
// miniboss timer
let minibossTimer = 0;
// For ml5js, contains hand data
let handPose;
// Holds the value of the estimated x position from the Kalman filter
let smoothX = 0;
// Same as above but for y
let smoothY = 0;
// Kalman filter ratio
let kf;
// Timer before user can go menu
let gameOverTimer = 0;
// Sync level (0-100)
let syncLevel = 0;
// Last boss spawn
let lastBossMilestone = 0;
// Duration of the onboarding screen
let duration = 8000;
// Array to hold the drones
let drones = [];
// Timer to keep track of when to spawn drones
let spawnTimer = 0;
// Keep track when the boss is on screen
let bossMode = false;
// Variables to store music & sound effects
let syncmusic;
let game1music;
let game2music;
let onboardingmusic;
let breachedmusic;
let mainmenumusic;
// Holds all gameplay music to loop it
let gameplaymusic = [];
// Tracks which song in the gameplaymusic array is up next
let currentTrackIndex = 0;
// Keep track of how long the onboard screen has been going on for.
let onboardingStartTime = 0;
// Score of the current run
let score = 0;
// Store in browser memory or 0 if first time
let highscore = localStorage.getItem("breachHighscore") || 0;
// Draw cursor
function drawCursor(x, y) {
push();
fill(0, 255, 255);
noStroke();
ellipse(x, y, 20);
fill(255);
ellipse(x, y, 8);
pop();
}
Menu.js: This file draws the menu, putting our background image, and our 3 buttons (play, options, quit).
Option.js: This file is to draw the option page, which can be accessed through clicking the option button. There are 3 things you can change in options, the pinch threshold we talked about earlier, the Kalman filter smoothening (latency – smoothness tradeoff). And finally the master volume of the game.
Play.js: This file contains the play page, where the background is made, where score is handled and where the spawning of the drones is done. The neat thing about the score system is, the saved high score persists across sessions, so even if you close the game with p5js, and re-open it, or even close your browser, as long as you don’t clear your cookies and site data, your high-score from any previous session will remain. This is done because p5js will store this information locally in your browser, and will be permanent till deleted manually.
A normal drone spawns every 9 seconds, a mini boss drone will spawn every 20 seconds, and a boss drone will spawn every 1500 points.
This is all monitored by the function handleSpawning:
function handleSpawning() {
if (!bossMode) {
// Stop all other spawns once we hit the warning threshold (400)
// This allows existing drones to clear before the boss arrives at 1500
let nextThreshold = lastBossMilestone + 1500;
if (score < nextThreshold - 100) {
// Warning: Red pulse if Miniboss is 3 seconds away
let nextMinibossTime = minibossTimer + 20000;
if (millis() > 5000 && nextMinibossTime - millis() < 3000) {
drawWarning("MINIBOSS INBOUND");
}
// Check for Miniboss spawn every 20 seconds, avoiding start of game
if (millis() > 20000 && millis() - minibossTimer > 20000) {
drones.push(new Drone("miniboss"));
minibossTimer = millis();
}
// Spawn a drone when game start, then spawn a normal drone every 9 seconds.
if (spawnTimer === 0 || millis() - spawnTimer > 9000) {
drones.push(new Drone("normal"));
spawnTimer = millis();
}
}
// Warning: Final Boss warning when close to 1500 points
if (score >= nextThreshold - 300 && score < nextThreshold) {
drawWarning("CRITICAL SYSTEM BREACH DETECTED");
}
// Check for Final Boss trigger at 1500 points
// Ensure the screen is actually clear of other drones before spawning
if (score >= nextThreshold && drones.length === 0) {
bossMode = true;
lastBossMilestone = nextThreshold;
let finalBoss = new Drone("boss");
finalBoss.x = width / 2; // SPAWN CENTER
drones.push(finalBoss);
}
}
}
When a mini boss or a boss is about to appear, red flashing lines will appear on the screen to warn the user of them being inbound:
Recognizer.js: This is an open source code that I took which allows for symbol detection, as well as drawing and adding your own custom symbols. I edited the code slightly to delete every symbol I won’t be using, so that the detector doesn’t waste our time by saying the symbol drawn is something that isn’t in the game. And I added 2 custom symbols being “W” and “S”.
Score.js: This screen pops up after you die, and just shows your score, final score, and what to do to get back to the menu so that you can play again.
Splash.js: This is where the game begins, and just allows for the initialization of everything, the game will ask you to raise your hand and keep it raised while it “syncs” before moving to the on boarding screen.
Sprite.js: This file contains the code to handle the sprite, split it up, and animate it so it is used properly during game play.
// Slices a sheet into an array of images
function extractFrames(sheet, cols, rows) {
let frames = [];
let w = sheet.width / cols;
let h = sheet.height / rows;
for (let y = 0; y < rows; y++) {
for (let x = 0; x < cols; x++) {
let img = sheet.get(x * w, y * h, w, h);
frames.push(img);
}
}
return frames;
}
// Draws and cycles through the frames
function drawAnimatedSprite(category, action, x, y, w, h, speed = 0.15, startFrame = 0) {
if (animations[category] && animations[category][action]) {
let frames = animations[category][action];
let index;
if (action === "death") {
// Calculate frames passed since death began
let elapsed = frameCount - startFrame;
index = min(floor(elapsed * speed), frames.length - 1);
} else {
index = floor(frameCount * speed) % frames.length;
}
push();
imageMode(CENTER);
image(frames[index], x, y, w, h);
pop();
}
}
We provide the image, and how many columns and rows it has. Splits the image with said column and rows so that each frame is extracted. Once all the frames are extracted, we can start drawing them with our second function, and this just loops through the frames using the formula:
index = floor(frameCount * speed) % frames.length;
The formula for death is different, as when it dies we want it to stop at the last frame, hence we use min which acts as a clamp and forces the index to stop at the last frame of the animation and stay there, preventing it from looping back to the beginning.
With all these separated files, we get a pretty clean sketch.js file which falls just under 100 lines.
function preload() {
// Variable declared in handTracking.js
handPose = ml5.handPose(() => {
modelReady = true;
});
menubg = loadImage("assets/menu.jpeg");
cyberFont = loadFont("assets/Cyberpunk.ttf");
syncmusic = loadSound("assets/sync.mp3");
game1music = loadSound("assets/game1.mp3");
game2music = loadSound("assets/game2.mp3");
breachedmusic = loadSound("assets/breach.mp3");
mainmenumusic =loadSound("assets/mainmenusoundtrack.mp3");
onboardingmusic = loadSound("assets/onboarding.mp3");
sheets.normalIdle = loadImage("assets/mobidle.png");
sheets.normaldeath = loadImage("assets/mobdeath.png");
}
function setup() {
createCanvas(windowWidth, windowHeight);
recognizer = new DollarRecognizer();
gameplaymusic = [game1music, game2music];
let constraints = {
video: { width: 640, height: 480 },
audio: false,
};
animations.normal = {
idle: extractFrames(sheets.normalIdle, 4, 1),
death: extractFrames(sheets.normaldeath, 6, 1)
};
video = createCapture(constraints);
video.hide();
handPose.detectStart(video, gotHands);
textFont(cyberFont);
for (let track of gameplaymusic) {
track.setVolume(0.2);
track.playMode('untilDone');
}
if (state == "menu") {
makeMenuButtons();
}
}
function draw() {
background(0);
let { pointerX, pointerY, clicking, rawDist } = handTracking();
if (state === "splash") {
drawSplashScreen();
if (hands.length > 0) drawHandIndicator(pointerX, pointerY, rawDist);
} else if (state === "onboarding") {
drawOnboarding();
} else if (state === "menu") {
menu();
for (let btn of btns) {
btn.update(pointerX, pointerY, clicking);
btn.draw();
}
} else if (state === "play") {
runGameplay(pointerX, pointerY, clicking);
} else if (state == "gameover") {
drawGameOver(pointerX,pointerY,clicking)
} else if (state == "quit") {
// Stop script and quit
remove();
} else if (state == "options") {
drawOptions(pointerX,pointerY,clicking);
}
if (hands.length > 0 && state !== "onboarding") {
drawCursor(pointerX, pointerY);
}
}
function windowResized() {
resizeCanvas(windowWidth, windowHeight);
if (state == "menu") {
makeMenuButtons();
}
}
function mousePressed() {
let fs = fullscreen();
fullscreen(!fs);
}
I am pretty happy with how it turned out, where all the interactions only use the camera, and I am happy with how the aesthetics of the game came out overall.
Reflection:
A lot of the errors I ran into stemmed from how am I going to have symbol recognition and smooth hand tracking, which both I was able to resolve using the recognizer open source code for the symbol recognition, and Kalman filtering for smooth hand tracking.
Improvements I think that could be made is the general aesthetics of the game could be more details, maybe add some more game modes so that there is more variety.
The name of the game right now is Cyberpunk Breach (tentative) and as you can see in the demo above I am doing for a cyberpunk theme game!
The game play is currently a work in progress, I have started on it but no character stripes implementation of yet. The concept is as such:
I took inspiration from a game called Magic Touch which is on the app store. The gist of the game is, you are a wizard, and you need to stop the robots from attacking you, the way you do that is pop the balloons that the robots are using, these balloons have specific glyph that you need to draw, if you draw them currently the balloon containing that glyph will pop.
Now I have added my own twist to this. I am making it cyberpunk themed, with drones rather, and the biggest change of functionality is the fact this entire game does not use your keyboard or mouse. It is entirely based on hand tracking, where you use your hand to navigate the menus, and play the game.
Now there are multiple issues with this that I have tackled or in the process of tackling.
The problem with hand tracking on browsers, is that they are really really REALLY latent and jittery. Latency would be a hard problem to fix since this is a browser issue, but jittery I can fix. This is where Kamlan filtering comes in play.
To explain the core concept:
The filtering has 3 steps:
– Predict
– Update
– Estimate
The Kalman filter works in a simple loop. First, it predicts what the system should look like next based on what it already knows. Then, it checks that prediction against a new (noisy) measurement and corrects itself.
Because of this, the Kalman filter has two main steps. The prediction step moves the current estimate forward in time and guesses how uncertain that estimate is. The correction step takes in a new measurement and uses it to adjust the prediction, giving a more accurate final estimate.
And finally, using a threshold, we can choose between using the estimated path, or the camera path.
Using this we can have pretty smooth hand tracking.
Now the issue of having recognized gestures, and even adding my own custom gestures, using a library called $1 Unistroke recognizer.
Alternate sketch just to test out the library:
The library has inbuilt strokes, so for example if we try to draw a triangle, the algorithm guesses your drawing with how confident it is:
You can also add your own custom gestures:
The tracking and the gesture recognition is what I was worried about before I got started on this project
For the final stages of the game: I will need to work on the game-play itself and the process of implementing this into the game.
Computer Vision has always been something I’ve been interested in, I used it in my 3rd assignment and I am currently using it in my midterm project. The article has given me answers to questions I have had while working with computer vision.
So far I have really only worked with hands, and it got me really curious, how does the AI model what is a hand and what isn’t, to the point it can assign so many key points to a single hand, it knows where the each finger tip is, the middle, the base and so on. And I know this article doesn’t fully answer this, but it gave me an idea to what exactly is computer vision. To a computer with no inherent context, anything it “sees” is just a bunch of no pixels with absolutely no relation whatsoever. It relies on mathematical calculations to make it’s own to context to what is happening and what is what. But that is just an abstract definition, honestly the techniques provided seem to only work in really specific cases, the author says there is no computer vision algorithm which is “completely” general.
I am going to have to disagree with that on the basis that this is not specific enough. Hand detection algorithms seem to work in almost any environment. It is able to detect when a hand is on screen or not, even multiple hands. Now if we take a hand algorithm and say that this algorithm wont detect this object in any environment? Of course it won’t. When we say general we need some sort of context to what general is! A lot of hand detection algorithms can be considered general in detecting hands no matter the environment for example.
There is a detection technique that I had to learn to improve my hand detection in the midterm project, and it is called Kalman filtering. To briefly describe this technique, the algorithm tries to predict the location of what it is tracking in the next frame, an compares it to the what the location actually is, and depending on a threshold we give this algorithm, the visualization of this tracking will either follow our predicted calculation, or the camera’s calculation. And this is an algorithm which I found to be quite intuitive in how it works, and I have noticed considerable difference in my hand tracking after implementation.
Honestly computer vision’s potential in interactive art is so extremely untapped. I do not see many people implementing it besides very few, and considering how accessible it is now to implement, it is such a shame We can have true interaction with our art work if we have the computer make decisions based on what it sees, giving us a new piece, not just every time the program is ran, but every time the background or the person does something.
One of the reasons products fail in real life is due to over-engineering. I know this wasn’t explicitly mentioned in the chapter, however it fits the description of designing a product that solves a simple problem in an complicated way. However, besides following the perspective of an engineer when designing the products, there is another play at hand. Control, many don’t realize it but over-engineering are done on purpose to control the people. It’s not that these people don’t understands how humans work, rather they understand how they work exactly. What do I mean by that?
Let us take printers for examples. Modern printers are so frustrating to work with, I absolutely hate dealing with them, you have to download their specified app, then press some button 2 times for 5 seconds or some nonsense to turn on blue tooth mode, and honestly half the buttons on the printer you will never end up using in your life. But that’s not it, it’s the fact you need all cartridges filled to print something.
Let’s say you want to print a document in black and white, and you don’t have any colored ink cartridges, the printer won’t let you print the document unless you have everything filled. Not to mention you need to use the printer’s brand specific cartridges which are most probably overpriced. All this is done so that the customer keeps on buying only their products for the printer. Besides my vent, it is true that most of the time products are over-engineered due to engineer’s not taking in the perspective of the average Joe. For the midterm project specifically, I plan on implementing proper feedback and instructions so that the user feels in control the entire time, and don’t have to second guess anything they do while playing the game.