Reading Response Week 5 : OpenAi Sora, Apple Vision Pro, Virtual Reality and the rise of Computer Vision

For this reading response I decided to take a different kind of approach and Instead of acknowledging and kind of rephrasing what is said in the original piece, I decided to look at the topic from a different pair of lenses.

Starting off, when we talk about Computer Vision and this interaction between computer systems and machines with humans, it always comes out as some kind of a new concept. One example is Virtual Reality and the New Apple Vision Pro headset which actually is an augmented reality (AR) headset but I don’t want to get deeper into that. What if I tell you that this concepts have actually been around since the 1970, YES, THE 1970S, that is like 50 years ago!

To explore more about the concept you can read the book “What you should wear to an artificial Reality?” but to summarize, the author talks about this development of this artificial reality world starting of from the 1970s with his exhibition called METAPLAY which involved two people playing with a ball that was not even a real one (it was just a projection on a screen). That quickly escalated to his projects called VIDEOPLACE, DIGITAL DRIVING and CRITTER which all worked on this idea to connect people throughout this computer vision and manage interaction in spaces which don’t really exist in real life.

On the other side, what I found interesting is the rise of AI systems in the past few years, specifically one that was announced in the past week and that is the OpenAi Sora, an AI software that can supposedly make videos out of simple prompts and all of that without even filming a single second. I am really interested in how this affects the Computer Vision, Film and Interactive Media World

 

Reading Reflection- Week 5

In this passage the author talks about computer vision, which is a technology which allows computers to interpret and understand visual information from the surrounding environment.

What is so cool about this is how the computers can understand what they see through the use of computer vision, like movements and objects. Lets take the game “Limbo Time” as an example, where players use their hands, and a computer tracks their movements to play the game. Its fascinating how these simple techniques like movements can create some intriguing interactive experiences.

Another example which fascinates me is the “Messa di Voce,”a performance where voices were transformed into images. Its crazy how the simple voices and sounds themselves transforms into images. As a musician this art-piece really caught my attention and showed me the range of possibilities that are achievable using computers.

Lastly, I found it interesting how computer vision is becoming more accessible in multimedia tools. They talked about plug-ins for programs like Processing and Max/MSP/Jitter that let artists and designers easily add computer vision features to their projects. It’s like having a toolbox full of cool gadgets that make it easier to create interactive art or games, which for us could be useful for our future projects.

These examples show how artists use technology to make interactive projects in lots of different  varieties of ways. In the age of Artificial Intelligence, it’s cool to see how these early ideas helped make the tech world we have now. These are like the building blocks for the interactive designs we see all the time.

 

Mid Term Process Update

My game is inspired by the Club Penguin Puffle Park games, specifically the Puffle Roundup. The game aims to gather as many Puffles into a designated area (the cage) in under 120 seconds. The more you gather, the more points you earn. The twist is that if you’re not careful with the movement of the mouse you could push away the puffle, making them escape, meaning you lose points.
The most complex part of the project would be looking for images/assets to make the game, the mouse interactivity smoothness, and probably the timer and point count system. I did some research and found a tool that lets you pack your images together to use in the code “http://free-tex-packer.com/” which would be very useful for the clock timer in my game.

Assignment #5 – Progress on the midterm

For my Midterm Assignment I decided to go a little bit back in time to the Windows XP era, basically the Windows of my childhood.  Just hearing the sounds of a windows XP computer turning on brings back so many memories.

My midterm is going to be exactly that, a Windows XP emulator with a twist, It has errors all over it and it gives us the famous Blue screen of death. Fun isn’t it? Let me show you my progress so far.

For the opening screen I decided to have a button which would be used to “turn on the computer”. It looks something like this:

Furthermore when we click the button the original Windows XP sound plays and I have also added the original background and My Computer Icon:

Snippet of code where I load these elements upon a click:

if(mouseIsPressed === true)
        {
          noLoop();
          clickSound.play();
          noTint();
          image(windowsImage, width/2,height/2, 400,400);
          imageMode(CENTER);
          windowsSound.play();
          image(computerIcon, width/10, height/10, 60, 60)

 

I know there is a long way to go but for now you can enjoy the sketch using the window below.

Reading Reflection – Week 5

The reading on “Computer Vision for Artists and Designers” discusses how computer vision is becoming more accessible to students/artists due to easier-to-use software and open-source communities. Out of the many projects showcased, I was really impressed (and slightly creeped out) by Rafael Lozano-Hemmer’s installation Standards and Double Standards (2004), where belts were controlled by a computer vision-based tracking system, causing the buckles to rotate automatically to follow the public. It was really interesting to see this type of interaction that, in a way, isn’t intentional, direct, or digital. However, when it came to the project Suicide Box by the Bureau of Inverse Technology (1996), where a motion-detection video system was utilized to record real data of suicides, I found that ethically concerning. You don’t need to record such events to store data; yet, on the other side, it might serve as a security measure for people who have presumably been missing. It is a pretty controversial project, to say the least.

Field of view comparison of a conventional and Telecentric Lens. Note the conventional lens’s angular field of view and the Telecentric Lens’s zero angle field of view.

Furthermore, the reading discussed the different kinds of problems that vision algorithms have been developed to address, and their basic mechanisms of operation, such as detecting motion, detecting presence, object tracking, and basic interactions. All of which the designers of C2’s doors should have taken into account. Moreover, something new I have come across is the term “Telecentric lenses,” which are lenses used to improve object recognition by maintaining constant magnification regardless of distance. Yet, I came to find out that it is high in cost, large in size, and heavy in weight, in addition to causing some distortion issues. So, I wonder when it is appropriate to use it or if it is smart to do so to begin with. All in all, this was a very interesting read that showed me that interaction can be more than just key or computer-based; rather, it’s more about innovative ways to bridge the two different worlds we live in! Last but not least, I wonder where the line is drawn when it comes to privacy and motion/facial detection. Have we as a society come to accept that we are being watched & listened to all the time, whether it’s your phone’s facial recognition or the immediate response after “Hey Siri!” ?

Computer Vision Reading Response – Redha

The main point that stood out to me from this week’s reading was the wide prospects of the use cases surrounding computer vision.

To begin with, two artworks stood out to me for two varying reasons. Both artworks, however, expanded the scope of possibilities for me concerning the applications of computer vision within the context of art.

The first of these artworks is Rafael Lorenzo-Hemmer’s Standards and Double Standards (2004). This work piqued my interest due to its incorporation of space and inanimate objects which are activated through the help of computer vision. Personally, I find the overlap between the digital and the tangible to be an interesting area of focus so this work immediately caught my attention for its symbolic repurposing of an everyday object which is then given a sense of agency through programming that is supported by computer vision. Moreover, this work allowed me to consider the potential of using computer vision without requiring a visual output based on the data that the program is using. For example, in Krueger’s Videoplace, the user can see a visualisation of the input that the computer vision system is receiving (their silhouette) and it becomes central to the work. Conversely, Standards and Double Standards makes use of the input internally in order to trigger an another action.  Finally, I definitely appreciated that this work does not feature a screen (!) as I feel that it has become an overly predictable method of presenting interactive art.

Rafael Lozano-Hemmer, "Standards and Double Standards," 2004 on Vimeo

That being said, the next work that I have identified is Christopher Moller’s Cheese (2003) – an installation which solely presents screen-based work. While I do feel that this installation is an exception to the statement above (due to its bold imagery and simple presentation (and the fact that the work itself is not interactive)), what stood out to me was not the effectiveness of the work itself but the technical implications concerning the computer vision system that made the work possible. Considering the exponential development of technology, and the fact that the work was produced over two decades ago, one can’t help but wonder what can be done with facial recognition technology today. The reading mentioned how sophisticated the computer vision system needed to be in order to recognise slight changes in emotion and provide a response (albeit a simple one).

Cheese - Christian Moeller

This has lead me to ponder what is possible with facial recognition technology (and computer vision as a whole) within the artistic space today. I was reminded of an installation produced in 2019 which I had looked at for another class entitled Presence and Erasure by Random International. As part of my presentation on this work I discussed the concept of consent within interactive art and, as an Arab and a Muslim, I immediately recognised that such a work would may not be able to exist in certain parts of the world (such as this one) as a result of social and cultural beliefs. Ultimately, going down this rabbit hole as led me to consider the endless possibilities we have with today’s technology but it has also helped me understand that just because you can pursue an idea, does not always mean that you should.

RANDOM INTERNATIONAL

Raya Tabassum: Midterm Project Progress

 

Concept:
I’m trying to make sort of a version of “Super Mario” game using one player who walks along the path collecting gold coins and the game ends when it collides with the enemy. The player have to jump (using UP key) to collect coin, and there’ll be sound incorporated with each jump and coin collection. When the game is over there’ll be a screen saying “Game Over” and if the player wants to play again they can press SHIFT key and the game will start again. There’ll be a scoring system displayed on the screen too.

Difficulty/Challenges:
Challenge would be to move the sprite smoothly along the background. I want to design the background myself and make the player and enemy designs too – to make it unique. I guess collaborating all the elements of the game together to respond to the user interaction would be a challenge so that the game runs properly.

Visualization:
I want my game screen to look like this (primarily, this is not a definite design, just drawn into Procreate as I’m currently using dummy sprites only to run the game code):

 

Coding:
There’ll be a Player class, an Enemy class, and a Coin class. I’ve designed the basic code for collision etc. Here are some highlighted code snippets:

The Player class:

class Player {
  constructor() {
    this.playerYOnGround = 550;
    this.playerSize = 60;
    this.bgGroundHeight = 45;
    this.animationSlowDown = 8;
    this.width = 1000;
    this.jumpHeight = 0;
    this.jumpStrength = 0;
    this.jumpStrengthMax = 5;
    this.gravity = 0.1;
    this.jumping = false;
    this.playerImg = [];
    this.numberPlayerImg = 6;
    this.playerImgIndex = 0;
    for (let i = 1; i <= 3; i++) {
      this.playerImg.push(loadImage(`guy-${i}.png`));
    }
  }

  initPlayer() {
    xpos = (this.width * 0.5) - (this.playerSize * 0.5);
    ypos = this.playerYOnGround;
  }

  animatePlayer() {
    if (this.jumping) {
      this.jumpStrength = (this.jumpStrength * 0.99) - this.gravity;
      this.jumpHeight += this.jumpStrength;
      if (this.jumpHeight <= 0) {
        this.jumping = false;
        this.jumpHeight = 0;
        this.jumpStrength = 0;
      }
    }

    ypos = this.playerYOnGround - this.jumpHeight;

    if (this.jumping) {
      image(this.playerImg[0], xpos, ypos);
    } else {
      image(this.playerImg[this.playerImgIndex], xpos, ypos);
      if (frameCount % this.animationSlowDown === 0) {
        this.playerImgIndex = (this.playerImgIndex + 1) % 3;
      }
    }
  }
}

When the player collides with enemy:

if (dist(this.enemyX, this.enemyY, xpos, ypos) <= (this.playerSize / 2 + this.enemySize / 2)) {
      win = false;
}

When the player collects coin:

if (dist(this.coinX, this.coinY, xpos, ypos) <= (this.playerSize / 2 + this.coinSize / 2)) {
      this.initCoin();
      score += 10;
}

Reading Response: Computer Vision

I was interested in the Detecting Motion 9code listing 1) concept in this week’s reading. Frame differencing is a simple technique that may be used to detect and quantify motions of humans (or other objects) within a video frame.

The reading emphasizes how computer vision algorithms are not naturally capable of comprehending the subtleties of various physical surroundings, despite their advanced capacity to process and analyze visual input. The physical environments in which these algorithms function can greatly increase or decrease their effectiveness. The use of surface treatments like high contrast paints or controlled illumination like backlighting, for instance, is discussed in the reading as ways to enhance algorithmic resilience and performance. This suggests that the software and the real environment need to have a symbiotic connection in which they are both designed to enhance one another.

This concept reminded me of The Rain Room. In my opinion, this installation has motion sensors that act like computer vision that allow people to move through a space where it is raining everywhere but where they are standing. The computer vision system’s exact calibration with the actual environment is crucial to the immersive experience’s success because it allows the sensors to recognize human movement and stop the rain from falling on the people.

Pi Week 5 Midterm Progress : G-POET – Guitar-Powered Operatic Epic Tale

Concept

I am super jealous 😤 of people who are extremely good at one thing. Hayao Miyazaki doesn’t have to think that much, but just keep making animated movies.. he’s the greatest artist I look up to. Slash and Jimi Hendrix does not get confused, they just play guitar all time, because they are the greatest musicians. Joseph Fourier does just Mathematics and Physics, with a little history on the side… but he’s still a mathematician.

My lifelong problem is that I specialize in everything, in extreme depths. When you are a competent artist, engineer, musician, mathematician, roboticist, researcher, poet, game developer, filmmaker and a storyteller all at once, it’s really really hard to focus on one thing….

which is a problem that can be solved by doing EVERYTHING.

Hence, for my midterm, I am fully utilizing all a fraction of my skills to create the most beautiful interactive narrative game ever executed in the history of p5js editor, where I can control the game by improvising on my guitar in real time. The story will be told in the form of a poem I wrote.

Ladies and gents I present you G-POETthe Guitar-Powered Operatic Epic Tale. 🎸 the live performance with your host Pi.

This is not a show off. This is Devotion to prove my eternal loyalty and love of arts! I don’t even care about the grades.  For me, arts is a matter of life or death. The beauty and the story arc of this narrative should reflect my overflowing and exploding emotions and feelings I have for arts.

Also, despite it being a super short game made using JavaScript, I want it on the same level, if not better than the most stunning cinematic 2D games ever made in history – the titles like Ori and the Blind Forest , Forgotton Anne, Hollow Knight. They are produced by studios to get that quality, I want to show what a one man army can achieve in two weeks.

Design

I am saving up the story for hype, so below are sneak peaks of the bare minimum. It’s an open world. No more spoilers.

(If the p5 sketch below loads, then use Arrows to move left and right, and space to jump )

And below, is my demonstration of controlling my game character with the guitar. If you think I can’t play the guitar… no no no Pi plays the guitar and narrate you the story, you interact with him, and tell him, oh I wanna go to that shop. And Pi will go “Sure, so I eventually went to that shop… improvise some tunes on the spot to accompany his narrations, and the game character will do that thing in real time”.

See? There’s a human storyteller in the loop, if this is not interactive, I don’t know what is.

People play games alone… This is pathetic.

~ Pi

💡Because let’s face it. People play games alone… This is pathetic.  My live performance using G-POET system will bring back the vibes of a community hanging out around a bonfire listening to the stories by a storyteller…same experience but on steroids, cranked up to 11.

No, such guitar assisted interactive performance in real-time Ghibli style game has not been done to my knowledge.

(In the video, you might notice that low pitch notes causes player to go right, and high pitch notes is for going left. There are some noises, I will filter it later.)

To plug  my guitar to my p5js editor, I wrote a C++ native app to calculate the frequency of my guitar notes through Fast Fourier Transform and map particular ranges of frequencies to some key press events, which is propagated to the p5js browser tab. A fraction of the C++ code to simulate key presses is

char buffer[1024];
    bool leftArrowPressed = false;
    bool rightArrowPressed = false;
    CGKeyCode leftArrowKeyCode = 0x7B; // KeyCode for left arrow key
    CGKeyCode rightArrowKeyCode = 0x7C; // KeyCode for right arrow key

    while (true) {
        int n = recv(sockfd, buffer, 1024, 0);
        if (n > 0) {
            buffer[n] = '\0';
            int qValue = std::stoi(buffer);
            // std::cout << "Received Q Value: " << qValue << std::endl;

            if (qValue > 400 && qValue <= 700 && !leftArrowPressed) {
                // Debug Log
                std::cout << "Moving Left" << std::endl;
                simulateKeyPress(leftArrowKeyCode, true);
                leftArrowPressed = true;
                if (rightArrowPressed) {
                    // Debug Log
                    std::cout << "Stop" << std::endl;
                    simulateKeyPress(rightArrowKeyCode, false);
                    rightArrowPressed = false;
                }
            } else if ((qValue <= 400 || qValue > 700) && leftArrowPressed) {
                // Debug Log
                std::cout << "Stop" << std::endl;
                simulateKeyPress(leftArrowKeyCode, false);
                leftArrowPressed = false;
            }

 

Of course, I need the characters. Why do I need to browse the web for low quality graphics (or ones which does not meet my art standards), while I can create my own graphics tailored to this game specifically?

So I created a sprite sheet of myself as a game character, in my leather jacket,  with my red hair tie, leather boots and sunglasses.

But is it not time consuming? Not if you are lazy and automated it 🫵. You just model Unreal metahuman yourself, plug the fbx model into mixamo to rig, plug into Unity, do wind and clothes simulation and animate. Then, apply non-photorealistic cel shading to give a hand-drawn feel, and utilize Unity Recorder to capture each animation frame, do a bit more clean up of the images through ffmpeg, then assemble the spritesheet in TexturePacker and voilà … quality sprite sheet of “your own” in half an hour.

Also when improvising the guitar during the storytelling performance, I need the game background music to (1) specifically tailored to my game and (2) follow a particular key and chord progression so that I can improvise on the spot in real time without messing up. Hence, I am composing the background , below is a backing track from the game.

In terms of code, there are a lot a lot a lot of refactored classes I am implementing, including the Data Loader, player state machine, Animation Controllers, Weather System, NPC system, Parallax scrolling, UI system, Dialgoue System, and the Cinematic and Cutscene systems, and Post Processing Systems and Shader loaders. I will elaborate more on actual report, but for now, I will show an example of my Sprite Sheet loader class.

class TextureAtlasLoader {
  constructor(scene) {
    this.scene = scene;
  }

  loadAtlas(key, textureURL, atlasURL) {
    this.scene.load.atlas(key, textureURL, atlasURL);
  }

  createAnimation(key, atlasKey, animationDetails) {
    const frameNames = this.scene.anims.generateFrameNames(atlasKey, {
      start: animationDetails.start,
      end: animationDetails.end,
      zeroPad: animationDetails.zeroPad,
      prefix: animationDetails.prefix,
      suffix: animationDetails.suffix,
    });

    this.scene.anims.create({
      key: key,
      frames: frameNames,
      frameRate: animationDetails.frameRate,
      repeat: animationDetails.repeat,
    });
  }
}

And I am also writing some of the GLSL fragment shaders myself, so that the looks can be enhanced to match the studio quality games. An example of the in game shaders is given below (this creates plasma texture overlay on the entire screen).

precision mediump float;

uniform float     uTime;
uniform vec2      uResolution;
uniform sampler2D uMainSampler;
varying vec2 outTexCoord;

#define MAX_ITER 4

void main( void )
{
    vec2 v_texCoord = gl_FragCoord.xy / uResolution;

    vec2 p =  v_texCoord * 8.0 - vec2(20.0);
    vec2 i = p;
    float c = 1.0;
    float inten = .05;

    for (int n = 0; n < MAX_ITER; n++)
    {
        float t = uTime * (1.0 - (3.0 / float(n+1)));

        i = p + vec2(cos(t - i.x) + sin(t + i.y),
        sin(t - i.y) + cos(t + i.x));

        c += 1.0/length(vec2(p.x / (sin(i.x+t)/inten),
        p.y / (cos(i.y+t)/inten)));
    }

    c /= float(MAX_ITER);
    c = 1.5 - sqrt(c);

    vec4 texColor = vec4(0.0, 0.01, 0.015, 1.0);

    texColor.rgb *= (1.0 / (1.0 - (c + 0.05)));
    vec4 pixel = texture2D(uMainSampler, outTexCoord);

    gl_FragColor = pixel + texColor;
}

 Frightening / Challenging Aspects

Yes, there were a lot of frightening aspects. I frightened my computer by forcing it to do exactly what I want.

Challenges? Well, I just imagine what I want. In the name of my true and genuine love for arts, God revealed all the codes and skills required to me through the angels to make my thoughts into reality.

Hence, the implementation of this project is like Ariana Grande’s 7 Rings lyrics.

I see it, I like it, I want it, I got it (Yep)

Risk Prevention

Nope, no risk. The project is completed so I know there is no risks to be prevented, I am just showing a fraction of it because this is the midterm “progress” report.

Midterm Progress – Sara Al Mehairi

Concept

The game I am planning to create will take inspiration from “Diary of a Wimpy Kid,” but with a twist of our university’s culture. That said, I titled it “Diary of an NYUAD Kid,” with each game representing a struggle or something we, students, relate to. At first, the player will be introduced to a menu screen with options to play. To avoid simplicity, I decided to gather a group of games, 4 to be specific (or 3.5, as one of them is a doodling notepad), and implement them to provide users with a variety of choices. All games will be single-player.

Design & Implementation

  1. The first game is a simple game of trying to avoid “the cheese touch,” referenced from the movie. If you failed to avoid the cheese touch, you lose points, but if you succeed, you gain points. The goal here is to gather as many falcon points as possible.
  2. The second game is a memory game titled “Have We Met?” It aims to depict the struggle of being new to campus and meeting so many people, with each card representing a character.
  3. The third game is an elevator game, which I’m planning to title “Rush Hour” or “Elevator Rush.” The goal is to get as many students in the elevator as possible to prevent them from being late to class, inspired by the slow elevators on our campus, specifically C2.
  4. Finally, the fourth semi-game is a student’s notebook where you can sketch or take notes, and then save your sketch as a PNG to your laptop. With the vision set & the base code established, my next step is to digitally design the game aesthetics.

CHALLENGES

Although I have the base code designed for most of the games (with many bugs), I believe it will be a challenge to implement them all perfectly without errors. My goal is to complete all four games, which I know is ambitious, yet I have faith. Another challenge I expect is trying consistency while trying to recreate the same aesthetic as the sketches in the original “Diary of a Wimpy Kid,” but with time and effort, I believe it is possible.