I think as a kid, for the longest time, I was obsessed with telekinesis. I had just watched Matilda, and although I didn’t know the name for it yet, I already knew how groundbreaking it was that Matilda could move things by the power of her mind, without touching anything. I would focus really hard on something like she did, and try to make it move (just a centimeter would have been enough), but it never worked.
I am happy, elated even, to announce that I think I am finally old enough to finally realize my childhood dreams. So for my midterm project, I will be building a computer vision game that lets you interact with the virtual world without using a mouse or keyboard — your “stdin”. I am looking at classic games for inspiration, such as Fruit Ninja, or maybe the games for PlayStation EyeToy mentioned in the reading.
I can use ml5.js+PoseNet for this project, and there are so many online resources for learning about them. Even Daniel Shiffman has PoseNet tutorials and code examples, so I hope this idea isn’t too ambitious for a midterm project. I still haven’t written any code, but this is how I imagine the final project structure will look like:
1. The basic structure is built with p5.js
2. Using PoseNet, I will train my model to recognize specific poses (such as throwing a punch, or dicing the fruit with a slash of the hand, etc.)
3. The data returned by the model will be used in the core p5.js code to alter the appearance of elements on screen. For example, if it is something like Fruit Ninja, I could use sprite sheets to animate the fruit being cut up.
4. Obviously, sound plays a big part in our reality, so to make the game more immersive, I would have to trigger particular sound effects based on the user’s actions.
I am a little wary of using the ML model and of having to train it. Everything else I could just look up and implement, but training an ML model seems a little daunting. I have worked with PoseNet before, but it was a much simpler task than the one I now plan to undertake.