Week 5 Reading – Computer Vision

Computer Vision has always been something I’ve been interested in, I used it in my 3rd assignment and I am currently using it in my midterm project. The article has given me answers to questions I have had while working with computer vision.

So far I have really only worked with hands, and it got me really curious, how does the AI model what is a hand and what isn’t, to the point it can assign so many key points to a single hand, it knows where the each finger tip is, the middle, the base and so on. And I know this article doesn’t fully answer this, but it gave me an idea to what exactly is computer vision. To a computer with no inherent context, anything it “sees” is just a bunch of no pixels with absolutely no relation whatsoever. It relies on mathematical calculations to make it’s own to context to what is happening and what is what.  But that is just an abstract definition, honestly the techniques provided seem to only work in really specific cases, the author says there is no computer vision algorithm which is “completely” general.

I am going to have to disagree with that on the basis that this is not specific enough. Hand detection algorithms seem to work in almost any environment. It is able to detect when a hand is on screen or not, even multiple hands. Now if we take a hand algorithm and say that this algorithm wont detect this object in any environment? Of course it won’t. When we say general we need some sort of context to what general is! A lot of hand detection algorithms can be considered general in detecting hands no matter the environment for example.

There is a detection technique that I had to learn to improve my hand detection in the midterm project, and it is called Kalman filtering. To briefly describe this technique, the algorithm tries to predict the location of what it is tracking in the next frame, an compares it to the what the location actually is, and depending on a threshold we give this algorithm, the visualization of this tracking will either follow our predicted calculation, or the camera’s calculation. And this is an algorithm which I found to be quite intuitive in how it works, and I have noticed considerable difference in my hand tracking after implementation.

Honestly computer vision’s potential in interactive art is so extremely untapped. I do not see many people implementing it besides very few, and considering how accessible it is now to implement, it is such a shame We can have true interaction with our art work if we have the computer make decisions based on what it sees, giving us a new piece, not just every time the program is ran, but every time the background or the person does something.

Leave a Reply