Reading the essay Computer Vision for Artists and Designers made me realize how differently computers and humans actually “see.” Our eyes and brains process the world in ways that feel natural: we recognize faces instantly, understand depth, guess intentions from gestures, and fill in missing details without even noticing. Computers, on the other hand, don’t have that intuitive grasp. They just see pixels and patterns. A shadow or a little blur can confuse them. Where we understand context, like knowing a cat is still a cat even if half hidden, computers rely on strict rules or training data, and they often fail when something doesn’t match what they’ve been taught to expect.
To bridge that gap, a lot of effort goes into helping machines track what we want them to notice. Instead of raw pixels, we give them features: edges, colors, corners, or textures. Algorithms can then use those features to keep track of an object as it moves. More recently, deep learning has allowed computers to learn patterns themselves, so they can recognize faces or bodies in a way that feels closer to human intuition (though still fragile). Sometimes, extra sensors like depth cameras or infrared are added to give more reliable information. It’s almost like building a whole toolkit around vision just to get machines to do what we take for granted with a single glance.
Thinking about how this plays into interactive art is both exciting and a little unsettling. On one hand, the ability to track people makes art installations much more engaging — an artwork can respond to where you’re standing, how you move, or even who you are (as I observed in TeamLab). That creates playful, immersive experiences that wouldn’t be possible without computer vision. But the same technology that enables this interactivity also raises questions about surveillance. If art can “see” you, then it’s also observing and recording in ways that feel uncomfortably close to security cameras. I think this tension is part of what makes computer vision so interesting in art: it’s not just about making something interactive, but also about asking us to reflect on how much we’re being watched.