I have been attempting to grapple with the linkage between sound and image as practiced in contexts of music performance with computers. The idea is to foster a sense of a world set apart from our own by linking movement, vision, and sound. In light of this goal, I have been investigating a number of computer vision techniques for gaining meaningful data from a live video stream. (Warning: this is technical.)
My use of OpenCV for some time has been relegated to utilizing other people's implementations of object tracking using Haar classifiers and blob tracking. I am impressed with the robustness of the Haar tracker, but the technique is computationally expensive, especially seeing that it takes a week to train a classifier to recognize a particular object. Blob tracking is good, but limited, and I did not have enough knowledge a week ago to create more robust solutions than ones based on simple background subtraction.
The first openCV function that I found potentially useful in artistic
calcOpticalFlowPyrLK(). Given two frames of video and a
set of landmarks corresponding to points of interest on the first frame,
calcOpticalFlowPyrLK() will return a second set of points that
represent where the first set of points are found on the second frame.
Essentially, these points describe detected movement. I used these
measurements of movement to create the above video.
Having discovered the optical flow function, I wondered if I could then find a method for tracking objects that was less expensive than the Haar classifier. I found this project report that describes a method of tracking objects using optical flow. I tried to implement something similar utilizing a moving mask as a filter to the landmark detector. This produced interesting results. Unfortunately, I had significant problems with features drifting and the classifier loosing the object.
I decided to try to learn how to implement Mathias Kölsch and Matthew Turk's "Flocks of Features" method for object tracking as described here. To say the least, it was a learning experience. For the most part, I am not sufficiently experienced in computer vision and mathematics to implement things by reading a paper. While the accompanying video shows some successful tracking, the flock is shown drifting from the object.
While I am confident that I can develop a better implementation, the process of recording the video made me question my motives. I did not see a musical application for the tracking of the position of arbitrary objects. Generally, one's interaction with a musical instrument is more sophisticated than merely placing it in space. While crafting metaphors to musical instruments may be limiting, I can say that as a musician, I found the mere moving of objects around a 3d space less than inspiring.
This is, perhaps, one of my more successful experiments to date. In it, I utilize the optical flow algorithm to detect motion. Whenever the slightest amount of motion occurs, two things happen. One, a "galaxy" appears moving at the detected rate of motion. Two, one of several pitches is given a little bit of volume. The speed of motion determines the pitch. Faster movements have different pitches (not necessarily higher or lower). There is a process of accumulation so that the more movement at a particular speed, the louder the pitch. (This is admittedly not all that evident.)
The process of crafting this taught me something important about the linkage between physical movement and music.
The way that this particular implementation could be improved is by detecting different kinds of movement and allowing these to be reflected in more highly differentiated sound processes. Differentiating movement has proved to be challenging. Maybe with the release of more refined sensors, this might be more easily achievable. Until then, the grind continues.