Observing Pianist Accuracy and Form with Computer Vision
Jangwon Lee, Bardia Doosti, Yupeng Gu, David Cartledge, David J. Crandall, Christopher Raphael
IEEE Winter Conference on Applications of Computer Vision (WACV) 2019
[download paper] Abstract: We present a first step towards developing an interactive piano tutoring system that can observe a student playing the piano and give feedback about hand movements and musical accuracy. In particular, we have two primary aims: (1) to determine which notes on a piano are being played at any moment in time, (2) to identify which finger is pressing each note. We introduce a novel two-stream convolutional neural network that takes video and audio inputs together for detecting pressed notes and fingerings. We formulate our two problems in terms of multi-task learning and extend a state-of-the-art object detection model to incorporate both audio and visual features. We also introduce a technique for identifying fingerings if pressed piano keys are already known. We evaluate our techniques on a new dataset of multiple people playing several pieces of different difficulties on an ordinary piano.