Recent Projects

Human-Interactive Optical Music Recognition (ongoing)

DLfM’17: Project Homepage
ISMIR’16: Poster, Project Homepage
HCML’16: Poster, Bird's-eye-view video
ISMIR’15: Poster, Project Homepage
Tutorial: Link
OMR proofreading is laborious if it's completely left to human. To improve the efficiency of OMR system, we try to combine the recognition and human proofreading into a single computational loop. Since our recognizers are highly constrained, a small amount of instruction may introduce significant improvements. We demonstrated in our project that Human-Directed OMR requested much fewer human operations than notation systems.

Concatenative Instrumental Sound Synthesis

Project Homepage

HMM-RNN for Optical Music Recognition

Project Homepage

MIDI-Assisted Egocentric Optical Music Recognition

Dataset, Poster
An OMR framework for egocentric applications. The idea is to incorporate MIDI data into OMR process to achieve better performance. Our experiments focused on static score images acquired with Google Glass. The task is more challenging than offline OMR due to various degradations of egocentric images, such as noise, motion blur and distortion. We employed methodologies different from traditional OMR which can adapt to our new application scenario.

Renotation from Optical Music Recognition (ongoing)

New Version:
Project Homepage
We propose a new model for renotation, formulated as a quadratic programming problem. The notation graph we constructed contain three types of edges: desired distance edge, alignment edge and conflict edge. The first type is soft constraint, expressed in the quadratic terms; the second is linear equality (hard) constraint and the last one is linear inequality (hard) constraint. The Mehrotra predictor-corrector interior-point method is applied for the optimization.

Old Version:
Project Homepage
One of the most important post-omr applications is Renotation, to arrange, format and render music symbols from recognized results. For this end, we constructed a connected primitive graph, which bridges musical primitives with three types of edges: horizontal, vertical and conflict, in terms of the primitives’ spatial relations. This graph facilitated the so-called “Force-Directed Rendering” of music notation. In the meantime, we applied Dynamic Programming to determine the line breaks for page layout.
The idea was demonstrated on our score-to-parts and automatic transposition experiments.

Ceres Optical Music Recognition System

Project Homepage, 2015 Poster
OMR is also known as Music OCR, which is aimed to convert scanned music scores into computer-readable formats. Based on OMR, we'll be able to generate symbolic data from score images, with which we can analyze, process and play the music.
The system uses multiple grammatically constrained graphical models to recognize different sorts of musical symbols. The symbol-level configuration is then explored based on a conflict-resolving process. The primitive-level proofreading is used by the system to correct recognition errors.

Musical Score Recognition via scene understanding

Poster (Excellent Project Award, SOIC Robotics Open House, 2014)
This was my second-year independent study project in IU's Computer Vision Lab. Dr. Kun Duan and Prof. David Crandall offered many insightful suggestions for this project. We tried to build a 2-layer holistic scene model (CRF model) to represent the monophonic music score at measure level. Measure is decomposed into symbols and symbol into parts. The parts model was trained via HOG feature extractor and Linear SVM, while the structural parameters were learnt through Structured-SVM.
Pros: tree structure, very fast inference
Cons: unable to recover from the failure of lower level detection.
Possible way to conquer: involve human into the computational loop.