IDEAS FOR MUSIC INFORMATICS INDEPENDENT PROJECTS - 10 Jan. 2007 (DAB = Don Byrd) Here are some ideas for semester projects for the courses I teach. NB: I don't claim they're all _good_ ideas, though I haven't included bad ones intentionally! Also, a few of these are much more appropriate for one of my courses than for the other, so I reserve the right to reject even these projects in a certain case. I'd be happy to consider anything else you think is relevant to the course; in fact, it's better in some ways if students make their own projects up. Please feel free to discuss any of this with me at any time. "*" in front of an item = brand-new or with major changes recently. "!" indicates projects I'd especially like to see someone work on. Requiring programming... !- Write a simple program to compute the similarity between two melodies, rhythm patterns, chord progressions, or even complete pieces, represented in some symbolic form (such a program could be used as the basis for a music-retrieval program). [topic: symbolic retrieval] - Attempt to improve NightingaleSearch's music searching and evaluate the results in some way, preferably by MIREX or standard TREC (Cranfield model) methods. [topic: symbolic retrieval] - Build a database (e.g., from MIDI files, or from an existing collection like CCARH's) of at least 500 music "documents"; build a suitable set of queries; and/or investigate how the choice of search parameters affect the results, as evaluated by MIREX or TREC methods. [topic: symbolic retrieval] - Same as above, but using one of the MIREX tasks, with M2K/D2K or any other software. - Use M2K/D2K to investigate any music-representation or searching question. Could be done with or without programming... - Extend OMRAS harmonic distributions, as used in the "OMRAS Audio-degraded Music IR Experiment" (cf. the ISMIR 2002 paper by Pickens et al.; reference in the list of publications on my website) !- Use NightingaleSearch to study something appropriate to some collection of music, e.g., to confirm or refute accepted wisdom about that music. The 24 Preludes and 24 Fugues of Bach's Well-Tempered Clavier and a relative handful of other pieces already exist in NightingaleSearch format; otherwise, there are utilities to convert music from some other forms (e.g., MuseData) to its format, though they may not work very well. [topic: music analysis] !- Same as above, but using the Humdrum Toolkit, and there's far more music already available in a format Humdrum can use. [topic: music analysis] - Investigate converting between representations of the same type (e.g., one type of notation to another); optionally write a program to implement your ideas. [topic: music representation] - Investigate converting between representations of different types (e.g., notation to MIDI); optionally write a program to implement your ideas. [topic: music representation] !- Most music-IR research to date has been on tonal, functional-harmony Western music; investigate how it could be extended to other music(s). !- It's obvious that automated Schenkerian analysis, even going just two or three levels down from the surface, would be incredibly valuable for music IR; however, a general, style-independent solution is probably not possible in general. But how about with restrictions -- e.g., only Anglo-American folksongs or 12-bar blues? Cf. "controllers" in David Cope's EMI system. [topic: music analysis, music perception, cognitive science] !- Adapt Steve Larson's theory of musical forces to recognize similarity between melodies or even complete polyphonic pieces of music. [topic: music analysis] !- Devise a way to test Steve Larson's theory of musical forces with a database of melodies. [topic: music analysis] - Investigate clustering musical documents on whatever basis; this could be very useful for visualization, recommender or improvisations systems, etc. Cf. several papers from ISMIR and elsewhere, and techniques like Kohonen maps and spring embedding. !- Investigate user-interface issues in music searching, either content-based or bibliographic. One option would be to actually design a user interface. - Follow up on the ISMIR 2000 Mozart Varations survey: do it more scientifically, or at least investigate how that could be done, preferably by designing a valid experiment. [topic: relevance judgments] !- DAB's Extremes of CMN list (on my website) is interesting, but _distributions_ for some collection of music and one or more of the features (e.g., written pitch or duration, or just number of augmentation dots!) showing how often various values occur in a significant body of music would be much more revealing; such distributions could be useful in statistical authorship studies, for example. Compute distributions for some of the items in the list. For a music collection, you could use the CCARH database (http://www.ccarh.org/), with kern data (http://kern.humdrum.net/) accessed via the Humdrum toolkit, or with MusicXML data (available from me) accessed via a program of your own. In any case, the programming part of this is relatively easy. - Work on any of the topics listed in a recent ISMIR Call for Papers (http://www.ismir.net/). - Investigate methods for finding music that is unplayable. Playability is usually a very subjective thing, but published music that is _clearly_ unplayable exists. For example, a Scriabin piano sonata includes a note that's above the range of any piano, and I believe that in one of his symphonies Beethoven asks the violins to play below their lowest note. - The "V2V" offshoot of the Variations2 project was an attempt to combine content-based and metadata-based searching. Investigate further how the two forms of searching could be combined from any standpoint: user interface, ranking, etc. !- Investigate the "Mickey Mouse Club theme" problem: to what extent is a music- searching program likely to find matches in inner voices that are of little or no interest because they're completely inaudible? (The answer may well depend on whether the program knows about the voices and does not look for matches that cross voices: see the "disastrous loss of precision" idea below.) !- Investigate what it would take to identify 12-bar blues in a collection of, say, MIDI files or Humdrum/kern files, and try out your technique. !- Investigate the extent to which a performer's chosen medium influences their perception of music. For example, do tuba, bassoon, and double-bass players tend to hear lower lines as more salient than flutists or violinists do? How about basses vs. sopranos? What about salience of rhythm vs. pitch, e.g., for drummers vs. other musicians? - Byrd & Crawford (2002) speculate on the disastrous loss of precision they believe would result from taking "matches" that cross voices as seriously as those that stay within a voice, without considering the audibility of the matches. Investigate and produce evidence one way or the other. !- Study a widely-used existing style-genre classification, e.g., that of All Music Guide, iTunes, Amazon.com, etc. Describe in some detail how it could be implemented by computer. Optionally, implement and test part of it, probably with a symbolic representation (audio is probably too difficult to do anything with in a semester). [topic: music classification] - Study "national" style classification from either audio or CMN. What features that a program might really be able to identify make music sound French, Slavic, American, etc.? [topic: music classification] - Propose a new task for MIREX. Why is this task significant? How could entries be evaluated? [topic: evaluation] Almost certainly _not_ involving programming... - Walter Hewlett and DAB have found previously-unknown instances of the famous "B A C H" motive in the music of Bach and Douglas Hofstadter, respectively. DAB used NightingaleSearch. Use any other music-searching technology to find anything interesting in any database (the CCARH database is a good one for this purpose). !- Investigate a basis for ranking music documents in search results. With music as with text, this is normally done by similarity, and justified via "relevance". But are these the best concepts for ranking music? - Extend or otherwise significantly improve DAB's table of candidate music-IR testbed databases (on my website). !- Extend DAB's comparison of music to text, images, etc. by adding other media, more details, or both. - Test/compare existing audio music recognition programs; compare them to optical music recognition programs. Cf. www.music-notation.info/en/compmus/audio2midi.html . - Compare in detail two or more music-notation encoding systems. This could be based on a table Natalia Minibayeva did several years ago, or could be completely new work. [topic: music representation] - Investigate and compare existing metadata formats for music, and/or design a new one. !- Extend, improve, or just evaluate DAB and Eric Isaacson's music-representation requirements specification (created for Variations2). !- Annotate or otherwise significantly improve DAB's music-IR bibliography (on my website). When I taught music IR in 2003, someone annotated 50 entries in terms of how useful they would likely be to someone in the class; more of that would be worthwhile, or more of the type of annotations some entries have now. - Compare classifications of music representations, e.g., DAB's, Selfridge-Field's, Castan's, Wiggins'; perhaps propose a new classification. [topic: music representation] - Study how MIREX works. Compare it to similar undertakings in other domains (TREC for text IR, the standard speech-recognition and question-answering tests, etc.). How could MIREX be improved? [topic: evaluation] !- List and discuss several of what you consider the most important unsolved problems of music IR. (I'll be glad to tell you what I think some of these problems are, but you're welcome to choose your own.) !- In Sept. 2005, a former director of engineering for All Music Guide said, in so many words, that programs that do automatic genre classification from audio are probably finding _something_, and something useful, but it may not be genres as people understand them. Investigate and report on the accuracy of this statement. [topic: music classification] !- There is very little agreement among existing style-genre classifications: the numbers of categories varies wildly (All Music Guide has 34, iTunes 37, Amazon.com 23, etc.)--and even those numbers overestimate the agreement, since they're not all "flat" lists, and some confuse styles and forms. Compare at least three existing classifications, and comment on which seems most practical for computer implementation and why. [topic: music classification] !- Study existing sets of relevance judgments for music and/or create a new set. [topic: relevance judgments] !- Find a small number, but at least five or six, pieces of music each of which is, in your opinion, as different as possible from all the others. Once you've chosen the pieces, either justify or refute your claim that each is as different as possible from the others on a basis that's as objective as possible, most likely a survey of listeners. Better, use such a measure to find the pieces in the first place. In either case, "as objective as possible" is not likely to be very objective: discuss the inherent limitations of objectivity here. [topic: music classification]