This is a list of project guides and their areas of interest for the 2018 WiMIR workshop. These folks will be leading the prototyping and early research investigations at the workshop. You can read about them and their work in detail below!
Rachel Bittner: MIR with Stems
The majority of digital audio exists as mono or stereo mixtures, and because of this MIR research has largely focused on estimating musical information (beats, chords, melody, etc.) from these polyphonic mixtures. However, stems (the individual components of a mixture) are becoming an increasingly common audio format. This project focuses on how MIR techniques could be adapted if stems were available for all music. Which MIR problems suddenly become more important? What information – that was previously difficult to estimate from mixtures – is now simple to estimate? What new questions can we ask about music that we couldn’t before? As part the project, we will try to answer some of these questions and create demos that demonstrate our hypotheses.
Rachel is a Research Scientist at Spotify in New York City, and recently completed her Ph.D. at the Music and Audio Research Lab at New York University under Dr. Juan P. Bello. Previously, she was a research assistant at NASA Ames Research Center working with Durand Begault in the Advanced Controls and Displays Laboratory. She did her master’s degree in math at NYU’s Courant Institute, and her bachelor’s degree in music performance and math at UC 2 Irvine. Her research interests are at the intersection of audio signal processing and machine learning, applied to musical audio. Her dissertation work applied machine learning to various types of fundamental frequency estimation.
Johanna Devaney: Cover Songs for Musical Performance Comparison and Musical Style Transfer
Cover versions of a song typically retain basic musical the material of the song being covered but may vary a great deal in their fidelity to other aspects of the original recording. While some covers only differ in minor ways, such as timing and dynamics, while others may use completely different instrumentation, performance techniques, or genre. This workshop will explore the potential of cover songs for studying musical performance and for performing musical style transfer. In contrast to making comparisons between different performances of different songs, cover songs provide a unique opportunity to evaluate differences in musical performance, both within and across genres. For musical style transfer, the stability of the musical material serves as an invariant representation, which allows for paired examples for training machine learning algorithms. The workshop will consider issues in dataset creation as well as metrics for evaluating performance similarity and style transfer.
Johanna is an Assistant Professor of Music Technology at Brooklyn College, City University of New York and the speciality chief editor for the Digital Musicology section of Frontiers in Digital Humanities. Previously she taught in the Music Technology program at NYU Steinhardt and the Music Theory and Cognition program at Ohio State University. Johanna completed her post-doc at the Center for New Music and Audio Technologies (CNMAT) at the University of California at Berkeley and her PhD in music technology at the Schulich School of Music of McGill University. She also holds an MPhil degree in music theory from Columbia University, as well as an MA in composition from York University in Toronto. Johanna’s research seeks to understand how humans engage with music, primarily through performance, with a particular focus on intonation in the singing voice, and how computers can be used to model and augment our understanding of this engagement.
Doug Eck: Building Collaborations Among Artists, Coders and Machine Learning
We propose to talk about challenges and future directions for building collaborations among artists, coders and machine learning researchers. The starting point is g.co/magenta. We’ve learned a lot about what works and (more importantly) what doesn’t work in building bridges across these areas. We’ll explore community building, UX/HCI issues, research directions, open source advocacy and the more general question of deciding what to focus on in such an open-ended, ill-defined domain. We hope that the session is useful even for people who don’t know of or don’t care about Magenta. In other words, we’ll use Magenta as a starting point for exploring these issues, but we don’t need to focus solely on that project.
Douglas Eck is a Principal Research Scientist at Google working in the areas of music, art and machine learning. Currently he is leading the Magenta Project, a Google Brain effort to generate music, video, images and text using deep learning and reinforcement learning. One of the primary goals of Magenta is to better understand how machine learning algorithms can learn to produce more compelling media based on feedback from artists, musicians and consumers. Before focusing on generative models for media, Doug worked in areas such as rhythm and meter perception, aspects of music performance, machine learning for large audio datasets and music recommendation for Google Play Music. He completed his PhD in Computer Science and Cognitive Science at Indiana University in 2000 and went on to a postdoctoral fellowship with Juergen Schmidhuber at IDSIA in Lugano Switzerland. Before joining Google in 2010, Doug worked in Computer Science at the University of Montreal (MILA machine learning lab) where he became Associate Professor.
Ryan Groves: Discovering Emotion from Musical Segments
In this project, we’ll first survey the existing literature for research on detecting emotions from musical audio, and find relevant software tools and datasets to assist in the process. Then, we’ll try to formalize our own expertise in how musical emotion might be perceived, elicited and automatically evaluated from musical audio. The goal of the project will be to create a software service or tool that can take a musical audio segment that is shorter than a whole song, and detect the emotion from it.
Ryan Groves is an award-winning music researcher and veteran developer of intelligent music systems. He did a Masters’ in Music Technology at McGill University under Ichiro Fujinaga, has published in conference proceedings including Mathematics and Computation in Music, Musical Metacreation (ICCC & AIIDE), and ISMIR. In 2016, he won the Best Paper award at ISMIR for his paper on “Automatic melodic reduction using a supervised probabilistic context-free grammar”. He is currently the President and Chief Product Officer at Melodrive – an adaptive music generation system. Using cutting-edge artificial intelligence techniques, Melodrive allows any developer to automatically create and integrate a musical soundtrack into their game, virtual world or augmented reality system. With a strong technical background, extensive industry experience in R&D, and solid research footing in academia, Ryan is focused on delivering innovative and robust musical products.
Christine Ho, Oriol Nieto, & Kristi Schneck: Large-scale Karaoke Song Detection
We propose to investigate the problem of automatically identifying Karaoke tracks in a large music catalog. Karaoke songs are typically instrumental renditions of popular tracks, often including backing vocals in the mix, such that a live performer can sing on top of them. The automatic identification of such tracks would not only benefit the curation of large collections, but also its navigation and exploration. We challenge the participants to think about the type of classifiers we could use in this problem, what features would be ideal, and what dataset would be beneficial to the community to potentially propose this as a novel MIREX (MIR Evaluation eXchange) task in the near future.
Oriol Nieto is a Senior Scientist at Pandora. Prior to that, he defended his Ph.D Dissertation in the Music and Audio Research Lab at NYU focusing on the automatic analysis of structure in music. He holds an M.A. in Music, Science and Technology from the Center for Computer Research in Music and Acoustics at Stanford University, an M.S. in Information Theories from the Music Technology Group at Pompeu Fabra University, and a Bachelor’s degree in Computer Science from the Polytechnic University of Catalonia. His research focuses on music information retrieval, large scale recommendation systems, and machine learning with especial emphasis on deep architectures. Oriol plays guitar, violin, and sings (and screams) in his spare time.
Kristi Schneck is a Senior Scientist at Pandora, where she is leading several science initiatives on Pandora’s next-generation podcast recommendation system. She has driven the science work for a variety of applications, including concert recommendations and content management systems. Kristi holds a PhD in physics from Stanford University and dual bachelors degrees in physics and music from MIT.
Christine Ho is a scientist on Pandora’s content science team, where she works on detecting music spam and helps teams with designing their AB experiments. Before joining Pandora, she completed her PhD in Statistics at University of California, Berkeley and interned at Veracyte, a company focused on applying machine learning to genomic data to improve outcomes for patients with hard-to-diagnose diseases.
Xiao Hu: MIR for Mood Modulation: A Multidisciplinary Research Agenda
Mood modulation is a main reason behind people’s engagement with music, whereas how people use music to modulate mood and how MIR techniques and systems can facilitate this process continue fascinating researchers in various related fields. In this workshop group, we will discuss how MIR researchers with diverse backgrounds and interests can participate in this broad direction of research. Engaging activities are designed to enable hands-on practice on multiple research methods and study design (both qualitative and quantitative/computational). Through feedback from peers and the project guide, participants are expected to start developing a focused research agenda with theoretical, methodological and practical significance, based on their own strengths and interests. Participants from different disciplines and levels are all welcomed. Depending on the background and interests of the participants, a small new dataset is prepared for fast prototyping on how MIR techniques and tools can help enhancing this multidisciplinary research agenda.
Dr. Xiao Hu has been studying music mood recognition and MIR evaluation since 2006. Her research on affective interactions between music and users has been funded by the National Science Foundation of China and Research Grant Council (RGC) of the Hong Kong S. A. R. Dr. Hu was a tutorial speaker in ISMIR conferences in 2012 and 2016. Her papers have won several awards in international conferences and have been cited extensively. She has served as a conference co-chair (2014), a program co-chair (2017 and 2018) for ISMIR, and an editorial board member of TISMIR. She was in the Board of Directors of ISMIR from 2012 to 2017. Dr. Hu has a multidisciplinary background, holding a PhD degree in Library and Information Science, Multi-disciplinary Certificate in Language and Speech Processing, and a Master’s degree in Computer Science, a Master’s degree in Electrical Engineering and a Bachelor’s degree in Electronics and Information Systems.
Anja Volk, Iris Yuping Ren, & Hendrik Vincent Koops: Modeling Repetition and Variation for MIR
Repetition and variation are fundamental principles in music. Accordingly, many MIR tasks are based on automatically detecting repeating units in music, such as repeating time intervals that establish the beat, repeating segments in pop songs that establish the chorus, or repeating patterns that constitute the most characteristic part of a composition. In many cases, repetitions are not literal, but subject to slight variations, which introduces the challenge as to what types of variation of a musical unit can be reasonably considered as a re-occurrence of this unit. In this project we look into the computational modelling of rhythmic, melodic, and harmonic units, and the challenge of evaluating state-of-the-art computational models by comparing the output to human annotations. Specifically, we investigate for the MIR tasks of 1) automatic chord extraction from audio, and 2) repeated pattern discovery from symbolic data, how to gain high-quality human annotations which account for different plausible interpretations of complex musical units. In this workshop we discuss different strategies of instructing annotators and undertake case studies on annotating patterns and chords on small data sets. We compare different annotations, jointly reflect on the rationales regarding these annotations, develop novel ideas on how to setup annotation tasks and discuss the implications for the computational modelling of these musical units for MIR.
Anja Volk holds masters degrees in both Mathematics and Musicology, and a PhD from Humboldt University Berlin, Germany. Her area of specialization is the development and application of computational and mathematical models for music research. The results of her research have substantially contributed to areas such as music information retrieval, computational musicology, digital cultural heritage, music cognition, and mathematical music theory. In 2003 she has been awarded a Postdoctoral Fellowship Award at the University of Southern California, in 2006 she joined Utrecht University as a Postdoc in the area of Music Information Retrieval. In 2010 she has been awarded a highly prestigious NWO-VIDI grant from the Netherlands Organisation for Scientific Research, which allowed her to start her own research group. In 2016 she co-launched the international Women in MIR mentoring program, in 2017 she co-organized the launch of the Transactions of the International Society for Music Information Retrieval, and is serving as Editor-in-Chief for the journal’s first term.
Dr. Cynthia C. S. Liem, MMus: Beyond the Fun: Can Music We Do Not Actively Like Still Have Personal Significance?
In today’s digital information society, music is typically perceived and framed as ‘mere entertainment’. However, historically, the significance of music to human practitioners and listeners has been much broader and more profound. Music has been used to emphasize social status, to express praise or protest, to accompany shared social experiences and activities, and to moderate activity, mood and self-established identity as a ‘technology of the self’. Yet today, our present-day music services (and their underlying Music Information Retrieval (MIR) technology) do not focus explicitly on fostering these broader effects: they may be hidden in existing user interaction data, but this data usually lacks sufficient context to tell for sure. As a controversial thought, music that is appropriate for the scenarios above may not necessarily need to be our favorite music, yet still be of considerable personal value and significance to us. How can and should we deal with this in the context of MIR and recommendation? May MIR systems then become the tools that can surface such items, and thus create better user experiences that users could not have imagined themselves? What ethical and methodological considerations should we take into account when pursuing this? And, for technologists in need of quantifiable and measurable criteria of success, how should the impact of suggested items on users be measured in these types of scenarios? In this workshop, we will focus on discussing these questions from an interdisciplinary perspective, and jointly designing corresponding initial MIR experimental setups.
Cynthia Liem graduated in Computer Science at Delft University of Technology, and in Classical Piano Performance at the Royal Conservatoire in The Hague. Now an Assistant Professor at the Multimedia Computing Group of Delft University of Technology, her research focuses on music and multimedia search and recommendation, with special interest in fostering the discovery of content which is not trivially on users’ radars. She gained industrial experience at Bell Labs Netherlands, Philips Research and Google, was a recipient of multiple scholarships and awards (e.g. Lucent Global Science & Google Anita Borg Europe Memorial scholarships, Google European Doctoral Fellowship, NWO Veni) and is a 2018 Researcher-in-Residence at the National Library of The Netherlands. Always interested in discussion across disciplines, she also is co-editor of the Multidisciplinary Column of the ACM SIGMM Records. As a musician, she still has an active performing career, particularly with the (inter)nationally award-winning Magma Duo.
Matt McVicar: Creative applications of MIR Data
In this workshop, you’ll explore the possibility of building creative tools using MIR data. You’ll discuss the abundance of prevailing data for creative applications, which in the context of this workshop simply means “a human making something musical”. You, as a team, may come up with new product or research ideas based on your own backgrounds, or you may develop an existing idea from existing products or research papers. You may find that the data for your application exists already, so that you can spend the time in the workshop fleshing out the details of how your application will work. Else, you may discover that the data for your task does not exist, in which case you, as a team, could start gathering or planning the gathering of these data.
Matt is Head of Research at Jukedeck. He began his PhD at the University of Bristol under the supervision of Tijl De Bie and finished it whilst on a Fulbright Scholarship at Columbia University in the city of New York with Dan Ellis. He then went on to work under Masataka Goto at the National Institute for Advanced Industrial Science and Technology in Tsukuba, Japan. Subsequently, he returned to Bristol to undertake a 2 year grant in Bristol. He joined Jukedeck in April 2016, and his main interests are the creative applications of MIR to domains such as algorithmic composition.