WiMIR Workshop 2019 Project Guides

WiMIR Logo.JPG

Pandora_Wordmark_RGB (1).png

Spotify_Logo_RGB_Green_2019.png

This is a list of Project Guides and their areas of interest for the 2019 WiMIR Workshop, which will take place on Sunday, 3rd November 2019 as a satellite event of ISMIR2019.  These folks will be leading the prototyping and early research investigations at the workshop.

This year’s Workshop is organized by Blair Kaneshiro (Stanford University), Katherine M. Kinnaird (Smith College), Thor Kell (Spotify), and Jordan B. L. Smith (Queen Mary University of London) and is made possible by generous sponsorship from Pandora and Spotify.

Planning to attend the WiMIR Workshop?
Read about the Project Guides and their work in detail below, and sign up to attend at https://forms.gle/mCEod8AvtqnBcMJz7

 

Amelie Anglade

Ryan Groves

Amélie Anglade and Ryan Groves:  Auto-BeatSaber: Generating New Content for VR Music Games

In this workshop we will dive into a specific problem at the intersection of music, gaming, and dance: the generation of a BeatSaber song level. BeatSaber is one of the most popular VR titles, in which the core of the gameplay is to rhythmically slice incoming boxes with light sabers to the sound of the beat. The game has sparked a huge community of modders who create their own choreographies to existing songs, as well as MIR-based tools (such as a MIDI converter or a BPM estimator) designed specifically to support level creation. The task of this workshop will be: how could we use machine learning to generate these choreographies automatically? Participants will have the opportunity to learn from our experience as Data Science & MIR consultants as we will share our own structured process for problem-solving in the music tech industry.

Dr. Amélie Anglade is a Music Information Retrieval and Data Science consultant. She completed her PhD at Queen Mary University of London, before moving to industry, initially taking on positions in R&D labs such as Sony CSL, Philips Research and CNRS, and then being employed as an MIR expert for Music Tech startups such as SoundCloud and frestyl. For the past 5 years she has further developed her expertise in music identification and discovery–assisting startups and larger companies in the AI and music or multimedia space as an independent consultant, researching, prototyping, and scaling up Machine Learning solutions for them. Additionally, Amélie is a contributor to the EU Commision as an independent technical expert in charge of reviewing proposals and ongoing EU projects. In her spare time she attends music hackathons (15+ so far), and is a teacher and mentor for women in the field of data science through multiple organizations.

Ryan Groves is an award-winning music researcher and veteran developer of intelligent music systems. He received his Master’s in Music Technology from McGill University. In 2016, his work on computational music theory was awarded the Best Paper at ISMIR. He also has extensive experience in industry, building musical products that leverage machine learning. As the former Director of R&D for Zya, he developed a musical messenger app that automatically sings your texts, called Ditty. Ditty won the Best Music App of 2015 by the Appy Awards. More recently, he co-founded Melodrive, where he and his team built the first artificially intelligent composer that could compose music in realtime and react to interactive scenarios such as games and VR experiences. He now works as a consultant and startup advisor in Berlin, with a focus on expanding the use cases of music and audio through the application of AI.

Ashley Burgoyne.jpeg

Ashley Burgoyne: Cognitive MIR with the Eurovision Song Contest

When Duncan Laurence triumphed at the 2019 Eurovision Song Contest in Tel Aviv, it was the Netherlands’ first victory in the contest since 1975 – and perfect timing for the ISMIR conference! One of the most-watched and discussed broadcasts in Europe, data about the Song Contest are an excellent opportunity to link the patterns we can find using MIR tools in audio to real-world human behaviour. This workshop will show you how, and teach you techniques you can use wherever you want to use MIR to understand not just music but also people.

We will consider a number of questions. Every year, the bookmakers try to predict the contest winner: can MIR do better? The same songwriters write the songs for multiple countries each year: is there nonetheless a typical sound for each country’s entry? People assume that voting is politically rather than musically based, but recent research has called those assumptions into question: what does the music tell us? And can we link Eurovision tracks to what fans say on Twitter or direct experimentation about what they hear?

John Ashley Burgoyne is the Lecturer in Computational Musicology at the University of Amsterdam and part of the Music Cognition Group at the Institute for Logic, Language, and Computation. Dr Burgoyne teaches in both musicology and artificial intelligence and is especially interested in musicometrics: developing behavioural and audio models that are conceptually sound, reliable, and musicologically interpretable as music enters the digital humanities era. He was the leader of the Hooked on Music project, an online citizen science experiment to explore long-term musical memory that attracted more than 170,000 participants across more than 200 countries.

Estefania Cano.png

Jakob Abeßer.png

Estefanía Cano and Jakob Abeßer: Learning about Music with MIR

What does John Coltrane have in common with Cannonball Adderley? What makes micro-timing in Brazilian samba unique? What are the tuning characteristics of the harpsichord? Which cues do musicians use to control ensemble intonation? The MIR community has been working for decades in developing reliable methods for research tasks such as beat tracking, melody estimation, chord detection, music tagging, among many others. While most of these methods are not yet perfect, they can certainly be useful tools when attempting to answer questions as the ones above. This holds true especially if computational analysis tools are combined with the experience from musicians, the insights from human listeners, and music knowledge.

This workshop will focus on exploring ways in which we can gain new knowledge about music by combining available MIR techniques and human musical expertise. Instead of focusing on improving MIR methods or in proposing new ways to solve MIR tasks, we want to use this workshop as a platform to brainstorm new questions about the various aspects of music. We want to revisit old questions and propose new alternatives to address them. We want to look back at previous projects and studies, and use the lessons we learned to improve the way we address questions today.

Estefanía Cano is a research scientist at the Semantic Music Technologies group at Fraunhofer IDMT in Germany. Estefanía received her B.Sc. degree in electronic engineering from the Universidad Pontificia Bolivariana, Medellín- Colombia, in 2005, her B.A. degree in Music- Saxophone Performance from Universidad de Antioquia, Medellín-Colombia, in 2007, her M.Sc. degree in music engineering from the University of Miami, Florida, in 2009, and her Ph.D. degree in media technology from the Ilmenau University of Technology, Germany, in 2014. In 2009, she joined the Semantic Music Technologies group at the Fraunhofer Institute for Digital Media Technology IDMT as a research scientist. In 2018, she joined the Social and Cognitive Computing Department at the Agency for Science, Technology and Research A*STAR in Singapore. Her research interests include sound source separation, music education, and computational musicology.

Jakob Abeßer studied computer engineering (Dipl.-Ing., 2008) and media technology (Dr.-Ing., 2014) at the Ilmenau University of Technology. Since 2008, he has been working in the field of semantic music processing at the Fraunhofer Institute for Digital Media Technologies (IDMT) in Ilmenau. In 2005 and 2010 he spent 2 stays abroad at the Université Paul Verlain in Metz, France and the Finnish Centre of Excellence in Interdisciplinary Music Research at the University of Jyväskylä in Finland. Between 2012 and 2017, he also worked as a doctoral researcher in the Jazzomat Research Project at the Franz Liszt School of Music in Weimar, developing methods for the computer- aided analysis of jazz improvisations. Since 2018 he is working as co-investigator of the research project “Informed Sound Activity Detection in Music Recordings” (ISAD) at Fraunhofer IDMT in collaboration with Prof. Dr. Meinard Müller from the International Audio Laboratories in Erlangen, Germany. His current research interests include music information retrieval, machine listening, music education, machine learning and deep learning.

Matthew Davies.png  Sebastian Bock.jpg

Matthew Davies and Sebastian Böck:  Building and Evaluating a Musical Audio Beat Tracking System

The task of musical audio beat tracking can be considered one the foundational problems in the music information retrieval community. In this workshop we seek to take a tour of the entire beat tracking pipeline by addressing the following steps: i) how to manually annotate ground truth ii) how to construct a lightweight beat tracking model using deep neural networks; iii) how to select appropriate musical material for training and testing; and iv) how to conduct evaluation in a musically meaningful way. In each of these areas, we seek to provide practical hands-on experience and acquired tacit knowledge concerning what works and also what doesn’t work. Throughout the workshop we will promote active participation and discussion with the aim of driving new research in beat tracking and fostering new collaborations.

Matthew Davies is a music information retrieval researcher with a background in digital signal processing. His main research interests include the analysis of rhythm in musical audio signals, evaluation methodology, creative music applications, and reproducible research. Since 2014, Matthew has coordinated the Sound and Music Computing Group in the Centre for Telecommunications and Multimedia at INESC TEC. From 2014-2018, he was an Associate Editor for the IEEE/ACM Transactions on Audio, Speech and Language Processing and coordinated the 4th Annual IEEE Signal Processing Cup. He was a keynote speaker at the 16th Rhythm Production and Perception Workshop, and General Chair of the 13th International Symposium on Computer Music Multidisciplinary Research.

Sebastian Böck received his diploma degree in electrical engineering from the Technical University in Munich in 2010 and his PhD in computer science from the Johannes Kepler University Linz in 2016. Within the MIR community he is probably best known for his machine learning-based algorithms, which pushed the performance of automatic beat tracking and other tasks into regions formerly only achievable by humans. Currently he is continuing his research at the Austrian Research Institute for Artificial Intelligence (OFAI) and the Technical University of Vienna.

Georgi Dzhambazov

Georgi Dzhambazov: Verse and Chorus Detection of Acoustic Cover Versions

Many MIR tasks have as a prerequisite the annotation of structural segments of a song. While cover versions usually retain most music aspects of the original song, there could be a completely new structure (sections appended/missing). In particular, covers with acoustic instrumental accompaniment are characterized by a predominant vocal line, whereby the accompaniment is occasionally missing or improvised. Therefore structure detection algorithms based solely on harmonic features are most likely not a sufficient solution.

In this hands-on-workshop, we will explore the problem of automatic segmentation and labeling of the verse and chorus sections for a given acoustic cover version. Information about the original song (lyrics, chords, guitar tabs etc.) can be found online. Our goal is to come up with ideas/prototypes on how to approach the problem combining existing methods (e.g. vocal activity detection, chord recognition, lyrics-alignment) in new ways, rather than design something completely new. An industry database of acoustic cover songs with a varying degree of modifications to the original structure will be provided.

Georgi holds a PhD on Music Information Retrieval from the Music Technology Group in Barcelona under the supervision of Xavier Serra.  He worked on the topic of automatic alignment of lyrics. He has also experience in applied research on speech recognition and natural language processing. In 2017 he founded VoiceMagix – a company providing solutions for automatic analysis of singing voice.

For several years he is a WiMIR mentor and MIREX task captain.  His research interests are algorithms for the singing voice and speech and machine learning in general. He is currently mainly interested in initiatives aiming at bridging the gap between research in MIR and the music industry.

Brian McFee.png

Brian McFee:  Coping with Bias in Audio Embeddings

An appealing general approach to modeling problems across many domains is to first transform raw input data through an embedding function, which has been trained on a large (but potentially unrelated) collection of data. This results in a vector representation of each object, which can then be used as input to a simple classifier (e.g., a linear model) to solve some downstream task using a limited amount of data. This approach has been successfully demonstrated in image and video analysis, natural language processing, and is becoming increasingly popular in audio and musical content analysis. However, general-purpose embedding models have been known to encode and propagate implicit biases, which can have detrimental and disparate population-dependent effects.

In this project, we will conduct a preliminary study of embedding bias in MIR data. Using pre-trained audio embeddings and well-known MIR datasets, we will first attempt to quantify the extent to which embedding-based classification exhibits biased results across data sets and/or genres. We will then attempt to de-bias the embedding by adapting recently proposed methods from the natural language processing literature.

Brian McFee is Assistant Professor of Music Technology and Data Science New York University. He received the B.S. degree (2003) in Computer Science from the University of California, Santa Cruz, and M.S. (2008) and Ph.D. (2012) degrees in Computer Science and Engineering from the University of California, San Diego. His work lies at the intersection of machine learning and audio analysis. He is an active open source software developer, and the principal maintainer of the librosa package for audio analysis.

2019-02-02-crop

Peter Sobot:  Software Engineering for Machine Learners (and Drummers) – Building Robust Applications with Audio Data  

Building machine learning systems is hard, but building systems that can scale can be even harder. In this workshop, we’ll discuss software engineering techniques to use when building machine learning systems, including methods to make your code easier to write, test, debug, and maintain. We’ll also build an audio sample classifier with these techniques using basic machine learning concepts, and discuss methods for deploying this system at scale. Finally, we’ll take the system to an extreme and use cloud computing to build a system that learns in response to user input.

Peter Sobot is a Staff Engineer at Spotify, where he works on recommendation products at massive scale, including the systems that power Discover Weekly.  His open-source software contributions range from low-level data tools to legendary internet-scale hacks like The Wub MachineHe has spoken at !!Con, Google Cloud Next, and Google Summit, and makes electronic music in his spare time.

Bob Sturm

Bob Sturm: Computer-Guided Analysis of Computer-Generated Music Corpora

Various iterations of the folkrnn system (folkrnn.org) have generated over 100,000 transcriptions of “machine folk”, e.g., https://highnoongmt.wordpress.com/2018/01/05/volumes-1-20-of-folk-rnn-v1-transcriptions/, but manually looking through these takes a lot of time. In this project we will think about and implement some methods that can help one grasp characteristics of such collections, and find interesting bits. For instance, we can look for instances of plagiarism, find anomalous material, judge similarity in terms of pitches, meter, melodic contour, etc.

Bob L. Sturm is currently an Associate Professor in the Speech, Music and Hearing Division of the School of Electronic Engineering and Computer Science at the Royal Institute of Technology KTH, Sweden. Before that he was a Lecturer in Digital Media at the Centre for Digital Music, School of Electronic Engineering and Computer Science, Queen Mary University of London. His research interests include digital signal processing for sound and music signals, machine listening, evaluation, and algorithmic composition. He is also a musician.

Chris Tralie

Chris Tralie: To What Extent Do Cyclic Inconsistencies Exist in Musical Preferences?

Music recommendation algorithms seek to rank a set of candidate songs in order of some estimated user preference.  It may be challenging to ascertain preferences from surveys, however, since Miller’s empirical “rule of 7” could be interpreted to suggest that humans lack the working memory to meaningfully rank much more than 7 items at a time.  To learn a longer list of preferences, then, one could consider presenting only a pair of alternatives at a time and aggregating these pairwise preferences into a global ranking. However, real pairwise rankings can lead to cyclic inconsistencies; that is, people often express that A > B and B > C, but also that C > A.  This is known as the “Condorcet Paradox.” Fortunately, there exists a topological pairwise rank aggregation technique, known as “HodgeRank,”[1] which can aggregate these rankings into the “most consistent” global order, while simultaneously quantifying the degree to which local (A > B > C > A) and global (A > B > C > … > A) exist.  In this workshop, we will first discuss these concepts in more detail, and then we will each listen to pairs of 15 second clips from a diverse corpus of music [2] and rank our preferences, and then apply HodgeRank to see how consistent we all are. We will also use metrics between rankings to show which people in our group have similar preferences.  Zooming out, we will also discuss some social psychology literature that correlates musical preferences to personality traits in music [2], and we will discuss concurrent ethical pitfalls that can emerge when collecting data and interpreting results in such studies.

[1] http://www.ams.org/publicoutreach/feature-column/fc-2012-12

[2] Rentfrow, Peter J., et al. “The song remains the same: A replication and extension of the MUSIC model.” Music Perception: An Interdisciplinary Journal 30.2 (2012): 161-185.

Christopher J. Tralie is a data science researcher working in applied geometry/topology and geometric signal processing. His work spans shape-based music structure analysis and cover song identification, video analysis, multimodal time series analysis, and geometry-aided data visualization. He received a B.S.E. from Princeton University 2011, a master’s at Duke University in 2013, and a Ph.D. in at Duke University in 2017, all in Electrical Engineering. His Ph.D. was primarily supported by an NSF Graduate Fellowship, and his dissertation is entitled “Geometric Multimedia Time Series.”  He then did a postdoc at Duke University in Mathematics and a postdoc at Johns Hopkins University in Complex Systems. He was awarded a Bass Instructional Teaching fellowship at Duke University, and he maintains an active interest in pedagogy and outreach, including longitudinal mentoring of underprivileged youths in STEAM education. He is currently a tenure track assistant professor at Ursinus College in the department of Mathematics and Computer Science. For more info, please visit http://www.ctralie.com.

TJ Tsai.jpg

TJ Tsai: Generating Music by Superimposing and Adapting Existing Audio Tracks

There has been a lot of work in training models to generate novel music from scratch.  In this workshop, we will explore the possibility of generating music by taking a source audio track and enhancing it by superimposing other segments of existing audio material.  The specific task we will work on is to take a classical piano recording and to overlay techno beats/music in an aesthetically pleasing manner. We will brainstorm different ways to accomplish this task, develop some prototypes, and hopefully generate some new music by the end of the workshop!

Prof. TJ Tsai completed bachelor’s and master’s degrees in electrical engineering at Stanford University.  During college, he studied classical piano with George Barth and participated in the Stanford Jazz Orchestra and the chamber music program.  After graduating, he worked at SoundHound for a few years, and then went to UC Berkeley for his Ph.D. Since 2016 he has been a faculty member in the engineering department at Harvey Mudd College, a STEM-focused liberal arts college in Claremont, CA.

Gissel Velarde  HolzapfelPhotoFinal

Gissel Velarde and Andre Holzapfel:  Music Research for Good

In recent years, important advances in artificial intelligence (AI) led to different initiatives considering super-intelligence, its advantages, and dangers. The initiatives fostering beneficial AI include (i) conferences: the AI for Good Global Summit (running since 2017), The Beneficial artificial general intelligence conference (held in 2015, 2017 and 2019), (Iii) the establishment of organizations and projects like OpenAI, Partnership on AI, Google’s AI for Social good, or AI for Humanity from the Université de Montréal, and (iv) governments AI strategies. In last year’s ISMIR conference, our community dedicated a session to discuss ethics in MIR, and there are topics which are still to be addressed. During this workshop, we will review the state of AI for good. We will revisit the tentative ethical guidelines for MIR developers proposed by Holzapfel et. al (2018) and its alignment with guidelines from the IEEE Global Initiative on Ethics of Autonomous and Intelligent Systems (Chatila & Havens, 2019), and Ethics Guidelines for Trustworthy AI from the High-Level Expert Group on Artificial Intelligence [HEGAI] (2019).  We will use tools such as SWOAT (strengths, weaknesses, opportunities, and threats) analysis, business canvas, SCAMPER technique for creative thinking and Gantt charts. After a situational analysis, we will define goals, scope, stakeholders, risks, benefits, impact, and an action plan. Finally, we will elaborate ethics guidelines in music research to be proposed to our community for consideration.

Chatila, R., & Havens, J. C. (2019). The IEEE Global Initiative on Ethics of Autonomous and Intelligent Systems. In Robotics and Well-Being (pp. 11-16). Springer, Cham.

High-Level Expert Group on Artificial Intelligence (2019). Ethics guidelines for trustworthy AI. Retrieved from https://ec.europa.eu/digital-single-market/en/news/ethics-guidelines-trustworthy-ai

Holzapfel, A., Sturm, B.L. and Coeckelbergh, M., 2018. Ethical Dimensions of Music Information Retrieval Technology. Transactions of the International Society for Music Information Retrieval, 1(1), pp.44–55. DOI: http://doi.org/10.5334/tismir.13

Gissel Velarde is a computer scientist, engineer, pianist and composer. She holds a PhD degree from Aalborg University for her doctoral thesis “Convolutional methods for music analysis” supervised by David Meredith and Tillman Weyde. She participated as a research member of the European project, “Learning to Create” (Lrn2Cre8), a collaborative project within the Future and Emerging Technologies (FET) programme of the Seventh Framework Programme for Research of the European Commission. She was a machine learning lead at Moodagent and worked as a consultant for SONY Computer Science Laboratories.  She is a DAAD alumni and mentor of the WIMIR program.

Andre Holzapfel received M.Sc. and Ph.D. degrees in computer science from the University of Crete, Greece, and a second Ph.D. degree in music from the Centre of Advanced Music Studies (MIAM) in Istanbul, Turkey. He worked at several leading institutes in computer engineering as postdoctoral researcher, with a focus on rhythm analysis in music information retrieval. His field work in ethnomusicology was mainly conducted in Greece, with Cretan dance being the subject of his second dissertation. In 2016, he became Assistant Professor in Media Technology at the KTH Royal Institute of Technology in Stockholm, Sweden. Since then, his research subjects incorporate the computational analysis of human rhythmic behavior by means of sensor technology, and the investigation of ethical aspects of computational approaches to music.

Eva Zangerle

Eva Zangerle – Multi-Dimensional User Models for MIR

In music information retrieval scenarios (particularly, when it comes to personalization), users and their preferences are often modeled solely by their direct interactions with the system (e.g., songs listened to). However, a user’s perception and liking of recommended/retrieved tracks is dependent on a number of dimensions, which may include the (situational) context of the user (e.g., time, location or activity), the user’s intent, content descriptors and characteristics of tracks the users have listened to and also should be able to model the change of preference over time (short vs. long-term). Comprehensive models that capture these multiple dimensions, however, are hardly devised.

In the scope of this workshop, we aim to look into how users and their preferences (long- and short-term) can be modeled, the implications of such comprehensive user models on the underlying MIR algorithms and also, how we can evaluate the contribution and impact of such user models.

Eva Zangerle is a postdoctoral researcher at the University of Innsbruck at the research group for Databases and Information Systems (Department of Computer Science). She earned her master’s degree in Computer Science at the University of Innsbruck and subsequently pursued her Ph.D. from the University of Innsbruck in the field of recommender systems for collaborative social media platforms. Her main research interests are within the fields of social media analysis, recommender systems, and information retrieval. Over the last years, she has combined these three fields of research and investigated context-aware music recommender systems based on data retrieved from social media platforms aiming to exploit new sources of information for recommender systems. She was awarded a Postdoctoral Fellowship for Overseas Researchers from the Japan Society for the Promotion of Science allowing her to make a short-term research stay at the Ritsumeikan University in Kyoto.

Advertisements

WiMIR Workshop 2018: Building Collaborations Among Artists, Coders and Machine Learning

We’re very pleased to link to this very fine blog post by Matt McCallum, Amy Hung, Karin Dressler, Gabriel Meseguer-Brocal, and Benedikte Wallace about their group work at the 2018 WiMIR Workshop!

WiMIR Workshop 2018: Success!

Screen Shot 2018-10-17 at 11.56.45.png

We’re pleased to tell you that the WiMIR 1st Annual Workshop was a resounding success!

Why organize a WiMIR Workshop? We saw this as a way to build upon the MIR community’s already strong support for diversity and inclusion in the field. The Workshop format was a fitting complement to the remote pairings of the mentoring program and brief introductions gained during the main ISMIR conference. We proposed three aims for the WiMIR 1st Annual Workshop:

  • Further amplify the scientific efforts of women in the field.
  • Encourage the discussion of proposed or unfinished work.
  • Create additional space for networking.

Thanks to support from Spotify, we were able to offer the WiMIR Workshop as a free event, and open it up to ALL members of the community!  The Workshop took place as a satellite event of ISMIR2018, at Télécom ParisTech. We had 65 pre-registrations, and closer to 80 people attending.  We had poster presentations from 18 women in the field, with topics ranging from Indian Classical music to musical gestures.  We had 11 project groups, ranging from karaoke-at-scale, music for mood modulation, and the relationship between cardiac rhythms & music.  We had a staggering number of croissants and pain au chocolats, too – thanks, Paris.

Screen Shot 2018-10-17 at 11.54.17.png

Screen Shot 2018-10-17 at 11.54.38.png

The day started with the aforementioned pastries and coffee, and then people joined up with their project groups, introduced themselves, and got a big-picture overview from their Project Guides.  This led into a poster session, focusing on early-stage research ideas.

Posters turned into lunch, which was informally structured around topics like “Dealing with Sexism” and “Surviving Grad School”.  The lunch provided attendees with an opportunity to connect with new people and learn about topics that members in the field (especially those who are not women) don’t often discuss.  

After lunch, the project groups started a deeper dive into their topic areas, with an eye to present at 4 pm.  The presentations were great – we had everything from machine-learned piano melodies to microsurveys about music and emotion to a whole lot of post-it notes about cover songs.

Screen Shot 2018-10-17 at 11.55.00.png

Screen Shot 2018-10-17 at 11.55.07.png

It was, in general, a lot of fun, and we achieved the aims of the event.  We’re looking forward to next year – it seems like most folks are as well:

It was a fruitful session and our group will certainly continue the work that we started yesterday.” – Elaine Chew, Professor of Digital Media, Queen Mary University

“The most inspiringly diverse event in the field of MIR!” – Oriol Nieto, Senior Scientist, Pandora

“It was exciting to see new diverse groups of people across different backgrounds, disciplines and institutions form new research collaborations!” – Rachel Bittner, Research Scientist, Spotify.

“The first WiMIR Workshop was an amazing way to meet a diverse set of people working in MIR who want to make the world better.  I loved our workshop chats as well as breaking the ice on tougher discussion points during lunch, such as overcoming sexism.  It was staggering to see what each group accomplished in such a short period of time at the first WiMIR Workshop, and I made many great new friendships as well.  Bravo!” – Tom Butcher, Principal Engineering & Science Manager, Microsoft

 

“I am already looking forward to next year’s!” – Kyungyun Lee, MS student, KAIST

“Very organized, inspiring and motivating event! Excellent way to meet the most welcoming people of the MIR community.” – Bruna Wundervald, PhD Candidate, Maynooth University

“I loved the format! Emerged at the end of the day full of ideas and new motivation.” – Polina Proutskova, Postdoc, Centre for Digital Media, Queen Mary University

And, of course, the tweets:Screen Shot 2018-10-24 at 12.04.49.png

Screen Shot 2018-10-24 at 12.04.56.png

Screen Shot 2018-10-24 at 12.05.08.png

 

Big thanks to everyone who helped out: The ISMIR2018 volunteers & General Chairs, the ISMIR Board, WIMIR leadership, Télécom ParisTech, and Emile Marx from Spotify Paris.  

We’ll see you next year in Delft!

The WiMIR Workshop Organizers,

WiMIR Workshop 2018 Project Guides

This is a list of project guides and their areas of interest for the 2018 WiMIR workshop.  These folks will be leading the prototyping and early research investigations at the workshop.  You can read about them and their work in detail below, and sign up to attend the WiMIR workshop here.

 

RachelBittner

Rachel Bittner:  MIR with Stems

The majority of digital audio exists as mono or stereo mixtures, and because of this MIR research has largely focused on estimating musical information (beats, chords, melody, etc.) from these polyphonic mixtures. However, stems (the individual components of a mixture) are becoming an increasingly common audio format. This project focuses on how MIR techniques could be adapted if stems were available for all music. Which MIR problems suddenly become more important? What information – that was previously difficult to estimate from mixtures – is now simple to estimate? What new questions can we ask about music that we couldn’t before? As part the project, we will try to answer some of these questions and create demos that demonstrate our hypotheses.

Rachel is a Research Scientist at Spotify in New York City, and recently completed her Ph.D. at the Music and Audio Research Lab at New York University under Dr. Juan P. Bello. Previously, she was a research assistant at NASA Ames Research Center working with Durand Begault in the Advanced Controls and Displays Laboratory. She did her master’s degree in math at NYU’s Courant Institute, and her bachelor’s degree in music performance and math at UC 2 Irvine. Her research interests are at the intersection of audio signal processing and machine learning, applied to musical audio. Her dissertation work applied machine learning to various types of fundamental frequency estimation.


tomButcher (1).jpg

Tom Butcher: Expanding the Human Impact of MIR with Mixed Reality

Mixed reality has the potential to transform our relationship with music. In this workshop, we will survey the new capabilities mixed reality affords as a new computing paradigm and explore how these new affordances can open the world of musical creation, curation, and enjoyment to new vistas. We will begin by discussing what mixed reality means, from sensors and hardware to engines and platforms for mixed reality experiences. From there, we will discuss how mixed reality can be applied to MIR- related fields of study and applications, considering some of the unique challenges and new research questions posed by the technology. Finally, we will discuss human factors and how mixed reality coupled with MIR can lead to greater understanding, empathy, expression, enjoyment, and fulfillment.

Tom Butcher leads a team of engineers applied scientists in Microsoft’s Cloud & AI division focusing on audio sensing, machine listening, avatars, and applications of AI. In the technology realm, Tom is an award-winning creator of audio and music services, which include recommendation engines, continuous playlist systems, assisted composition agents, and other tools for creativity and productivity. Motivated by a deep enthusiasm for synthesizers and electronic sounds from an early age, Tom has released many pieces of original music as Orqid and Codebase and continues to record and perform. In 2015, Tom co- founded a Seattle-based business focusing on community, education, and retail for synthesizers and electronic music instruments called Patchwerks.


IMG_8863

Elaine Chew: MIR Rhythm Analysis Techniques for Arrhythmia ECG Sequences

Cardiac arrhythmia has been credited as the source of the dotted rhythm at the beginning of Beethoven’s “Adieux” Sonata (Op.81a) (Goldberger, Whiting, Howell 2014); the authors have also ascribed Beethoven’s “Cavatina” (Op.130) and another piano sonata (Op.110) to his possible arrhythmia. It is arguably problematic and controversial to diagnose arrhythmia in a long-dead composer through his music. Without making any hypothesis on composers’ cardiac conditions, Chew (2018) linked the rhythms of trigeminy (a ventricular arrhythmia) to the Viennese Waltz and scored atrial fibrillation rhythms to mixed meters, Bach’s Siciliano, and the tango; she also made collaborative compositions (Chew et al. 2017-8) from longer ventricular tachycardia sequences. Given the established links between heart and musical rhythms, in this workshop, we shall take the pragmatic and prosaic approach of applying a wide variety of MIR rhythm analysis techniques to ECG recordings of cardiac arrhythmias, exploring the limits of what is currently possible.

Chew, E. (2018). Notating Disfluencies and Temporal Deviations in Music and Arrhythmia. Music and Science. [ html | pdf ]
Chew, E., A. Krishna, D. Soberanes, M. Ybarra, M. Orini, P. Lambiase (2017-8). Arrhythmia Suitebit.ly/heart-music-recordings
Goldberger, Z. D., S. M. Whiting, J. D. Howell (2014). The Heartfelt Music of Ludwig van Beethoven. Perspectives in Biology and Medicine, 57(2): 285-294. [synopsis]

Elaine Chew is Professor of Digital Media at Queen Mary University of London, where she is affiliated with the Centre for Digital Music in the School of Electronic Engineering and Computer Science. She was awarded a 2018 ERC ADG for the project COSMOS: Computational Shaping and Modeling of Musical Structures, and is recipient of a 2005 Presidential Early Career Award in Science and Engineering / NSF CAREER Award, and 2007/2017 Fellowships at Harvard’s Radcliffe Institute for Advanced Studies. Her research, which centers on computational analysis of music structures in performed music, performed speech, and cardiac arrhythmias, has been supported by the ERC, EPSRC, AHRC, and NSF, and featured on BBC World Service/Radio 3, Smithsonian Magazine, Philadelphia Inquirer, Wired Blog, MIT Technology Review, etc. She has authored numerous articles and a Springer monograph (Mathematical and Computational Modeling of Tonality: Theory and Applications), and served on the ISMIR steering committee.


 

devaney_sm

Johanna Devaney:  Cover Songs for Musical Performance Comparison and Musical Style Transfer

Cover versions of a song typically retain basic musical the material of the song being covered but may vary a great deal in their fidelity to other aspects of the original recording. While some covers only differ in minor ways, such as timing and dynamics, while others may use completely different instrumentation, performance techniques, or genre. This workshop will explore the potential of cover songs for studying musical performance and for performing musical style transfer. In contrast to making comparisons between different performances of different songs, cover songs provide a unique opportunity to evaluate differences in musical performance, both within and across genres. For musical style transfer, the stability of the musical material serves as an invariant representation, which allows for paired examples for training machine learning algorithms. The workshop will consider issues in dataset creation as well as metrics for evaluating performance similarity and style transfer.

Johanna is an Assistant Professor of Music Technology at Brooklyn College, City University of New York and the speciality chief editor for the Digital Musicology section of Frontiers in Digital Humanities. Previously she taught in the Music Technology program at NYU Steinhardt and the Music Theory and Cognition program at Ohio State University. Johanna completed her post-doc at the Center for New Music and Audio Technologies (CNMAT) at the University of California at Berkeley and her PhD in music technology at the Schulich School of Music of McGill University. She also holds an MPhil degree in music theory from Columbia University, as well as an MA in composition from York University in Toronto. Johanna’s research seeks to understand how humans engage with music, primarily through performance, with a particular focus on intonation in the singing voice, and how computers can be used to model and augment our understanding of this engagement.


 

DougEck (1).jpg

Doug Eck: Building Collaborations Among Artists, Coders and Machine Learning

We propose to talk about challenges and future directions for building collaborations among artists, coders and machine learning researchers. The starting point is g.co/magenta. We’ve learned a lot about what works and (more importantly) what doesn’t work in building bridges across these areas. We’ll explore community building, UX/HCI issues, research directions, open source advocacy and the more general question of deciding what to focus on in such an open-ended, ill-defined domain. We hope that the session is useful even for people who don’t know of or don’t care about Magenta. In other words, we’ll use Magenta as a starting point for exploring these issues, but we don’t need to focus solely on that project.

Douglas Eck is a Principal Research Scientist at Google working in the areas of music, art and machine learning. Currently he is leading the Magenta Project, a Google Brain effort to generate music, video, images and text using deep learning and reinforcement learning. One of the primary goals of Magenta is to better understand how machine learning algorithms can learn to produce more compelling media based on feedback from artists, musicians and consumers. Before focusing on generative models for media, Doug worked in areas such as rhythm and meter perception, aspects of music performance, machine learning for large audio datasets and music recommendation for Google Play Music. He completed his PhD in Computer Science and Cognitive Science at Indiana University in 2000 and went on to a postdoctoral fellowship with Juergen Schmidhuber at IDSIA in Lugano Switzerland. Before joining Google in 2010, Doug worked in Computer Science at the University of Montreal (MILA machine learning lab) where he became Associate Professor.


 

RyanGroves

Ryan Groves:  Discovering Emotion from Musical Segments

In this project, we’ll first survey the existing literature for research on detecting emotions from musical audio, and find relevant software tools and datasets to assist in the process. Then, we’ll try to formalize our own expertise in how musical emotion might be perceived, elicited and automatically evaluated from musical audio. The goal of the project will be to create a software service or tool that can take a musical audio segment that is shorter than a whole song, and detect the emotion from it.

Ryan Groves is an award-winning music researcher and veteran developer of intelligent music systems. He did a Masters’ in Music Technology at McGill University under Ichiro Fujinaga, has published in conference proceedings including Mathematics and Computation in Music, Musical Metacreation (ICCC & AIIDE), and ISMIR. In 2016, he won the Best Paper award at ISMIR for his paper on “Automatic melodic reduction using a supervised probabilistic context-free grammar”.  He is currently the President and Chief Product Officer at Melodrive – an adaptive music generation system. Using cutting-edge artificial intelligence techniques, Melodrive allows any developer to automatically create and integrate a musical soundtrack into their game, virtual world or augmented reality system.  With a strong technical background, extensive industry experience in R&D, and solid research footing in academia, Ryan is focused on delivering innovative and robust musical products.


 

 

Christine Ho, Oriol Nieto, & Kristi Schneck:  Large-scale Karaoke Song Detection

We propose to investigate the problem of automatically identifying Karaoke tracks in a large music catalog. Karaoke songs are typically instrumental renditions of popular tracks, often including backing vocals in the mix, such that a live performer can sing on top of them. The automatic identification of such tracks would not only benefit the curation of large collections, but also its navigation and exploration. We challenge the participants to think about the type of classifiers we could use in this problem, what features would be ideal, and what dataset would be beneficial to the community to potentially propose this as a novel MIREX (MIR Evaluation eXchange) task in the near future.

Oriol Nieto is a Senior Scientist at Pandora. Prior to that, he defended his Ph.D Dissertation in the Music and Audio Research Lab at NYU focusing on the automatic analysis of structure in music. He holds an M.A. in Music, Science and Technology from the Center for Computer Research in Music and Acoustics at Stanford University, an M.S. in Information Theories from the Music Technology Group at Pompeu Fabra University, and a Bachelor’s degree in Computer Science from the Polytechnic University of Catalonia. His research focuses on music information retrieval, large scale recommendation systems, and machine learning with especial emphasis on deep architectures. Oriol plays guitar, violin, and sings (and screams) in his spare time.

Kristi Schneck is a Senior Scientist at Pandora, where she is leading several science initiatives on Pandora’s next-generation podcast recommendation system. She has driven the science work for a variety of applications, including concert recommendations and content management systems. Kristi holds a PhD in physics from Stanford University and dual bachelors degrees in physics and music from MIT.

Christine Ho is a scientist on Pandora’s content science team, where she works on detecting music spam and helps teams with designing their AB experiments. Before joining Pandora, she completed her PhD in Statistics at University of California, Berkeley and interned at Veracyte, a company focused on applying machine learning to genomic data to improve outcomes for patients with hard-to-diagnose diseases.


xiaoxhu.jpg

Xiao Hu: MIR for Mood Modulation: A Multidisciplinary Research Agenda

Mood modulation is a main reason behind people’s engagement with music, whereas how people use music to modulate mood and how MIR techniques and systems can facilitate this process continue fascinating researchers in various related fields. In this workshop group, we will discuss how MIR researchers with diverse backgrounds and interests can participate in this broad direction of research. Engaging activities are designed to enable hands-on practice on multiple research methods and study design (both qualitative and quantitative/computational). Through feedback from peers and the project guide, participants are expected to start developing a focused research agenda with theoretical, methodological and practical significance, based on their own strengths and interests. Participants from different disciplines and levels are all welcomed. Depending on the background and interests of the participants, a small new dataset is prepared for fast prototyping on how MIR techniques and tools can help enhancing this multidisciplinary research agenda.

Dr. Xiao Hu has been studying music mood recognition and MIR evaluation since 2006. Her research on affective interactions between music and users has been funded by the National Science Foundation of China and Research Grant Council (RGC) of the Hong Kong S. A. R. Dr. Hu was a tutorial speaker in ISMIR conferences in 2012 and 2016. Her papers have won several awards in international conferences and have been cited extensively. She has served as a conference co-chair (2014), a program co-chair (2017 and 2018) for ISMIR, and an editorial board member of TISMIR. She was in the Board of Directors of ISMIR from 2012 to 2017. Dr. Hu has a multidisciplinary background, holding a PhD degree in Library and Information Science, Multi-disciplinary Certificate in Language and Speech Processing, and a Master’s degree in Computer Science, a Master’s degree in Electrical Engineering and a Bachelor’s degree in Electronics and Information Systems.


Anja Volk, Iris Yuping Ren, & Hendrik Vincent Koops:  Modeling Repetition and Variation for MIR

Repetition and variation are fundamental principles in music. Accordingly, many MIR tasks are based on automatically detecting repeating units in music, such as repeating time intervals that establish the beat, repeating segments in pop songs that establish the chorus, or repeating patterns that constitute the most characteristic part of a composition. In many cases, repetitions are not literal, but subject to slight variations, which introduces the challenge as to what types of variation of a musical unit can be reasonably considered as a re-occurrence of this unit. In this project we look into the computational modelling of rhythmic, melodic, and harmonic units, and the challenge of evaluating state-of-the-art computational models by comparing the output to human annotations. Specifically, we investigate for the MIR tasks of 1) automatic chord extraction from audio, and 2) repeated pattern discovery from symbolic data, how to gain high-quality human annotations which account for different plausible interpretations of complex musical units. In this workshop we discuss different strategies of instructing annotators and undertake case studies on annotating patterns and chords on small data sets. We compare different annotations, jointly reflect on the rationales regarding these annotations, develop novel ideas on how to setup annotation tasks and discuss the implications for the computational modelling of these musical units for MIR.

Anja Volk holds masters degrees in both Mathematics and Musicology, and a PhD from Humboldt University Berlin, Germany. Her area of specialization is the development and application of computational and mathematical models for music research. The results of her research have substantially contributed to areas such as music information retrieval, computational musicology, digital cultural heritage, music cognition, and mathematical music theory. In 2003 she has been awarded a Postdoctoral Fellowship Award at the University of Southern California, in 2006 she joined Utrecht University as a Postdoc in the area of Music Information Retrieval. In 2010 she has been awarded a highly prestigious NWO-VIDI grant from the Netherlands Organisation for Scientific Research, which allowed her to start her own research group. In 2016 she co-launched the international Women in MIR mentoring program, in 2017 she co-organized the launch of the Transactions of the International Society for Music Information Retrieval, and is serving as Editor-in-Chief for the journal’s first term.


Cynthia C. S. Liem & Andrew Demetriou:  Beyond the Fun: Can Music We Do Not Actively Like Still Have Personal Significance?

In today’s digital information society,music is typically perceived and framed as ‘mere entertainment’. However, historically, the significance of music to human practitioners and listeners has been much broader and more profound. Music has been used to emphasize social status, to express praise or protest, to accompany shared social experiences and activities, and to moderate activity, mood and self-established identity as a ‘technology of the self’. Yet today, our present-day music services (and their underlying Music Information Retrieval (MIR) technology) do not focus explicitly on fostering these broader effects: they may be hidden in existing user interaction data, but this data usually lacks sufficient context to tell for sure.  As a controversial thought, music that is appropriate for the scenarios above may not necessarily need to be our favorite music, yet still be of considerable personal value and significance to us. How can and should we deal with this in the context of MIR and recommendation? May MIR systems then become the tools that can surface such items, and thus create better user experiences that users could not have imagined themselves? What ethical and methodological considerations should we take into account when pursuing this? And, for technologists in need of quantifiable and measurable criteria of success, how should the impact of suggested items on users be measured in these types of scenarios?   In this workshop, we will focus on discussing these questions from an interdisciplinary perspective, and jointly designing corresponding initial MIR experimental setups.

Cynthia Liem graduated in Computer Science at Delft University of Technology, and in Classical Piano Performance at the Royal Conservatoire in The Hague. Now an Assistant Professor at the Multimedia Computing Group of Delft University of Technology, her research focuses on music and multimedia search and recommendation, with special interest in fostering the discovery of content which is not trivially on users’ radars. She gained industrial experience at Bell Labs Netherlands, Philips Research and Google, was a recipient of multiple scholarships and awards (e.g. Lucent Global Science & Google Anita Borg Europe Memorial scholarships, Google European Doctoral Fellowship, NWO Veni) and is a 2018 Researcher-in-Residence at the National Library of The Netherlands. Always interested in discussion across disciplines, she also is co-editor of the Multidisciplinary Column of the ACM SIGMM Records. As a musician, she still has an active performing career, particularly with the (inter)nationally award-winning Magma Duo.

Andrew Demetriou is currently a PhD candidate in the Multimedia Computing Group at the Technical University at Delft. His academic interests lie in the intersection of the psychological and biological sciences, and the relevant data sciences, and furthering our understanding of 1) love, relationships, and social bonding, and 2) optimal, ego-dissolutive, and meditative mental states, 3) by studying people performing, rehearsing, and listening to music. His prior experience includes: assessing the relationship between initial romantic attraction and hormonal assays (saliva and hair) during speed-dating events, validating new classes of experimental criminology VR paradigms using electrocardiography data collected both in a lab and in a wild setting (Lowlands music festival), and syntheses of musical psychology literature which were presented at ISMIR 2016 and 2017.


 

matt_mcvicar

Matt McVicar: Creative applications of MIR Data

In this workshop, you’ll explore the possibility of building creative tools using MIR data. You’ll discuss the abundance of prevailing data for creative applications, which in the context of this workshop simply means “a human making something musical”. You, as a team, may come up with new product or research ideas based on your own backgrounds, or you may develop an existing idea from existing products or research papers. You may find that the data for your application exists already, so that you can spend the time in the workshop fleshing out the details of how your application will work. Else, you may discover that the data for your task does not exist, in which case you, as a team, could start gathering or planning the gathering of these data.

Matt is Head of Research at Jukedeck. He began his PhD at the University of Bristol under the supervision of Tijl De Bie and finished it whilst on a Fulbright Scholarship at Columbia University in the city of New York with Dan Ellis. He then went on to work under Masataka Goto at the National Institute for Advanced Industrial Science and Technology in Tsukuba, Japan. Subsequently, he returned to Bristol to undertake a 2 year grant in Bristol. He joined Jukedeck in April 2016, and his main interests are the creative applications of MIR to domains such as algorithmic composition.