WiMIR Workshop 2021 Project Guides

This is a list of Project Guides and their areas of interest for the 2021 WiMIR Virtual Workshop, which will take place as an online-only satellite event of ISMIR2021.

The Workshop will take place on Friday, October 29 and Saturday, October 30 – please sign up by using this form: https://forms.gle/GHjqwaHWBciX9tuT7

We know that timezones for this are complicated, so we’ve made a Google Calendar with all the events on it – visit this link to add them to your calendar

This year’s Workshop is organized by Courtney Reed (Queen Mary University), Kitty Shi (Stanford University), Jordan B. L. Smith (ByteDance), Thor Kell (Spotify), and Blair Kaneshiro (Stanford University).

October 29

Dorien Herremans: Music Generation – from musical dice games to controllable AI models

This event will take place at 1500, GMT+8

In this fireside chat, Dorien will give a brief overview of the history of music generation systems, with a focus on the current challenges in the field, followed by an open discussion and Ask-Me-Anything (AMA) session. Prof. Herremans’ recent work has focused on creating controllable music generation systems using deep learning technologies. One challenge in particular – generating music with steerable emotion – has been central in her research. When it comes to affect and emotion, computer models still do not compare to humans. Using affective computing techniques and deep learning, Dorien’s team has built models that learn to predict perceived emotion from music. These models are then used to generate new fragments in a controllable manner, so that users can steer the desired arousal/valence level or tension in newly generated music. Other challenges tackled by Dorien’s team include ensuring repeated themes in music, automatic music transcription, and novel music representations, including the library Pytorch GPU library: nnAudio. 

Dorien Herremans is an Assistant Professor at Singapore University of Technology and Design, where she is also Director of Game Lab. Before joining SUTD, she was a Marie Sklodowska-Curie Postdoctoral Fellow at the Centre for Digital Music at Queen Mary University of London, where she worked on the project: “MorpheuS: Hybrid Machine Learning – Optimization techniques To Generate Structured Music Through Morphing And Fusion”. She received her Ph.D. in Applied Economics on the topic of Computer Generation and Classification of Music through Operations Research Methods, and graduated as a Business Engineer in Management Information Systems at the University of Antwerp in 2005. After that, she worked as a consultant and was an IT lecturer at the Les Roches University in Bluche, Switzerland. Dr. Herremans’ research interests include AI for novel applications in music and audio.

Kat Agres: Music, Brains, and Computers, Oh My!

This event will take place at 1600, GMT+8

In an informal, ask-me-anything chat, Kat will discuss her career path through cognitive science to computational approaches to music cognition, to her current research in music, computing and health.

Kat Agres is an Assistant Professor at the Yong Siew Toh Conservatory of Music (YSTCM) at the National University of Singapore (NUS), and teaches classes at YSTCM, Yale-NUS, and the NUS YLL School of Medicine. She was previously a Research Scientist III and founder of the Music Cognition group at the Institute of High Performance Computing, A*STAR. Kat received her PhD in Psychology (with a graduate minor in Cognitive Science) from Cornell University in 2013, and holds a bachelor’s degree in Cognitive Psychology and Cello Performance from Carnegie Mellon University. Her postdoctoral research was conducted at Queen Mary University of London, in the areas of Music Cognition and Computational Creativity. She has received numerous grants to support her research, including Fellowships from the National Institute of Health (NIH) and the National Institute of Mental Health (NIMH) in the US, postdoctoral funding from the European Commission’s Future and Emerging Technologies (FET) program, and grants from various funding agencies in Singapore. Kat’s research explores a wide range of topics, including music technology for healthcare and well-being, music perception and cognition, computational modeling of learning and memory, statistical learning, automatic music generation and computational creativity. She has presented her work in over fifteen countries across four continents, and remains an active cellist in Singapore.

Tian Cheng: Beat Tracking with Sequence Models

This event will take place at 1830, GMT+9

Beat tracking is an important MIR task with a long history. It provides basic metrical
information and is the fundament of synchronize-based applications. In this task, I
will summarize common choices for building a beat tracking model based on
research on related topics (beat, downbeat, and tempo). I will also compare simple
sequence models for beat tracking. In the last part, I will give some examples to show
how beat tracking is used in real-work applications.

Tian Cheng is a researcher at Media Interaction Group in National Institute of
Advanced Industrial Science and Technology (AIST), Japan. From 2016 to 2018, she
was a postdoctoral researcher in the same group. Her research interests include beat
tracking and music structure analysis. Her work provides basic music content
estimations to support applications for music editing and creation. She received her
PhD from Queen Mary University of London in 2016 and her dissertation focused on
using music acoustics for piano transcription.

Stefania Serafin: Sonic Interactions for All

This event will take place at 1800, GMT+2

In this workshop I will introduce our  recent work on using novel technologies and sonic interaction design to help hearing impaired users and individuals with limited mobility enjoy music.The talk will present the technologies we develop in the Multisensory Experience lab at Aalborg University in Copenhagen, such as VR, AR and novel interfaces and haptic devices, as well as how these technologies can be used to help populations in need.

Stefania Serafin is professor of Sonic Interaction Design at Aalborg University in Copenhagen. She received a Ph.D. in Computer Based Music Theory and Acoustics from Stanford University. She is the president of the Sound and Music Computing Association and principal investigator of the Nordic Sound and Music Computing Network. Her research interest is on sonic interaction design, sound for VR and AR and multi sensory processing.

Oriol Nieto: Overview, Challenges, and Applications of Audio-based Music Structure Analysis

This event will take place at 1000, GMT-7

The task of audio-based music structure analysis aims at identifying the different parts of a given music signal and labeling them accordingly (e.g., verse, chorus). The automatic approach to this problem can help several applications such as intra- and inter-track navigation, section-aware automatic DJ-ing, section-based music recommendation, etc. This is a fundamental MIR task that has significantly advanced over the past two decades, yet still poses several interesting research challenges. In this talk I will give an overview of the task, discuss its open challenges, and explore the potential applications, some of which have been employed at Adobe Research to help our users have better creative experiences.

Oriol Nieto (he/him or they/them) is a Senior Audio Research Engineer at Adobe Research in San Francisco. He is a former Staff Scientist in the Radio and Music Informatics team at Pandora, and holds a PhD from the Music and Audio Research Laboratory of New York University. His research focuses on topics such as music information retrieval, large scale recommendation systems, music generation, and machine learning on audio with especial emphasis on deep architectures. His PhD thesis is about trying to better teach computers at “understanding” the structure of music. Oriol develops open source Python packages, plays guitar, violin, cajón, and sings (and screams) in their spare time.

Emma Frid: Music Technology for Health

This event will take place at 2000, GMT+2

There is a growing interest in sound and music technologies designed to promote health, well-being, and inclusion, with many multidisciplinary research teams aiming to bridge the fields of accessibility, music therapy, universal design, and music technology. This talk will explore some of these topics through examples from two projects within my postdoctoral work at IRCAM/KTH: Accessible Digital Musical Instruments – Multimodal Feedback and Artificial Intelligence for Improved Musical Frontiers for People with Disabilities, focused on the design and customization of Digital Musical Instruments (DMIs) to promote access to music-making; and COSMOS (Computational Shaping and Modeling of Musical Structures), focused on the use of data science, optimization, and citizen science to study musical structures as they are created in music performances and in unusual sources such as heart signals. 

Emma Frid is a postdoctoral researcher at the Sciences et technologies de la musique et du sons (STMS) Laboratory, at the Institut de Recherche et Coordination Acoustique/Musique (IRCAM) in Paris, where she is working in the COSMOS project, under a Swedish Research Council International Postdoctoral Grant hosted by the Sound and Music Computing Group at KTH Royal Institute of Technology. She holds a PhD in Sound and Music Computing from the Division of Media Technology and Interaction Design at KTH, and a Master of Science in Engineering in Media Technology from the same university. Her PhD thesis focused on how Sonic Interaction Design can be used to promote inclusion and diversity in music-making. Emma’s research is centered on multimodal sound and music interfaces designed to promote health and inclusion, predominantly through work on Accessible Digital Musical Instruments (ADMIs).

Nick Bryan: Learning to Control Signal Processing Algorithms with Deep Learning

This event will take place at 1230, GMT-7

Expertly designed signal processing algorithms have been ubiquitous for decades and helped create the foundation of countless industries and areas of research (e.g. music information retrieval, audio fx, voice processing). In the last decade, however, expertly designed signal processing algorithms have been rapidly replaced with data-driven neural networks, posing the question — is signal processing still useful? And if so, how? In this talk, I will attempt to address these questions and provide an overview of we can combine both disciplines and use neural networks to control (or optimize) existing signal processing algorithms from data and perform a variety of tasks such as guitar distortion modeling, automatic removal of breaths and pops from voice recordings, automatic music mastering, acoustic echo cancelation, and automatic voice production. I will then discuss open research questions and future research directions with a focus on music applications.

Nicholas J. Bryan is a senior research scientist at Adobe Research and interested in (neural) audio and music signal processing, analysis, and synthesis. Nick received his PhD and MA from CCRMA, Stanford University and MS in Electrical Engineering, also from Stanford as well as his Bachelor of Music and BS in Electrical Engineering with summa cum laude honors at the University of Miami-FL. Before Adobe, Nick was a senior audio algorithm engineer at Apple and worked on voice processing algorithms for 4.5 years.

October 30

Lamtharn “Hanoi” Hantrakul: Transcultural Machine Learning in Music and Technology

This event will take place at 1600, GMT+8

Transcultural Technologies empower cultural pluralism at every phase of engineering and design. We often think of technology as a neutral tool, but technology is always created and optimized within the cultural scope of its inventors. This cultural mismatch is most apparent when tools are used across a range of contrasting traditions. Music and Art from different cultures, and the people that create and breathe these mediums, are an uncompromising sandbox to both interrogate these limitations and develop breakthroughs that empower a plurality of cultures. In this talk, we will be taking a deep dive into tangible audio technologies incubated in musical traditions from Southeast Asia, South Asia, South America and beyond.

Hanoi is a Bangkok-born Shanghai-based Cultural Technologist, Research Scientist and Composer. As an AI researcher, Hanoi focuses on audio ML that is inclusive of musical traditions from around the world. At Google AI, he co-authored the breakthrough Differentiable Digital Signal Processing (DDSP) library with the Magenta team and led its deployment across two Google projects: Tone Transfer and Sounds of India.  At TikTok, he continues to develop AI tools that empower music making across borders and skill levels. As a Cultural Technologist, Hanoi has won international acclaim for his transcultural fiddle “Fidular” (Core77, A’), which has been displayed in museums and exhibitions in the US, EU and Asia. He is fluent in French, Thai, English and is working on his Mandarin.

Jason Hockman + Jake Drysdale: Give the Drummer Some

This event will take place at 1000, GMT+1

In the late 1980s, popular electronic music (EM) emerged at the critical intersection between affordable computer technology and the consumer market, and has since grown to become the one of the most popular genres in the world. Ubiquitous within EM creation, digital sampling has facilitated the incorporation of professional-quality recorded performances into productions; one of the most frequently sampled types of recordings used in EM are short percussion solos from funk and jazz performances—or breakbeats. While these samples add an essential energetic edge to productions, they are generally used without consent or recognition. Thus, there is an urgency for the ethical redistribution of cultural value to account for the influence of a previous generation of artists. This workshop will present an overview on the topic of breakbeats and their relation to modern music genres as well as current approaches for breakbeat analysis, synthesis and transformative effects developed in the SoMA Group at Birmingham City University.

Jake Drysdale is currently a PhD student in the Sound and Music Analysis Group (SoMA) at Birmingham City University, where he specialises in neural audio synthesis and structural analysis in the electronic music genres. Jake leverages his perspective as an professional electronic music producer and DJ towards the development of intelligent music production tools that break down boundaries imposed by current technology.

Jason Hockman is an associate professor of audio engineering at Birmingham City University. He is a member of the Digital Media Technology Laboratory (DMTLab), in which he leads the Sound and Music (SoMA) Group for computational analysis of sound and music and digital audio processing. Jason conducts research in music informatics, machine listening and computational musicology, with a focus on rhythm and metre detection, music transcription, and content-based audio effects. As an electronic musician, he has had several critically-acclaimed releases on established international record labels, including his own Detuned Transmissions imprint.

Olumide Okubadejo: Ask Me Anything

This event will take place at 1800, GMT+2

Bring your industry questions for Spotify’s Olumide Okubadejo in this informal discussion, covering moving research into production, industry scale vs. academic scale, and moving between the two worlds.

Olumide Okubadejo was born and raised in Nigeria. The musical nature of the family he was born in and the streets he was raised in instilled in him early, a penchant for music. This appreciation for music led to him to play the drum in his local community by the age of 9. He later went on to learn to play the piano and the guitar by the age of 15. He studied for his undergraduate degree at FUTMinna, Nigeria, earning a Bachelor of Engineering degree in Electrical and Computer Engineering. He proceeded to University of Southampton, where he earned a masters degree in Artificial intelligence and then France where he earned a PhD. Since then he has focused his research around Machine learning for sound and music. These days, with Spotify, he researches and focuses on assisted music creation using machine learning.

Cory McKay: What can MIR teach us about music? What can music teach us in MIR?

This event will take place at 1300, GMT-4

Part of MIR’s richness is that it brings together experts in diverse fields, from both academia and industry, and gets us to think about music together. However, there is perhaps an increasingly tendency to segment MIR into discrete, narrowly defined problems, and to attempt to address them largely by grinding huge, noisy datasets, often with the goal of eventually accomplishing something with commercial applications. While all of this is certainly valuable, and much good has come from it, there has been an accompanying movement away from introspective thought about music, from investigating fundamental questions about what music is intrinsically, and how and why people create, consume and are changed by it. The goals of this workshop are to discuss how we can use the diverse expertise of the MIR community to do better in addressing foundational music research, and how we can reinforce and expand collaborations with other research communities.

Cory McKay is a professor of music and humanities at Marianopolis College and a member of the Centre for Interdisciplinary Research in Music Media and Technology in Montréal, Canada. His multidisciplinary background in information science, jazz, physics and sound recording has helped him publish research in a diverse range of music-related fields, including multimodal work involving symbolic music representations, audio, text and mined cultural data. He received his Ph.D., M.A. and B.Sc. from McGill University, completed a second bachelor’s degree at the University of Guelph, and did a postdoc at the University of Waikato. He is the primary designer of the jMIR software framework for performing multimodal music information retrieval research, which includes the jSymbolic framework for extracting musical features from digital scores, and also serves as music director of the Marianopolis Laptop Computer Orchestra (MLOrk).

One thought on “WiMIR Workshop 2021 Project Guides

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s