M3309.002900: Machine Listening

Lecturer
- Prof. Kyogu Lee ([email protected])
  
  Department of Intelligence and Information;
  
  jointly affiliated with Graduate School of Artificial Intelligence
Teaching assistant
- Jin Woo Lee ([email protected])

Lecture Info

Lectures & Labs
- Where: GSCST 18-212
- When: Mondays 09:00 ~ 11:50 AM
Office hours
- After class (other times by appointment only)

Course Details

Credit-Lecture-Lab 3-3-0
Course Completion Classification: Combined Masters/Doctorate
Teaching materials
- Lecture slides
- Textbooks (references) you will find useful:
  - Spectral Audio Signal Processing by Julius O. Smith
  - Pattern Recognition and Machine Learning by Christopher Bishop
  - Deep Learning by Ian Goodfellow, Yoshua Bengio, and Aaron Courville
- Papers from relevant conferences and/or journals will be used as references.

Course Outline

Machine Listening (or Computer Audition) is one of the most widely used fields of artificial intelligence, in addition to Computer Vision. There are already many machine listening services such as speech recognition algorithms, like Siri, and automatic music search through audio fingerprinting, penetrating deep into our lives. In this course, you will learn through a series of lectures about the fundamentals of state-of-the-art the machine learning algorithms used to create artificial hearing or machine listening systems, including:

LTI systems, convolution theorem, sampling theorem
Mel spectrogram, MFCC, Spectral envelope, Linear prediction, Phase vocoder
Onset detection, F0 detection, Chroma, CQT, Key & Chord Estimation
Deep learning, Back-propagation, CNNs, RNNs, Transformers
Acoustic scene classification
Audio-text multimodal learning
Neural harmonic-plus-noise model

Students will have a chance to actually implement such algorithms in the lab sessions. Finally, we aim to build a real-world system that can be applied to audio/music/auditory perception through the final project.