CC300: Advanced Audio-Visual Processing

Lecturer: Professor Michael Casey

Drawing from research and teaching at leading arts research centres around the world, such as the MIT Media Lab, this module develops the mathematical and computational framework for analysing and synthesising audio-visual materials for application in digital media arts.

Topics include multimedia signals, filters, multimedia synchronisation, structure and interpretation of audio-visual synthesis algorithms and multimedia content organisation and information retrieval using MPEG-7.

Assessment in the module will comprise evaluation of a significant audio-visual project using real-time synthesis software (Max/MSP, PD, C-Sound, MPEG-4 SAOL). Examples are digital audio processors, interactive video works, synthesisers, electroacoustic sound compositions, etc., demonstrating in-depth understanding of one-or-more audio-visual signal analysis / synthesis techniques.

Indicative Content

  • Perception: light, sound, frequency ranges of eye/ear, decibels, power, cochlea models, retinal models.
  • Multimedia theory: interactive cinema, languages of electroacoustic music and interactive media
  • Audio-visual signals: convolution sum; Parseval's relation; Fourier Series; Discrete Fourier Transform (DFT); Fast Fourier Transform (FFT), Wavelets, Singular Value Decomposition, Machine Learning from Audio-Visual signals
  • Analysis/Synthesis methods: Filterbank analysis / sythesis (e.g. MP3), object-based coding (MPEG4)
  • Audio-Visual programming: structured video/audio, Processing, Max/MSP/Jitter, PD
  • MIDI, Jitter and audio-visual software
  • Filters: Z-Transform; Impulse Response; System Function; FIR filters; IIR filters
  • Effects: feedback, pitch shifting, time stretching and band-limited re-sampling
  • Detection: video input devices and audio-visual tracking
  • Generative systems: synthesis of audio-visual content using algorithms

Text Books

  • The Processing Book, Reas, C. and Fry, B., MIT Press, 2007
  • Perry R. Cook, Real-Time Synthesis for Interactive Applications, MIT 2002
  • Vision, Cornsweet, T. Oxford university Press, 1970
  • S. Handel. Listening: An introduction to the perception of auditory events. MIT Press, 1989