web: charlesmartin.au mastodon: @charlesmartin@aus.social
Creative Deep Learning Systems | NIMEs |
---|---|
Focus on MIDI data (e.g., Magenta Studio) | Yes MIDI, but also many custom sensors |
Focus on digital audio | Focus on performer gestures |
Focus on composition/artefact generation | Focus on interaction |
Rhythm on 16th note grid | Complex or no rhythm |
Focus on categorical data | Continuous data more interesting |
Good at predicting creative, continuous, multi-dimensional data: handwriting, sketches… musical gestures?
Time per prediction (ms) with different sizes of LSTM layers.
Time per prediction (ms) with different MDN output dimensions. (64 LSTM units)
12K sample dataset (15 minutes of performance)
Takeaway: Smallest model best for small datasets. Don’t bother training for too long.
100K sample dataset (120 minutes of performance)
Takeaway: 64- and 128-unit model still best!
Takeaway: Make Gaussians less diverse, make categorical more diverse.
Twitter: @cpmpercussion
Website: https://cpmpercussion.github.io/creative-prediction/