Project Overview
This project focuses on the application of deep learning models to a creative, non-visual domain: music. The AI is trained on a dataset of musical pieces (in MIDI format) to learn patterns in melody, rhythm, and harmony. The trained model can then be used to generate new, original musical compositions, demonstrating an understanding of sequence and structure.
Tech Stack & Tools
- Python: The primary programming language for the model.
- TensorFlow/Keras: The deep learning frameworks used to build and train the neural network.
- music21: A Python library for parsing and manipulating music data.
- Recurrent Neural Networks (RNNs) / LSTMs: The primary model architecture used for sequential data generation.
My Process
Phase 1: Data Preparation
I collected a dataset of classical music pieces in MIDI format. The raw MIDI data was then processed to extract the musical notes and their durations, which were converted into a numerical format suitable for model training.
Phase 2: Model Training
I designed and trained a recurrent neural network to predict the next note in a sequence based on the preceding notes. The model learned the patterns and structures inherent in the musical data.
Phase 3: Music Generation
Once the model was trained, I used it to generate new sequences of notes. These sequences were then converted back into a MIDI file that could be played and listened to.
Results & Future Work
The AI successfully generated original and coherent short musical phrases. While the compositions are not as complex as human-created pieces, the project demonstrates a strong understanding of generative modeling for sequential data.
Future Enhancements:
- Implement a more complex model architecture like a transformer for longer, more structured compositions.
- Add a user interface to allow users to specify a musical style or theme for the AI to follow.
- Integrate the model with a digital audio workstation (DAW) for real-time collaboration.