Friday, September 4, 2009

Some more papers on instrument ID

Three papers my supervisor has brought to my attention:

Thursday, August 27, 2009

IPEM and the psychology of electronic music

I'm working with the IPEM Toolbox. A difficult task given that it's tied to Matlab versions 5.3.1 and 6.0. The source code is available and can be toyed with to try to make it compatible with later versions. A fix is to find the original versions, more as a time saving exercise! But as an aside, the concept of IPEM and the extraction of perceptual features is of interest to my own research. And in particular, being an electronic music fan, this article discusses the "Psychology of Electronic Music".

Wednesday, August 12, 2009

4475 separted notes

I separated the notes from the majority of the IOWA musical instrument samples. Total notes = 4475. WOW! It wasn't easy! I'll be writing about how I did it in my next publication. Look out a few short lines describing about 9 day's of hard work!!

Monday, August 3, 2009

Another Instrument ID paper

I came across another paper on musical instrument ID today:

Loughran,R., Walker,J., O'Neill, M., O'Farrell, M., 2008 - Musical Instrument Identification Using Principal Component Analysis and Multi-Layered Perceptrons.pdf

Loughran, R. uses the temporal envelope of the signal residual (post removal of the RMS temporal envelope) as a feature which I haven't come across before in the literature.
"Temporal and spectral envelopes. The temporal envelope was found by calculating the RMS energy envelope of each sound, which was then filtered using a 3rd order low pass Butterworth filter. This envelope was calculated over the length of each note and so includes temporal information on how the energy within the sound changes over time. Thus this envelope incorporates information regarding the attack time which has been shown to be of high importance to instrument classification [13]. The temporal envelope was then subtracted from the original sound to find the residual. The temporal residual envelope was calculated from the RMS of this residual.

Friday, July 31, 2009

A possible approach to data preparation

As more of a note to self, there are some interesting aspects of the final year project, "Musical Instrument Detection" by Gautham J. Mysore and Gregory Sell and SongHui Chon, which I may consider:
  1. The ensemble approach to classification:
    "...we attempt the problem of identifying the instrumentation of a musical signal at any given time using several machine learning techniques(logistic regression, K-NN, SVM).We approached the problem as a series of separate binary classifications (as opposed to a multivariate problem) so that we could mix and match the best algorithm for each instrument to create the best overall classifier."
  2. The mixture signals were created artificially:
    "Then, to create one of the combinations above, a random signal for each instrument were all combined randomly. In this way, we created 52 total signals for each instrumental combination."

Thursday, July 30, 2009

HMM toolboxes

Some toolboxes and links:

Hidden Markov Model Toolbox for Matlab
Mendel HMM Toolbox for Matlab
H2M: A set of Matlab/Octave functions for the estimation of mixtures and hidden markov models

Modelling the temporal dynamics of timbre

"Musical Instrument Timbres Classification with Spectral Features", Giulio Agostini, Maurizio Longari, Emanuele Pollastri (2001):
A considerable number of features is currently available in the literature, each one describing some aspects of audio content [22, 23]. In the digital domain, features are usually calculated from a window of samples, which is normally very short compared to the total duration of a tone. Thus, we must face the problem of summarizing their temporal evolution into a small set of values. Mean, standard deviation, skewness, and autocorrelation have been the preferred strategies for their simplicity, but more advanced methods like hidden Markov models could be employed, as illustrated in [21, 22]. By combining these time-spanning statistics with the known features, an impressive number of variables can be extracted from each sound. The researcher, though, has to carefully select them in order to both keep the time required for the extraction to a minimum and, more importantly, to prevent from incurring into the so-called curse of dimensionality.
Taken from "Computer Models for Musical Instrument Identification", Nicolas D. Chétry (pg.180):
When modelling timbre, our system and the ones encountered in the literature lose time consideration. In other words, the temporal organisation of the various acoustic events is not represented at the model level. This approach is understandable if one considers timbre as a global attribute of sound. However, we showed in our experiments that, for example, onset and steady-state segments of tones have different characteristics, so that they each contribute to a particular aspect of timbre. Instead of averaging them in one single model, one could think of independently and explicitly modelling them. Similar to speaker recognition, the incorporation of
Dynamic Time Warping (DTW) or Hidden Markov Models (HMM) can constitute a possible orientation for future research.
In a discussion with my supervisor, we noted the importance of considering the temporal dynamics of an instrument sound. As the uniqueness of the spectral envelope cannot be absolutely guaranteed across instruments, extra information about the signal temporal behaviour are often required in order to increase the systems performance. The main difficulty in extracting such temporal features resides in the fact that robust automated pre-processing techniques for onsets or transients detection are difficult to design, especially in the case of pitched musical sounds. As noted by Chetry,
"For this reason, a more general approach is preferred. It consists of appending the delta (speed) and delta-deltas (acceleration) coefficients to the feature vector in order to include information about its evolution with time."