Datasets

    I have used several datasets in my publications that can be downloaded here. As the datasets are intended for academic research only, all the archieves are password protected. Just contact me via e-mail in case you would like to use the dataset.

    Dataset Description
    Annotated
    (992 mb)
    Contains 190 songs from popular and unknown artists and genre votes of a listening experiment.
    Unique
    (389 mb)
    Contains 3115 song excerpts from popular artists.
    1517-Artists (16.1 gb)
    1 2 3 4 5 6 7 8 9
    Contains 3180 full length tracks of popular and unknown artists.
    Pop
    (2.7 gb)
    The Pop dataset is a tempo classification dataset used in "From Rhythm Patterns to Perceived Tempo".
    Music Detection
    (160 mb)
    This music detection dataset was used in "Automatic Musc Detection in Television Productions"

     

    Distance Matrices

    To facilitate the comparisons of other music similarity algorithms the distance matrices produced by the algorithms evaluated in my phd thesis can be downloaded from this page.

    Algorithm ismir2004
    Homburg GTZAN Unique 1517-Artists Ballroom Annotated
    SG x x x x x x x
    BLS x x x x x x x
    MARSYAS x x x x x x x
    RTBOF x x x x x x x
    RND x x x x x x x
    TAGVS x x x x x x x
    G1C x x x x x x x
    TAG x x x x x x x
    CMB1 x x x x x x x
    CMB2 x x x x x x x
    CMB3 x x x x x x x
    CMB4 x x x x x x x

     

    Tag Affinity and Tag Binary Matrices

     

    Software

    MIREX 2009 Submission (Genre Classification)

    Tag Classification Evaluation Script