EMI Music is to release parts of its ‘EMI Million Interview Dataset’ to around 150 data scientists from non-profit organisation Data Science London, who will study what EMI describes as the ‘most extensive collection of data on music consumers ever shared’.
The dataset contains information on music fans’ interests, attitudes, behaviours, familiarity with and appreciation of music. It has been partly generated by WPP-owned Lightspeed Research, which has been working with EMI for two years across 25 countries to better understand how to connect artists and their music with consumers.
A subset of data from those interviews will now be shared to enable extraction of new insights. This collaborative project will also feature a series of community events and conferences, and starts with the Music Data Science Hackathon music competition taking place this weekend.
The competition, which will ask: ‘Can you predict if a listener will love a new song?’, will require entrants to develop an algorithm that can predict a listener’s level of appreciation for songs and artists, based on demographics, word associations, and the past interviews contained in the dataset. Data scientists taking part in the hackathon will be competing for £6,500 in cash prizes, sponsored by EMI and data science and big data solutions firm EMC.
Crowdsourcing provider Kaggle will be hosting the competition on its online platform, after which the firm will make parts of the Million Interview Dataset available to its online community of more than 44,000 data scientists around the world.
David Boyle (pictured), SVP Insight for EMI Music comments: ‘EMI’s insight has profoundly increased our understanding of music consumers and the service we now provide for our artists. With the EMI Million Interview Dataset we hope to bring more new ways of thinking into our industry that will deliver enormous benefits to artists and their fans.’
Web sites: www.emimusic.com