Music Data and Music Search at Insane Scale at The Echo Nest

Brian Whitman (The Echo Nest)
Operations and Culture
Location: King's Suite
Average rating: ***..
(3.67, 6 ratings)

In the past few years, The Echo Nest has built the largest database of music anywhere – over 2 million artists and 30 million songs each with detailed information down to the pitch of each note in each guitar solo and every adjective ever said about your favorite new band. We’ve done it with a nimble and speedy custom infrastructure—web crawling, natural language processing, audio analysis and synthesis, audio fingerprinting and deduplication, and front ends to our massive key-value stores and text indexes. Our real time music data API handles hundreds of queries a second and powers most music discovery experiences you have on the internet today, from iHeartRadio and Spotify to eMusic, VEVO, MOG and MTV.

During this talk, the Echo Nest’s co-founder and CTO will run through the challenges and solutions needed to build music recommendation, search and identification at “severe scale,” with the constraint that most of our results are computed on the fly with little caching. It’s hard to store results when data about music changes on the internet so quickly as do the tastes and preferences of your customers’ listeners.

Intro – the world of music data

  • “The future music platform” and APIs in the music industry
  • The Echo Nest music intelligence platform
  • Recommendation & playlisting
  • Audio fingerprinting
  • Music resolving and search

Scale challenges with a music database

  • Acoustic analysis
  • Text indexing
  • Key-value stores

Scaling solr and lucene for recommendation and playlisting

  • Building a Pandora with 30 million songs
  • Text indexing challenges with Solr over 1 billion documents
  • Cutting text indexes to dynamic user queries

API front end & workers

  • Scaling to hundreds of queries a second without much caching
  • Fingerprinting
  • Shazam for the world in Solr and Tokyo Tyrant

What’s next

Brian Whitman

The Echo Nest

Brian Whitman (The Echo Nest) teaches computers how to make, listen to, and read about music. He received his doctorate from the Machine Listening group at MIT’s Media Lab in 2005 and his masters in Computer Science from Columbia University’s Natural Language Processing group in 2000. His research links automatically extracted community knowledge of music to its acoustic properties to “learn the meaning of music.” His composition and sound art projects consider the effects of machine interpretation of large amounts of media, such as the first actual “computer music” (as in music for computers) of “Eigenradio.” As the co-founder and CTO of the Echo Nest Corporation, Brian architects an open platform with billions of data points about the world of music: from the listeners to the musicians to the sounds within the songs.

Sponsors

Sponsorship Opportunities

For information on exhibition and sponsorship opportunities at the conference, contact Gloria Lombardo at glombardo@oreilly.com

Media Partner Opportunities

For media partnerships, contact mediapartners@ oreilly.com

Press and Media

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

Contact Us

View a complete list of Velocity contacts