Hadoop and MapReduce are enabling consumers of big data to store and process ever increasing volumes of data. Logs, sensor data, user generated content, feeds – the list goes on.
At Cloudera, we’ve witnessed the same process occur at organizations across the world: what starts as a research project on a small Hadoop cluster ends up in the production pipeline, thus creating a new burden for the operations team. A number of tools for managing a reliable and scalable Hadoop installation have been produced by Hadoop’s vibrant user community, including major contributions from Yahoo!, Facebook, and IBM Research. We’ll demonstrate how to put these tools to use in your Hadoop cluster.
During this session, we’ll share war stories from clusters we’ve managed, as well as specific tips and tricks for scaling Hadoop from tens to thousands of nodes. We’ll cover the following in detail:
Jeff Hammerbacher was an Entrepreneur in Residence at Accel Partners immediately prior to joining Cloudera. Before Accel, he conceived, built, and led the Data team at Facebook. The Data team was responsible for driving many of the applications of statistics and machine learning at Facebook, as well as building out the infrastructure to support these tasks for massive data sets. The team produced two open source projects: Hive, a system for offline analysis built above Hadoop, and Cassandra, a structured storage system on a P2P network. Before joining Facebook, Jeff was a quantitative analyst on Wall Street. Jeff earned his Bachelor’s Degree in Mathematics from Harvard University.
Comments on this page are now closed.
For information on exhibition and sponsorship opportunities at the conference, contact Sharon Cordesse at scordesse@oreilly.com
Download the Velocity Sponsor/Exhibitor Prospectus
Download the Media & Promotional Partner Brochure (PDF) for information on trade opportunities with O'Reilly conferences or contact mediapartners@ oreilly.com
For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com
To stay abreast of conference news and to receive email notification when registration opens, please sign up for the Velocity Conference newsletter (login required)
View a complete list of Velocity contacts
Comments
An excellent presentation.. Great to get technical..
Good coverage of HDFS.
I was hoping for more information on streaming logs into the warehouse from production systems.
Again, the A/V (audio) tech in there with his iPhone ringing away. Bad form. On a separate note, I found ths presentation to be very informative and extremely well organized.
Really informative and, though technically in depth, well organized.