A Day in the Life of Facebook Operations

Tom Cook (Dropbox)
Average rating: ****.
(4.13, 38 ratings)

Facebook is now the #2 global website, responsible for billions of photos, conversations, and interactions between people all around the world running on top of tens of thousands of servers spread across multiple geographically-separated datacenters. When problems arise in the infrastructure behind the scenes it directly impacts the ability of people to connect and share with those they care about around the World.

Facebook’s Technical Operations team has to balance this need for constant availability with a fast-moving and experimental engineering culture. We release code every day. Additionally, we are supporting exponential user growth while still managing an exceptionally high radio of users per employee within engineering and operations.

This talk will go into how Facebook is “run” day-to-day with particular focus on actual tools in use (configuration management systems, monitoring, automation, etc), how we detect anomalies and respond to them, and the processes we use internally for rapidly pushing out changes while still keeping a handle on site stability.

Tom Cook

Dropbox

Tom is a Systems Engineer on the Technical Operations team at Facebook, where he is responsible for a variety of low-level services and systems within the production environment. During his time at Facebook, the systems footprint has expanded over 10×. Prior to joining the company, Tom worked for a number of smaller tech companies in Texas.

Comments on this page are now closed.

Comments

Picture of kathy allen
kathy allen
06/23/2010 4:14pm PDT

great talk, but for me, i wanted to see more nitty-gritty of what happens in the trenches, specific problems solved. thank you for showing how, if ops can make the right decisions early on about automation and config mgmnt, the company can move fast and not get dragged down by manual labor.

Picture of Ernest Mueller
Ernest Mueller
06/23/2010 2:33pm PDT

Great information. I assume the ops and engineer per user numbers means that Facebook has a 2:1 dev to ops ratio? Anyway, some session notes to tide people over till the presentations’ posted: theagileadmin.com/2010/06/2...

Picture of Matthew Sacks
Matthew Sacks
06/23/2010 2:30pm PDT

Rocking presentation. Some interesting statistics were presented, and I’d be interested to hear more about the technical reasons Facebook spins their own custom CentOS distro.

Picture of Suzanne Axtell
Suzanne Axtell
06/23/2010 2:20pm PDT

@Bjoern, we recorded this talk and will be posting it in a few days, so you’ll be able to watch it soon.

Picture of Bjoern Kaiser
Bjoern Kaiser
06/23/2010 2:09pm PDT

too few room so that I could not attend the talk :-(

For Velocity China sponsorship information for companies outside China, contact Yvonne Romaine at yromaine@oreilly.com.

  • Google
  • Strangeloop
  • Yahoo! Inc.
  • Dyn Inc.
  • Facebook
  • Schooner Information Technology
  • Tilera
  • AlertSite
  • AppDynamics
  • Aptimize
  • CDNetworks
  • Circonus
  • Cloudscaling
  • Clustrix
  • Coradiant
  • Dell
  • DTO Solutions
  • MaxiScale
  • Neustar
  • Nokia
  • NorthScale, Inc.
  • Shopzilla
  • Splunk
  • Virident
  • Zoompf
  • Neustar

For information on exhibition and sponsorship opportunities at the conference, contact Yvonne Romaine at yromaine@oreilly.com

Download the Velocity Sponsor/Exhibitor Prospectus

Download the Media & Promotional Partner Brochure (PDF) for information on trade opportunities with O'Reilly conferences or contact mediapartners@ oreilly.com

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

To stay abreast of conference news and to receive email notification when registration opens, please sign up for the Velocity Conference bulletin (login required)

View a complete list of Velocity contacts