In spite our best intentions to make a faster, stronger web, things happen. Power outages, Failovers. Scaling problems. What separates you as a hidden hero is how well you anticipate, react, learn, and apply that to future planning. The Operations + Culture track at Velocity offers you tried and true case studies, tools, and experts that not only teach you how to scale like a beast and navigate the growing DevOps ecosystem, but also how to anticipate those future pitfalls and tackle them as a team. Check back for more developments in this program track.
Using a job queue to run tasks in the background is an easy way to get huge performance improvements in most web apps, but is also an easy way to create huge engineering headaches. This session looks at the different ways of running tasks offline, and how to avoid the operational, data consistency and user experience problems this can cause.
An overview of Google Compute Engine. GCE provides Virtual Machines optimized for large scale data processing and analytics. We will dive into the core concepts, API, unique features and architectural best practices in the context of concrete examples.
In the very different environment of the 1990s, the speaker introduced that concepts of desired-state management and convergent repair, which formed the basis for tools like CFEngine, Puppet and Chef. Today, the industry has evolved considerably, and Mark Burgess is not standing still. In this talk, he asks: what are the next steps for IT infrastructure tech?
Adriaan shares his experience in helping big companies to quickly deploy changes to production, while minimizing risk of performance problems. This will show engineers some small steps they can take to work towards a fast and stable website.
There was a time not long ago when Etsy was laden with barriers, silos, broken communication, and noncooperation. This talk will focus on the various stages of Etsy's cultural development from the early days to present. We will tell of how Etsy overcame numerous challenges and built a strong company culture while continuing to scale.
In this talk, we describe the “DevOps Cookbook Project,” where we catalog and codify the practices of “high performing DevOps organizations” that result in their extraordinary performance. Our goal is to create a prescriptive playbook that organizations can follow to replicate the extraordinary culture and outcomes so that IT Operations can operate at scale and win in the marketplace.
In this talk we'll dig in deep into outage events in order to understand the hurdles we face when we find ourselves in tough times.
* Human behavior in disorienting situations
* Team troubleshooting pitfalls and successes
* Preparing for outage handling
* "Situational awareness" and "crew resource management" applied to web operations
* Coordination and decision-making under pressure
How do you support the kind of growth you hope for, without breaking the bank, and while sustaining a compelling experience? This talk focuses on the data management challenges game developers face and outlines the key criteria for selecting a data management model that will provide the scalability and performance needed to support your app from birth through massive growth.
You are a developer, you live and breathe agile development, automated testing and continuous integration. You are also on-call, tasked with keeping the site up and running. Rather than trial-and-error you are looking for pointers to fast-track your operations to a more mature level. In this session you will learn how to create enough structure in your operations to scale without going crazy.
Half code, half service, half data: Databases continue to be a challenge. Even in a "devops" environment, figuring out how to deal with issues around your database is critical to long term success, and the old school ways won't cut it. Join us to learn how we've worked with some of todays leading webops shops to make database systems something ops, developers, and DBA's can all embrace together.
As demands for performance increase and the complexity of Internet application architectures grow, operations are abandoned in a perfect storm. The problems are ever harder to diagnose and the need to rapid remediation is more than ever. You need the right techniques and methodologies or you'll be ineffective or, worse, an obstacle. Learn them here.
London Multi Asset Exchange is a financial exchange focused on the retail customer. Ensuring low milliseconds Latency and Stability for customers accessing over the Internet is very challenging, and only possible with a careful approach to monitoring.
The Echo Nest powers most music discovery experiences you have on the internet today, from Spotify to MTV to MOG to iHeartRadio, and it's all built on a huge datastore of music data we've created using custom storage and indexing approaches on top of stock tools like Solr, Tokyo Tyrant and SQL. Find out how we can fingerprint a song in 50ms or recommend a playlist in 100.
Everyone should use a CDN as an part of making your site perform. But how do they work, how do you use them effectively? Look behind the magic curtain and learn how a CDN works.
China is a huge emerging Internet market but web optimization presents unique challenges that require insight. This workshop will show specific insight factors that undermine web performances in China and tools and methods to resolve them. From DNS to GFW to how to track performances down to key cities and regional level, this workshop will show how to avoid key mistakes in China.
Instagram is one of the fastest-growing mobile services in history. This talk will cover the twists and turns, architectural decisions and evolving culture that led us to successfully scale the site to match our growth.
Betfair's Martin Anderson (Site Architect) and Abraham Ingersoll (Principal Site Reliability Engineer) step away from the world's largest gambling exchange to offer a humorous example-by-example dissection of the age old battle between developers and operations.
Right now the BBC is preparing their network for the 2012 Olympics.
This presentation will be after the Olympics.
Come along and find out how we prepared and how well (or badly) we did, at least you should see some pretty graphs.
A step-by-step presentation of how Box transitioned its 2 million line web application codebase from a single bottlenecked MySQL database to a fully sharded, horizontally scalable database architecture. The focus will be on the incremental steps and best practices that enabled the successful execution of this fundamental change, all the while continuously serving 2 billion queries per day.
Working in Web Operations means dealing with production systems that in most cases needs to be operational 24x7x365. To reach 99.999% uptime, you must fail as little as possible.
This talk will go through a few real-world incidents and failures experienced by our WebOps team, and outline what we are learning (the hard way), and how we're trying to improve.
What could possibly go wrong?
This is a story of trying to make good quality metrics and monitoring a reality. We’ll talk about the products we settled on, how we set about reducing the friction and removing roadblocks we hit, the many mistakes we made along the way and how hiring someone just for metrics was one of the best things The Guardian Ops team has done.
How Operations approaches their work is just as important as the work itself. It can either be a haphazard and constantly interrupt-driven pile of tickets, or a sane method that suits your team. This talk will go over the concerns and considerations for prioritizing, coordinating, and tracking work to be done.