Two weeks ago, my colleague Marijn and I were at the Velocity conference in Santa Clara. We saw 2 days of keynotes and sessions presented by some of the star companies and speakers today. Besides visiting the conference, we had a fantastic week with great experiences, including visiting the Google Campus, San Francisco and Stanford University.
In this post I’ll share our experiences during the conference and the findings that we brought back to Coolblue. The conference was packed with the greatest tech companies from all around the world. It is very inspirational to hear them sharing their knowledge. Each time specific tools were discussed, our fingers itched to try them out ourselves. In fact, that might be one of the biggest challenges in IT today: how do you pick the right tool for the right job? How do you pick one tool amongst the dozens of tools that are out there?
The main title of the conference was: “build resilient systems at scale”. I personally was mainly interested in topics related to containerisation, orchestration, DevOps (whatever that might mean), microservices architecture and building a solid, mature production platform for developers. Marijn on the other hand wanted to learn more about website performance and measuring, and organisational optimisations. Together we got the most out of the conference and flew home with a lot of new insights to share within Coolblue.
So, what concrete findings did we take with us?
It’s all about the people. People, people, people. There were so many talks that emphasised this. It’s really easy to get lost in the tools and technology and forget why again you are building something. And more important: for whom. If your colleague developers are the end-users of your product, don’t forget to include them in the process just like you would with any other (non-technical) customer.
You are never done improving things. Some of the hottest topics during the conference were containerisation, application orchestration and microservices. Now these are topics that Coolblue can definitely improve on. For example, there was a talk from Karl Isenberg on Container Orchestration Wars, where he ended with comparison tables between different container orchestration tools (see slides). Such talks are very valuable. As are talks where companies describe their paths to an improved architecture and the pitfalls. We should definitely keep this in mind while traversing the same path, such as moving more and more services to the cloud.
There are many, many options out there. Options for what? Options for everything. Whether it’s a microservice framework, an IaaS/PaaS/SaaS platform, or anything else. You are going to have to make decisions, which is hard, but what is even harder is to stay with those decisions. You will have to accept that there is no tool that is perfect for your situation.
We should break stuff more often. Everyone knows chaos monkey from Netflix. It randomly breaks machines and services during business hours to continuously test both the infrastructure and the team’s capability to fix whatever breaks.
As it turns out, more and more companies do this, though often in a more controlled manner, say every Friday at 11.00AM (which is how Pagerduty does it). This is the best way to prepare for uncontrolled outages, and improve the knowledge of your teams at the same time.
One talk described how to do this for databases, where you destroy your database node and use the backup (which you have of course, right?) to restore to the previous state. Another talk described how to even prevent your system from breaking while building it by not building the perfect system but a resilient system. Assume stuff will break and let your system be able to cope with this or build it in such a way that you can easily anticipate on it when it breaks.
While practicing such situations, don’t forget to document the processes you come up with while fixing it. This documentation is invaluable when things get real.
Developers shouldn’t need to care about where their applications live. One of the most important jobs of operational teams these days is to build a solid platform on which developers can easily provision their applications. Ideally, all a developer needs to do is wrap their application in a Dockerfile, submit it to the platform and it will run. The platform takes care of logging, monitoring, load balancing, high-availability and more. A developer doesn’t need to care if the application runs in a private, public or hybrid cloud environment. Building such platforms is the challenge of operational teams these days.
Service workers and browsers. Where last year was all about what a service worker is, this year it is all about what you can actually do with it. Many presentations showed some nice implementations of service workers improving and influencing the loading time of your site. Patrick Meenan showed that browsers themselves are joining in to the game as well, by blocking content based on connection speed and loading time. Such technologies are however still under development.
All in all it was a great conference with some amazing talks and a lot of knowledge sharing. Now the real challenge charts: applying some of the new-found knowledge at Coolblue. We can’t wait.