首页 > 代码库 > spotify engineering culture part 1

spotify engineering culture part 1

原文 ,因为原视频说的太快太长, 又没有字幕,于是借助youtube,把原文听&打出来了。 中文版日后有时间再翻译。

 

one of the big succeess factors here at Spority is our agile engineering culture. Culture trends to be invisible we don‘t notice it because it‘s there all the time, kind of like the air we breathe.
But if everyone understands the culture, were more likely to be able to keep it and even strengthen it as we grow. So that‘s the purpose of this video.

When our first music player was launched in 2008, we were pretty much as scrum company. scrum is a well-established agile development approach and it gave us a nice teamate culture.
However ,a few years later, we had gone into a bunch of teams and found that some other standards come parctices we‘re actually getting in the way. So we decided to make all this optional.

Rules are good start ,but then break them when needed .
We decided :
agile manners matters more than scrum,
and agile principles matter more than any specific practices.

we renamed the scrum after all to agile coach ,because we wanted servant leaders more than process masters. We also started using the term squard instead of scrum team . And our key driving force became autonomy.

So what‘s an autonomy squard?
If scrum is a small cross-functional self originazation team , usally less than 8 people. They sit together and they have n to n responsibility for the stuff they build : design , commit ,deploy, maintance , operations -- The whole thing .
each squard has a long term mission, such as: make Spotify the best place to discover musice or internal stuff like infrastructure for ab testing. Autonom basically means that the squard decides what to build , how to build it and how to work together while doing it.
There are of course some boundary , such as the squad mission, the overall product strategy for whatever area they‘re working on and short term goals that are renegotiated every quarter.

our office is optimized for collaboration. Here‘s a typical squad area that squad members were closely together here with adjustable desks and easy access to each other screens. Together we‘re here in the lounge for things like planning sessions and restrospective. And back there is a huddle room for smaller meetings or just get some quiet time.Almost all wall are white boards.

so why is autonomy so important? Well because it‘s motivating and motivated people build better stuff. Also autonomy makes us fast by letting decision happen locally in the squad instead of a bunch of managers and committers and stuff. It helps us
minize hand of wating so we can scale without getting bogged down with dependencies and coordinration.
Although each squad has its own mission, the need to be aligned. with product strategy company proorities and other squad. basically be a good citizen in this in this eco-system .
spotify‘s overall mission is more import than any indiviudal squad . So the key priciple is really be autonomous but don‘t sub-optimize. It‘s kind of a jazz band , although each musician is a autonomy and plays on instrument, listens to each other and focuses on the whole song together . That‘s how great musician is created. So our goal is loosely coupled but tightly aligned squads.
We‘re not all theree yet ,but we experment a lot with different ways of getting closer.
in fact that applies to most things in this video , this culture description is really a mix up what we are today and what we‘re tring to become in the future.

Alignment and autonomy may seem like different ends at scale . as in more autonomy equals less alignment. However we think a bit more like wto diffent dimensions. Down here is low alignment and low autonomy. a micro-mangerment culture no high-level purpose ,just shup up and follow orders. up here is high alignment, but sitll low autonomy so leader are good at communictating what probelm is be solved, but they‘re also telling pople how to solve it . high alignment and high autonomy means leaders focus on what problem to be solved, but let the team figure out how to solve it . What about down here low alignment and high autonomy, means teams do whatever thy want and basically all run in a different directions. Leader are helpless in our product becomes a frankenstein.
we‘re trying hard to be up here aligned autonomy and we keep experimenting with different ways of doing that. So alignment enbles autonomy , the strong aliment we have , the more autonomy we can afford to grant . that means the leader‘s job is to communicate what problem needs to be solved and why. And squard‘s collaborate with each other to find the best solution.

One consequence the autonomy is that we have little stardarization when people ask things like which code editor do you use, or how do you plan. the answer is mostly depends on which squad. some do srum and sprint , others do cumber , some estimate stories and measured volocity other don‘t . It‘s really up to each squad is that a formal standards.
We have a strong culture of cross-pollination. when enough squad to use a specific practice or tool such as git . That becomes the path of least retainment. And other squads tend to pick the same tool squad atart supporting that tools and helping each other and it becomes like a defacto stardard . This informal approach give us a blance between conisitency and flexibility.

our architecture is based on ver a hundred sparate systems .
code and deploy indepently. display near interaction. but each system focus on one specific need , such as playlist management search or monitoring. We try to keep them small indie couple with clear interface and protocols.
Technically each system is owned by one squad, in fact most own serveral. But we have an interal open source model in our culture is more about than owning. suppose squad 1 here need something done. in system Be and squed 2 knows that code best . they typically ask squad 2 to do it . however if squad 2 to desn‘t have time would have other priorities, then squad 1 doesn‘t necessarily need to wait . We hate waiting. instead , it‘s wellcome to go and edit the code themself , and they ask squad 2 to review the changes . so anyone can edit any code, but we have a culture of peer code review . this improves quality , and more importantly spreads knowledge.

overtime we‘ve evoloved design guideliness, code stardards, and other things to reduce engineering fiction but only wen badly needed.
so on scale from authoritative to liberal we‘re definitely more on the liberal side.

now none of this would work if it wasn‘t for the people. we have a really strong culture : a mutual respect . I keep hearing comments like my college are awesome . people often give credit to each other for great work and seldom take creadit for themself. Considering how much talent we have , there is surprisingly little ego .

one big a harm for new hires is that autonomy is kinda scary at first, you and your squad mate are expected to find your own solution. no one will tell you what to do ,but it turns out if you ask for help, you will get lots of it and fast . there‘s genuines respect for the fact that we‘re all in this boat together and need help each other succeed .

we focus a lot on motivation . Here is an example an actual email form the head of pople operations:
hi everone ,
our employee satifaction survey, says 91% enjoy working here and 4% don‘t . (now that may seem like a pretty high safifaction reate especally considering our growth pain from 2006 to 2013 . we‘v e double every year. and now have over 2100 people but then he continues. )
this is of course not satifactory and we want to fix it.
if you‘re one of thos unhappy 4% , please contract us . we‘re here for your sake, and nothing else.

so good enough isn‘t good enough. half year later, things had improved , the satifaction rate rised up to 94%.

with this strong focus autonomation, it‘s no coincidence that we have awesome reputation as workplace. nevertheless we done have plaenty of problems to deal with so we need to keep improving.


ok say so we have over 50 squads spread across 4 cities. some kind of structure is need. current squads are grouped into tribes. a tribe is a light weight matrix each people is a member of the squad as well as a chapter. the squad is primary dimension focusing on product delivery and quality. while the chapter is a comepency area , such as quality assistant as a coaching or web development as squad member. my chapter leaders my formal line manager a servant leader focusing on coaching and mentoring me as engineer.so i can swith squad without getting a new manager .

It‘s a pretty picture of accept that it‘s not really true in reality line. art nice and straight and things keep changing . Here‘s a real-life example.
from one moment in time for one tried and of course it‘s all different by now. and that‘s ok the most valuable communication habitat informal and unpredictable ways to support this . we also have gilts. a guild is a light weight community of interest where people across the whole company gather and share knowledge with a specific area , for example leadership, web development or continus to livery . anyone can joind or leave a guild at anytime. guilds typically have a mailling list , biannual on conferences and other infomal communication methods . most organizational charts are in illusion . so our main foucus is community rather than herachical structures .
we found that a strong enough community can get away with an infomal motel structure .if you always need to know exactly who is making decisions , you are in the wrong place .
one thing that matters a lot for autonomy ,is how easily can we get our stuff into production. if realsing is hard will be trend to realse seldom to avoid the pain. that means each release is bigger. and theresfore even hard .it‘s a vicious cycle . but if releasing is easy , we can release ofen. that means each release is smaller and there for easier to stay in the loop and void that one . we encourage small frequency release and invest heavily in test automation and continues to liver infrastructure really should be routine not drama .
sometimes we make big investment to make realsing easier , for example the orgin Spotify desktop client was a single monolithic application. in the early days with just a handful developers . it was fine. but as we grew , this became a huge problem, dozens of squad had to synchronize with each other, for each release and it could take months to get stable version.

instead of creating lots of proecessing rules and stuff to manager this , we change the architecture to enble decoupled releases. Using chromium embeded framework , the client is now basically a web browser in this guide. each section is like a frame on the website and squads can release theis own stuff directly. as part of this architecural change , we started seeing each client platform as a client app. and invole three different flavors of squads client app squads, features squads and infrastucture squads. a feature squad focuses on one feature area , such as search . this squad will build ship and maintain seach related features on all platforms. a client app squad focuses on making realse easy on one specific client platform, such as desktop , ios or android. infrastructure squads focus on making other squad more effective .they provide tools and routines for thins like continues delivery , ab testing , monitoring and operations , regardless of the current structure we always strive for a self-service model , kinda like a buffet the restaruant staff , don‘t server you directly, they enble you to server yourself . so we avoid hand of like the plague, for example an operation squad or client app squad does not put code into production for people. instead their job is to make it easy for features got to put there own code into production. despite the -self model we sometime need a bit of sync between squads doing release.

we manager this using release train and feature taggles , each lient app has a release train that departs on a regular schedule . typically every week or every 3 weeks depending on which client . just like in the physical world, if trains depart frequently and reliably , you don‘t need much up-front planning , just show up and take the next train.
suppose there are 3 squads are building stuff. and when the next release train arrives, feature a b and c done , while d is still in progress, thre release train will include all 4 features but the unfinished one is hidden , using a feature toggle. it may soud weird to realse unfinished features and hide them . but in‘s nice because it exposes intergation problem early and minimizes the need for code branches. on merged code hide problems and is a former technical debt. feature toggle let us dynamically show and hide stuff in test as well as production.
in addition to hidng unfinished work we used this to ab testing gradually roll out to finiehed features . all and all are release process is beterr than is used to be. but we still see plenty of improvement areas so we‘ll keep experimenting . this may seem like a scary model, letting each squad put the own stuff into production ,without any formal centralized control and we do screw up sometimes. but we‘ve learned that trust in more importan than control. why we hire some who we don‘t trust. agile at scale requires trust at scale and that means no politics , it also means no fear. fear doesn‘t just kill trust , it kills innovation. because the failure gets punished people won‘t dare try new things.

so let‘s talk about failure. actually no. let‘s take a break get on your feet get some coffee. let this stuff sink in for a bit and we come back when you are part of the individual , ok?

to be continued...