The software engineering lifecycle: How we built the new Dropbox Plus

// By Matthew Gerstman • Sep 15, 2020

A few weeks ago, we released a whole bunch of new features to Dropbox Plus, our paid plan for personal users. While we started as a storage company, we‘ve grown to be a hub to manage your digital life. About 150 people worked on this launch: engineers, product managers, designers, copywriters, and many more.

Through a combination of luck and happenstance, I was fortunate enough to touch almost every part of this launch. I got to see how different teams work, as I joined each at different stages of the software engineering lifecycle.

In this post I’ll distill my experience working on different parts of the launch, and discuss how we think about building software. I’ll go into a few technical details, but mostly focus on how teams organize and operate.

Problem Scoping

Toward the end of 2019 we had a realization: while we had built incredible new products for our professional users like the new Dropbox and Dropbox Transfer, it had been a while since we delivered new value to our personal users. 

We set out to change that. We put together a team focused exclusively on helping personal users manage their digital lives. At the time we didn’t know how much those lives and needs would change the year to come, but in hindsight our timing was great.

Our contract with our customers is simple: We build products that people like so much that millions of them pay for it. As a rule, we aspire to be worthy of trust in all situations. You are our customer, not our product.

Merging our mission and our values, we came up with a simple plan: ship a bunch of timely, useful features to our most loyal users. After a lot of discussion about what to build, we came up with the new Dropbox Plus.

We knew our users wanted computer backup. This was finally possible thanks to our brand new sync engine, which we had just finished rolling out in June.

We did weeks of customer research and learned that users wanted a handful of specific new features from us. They wanted a special folder where they could store their most important files, this became Dropbox Vault.

Our research also surfaced that most of our users still weren’t using a password manager. We knew this would be fundamental to their digital security, so we acquired Valt and integrated their product into the Dropbox product suite to fulfill our user’s needs. The irony wasn’t lost upon us that we were building a secure folder called Vault while acquiring a company called Valt.

Finally, users expressed a desire to share their Dropbox plan with their family, without sharing their account. So we came up with the Dropbox Family plan.

Family

I started 2020 as a member of the Family team. Our mission was to build a new paid plan for users that let them share all of our new and existing features with family members under one subscription, as they do with other subscriptions at home. 

This team was brand new, formed while the product spec was still in progress. We hadn’t written any code yet. There were a lot of questions we had to answer first: Do users want one family folder or a shared quota? How many people can be in a family? What makes a family different from a workplace “team”?

Plan Before Coding

We also put a lot of time into architecture decisions before we began implementation, to design a product that would both fit today’s needs and accommodate future growth. When starting out on a new project it’s important to lay out a rough plan, even though you know those plans will shift later as you learn more and as theory meets practice.

What should our data model look like? How do we want to compose APIs? How do we integrate with our existing payments system? Planning ahead kept us from wasting developer cycles once we started coding.

Team Dynamics

Early on, we realized we shouldn’t build this product on the existing Dropbox for Teams infrastructure. Our existing teams model is built for professional use cases and involves lots of advanced sharing, permissions, and user management features that family users wouldn’t need. It would have slowed us down to retrofit this system for family plans, and in some cases would force abstractions that would be anathema to families. For example, a team admin can access any team member’s files—something many families wouldn’t want.

As we were hammering down the exact requirements, we started planning technical infrastructure for the project. This was in January while we were still working in physical offices. All of the engineers on the core Family team were in the New York office.

Each member of the engineering team brought a different set of expertise to the table. I was personally tasked with leading frontend architecture. My job was to figure out how to build all of the Family management and invite pages. Others focused on APIs, data model, or handling shared quota.

We sat together in a pod, constantly bouncing ideas off of one another. Because this was a greenfield project, rather than extending existing architecture, we were able to design the data model, API, and user interface each with the others in mind. The team were often at each others desks, discussing how we might model the problems we were solving. With all-day proximity, everyone had enough context on everything.

Speeding Up Development

I’m especially proud that we decided to build all of our pages as independent applications. For example, we knew the Family management page was likely to end up as part of Account Settings, but we didn’t want to be encumbered by that development process. Building on the already established page Account Settings page directly would have slowed us down.

Here’s why: To test on the Account page we’d have to create a Plus user, navigate to the Account page, and invite members to the Family before we could start testing the existing core features on the page. This meant a 2-3 minute edit/refresh cycle, totally unacceptable in our rapid iteration phase. We needed to be able to test things in seconds, as we were still figuring out the exact experience we would be offering.

Instead, we spun up a developer sandbox with a set of test fixtures. We embedded the root Family management component at the top. This allowed us to test the Family settings page in isolation and only worry about the parts actively in development.

As part of this we invested in a new internal technology called API-QL. We use an in-house interface description language to build REST endpoints for our APIs. This provides convenient features like client/server type validation, but doesn’t provide caching, polling, or React hooks. Apollo, a popular GraphQL client, provides all of these things and would allow us to experiment with GraphQL without changing our serving stack.

API-QL is a layer between Apollo and these rest endpoints. It is built on top of Apollo’s local resolvers and implements a lightweight GraphQL server in the client. API-QL allowed us to build our REST endpoints, but still leverage the caching and other developer experience wins provided by Apollo. Armed with API-QL and a culture of collaboration, all of our APIs were designed with API-QL in mind.

Milestones

For each new feature we set up a series of milestones for the project. In the case of Family plans, we were focused on the following: internal alpha, external alpha, beta, and finally GA (general availability). I was on the team through internal alpha so we’ll cover that here. Appendix A contains a list of all of the remaining milestones and their goals.

Internal Alpha: Dogfooding

The goal of the internal alpha was to ship something to other Dropboxers so we could dogfood the feature. We strive to do this for every feature, so we can test the basic functionality and make sure all the plumbing works. This helps us build confidence in product quality before shipping to any external users.

We were determined to be constantly shipping value to users, whether those users be Dropbox employees, friends and family, or curious beta testers. To reach that first internal alpha we constantly cut scope. Our goal was to ship a minimal viable product for amily plans as quickly as possible for our internal alpha, without including anything that might cause them trouble. It needed to work end to end. It didn’t need to be polished. 

So any time a modal appeared that said “are you sure you want to do this?” we rescheduled that specific functionality for a later milestone. 

We also held off on nice-to-have features like contact suggestion or email reminders. These features would be necessary for a finished product, but would slow us down and weren’t unique to the new Family plan, which is what we wanted our alpha users to test.

Onboarding a new teammate

Midway through the quarter we had a very experienced frontend engineer join Dropbox and the Family team. As frontend lead, it was my responsibility to come up with a well scoped project that would help them learn our stack.

Our milestone-based planning was really helpful here. We had already decided that we needed contact suggestion for the final product, but had not made it part of the internal alpha since we could get away with a simple text field. This made for a perfect starter project: It touched every layer of our stack, but wasn’t necessary until a later milestone. Our new hire could immerse themselves in our stack without feeling they were holding up the rest of us. 

As they ramped up, it became clear he could take over frontend for Family plans from me. This was timely, because another product outside of Family needed some attention.

Joining Vault

During the first week of March, my manager and I discussed in our weekly one-on-one that the Vault team needed someone to come in and set frontend technical direction. The team’s mission was to build a secure folder for users’ most important documents. The majority of their efforts had been focused on making that folder as secure as possible.

They had built a minimal viable product in under nine weeks. Now we needed to shape it into a stable foundation. We set a transition plan for me to hand off all of my work on Family the last week of March, while it was in alpha, and begin ramping up on Vault the first week of April.

Class Dismissed

Shortly after, 2020 hit. On March 13, Dropbox announced we’d all be working from home due to COVID-19. At the time we thought it would just be a few weeks; how naive we all were. Before I knew it, my time on the Family team was over and we were still at home. I had to figure out how to onboard onto a new team in the middle of a pandemic.

The very first thing I did was set up a 1:1 with each and every person with whom I’d be working. We had engineers working on locking/unlocking the folder, serving APIs, updating the desktop app, and working on the existing frontend. We also had a product manager and a designer. I knew it was important to establish these relationships early on if I were to become an effective member of the team.

We held these 1:1s over Zoom as soon as the company had sent us all to work from home, rather than waiting until April. It was a shift for everyone but it worked. Over time I figured out who I’d be working with the most. I set up weekly 1:1s with them to make sure we stayed in sync from afar.

After a few weeks on the team I wrapped up my own starter project—updating user onboarding—and began to focus on codebase quality. I began to tackle my mandate of setting technical direction on the frontend. Our existing frontend code was hard to test. It had a lot more spaghetti than we’d like because it had been built so quickly. We wanted to migrate to a stronger foundation, without slowing down team progress.

To accomplish this, I began a doc called Vault Frontend: Refactoring in the Right Direction. An abridged version of this doc is available at the bottom of this post in Appendix B.

Shipping Our Team Coding Standards

Defining a solid set of standards for the team’s code was one thing. But in order to ship these standards in a product, I needed to get buy-in from the rest of the team. I shared the first draft with the two other frontend engineers on the team. The three of us then jammed on it until we were all satisfied with its specs.

Besides making it clear what the team should do, we had to make it clear why. Of course, it’s more fun to write code following the latest best practice. But there are important business wins as well that the team should understand. Higher quality code is easier to read, easier to update, and easier to maintain. Our measurable goal was to reduce the number of bugs coming in on any given week. 

The three of us presented this doc to the team and everyone was on board with it. Over time, we were increasingly happy with the state of our codebase. 

However, we noticed that one of the decisions we’d made was slowing us down.

Nothing’s Perfect

At my recommendation, the team had begun using API-QL for API calls. While it boosted the Family team’s output, for the Vault team it turned out to be a mistake. The Family team operated in a greenfield codebase and built APIs with API-QL in mind, on Vault we were retrofitting API-QL onto existing APIs. The data models didn’t line up, which slowed down development on the Trusted Contact feature. 

There was an important lesson here: Ideas that work flawlessly on one project might not transfer to another. API-QL is a fantastic technology and we’re continuing to invest in it, but it wasn’t the right fit for this project at this time.

Reflecting

Overall, our standards doc provided a solid foundation for our code going forward. While API-QL wasn’t the right fit, we were able to amend the doc and continue refactoring in the right direction. This reduced the number of incoming bugs, made our code more delightful to work with, and overall sped up developer velocity.

Most important, it put us all on the same page. Not only the doc, but the process of co-authoring it helped us define what good code looks like to us. This made code review faster and more consistent, since we all had a shared frame of reference.

Entrypoints

As I was wrapping up frontend technical direction for Vault, I was pulled into a related project: entrypoints for new features. It’s not enough that features like Vault, Passwords, and Computer Backup exist. Users need to be given ways to find them. We can’t expect them to go looking. We need to meet them where they are.

This might sound simple, but it was a large collaborative effort involving product, platform, and infra teams. Our job was to dive into all of those other teams’ codebases and integrate our new features with their existing source. We already had two engineers working on this full time, but one of them was imminently going on paternity leave.

I had joined this project to take over the frontend work. We had about two weeks of transition where all three of us worked together. Then it was just the two of us—me and Michael. I was in charge of frontend, he did backend. 

This job presented its own challenges. On Family we had started from scratch, but on Vault I joined a product in motion. Now, to present users with entrypoints we needed to make a ton of tiny changes all over the codebase. Yet while Family and Vault were teams of about 10 developers each, entrypoints was just Michael and me.

For two months we cranked out tickets, integrating our new features with Dropbox’s existing surfaces. We had to dive into everything from the web UI to the sync engine. Unlike Family or Vault which needed to focus on architecture or code quality, we were touching code we didn’t own, and our focus was flat-out “get it done.”

On one particularly exciting occasion we paired for half a day with an engineer on the sync team. We needed to show a warning when a user deleted one of their file system entrypoints. Neither me or Michael were familiar with Rust, the language used for the sync engine, so we needed to bring in expertise for a day.

It sounds stressful, but the experience was a blast. Pair programming in a language you don’t know feels like having superpowers. I was also happy I could claim that I’d touched our new backup feature, even if I only added a few lines of code. We got this change into the client just under the wire, then waited with bated breath as it went out to users.

Launch

Before we knew it, our new and improved Dropbox Plus began going to market. This process involved rolling out the features to a subset of our users over time. We have an internal gating technology called Stormcrow that allows us to set a percent of the population we want to receive a feature. We turned on all of our new features to 1% of our users, then 10%, then 25%, then 50%, then 100%. 

I wish I could say this went off without a hitch, but it didn’t.

We realized after sending out our announcement email that so many users were trying to sign up for Plus at once that our payments system couldn’t keep up. We rolled back the features, fixed the payments system, then turned the features back on.

After fixing this and tweaking our email cadence to keep traffic in mind, the features were live for everyone. Within just a few weeks, millions of users had tried out our new features.

Appendix A: Family Plan Milestones

For each new feature we set up a series of milestones for the project. In the case of Family plans, we were focused on the following: internal alpha, external alpha, beta, and finally GA (general availability).

Internal Alpha: Plumbing

Internal alpha was our first milestone, the goal here was to “test the plumbing.” We wanted Dropboxers to upgrade, invite family members, and test our surfaces. This milestone allowed us to test with real users in production, even if they were all Dropboxers and their families. More details above.

External Alpha: Early Feedback

External alpha was about gaining data from a larger, more diverse group of real users to inform and sharpen our decisions about the product. This meant going outside the circle of Dropboxers and their families. We wanted to see how often users upgraded, how many members they added to their families, and gather general feedback from the outside world.

Beta: Public QA

While external alpha is about collecting signals to help finalize product decisions, beta is about ensuring stability in that final product that we’ve defined and built. Our goal in beta is to remove as many unknown unknowns as possible and figure out what bugs real users are running into.

General Availability: Stable

Finally, GA is when we remove the beta tag and release a complete product to all of our users. At this point we’re confident in both the product decisions we’ve made and the quality of the software itself. 

Appendix B: Vault Frontend - Refactoring in the Right Direction

Guidelines for refactoring

As we work on Vault frontend, we should be refactoring our code in the right direction. All future diffs should aim to correct preexisting code that breaks these rules.

Don’t block progress
Our team’s mission is to ship a product for storing your most sensitive files. A maintainable codebase is the means to that end, not a goal in it of itself.

Try to keep it on a per file basis
It’s easy to rabbit hole when refactoring a file. Try to keep it to one refactored file (and its tests) per diff.

Maintain the interface in the first diff
First refactor the internals of the component, then change its props in a followup.

Write new tests
Many of our current tests are ineffective, write new tests based on the guidelines below when you change a component.

Focus on the customer and shipping features first, as opposed to focusing on clean code for the sake of clean code.

Components

All UI components should live in the component library.
A good rule of thumb is if it imports from our design system it belongs in the team level component library.

Function components over class components
Hooks are now considered standard best practice, we should work to refactor class components into function components with hooks.

i18n belongs at the bottom of the file
Create a function called getStrings or a hook called useStrings to return strings needed for a component. Don’t inline i18n in components.

Declarative, not Imperative
Avoid functions like Modal.showInstance. Keep track of the open/close state for a modal.

Design System Styling
Avoid overriding styles within design system components.  Where possible, reach out to Design Systems to find supported configuration options.

Testing

All tests should be written with react-testing-library.
This library is geared towards functional testing rather than tests around dom output.

Don’t use enzyme
Enzyme is semi-deprecated and has poor support for hooks

Avoid Duplicating Test Coverage
If a component is effectively tested by the tests of its subcomponents, don’t reimplement those tests. For example testing that a modal opens/closes when it uses the underlying DIG modal

Never use a className for anything other than css
Use data-testid to get components. Do not use classNames or or text inside a component

State Management

Note: we ended up reverting these decisions regarding state management. See the above section Shipping Our Team’s Coding Standards for more details.

Avoid Redux For API State
API integration should be done through API-QL, a setup diff will be coming

Consider removing Redux entirely
Put component or UI state in either context or a hook.

Logging

Put logs at the bottom of the file
There are just so. many. logging. statements. We put i18n and logging at the bottom of the file so we can reduce noise while reading business logic.

Instead of this


  const onInviteModalSend = () => {
    logProductAnalyticsEvent({
      eventAction: AnalyticsEventAction.SELECT,
      eventObject: AnalyticsEventObject.SEND_CONTACT,
      actionSurface: AnalyticsEventActionSurface.VAULT_SETTINGS,
    });
    closeModal();
  };

Do this


const onInviteModalSend = () => {
  loggingFuncs.logInviteModalSend();
  closeModal();
};

const loggingFuncs = {
  logInviteModalSend: () => {
    logProductAnalyticsEvent({
      eventAction: AnalyticsEventAction.SELECT,
      eventObject: AnalyticsEventObject.SEND_CONTACT,
      actionSurface: AnalyticsEventActionSurface.VAULT_SETTINGS,
    });
  }
}

In Tests


import loggingFuncs from 'component-file';
spyOn(loggingFuncs, 'logButtonClicked')

Code Isolation - Future proofing for code sharing

Avoid importing from elsewhere in the server repo.

  • If you need to import something from outside of modules/clean/react/vault or the vault component library look into dependency injection.
  • We will likely handle these on a case by case basis.
  • We will likely use React context to handle these.

Styling

Avoid advanced SCSS
Make your classnames easy to grep for.

Do This

.vault-link {
  // styles
}
.vault-link:hover {
}
.vault-link-bold {
}

Don’t do this
These classes are hard to map back to components from code.

.vault-link {
  // styles
  &-bold {
  }
  &:hover {
  }
}

// Copy link