Engineering a disruption tolerant supply chain

// By Bharat Mediratta • Apr 28, 2020

On January 23, the Chinese government ordered the lockdown of Wuhan and other cities in Hubei Province in an attempt to contain the COVID-19 outbreak. For many outside of China this still seemed like a regional problem, but it turned out to be the start of the global health crisis we are still living through.

It is also an economic crisis of unprecedented velocity. The location of the outbreak in China was significant for tech companies because most of the world’s electronic components are manufactured in China. And even as our own supply chain for server hardware has diversified in recent years at Dropbox, we and many of our suppliers still rely on China for materials and components. 

Just two days after the start of the Wuhan lockdown, Ali Zafar, the leader of our Infrastructure Strategy and Operations team told Andrew Fong, our VP of Infrastructure, “This will not be an ordinary year.” Just how extraordinary, no one knew, but Ali wasn’t waiting to find out.

I’d only been the CTO of Dropbox for a couple of months at that point, and a number of the teams I manage were about to go into overdrive. Our ITS team scrambled to turn our business continuity plan into an effective remote support system for thousands of distributed workers just six weeks later. Our data center team proactively swapped out 30,000 components in eight weeks so we could safely reduce on-site staffing for an extended period.

By the time he flagged Andrew on the 25th, Ali had been in touch with contacts throughout Asia and the supply chain world. He understood what could happen if factory workers were not able to return from their family homes after the Chinese New Year holiday. Wuhan itself has become a central logistical hub in the server component industry, so he knew there were many possible failure modes that could disrupt hardware shipments to our data centers as soon as Q2.

What’s at stake

The job of supply chain is to keep ahead of capacity, and that job was on the line. If server racks were going to be late, Dropbox would only be months away from having to make software accommodations that would directly affect the product performance of our customers’ experience. Andrew asked Ali how he could help, offering to call vendors personally. “I said, no, let me put together a tiger team of four people, and let us manage it,” says Ali. “I may need your support later on.” To manage a crisis you have to take ownership. And in supply chain what you have to own are the relationships with your vendors.

“It’s the human element and the relationship piece that really sets us apart in the supply chain,” says Ali. “We've had almost daily calls with Asia over the last two months to find out exactly what we need to know about factory shutdowns, global component shortages, and logistics challenges. This situation really requires us to be on top of things.”

Typically we work directly with two to three rack vendors and six or seven strategic component suppliers, and we expect them to manage their sub-tier suppliers. But as we saw the impact of COVID-19 grow, Ali’s team realized they had to go much deeper in the supply chain. “Now,” says Ali, “we need to know exactly what’s going on. So we have to keep going one level deeper until we can connect all the dots.” This meant expanding our reach into to tier-two and tier-three suppliers—the suppliers of our suppliers—to compile all the information we needed to assess the realistic impact on delivery dates.

How to own it

This was a monumental task and there were a lot of dots to connect. Ali knew that success was only possible if ownership and trust rippled through the organization directly to the suppliers. “I formed a team of four,” he says, “and I told each one of them, ‘you are accountable to make sure this happens. You are on this until it is finished.’” But Ali has also made sure to be in every meeting with them. “You have to be there. It takes a toll, but I want my people to see that Ali is fighting the fire with them.”

The members of this team, which include Jenny Hu, Jesse Lee, Refugio Fernandez, and Vishal Patel, all have one thing in common: they live for this kind of thing. Some people are crisis people, they thrive during challenging times. Ali admits he’s one too, and has filled his team with just enough of them to cover a situation like this. He’s also balanced the team with colleagues who excel at sustaining operations and keeping the ship on course. “It's because they all feel accountable to themselves to deliver the final product,” explains Ali, “it causes them to act like Dropbox is their own company.” He ties this directly to the size of the team. “If we had a hundred people with each one working on small parts of the system, I don't think we would have been able to move so fast.”

They moved fast and tirelessly, working very long days to juggle the time zones between all the daily calls. And then on March 5th, we decided to send all but essential infrastructure staff to work from home. The sudden shift to an almost completely distributed workforce has definitely been a challenge, but thanks to the underlying strength of our culture, people have really risen to the occasion. Sheltering in place has been a particular strain for our working parents who’ve had to add child care and homeschooling to their busy workdays. Jenny Hu has been managing our rack supply with vendors in China, Taiwan, Mexico, and the US, each of which have had disruptions. She also has two young children at home to work into her routines. 

The supply chain team has Zoom meetings many times a day to stay updated and make adjustments to plans as new information becomes available. Jenny’s four-year-old son Nathan has become a fixture at these meetings and helped the team keep their work human. He tells people to turn their cameras on so he can see their faces, and also asks them to “work harder, so my mom can play with me.” Little does he know how hard they are already working, catching catnaps between time zones, and balancing personal responsibilities with global uncertainties.

Delivering relationships

The strength of our vendor relationships starts with the trust we build within Dropbox. Ali relishes his autonomy, but he also knows that Andrew has his back if he needs to escalate the situation. This trust extends from Ali to his team to our suppliers in a web of ownership and candor. Through these relationships, the team has succeeded against long odds at keeping our server deliveries on track and on plan for the year. 

We have four different types of servers in our data centers: storage, database, hadoop, and compute. Each require different components, but crucially, at any time each has a different priority in terms of need-by dates. So not only has our supply chain team gone deeper by managing availability of individual server components and materials, they’ve also shared fine-grained detail with vendors about our actual needs. This kind of candor and transparency is highly unusual in the hyper-competitive world of vendor relations, but it’s second nature to our engineers.

“The vendors have actually told me that we are a model customer for them,” says Ali. “Because if we have flexibility to a certain extent, I share that flexibility and transparency with them, so that, in turn, they're also very candid with me.” This honesty is based on mutual respect and creates mutual interest. “I don't want only Dropbox to be successful and my partners to fail,” says Ali. “That's not how we operate.” Cooperation is a competitive advantage when supply is uncertain.

Since 2016, when Dropbox moved from the public cloud into our own, custom-built Magic Pocket infrastructure, we have been innovating on our own hardware. Our engineers have led the industry by putting new technologies into production. We were the first large cloud service provider to adopt SMR drive technology for our storage servers, which has led to substantial cost and energy savings. Last year we optimized our replication scheme for cold storage, and this year, we’re deploying our next generation of compute servers (more on that soon). All of this innovation has earned us recognition and respect in the supplier community as well.

Our investments in building a truly global supply chain are really helping us now. We always try to multi-source components, but we also push to make sure our reach extends not only throughout Asia, but also in Europe and the Americas. This diversity is giving us resilience against disruption as well as access to new partners, new products, and new ideas we can use to continue innovating our infrastructure. As Ali says, “this is what we live for.”

// Copy link