By: David Waller
This article first appeared in the MIT Sloan Management Review Blog on April 11, 2019.
Today, most companies equate doing analysis with writing formulas in spreadsheets. But the business landscape has shifted seismically since the invention of the spreadsheet. Today, organizations must think in terms of millions of individual customers, not just a handful of segments, and solve problems with reusable solutions to avoid reengineering the process from the ground up. And they want to benefit from the latest advances in machine learning and AI, not simply throw regressions at whatever analytical problem they face. In short, companies need to retrain for writing code, not formulas, as the future of work will entail thinking not just analytically but also algorithmically.
This change of perspective is significant. Most companies might see code as something confined to obscure corners of the IT department or as the exclusive province of a select group of data scientists. But organizations that manage to make code the natural language for diffusing analysis across their business can often grow and innovate faster than their peers.
Taking a code-centered approach will benefit organizations in three ways:
First, thinking in code allows companies to cleanly separate data from analysis of the data, which allows teams to improve each one independently of the other. When data and analysis are cleanly separated, different teams can focus on independently improving each aspect, leading to faster progress.
Second, code is much easier to share and reuse – the entire open-source software movement rests on this idea. Software developers have spent years building tools to make their work easy to trace, modify, and share. By adopting key principles of software development, such as version control, enterprise teams can be more efficient and collaborative as updates to files are tracked throughout their lifetime and changes can be reversed easily.
Companies need to retrain for writing code, not formulas, as the future of work will entail thinking not just analytically but also algorithmically
Finally, code is better for both simple and complex analysis. Breakthroughs in machine learning and AI techniques are implemented as code, and by cloning the code researchers are using, individuals can gain access to state-of-the-art techniques in analysis, quickly and for free.
So what must managers do to move their existing workforce along the spectrum from formula to code? In our experience, there are three practical steps leading companies in this area take.
Tear down the “Tower of Babel.” Communication is a prerequisite to collaboration. Language barriers create some of the strongest barriers to effectively sharing ideas. This is not just true for text exchanges and spoken conversations — it’s equally true for code. But having to mentally recast ideas in several programming languages requires additional expertise, as it can be cognitively demanding.
What’s the solution?
Companies should aim to select at most two, but ideally one, analytical programming language as a company-wide standard — something everyone can “speak.” To be clear: No single choice is perfect for every situation, and reasonable people can disagree on the choice of standard, so teams should prepare for familiar change-management challenges. Companies can assuage naysayers and stay current by agreeing to revisit standards every couple of years.
A good first step for companies is to learn from what experts are doing. Seek out those highly regarded by peers and managers in your company’s core quantitative areas — for instance, in finance, marketing, or at the center of any product group whose product relies on analytics. One global financial services company took just this approach and learned that its top data scientists had settled on Python as a language, and that even junior data scientists were sharing Python code through Jupyter notebooks, a tool widely adopted in the scientific community for conducting and documenting reproducible research with code.
People who have spent years trying to hone their applied quantitative skills will inevitably be opinionated when it comes to the choice of tools and methods and would likely be delighted if their unofficial standards became official. These individuals will act as torchbearers and teachers in the organization, so raising their profiles and amplifying their impact is both sound business practice and a useful talent management strategy.
Create shared-code repositories. Once people transcribe ideas in a common language, companies should take a cue from open-source communities and establish their own shared-code repositories and knowledge bases. This makes it possible for people to share their coding work quickly and easily and to avoid constantly reinventing the wheel.
As with any central system, companies need to be thoughtful about security and permissions, and they should vary access credentials according to their own standards for confidentiality or intellectual property protection. But creating a rich space where ideas can benefit from a wide array of contributions is a powerful engine of progress, and companies can benefit enormously.
With shared-code repositories, multiple groups within an organization can use the same code files to solve similar problems. For instance, the marketing team in a bank might want to know about customers who are thinking about mortgage refinancing so they can target certain products against these customers; and the finance team might also want data on possible refinancing as it projects budgets and billings. The problem formulation is the same in both cases — how many people, and which ones, are likely to refinance? — so why not use the same code to get to the answer?
A good way to get going quickly is to pick a project, create a code repository around it, and invite contributions from a wide audience. Code-sharing platforms like GitHub and Bitbucket make this easy. It’s useful to start with broadly applicable and noncontroversial projects — such as time-series forecasting, generating customer segmentations, and calculating price elasticities, to name a few.
Some companies have gone beyond internal shared repositories and publicly shared their efforts. Leading technology companies like Google and Microsoft have been doing this for some time. But now, companies in other industries are also beginning to see the advantages in adopting this strategy. One telecom carrier, for example, has made its shared-code repositories part of the open-source community, which allows the company to avail itself of help from others even outside the company and potentially set the standard platform for the telecom industry.
Make code part of business as usual. Companies that want to generate the most value possible from advanced analytics face one final, and daunting, challenge: They must make code-based modeling the rule, not the exception. It must become business as usual, as unremarkable and reflexive as attaching a spreadsheet to an email. What makes this challenge formidable is that it requires not just a change in perspective but also a change in habits. But there are pragmatic strategies for accelerating this shift.
First, companies that truly view analytics as a strategic priority will go to great lengths to communicate clear and specific expectations at all levels. Senior executives broadcast company-wide messages emphasizing their belief in and renewed focus on analytical excellence; they explicitly connect it to their strategies in town hall-style meetings; and often, they signal their intentions to shareholders and the market as a whole by highlighting their efforts in everything from annual Securities and Exchange Commission filings to investor calls.
A second strategy for making this change happen quickly and smoothly is to protect and provide time for employees to get training. This strategy works because developing true technical skills requires focus, feedback, elapsed time, and repetition. Today, there is a vast array of options available to companies and individuals alike, ranging from boot camps to massive open online courses (MOOCs) to customized, onsite instruction. In our experience, any of these options can succeed as long as trainees can have sustained blocks of time to learn without constantly toggling back to their day jobs. The learning costs of context switching is enormous. With focus, becoming a competent coder is not an insurmountable task, and managers shouldn’t assume their employees are not up to it.
A third and powerful tactic is setting up a viable support structure. People need to know whom to ask for help; the angst of learning can be considerably lowered when that help is timely and relevant. Progress stalls when the same handful of individual super-users are questioned repeatedly. They quickly become overwhelmed. But people who are just one step ahead on the journey can become mentors for others just starting out. Managers who communicate new expectations to their teams need to be prepared to be the first port of call for unblocking their teams — and few things provide a stronger incentive to learn something than knowing that you’ll have to teach it to others.
But there’s no need to fear. There are many guideposts in this new world. Popular answers, whether found through a search engine, a training resource, or a peer teacher, are almost always elegant and reusable. And sometimes, those answers will contain links to extensive open-source code repositories with solutions to any manner of related problems. The same is generally not true for spreadsheets, whose intermingling of data and analysis makes it difficult to abstract away just the reusable and improvable solution to your problem — especially when that solution requires more than just one step.