The R&D Design Sprint: an innovation process for deep tech startups

A great tool for shortening the innovation cycle, doing end-to-end value-driven development, and allowing organizations to fail often and fail early

via Shutterstock (licensed)

Four and a half years ago I started on a quest along with my co-founders to commercialize speech emotion recognition and behavioral tracking technologies. Having just transitioned from the CEO to the CTO role at Behavioral Signals in January 2019, following our second round of funding, our technical team was faced with the task of making our technology work not only in the lab but also “in the wild”, i.e., under adverse recording conditions, diverse use-cases, and languages, as well as, limited computing resources. Our team of machine learning engineers consisted at the time of eight people both junior and senior — all of them top-notch. The technology had reached a mature state-of-the-art point and the continuous data annotation/integration process was already in place.

The Problem

The major challenges we were facing at the time were making the technology fast, scalable, and robust to diverse data, as well as, presenting the behavioral outputs of the oliverAPI to the customer in a way that maximized their value. In summary, we were in urgent need of innovation both at the technological and the value-creation level. Our ML engineers had been trained (and trained well) on working in small groups or in isolation, with a well-defined research problem, identifying or creating homogeneous datasets to work with, and finally producing technological inventions that lead to improved performance and well-cited publications. They were used to performing this work in two to three-month cycles associated with conference or publication deadlines. They were now faced with a multitude of not fully-defined problems, tons of very diverse data, plus working in an agile team with biweekly sprints. And they were stuck and not terribly happy! We were in urgent need of redefining our innovation process.

The Solution

In the past, we had successfully run at Behavioral Signals a couple of design sprints, one of them leading to Furhatino, a runaway hit at Interspeech 2018. I was enamored with the design sprint concept. It was a great tool for shortening the innovation cycle, doing end-to-end value-driven development, and allowing organizations to fail often and fail early. So I sat down and tried to figure out a way to adapt the design sprint in an R&D setting. I wanted to create a synergistic, democratic innovation generation process achieving the cross-pollination of ideas that you can achieve at a workshop avoiding every-person-for-themselves research heroics while retaining the synergies of an agile team. I also wanted to create cross-functional teams of ML engineers and platform engineers making sure that algorithmic ideas were tested for scalability and speed, as well as, have the platform engineers involved in the algorithmic design from the get-go. Most importantly I wanted to rapidly deep dive into one or two ideas and go all the way to a mock implementation, get some initial indication about the potential of the innovation and decide if it makes sense to invest more resources in our regular biweekly sprint.

The R&D sprint

A typical design sprint consists of five phases of approximately equal duration: define/map/understand the problem, individual contributors brainstorm to find solutions (diverge), the team selects the most promising of these ideas and digs deeper (converge), design prototypes that can be tested by people and, finally, conduct usability testing by the audience of the prototype (for more details see appendix). These five stages also apply to an R&D sprint. Specifically, typical problems that an ML R&D team is faced with are to improve performance, speed up or improve the scalability of an algorithm, or make the algorithm more robust to unseen data (the latter was also our main goal). So the five stages of the R&D sprint consist of: mapping the problem and setting performance metric targets (e.g., speed, accuracy); each individual contributor performing a literature search and brainstorms for ideas to potentially reach those targets (diverge); then the team gets together, votes and selects on the two most promising directions, two sub-teams are formed to dig deeper into the literature and propose research innovations that are presented to the full team and selected for prototyping (converge); quick prototyping of the ideas using as much off-the-shelf open-source tools as possible; and finally the presentation of the evaluation results and planning of incorporation of the innovation in a regular development sprint cycle by the decision-maker (test). The people involved in the R&D sprint can range from 5 to 9, and typically they include a decision-maker (in our case the CTO), a facilitator (in our case the VP of engineering), at least two ML engineers (leaders of the two subgroups) and at least one software/platform engineer.

Our first R&D sprint

We ran our first R&D sprint over a period of two days during the last week of January 2019. Our team consisted of 8 people: 4 ML engineers (2 senior, 2 junior), 2 software/platform engineers, the VP of engineering, and the CTO. The goal was to improve the performance of our emotion AI platform by 20% for “seen” and 40% for “unseen” data (relative error rate reduction of emotion event detection) without increasing the computation cost by more than 10%. After mapping the problem (1h), each team member spent ½ day on brainstorming and came up with the following tags corresponding to areas of innovation (diverge phase): transfer semi-supervised learning, solve data imbalance via data augmentation, improved multimodal fusion, better temporal modeling, multitasking, feature engineering & selection, active learning. During the converge phase, we voted and jointly decided to dig deeper into data augmentation and multimodal fusion. Then two teams, each consisting of two ML and one platform engineer, volunteered to dig deeper into each area. We spent ½ day during the converge phase and the output (a set of PowerPoint slides) was presented to the team for comments. The first team presented a proposal of how to use adversarial networks for data augmentation and the second team presented a new architecture for “deep fusion”. Both ideas were motivated by the literature, specifically works in computer vision. Both proposals were approved and each team went home to prep for the next day so as to be ready for implementation the next day. The prototype phase lasted ¾ of a day of the sprint (in practice a bit longer because both teams cheated and started implementing overnight). Initial results were presented in the late afternoon of the 2nd day. Team 1 was able to provide a mock implementation and some first good results (relative error rate improvement of 10%). Team 2 was not able to provide a full implementation of the proposed architecture but an oracle experiment provided encouraging results. The platform engineers provided speed and scalability evaluations. Both proposals were picked for development. We skipped the regular sprint cycle planning for the two proposals and the exhausted team members went home.

Aftermath

Both innovations were integrated into the platform and lead to two Interspeech 2019 (deep fusion, data augmentation) papers and two associated patent (1, 2) applications.

Epilogue

In mid-2019, as we were rolling out the technology, we observed that although the team was now producing technological inventions at a high rate, the value-creation part was still lagging. The ML team was not very reactive to customer requests on how to consume the data, especially, when such requests lead to redesigning backend APIs or tuning algorithms. We realized that our original functional (horizontal) organization along with ML, platform, and application teams was not allowing the ML engineers direct access to the customer. We planned and soon thereafter executed a reorg, creating use-case (vertical) teams. Despite initial pushback, especially from members of the ML team, innovation rates (measured as inventions per month integrated into the platform) improved significantly. We finally converged to an organizational model where ML engineers spent 70% of their time in the use-case/vertical teams and 30% with their ML peers further improving innovation efficiencies. But that’s a story for another post.

Appendix: The Design Sprint — Wikipedia description summarized

For convenience, I have summarized the Wikipedia entry for the design sprint below. Note that the five stages of the design sprint are often run in three days instead of five (mini design sprint).

Summary of Wikipedia entry: A design sprint is a time-constrained, five-phase process that uses design thinking with the aim of reducing the risk when bringing a new product, service, or feature to the market. The process aims to help teams clearly define goals, validating assumptions, and deciding on a product roadmap before starting development. It seeks to address strategic issues using interdisciplinary, rapid prototyping, and usability testing. It started in 2010 by a handful of pioneers from different parts of the Google ecosystem — UX designers, engineers, researchers, and product managers — . It is used as a testbed for launching a new product or a service, extending an existing experience to a new platform, improving the user experience, or adding new features to an existing product. The creators of the design sprint approach, recommend preparation by picking the proper team, environment, materials, and tools working with six key ‘ingredients’.

  1. Understand: Discover the business opportunity, the audience, the competition, the value proposition, and define metrics of success.
  2. Diverge: Explore, develop and iterate creative ways of solving the problem, regardless of feasibility.
  3. Converge: Identify ideas that fit the next product cycle and explore them in further detail through storyboarding.
  4. Prototype: Design and prepare prototype(s) that can be tested with people.
  5. Test: Conduct 1:1 usability testing with 5–6 people from the product’s primary target audience. Ask good questions.

The main deliverables after the Design sprint are answers to vital questions; storyboards, user stories/journeys, architectural diagrams; prototypes; usability reports; plans/next steps for full implementation; validating hypotheses before committing further resources.

The suggested ideal number of people involved in the sprint is 4–7 people and they include the facilitator, designer, a decision-maker (often a CEO if the company is a startup), product manager, engineer, and someone from companies core business departments (Marketing, Content, Operations, etc.).

A succinct and more practical description of the five phases of the design sprint can also be found in the following source https://www.thesprintbook.com/how

“On Monday, you make a map of the problem [map or understand phase]. On Tuesday, each individual sketches solutions [diverge phase]. On Wednesday, you decide which sketches are strongest [converge stage]. On Thursday, you build a realistic prototype [prototype]. And on Friday, you test that prototype with five target customers [test].”

CTO and co-founder of Behavioral Signals