When it comes to scoping, it is critical to start from defining exactly what problem needs to be solved. Now, if you've been following the recommendations in this book, it should be quite clear what problem it is you are solving. However, in many organizations we've witnessed projects being developed and approved without anyone explicitly stating what problem is being solved and what are the concrete objectives to be achieved.
Without concrete objectives, we cannot define a concrete metric to be optimized. In that case, it is often left to the developers to choose a metric to optimize for. And after a lot of time has been spent in implementation, the team discovers that this is not what the organization wanted.
To avoid this, clearly state the problem. How will the final product look like? What will it do? How will it behave? What are its outputs. Exercise empathy. How will this product affect the members of our organization? Third parties? Create a mockup. This will allow you to anticipate possible side effects. Just as often as specifying what it should do, define what it should *not* do. What is a non-goal. What features can be left for further iterations. Think of the 80-20 rule.
Avoid following the hype. While it is important to research the state-of-the-art technologies to choose the best possible solution for the problem at hand, do not be driven by what's trendy but by what is the most cost effective way to solve your problem. We've found that many problems can be solved with 20 year old statistical techniques, or may just require a very simple machine learning layer on top of a robust data collection and processing platform.
Define milestones and evaluation points. You should have clearly defined 3 or 6 month milestones to evaluate whether the project is on the right track, and if not, kill it or pivot. This is only possible is the problem has been concretely defined and can be quantitatively evaluated. At the same time, if the project requires long-term research, make sure the milestones are abstract enough to provide leeway for experimentation (for example, instead of requiring a finished prototype, a valid research milestone could check for whether a specific hypothesis has been verified or falsified).
Over-budget for compute. You will always need more computational resources than your engineers estimated. And even if not, you can always use those resources for further improvements of the initial prototype.
Recognize that the devil is in the details. Often, more time is spent adapting a finished prototype for production than in the research and development of the original prototype. Data in production might be different than the dataset we've collected. Or production hardware has more limited computational resources and the models must be optimized to run correctly and with lower power consumption. It is crucial to budget enough resources for this deployment step, or risk external parties thinking that the technology does not have the promise that they thought.