When we scope out an AI development project it will be impossible to predict exactly how progress will play out. Unexpected issues may arise, such as model performance not being good enough and that might be due to, for example, not having enough data in our data set or the model architecture isn't appropriate to doesn't generalize adequately. Or we might discover in our testing that while we have 99% accuracy, the predictions fail catastrophically in the remaining 1% of the cases.
So with this lack of predictability, how can we most effectively manage an AI development project? While there are no industry standard best practices, we can share some observations we've made throughout years of experience as well as through discussions with many industry players.
Define all stages in the project lifecycle from the outset. This includes scoping, data acquisition, modeling and deployment. This will allow you to budget enough time for the full project all the way through to final implementation. We've seen many projects start with a quick POC that delivers promising results, but get cancelled due to lack of resources once the scale of work that needs to be done to make it robust enough for production is clear.
Budget much more than you think for data labelling. Quick prototypes can be developed with off the shelf datasets available from the internet or quickly generated by an engineer. For a robust deployment in the field that will deliver positive ROI, the model needs to be trained on orders of magnitude more data. You will probably need to collect additional data and label it with human effort.
Set up a testing pipeline from the outset. This is more of an engineering tip, but should also be considered when planning. Machine learning models are brittle, and some bugs can be introduced that subtly break things (the model still trains, but its accuracy is now lower). A continuous testing pipeline, at the code as well at the training level, will help catch regressions more quickly.
90% of the work happens after deployment. Remember that after deployment, your model is alive. It will probably need constant monitoring, data acquisition, relabelling and retraining.
Build up from a prototype. Instead of tackling the full problem from the outset, break it down into minimum viable objectives. Develop a model for the lowest hanging fruit and build up from there. Not only will it be easier and faster to show results to management, it will help identify faulty assumptions early on and correct them.