Designing and Building Data Science Solutions
Data science, machine learning and artificial intelligence (AI) can have game-changing impacts for businesses, empowering them to increase operational efficiency, improve the quality of their services and understand their customers better. Yet for these benefits to be realised, data science initiatives must be designed and executed in a sensible way. Often these projects, while successful from a scientific standpoint, miss the mark in terms of business impact. Many business leaders are left feeling unsettled, balancing the need for innovation and the adoption of revolutionary technologies with an uncomfortable degree of uncertainty and risk of failure. For the data scientist the situation can be equally unnerving, with uncertainties about how to deliver a successful project when the path is not clear.
Yet, these uncertainties and risks – for the business leader and the data scientist alike – can be controlled and managed if approached in a sensible manner. Your authors have designed and delivered hundreds of projects across a wide range of industries. We have made many mistakes, and in the process we have learned what works well and where the common pitfalls lie. We wrote this book to share our experiences in hopes that it will help the reader – whether a data science practitioner or a business leader – reduce these risks and design projects that have the greatest chance of success.
Much of the content in this guide is derived from lessons we have given to our students. Here we have gathered, organised and expanded on those bits of advice to serve as a resource for anyone considering embarking on a data science journey. We share our approach to data science projects, addressing topics such as: alignment to business imperatives, project design, project delivery and evaluation of success.
Data science can be an exciting, invigorating field, and for the business leader, it can bring about revolutionary changes to an organisation that can come with huge returns on investment and value added. For the data scientist, designing and delivering successful projects is rewarding, stimulating and tremendously gratifying. We hope this guide gives you the confidence to understand the risks and approach your project in a sensible way.
We have each worn many hats: data scientists, project managers, mentors and consultants. We met via the S2DS programme, where we worked together as project mentors.
Like many data scientists, we have both come to the field via routes in other disciplines. Jonathan Leslie obtained his PhD in Biology from the University of London, studying blood vessel formation at the Cancer Research UK London Research Institute. After 22 years in biomedical research, he turned to data science and founded a freelance consultancy business. In 2017 he joined Pivigo, where he is now the Director of Data Science. He is passionate about promoting open-source software and routinely volunteers as a mentor in the R-programming and data science communities.
Neri Van Otten holds a M.Sc. in Computer Science Engineering from Ghent University where she wrote her thesis on optimising energy usage in electrical vehicles by training online personalised deep neural networks. In her 8 years of professional data science experience, she has worked at start-ups, a hedge fund and as an independent data science consultant helping a wide variety of different companies. Technically, she focuses mainly on optimising/scaling machine learning systems to utilise data in real-time, researching deep learning solutions and building NLP (Natural Language Processing) applications. At the moment, she leads the data science efforts at SourceBreaker, runs her own deep learning start-up Spot Intelligence and mentors (aspiring) data scientists. She is passionate about diversity and education and regularly teaches machine learning.
We have designed and executed hundreds of projects using data science, machine learning and artificial intelligence (AI), most of which we would consider “successful” (you can read much, much more on the topic of “success” in Chapter 2). We have also made countless mistakes along the way and have done our best to learn from them. Designing projects, ensuring stakeholder buy-in and delivering successful, relevant outcomes is a difficult task, and it doesn’t come naturally to anyone. We hope that you can benefit from the mistakes we have made and the learnings we have acquired in our careers.
Work with us
The authors professionally help companies achieve success within the data science field.
Jonathan Leslie is the director of data science at Pivigo. Pivigo is a Data Science marketplace and training provider and are passionate about what the data revolution will bring, both to industry and the public sector. They believe data science should be democratised and brought to all businesses, irrespective of size, age and ‘data maturity’. The core of Pivigo’s service offering is in outsourced data science projects. There are two main services: participation in their data science bootcamp S2DS (Science to Data Science) or partnering on a Pivigo marketplace bespoke project.
Neri Van Otten provides data science consulting services to companies. She helps to define road maps, de-risk data science projects, train and mentor data scientists on good data science and software engineering practices and helps drive projects forward. Her experience deploying systems into a stable production environment is invaluable for long term success for any impactful data science project.
We would like to thank several people who have contributed to this work. Without you, this would never have happened. We thank Andras Szabo and Kim Nilsson, who spent many hours carefully reading our drafts and providing invaluable feedback. We would also like to thank Ole Moeller Nilsson, Maryam Qurashi, Mandeep Soor, Jason Muller, Neil Forest and Deepak Mahtani for their many suggestions and useful comments. We are especially grateful to our families for their patience and support: Mariam Orme, Daryan Leslie, Zara Leslie (for Jon) and Thea Van Otten, Paul Van Otten and Hilde De Boeck (for Neri).
The content of this book is open for everyone to read for free as we strongly believe in knowledge sharing. If however, you do prefer a different format or want to support the authors, the book is also available as a published book or in e-reader format.
We welcome corrections, suggestions for edits or recommendations for additional topics or chapters. If you would like to contribute, please refer to our contributing guidelines on the issues page.
Copyright © 2020 Jonathan Leslie and Neri Van Otten