2 What success looks like
2.1 The four levels of project evaluation
When approaching a data science project, we often look to the end: knowing what a successful outcome would look like is a good way to determine the direction of a project and the steps required to get there. The idea is that if you meet those criteria – if you deliver something that resembles this desired outcome – you have succeeded. However, while faithful production of the agreed-upon deliverables is indeed important, this is often only part of a much larger picture.
In early conversations with a stakeholder, you may have agreed upon a certain set of outcomes and deliverables for the project based on the circumstances and understanding at the time. However, circumstances can change. Similarly, simply because you have achieved a predetermined goal for a project does not necessarily mean that this goal is the best outcome for the business. Part of our jobs as data scientists, especially if tasked with designing a project, is to understand what the business needs and help stakeholders see what is the most beneficial. Thus, simply focusing on deliverables falls short when assessing how successful a project has been. Moreover, often in a data science project, we make unexpected discoveries. If we view these solely through the lens of agreed project objectives they could be considered mistakes or shortcomings, however, the unexpected outcomes are often the most valuable to the business.
Eskander Howsawi and colleagues have proposed a framework for evaluating project success that has four levels, termed “context”, “business”, “product” and “project process” (Howsawi et al. (2014); Figure 2.1). In our approach to data science project design, we have found striking parallels to the Howsawi framework. (For a more detailed explanation of these levels and this framework, we recommend reading the paper.)
In the sections below we discuss each of these levels in detail. If you are involved with defining and scoping a project, we strongly recommend you think carefully about how your proposed work satisfies each of these levels. As a consultant, this will be an important part of your role and an important step in developing a project that has real value to its stakeholders. Even if your specific role does not involve project design, we recommend you take the time to consider these levels at both the beginning and the end of a project: if nothing else, it’s a good thought exercise that will be important if you are ever involved in project design.
2.2 Contextual level
The first and highest level of the project evaluation framework is termed contextual and is the most abstract. This relates to the circumstances surrounding a project and the externalities that affect it. While considering this may seem beyond the remit of the data scientist’s role, this level is arguably the most impactful: if a project delivers business value, but the circumstances upon which that value is based change, the realised value may change dramatically. Consider, for example, how Brexit may affect UK businesses – project outcomes that may have been valuable in 2016 may not hold up after the UK leaves the European Union. Similarly, the COVID-19 pandemic has had seismic effects on many businesses, changing how they operate and the landscape of business opportunities before them. Projects aimed at pre-COVID business cases may no longer have contextual relevance. Understanding and adjusting for these changing circumstances helps ensure that your work is topical, relevant and impactful.
Contextual considerations often are related to business strategy, a good understanding of which is key when designing a project that is going to have an impact well into the future. A company’s business strategy describes its vision, culture and image. At its core is an understanding of the business’s goals and how its leaders intend to achieve those goals. Naturally, business goals are often centred around performance: attracting customers, increasing profits and reducing costs are almost always major driving forces. But the strategy can also extend beyond that to include things such as organisational culture, brand and image and the company’s place in the wider community. It is important to note that the underlying goals driving strategy can vary across sectors. For example, the goals of a government agency will surely be very different from those of private-sector or not-for-profit organisations.
Understanding an organisation’s strategy will help you to more clearly see how a data science project fits in. Often such projects are part of an overall move towards innovation, so it can be helpful to clarify what the business is hoping to achieve with that initiative. In our experience, we generally turn to several key questions that can help paint a picture of what the business strategy is:
- What are your business drivers and your strategic imperatives?
- What are the main pain points or challenges in delivering your strategy?
- Who are your competitors and what do you think you need to do to stay or get ahead of them?
- Where do you think you may be missing opportunities?
- Do you have any particular business objectives in mind, such as increasing revenue, reducing costs or improving your products or services?
- Do you have any ongoing data science work already? What is its focus, and how does the current project fit into it?
These questions are ones that we, ourselves, often use to help understand the context of the project. They are gleaned from several existing frameworks that can help you to identify the business strategy. We have highlighted a few below.
2.2.1 SWOT analysis
SWOT (Strengths, Weaknesses, Opportunities and Threats) analysis is a simple, yet powerful tool that is often used to develop a business strategy. While your job as a data scientist or a consultant is not to define your client’s overall business strategy, the exercise of going through a SWOT analysis can help you to understand how the company views its position in the market. In short, this analysis allows you to identify internal capabilities (strengths and weaknesses) and external factors (opportunities and threats) to understand the business’s competitive advantage and the factors that are favourable or unfavourable to achieving its objectives. LivePlan has an excellent blog post that provides an overview of the process along with an example and questions to help drive the conversation.
2.2.2 Porter’s Five Forces
Porter’s Five Forces framework has echoes in the SWOT analysis described above. Indeed, it’s originator, Micheal E. Porter, developed it in reaction to SWOT analysis, which he felt fell short in analyzing competition of a business. It is generally used to assess an industry in terms of its potential for profitability. While this framework can be a powerful tool, in our experience it is less useful in identifying the contextual environment of how data science, or technical innovation in general, fits into a business’s strategy. Nevertheless, we mention it here for the benefit of those readers who would like to learn more.
2.2.3 PESTEL Analysis
PESTEL analysis is used to understand the external forces that an organisation faces. The acronym stands for Political, Economic, Social, Technological, Environmental and Legal. The premise is that organisations that are more tuned-in to the changes in these forces will be better positioned to compete. It is often used in conjunction with a SWOT analysis and, when used well, allows an organisation to not only identify these relevant forces but also to assess the potential impact that they may have.
These tools can help give you a better understanding of the circumstances and business motivations driving an organisation’s decision to embark on a data science project. You don’t necessarily have to go through the formality of a consultation session or a workshop with your client to understand a project’s wider context, but our very strong advice is to at least go through the exercise of considering some of the questions we have provided. It will give you a better understanding of how your work fits into the business as a whole and will demonstrate to your client that you are willing to take the time to fully understand the forces that drive their decisions. Usually, your client is looking to you to guide them on their data science journey, and knowing the larger context of your work will go a long way to ensuring that the outcome is relevant and valuable.
2.3 Business level
The second level of the framework corresponds to the business. In short, this describes how much value the project brings to the business. Unlike concrete, tangible deliverables, business value can be hard to measure. Success on this level may not be realised immediately, but rather may only be understood well after the project has been completed.
How does one plan for business-level success when designing a project? This is a complicated question with no single right answer, however, this is often where the creative beauty of project design comes into play. To do this well, you will want to engage with stakeholders to identify opportunities for business value based on an understanding of the organisation’s strategy, therefore the identification of business-level goals should be thought of as a collaborative effort between the data scientist and the business partner.
Naturally, projects must be feasible to generate business value. Thus when defining the business case in Phase 1 (Chapter 3), you will often find yourself moving between high-level discussions about what sorts of outcomes would be useful to the organisation and more concrete discussions about feasibility in terms of objectives, budget, data availability and appetite for risk. We discuss ways to drive this conversation during the stages of project design in Chapter 3.
In the best cases, the business value generated from a project can be expressed in concrete terms: increased revenue, time saved or measurable increases in customer satisfaction are a few examples. Exactly how a project’s outcome results in changes to these values are often confounded by other factors. In other words, your data science project will likely be one of many factors that, collectively, affect revenue or costs. Therefore, defining KPIs (Key Performance Indicators) that can accurately and quantifiably assess the impact of your work in an isolated environment will help when assessing your project’s value at the business level. We recommend that you work with your client or internal stakeholders to identify the relevant KPIs during the design stages of your project. This is covered in Chapter 5.1 in more detail. For another discussion of how to consider business objective in data science projects, we recommend this piece by Peter Skomoroch and Mike Loukides.
Sometimes the business value from a project cannot be measured objectively. Indeed, the value of exploratory and proof-of-concept projects is often in the knowledge generated or insights into the potential for future investments. For example, many of our clients undertake proof-of-concept projects to answer specific questions such as, “Is it worth it for us to invest in hiring a data science team to take this work further?” While it’s hard to place a dollar value on generating a reliable answer to this question, the value is certainly high given the strategic aspirations of the client.
One way to assess a project’s value in such cases is to align it to the stakeholder’s OKRs (Objectives and Key Results).
OKRs and business goals
The OKR framework is a system by which a company aligns priorities across all levels of the organisation. For example, three to five objectives could be set for the organisation as a whole, with multiple concrete, measurable key results attached to each. The idea is that the key results will provide evidence as to what extent the objectives have been met. Each level of the company should go through the process, such that company objectives inform departmental OKRs, which in turn help to define the team and individual OKRs. Thus, everyone in the company can be aligned as to what the business priorities are. Often OKRs are defined and reviewed quarterly.
Understanding your client’s OKRs can help you to identify key metrics that may be important for assessing a project’s success. For example, your project may be aimed at improving customer insights. Exactly how you would assess the degree to which your project has succeeded is imprecise, and your idea of success may be very different from your client’s. If framed as an objective in an OKR framework, however, you are forced to define a series of key results – measurable outcomes that you can use to assess how you have met the objective. Key results in this example could be, “Define five features that are associated with increased customer churn”, “Build a model that can accurately identify those customers who churned last year 90% of the time” or “Identify which customer segments are most likely to churn.” If your project achieves these key results, you can feel safe in declaring it a success.
In cases where it is a struggle to attach metrics to a project’s outcome, consider less tangible benefits to the work. For example, the fact that your client is undertaking a data science project shows that they are adopting a more data-driven approach to their business; while the project itself may not have a tangible ROI (return on investment), this could allow them to position themselves as a data-driven player in the market. In many cases, this resonates with higher-level stakeholders, such as boards of directors or investors. While it may not seem impactful in terms of concrete outcomes, this could certainly have contextual implications such as those outlined in the preceding section.
Similarly, a common problem we encounter with organisations that are new to data science or advanced analytics is siloed teams that rarely interact. For example, a marketing team may have one set of data, the sales team may have another and the IT department may have still another. Your project’s value could be in bringing these disparate datasets together to generate a more clear picture of the organisation’s data landscape. While it’s hard to place a number on the value of this outcome, it is probably hugely important to the client and would be a critical first step in improving customer insights in the example above.
Whether thinking in terms of measurable quantities or insight generation, we encourage you to look back at the defined project objectives often and consider whether your work is addressing them. It can be easy to lose sight of the target when immersed in a project – reminding yourself about the project goals can help keep you on track.
2.4 Product level
The product level is concerned with the deliverables themselves and whether they meet the technical requirements of the project. In data science, this could include factors such as model performance, the accuracy of predictions across the range of data points or the performance of a data product such as a dashboard or interactive visualisation. It may also correspond to the code base and reports; these must meet the requirements and expectations of a project’s stakeholders. One of the most frustrating project outcomes is the delivery of a fantastic code base that the rest of the company does not know how to use: while the project may have succeeded on a project process level (discussed below), it may not be useful on a product level.
Consider, for example, a project where the final deliverable is a productionised solution, such as a recommendation engine. Naturally, the ability to make accurate recommendations is crucial. This should be evaluated empirically, with a metric such as MAP@K (mean average precision at K). You will want to test it on a subset of actual customers and compare its performance to a benchmark of previously used models using an A/B test. You will also likely want to include functionality that measures MAP@K going forward and integrate a model retraining protocol that will be activated (automatically or manually) when the metric dips below a certain threshold. For the recommendation engine to work efficiently, it will have to be integrated into the larger data infrastructure of the client’s system, so your product will also include the plumbing that connects it. For instance, your recommendation engine could be contained in an API. How that API is built and the structure of data coming in and out of it are important considerations, as is the speed with which the API can communicate with the rest of the system. These are a few examples of the considerations that fall under product-level evaluation.
Success at this level also corresponds to more granular aspects of your delivered product, such as the code base itself and associated documentation. Rarely are products intended to be completely static, rather they need to be updated, modified and looked after. It is possible that you won’t be the person who does this in the future so it is your responsibility as the product designer and creator to ensure that you leave your software in a state that others can work with, including future you. Complicated, cryptic and brittle code will make that difficult. If you take pride in your work (and you should!), leaving your project in a fit state is essential.
2.5 Project process level
The lowest level is the project process level, which is focused on the actions taken towards producing deliverables. This can include working on-schedule and within budget. Within the context of data science, this can also pertain to actions related to data analysis: we place activities such as EDA (exploratory data analysis), model selection and statistical analyses at this level. In short, this represents the “hands-on” activities of the data scientist. These processes are often managed through project management methodologies like SCRUM or Kanban.
For many data scientists, this is where most of your time and energy is spent. It somewhat describes how you use your tools…your scientific chops. When we think of project process level success, we consider things like the quality of your code, your fluency in coding, your knowledge of existing methodologies and approaches, how well you understand the algorithms that drive your models and how you interpret the outcomes. Naturally, many of these items are things that you must improve upon over time. For example, becoming a skilled coder and developing a deep understanding of modelling approaches are skills that must be practised and honed over years, and few of us ever feel that we don’t have room for improvement. Don’t feel that you have to be a coding expert or that you must know the details of every algorithm in the world to have success at this level, but do understand that for those with less experience, it will take more time to deliver successfully at this level.
2.6 Final thoughts
We have described four hierarchical levels of project success: contextual, business, product and project process (Figure 2.1). The reader may notice that the lower levels – product and project process – are more concrete and that, as we move up the hierarchy, the levels increase in abstraction. What should also be clear is that they similarly increase in long-term impact: a project that is successful at the product level but that falls short at the contextual level may run the risk of being ineffective or of limited usefulness to the business.
We, therefore, recommend that the data scientists consider the framework for project evaluation in descending order, defining the higher levels of success before the lower levels. As you will see in the next section, this is directly related to the phases of project execution and helps to frame the execution framework proposed below.
The reader may also notice some overlap between the levels. For example, we place metrics such as model performance under the product level, but it certainly also has relevance at the business level. Similarly, considering ROI has considerable overlap between the business and contextual levels. This ambiguity reflects the fact that each level is a spectrum, and exactly where the borders lie between them is somewhat arbitrary. Nevertheless, we find it useful to work within the framework of these levels: if, for no other reason, it gives structure and organisation to your approach to project evaluation and design.
We have provided a lot of information about evaluating projects, and you might be wondering if thinking about all of this is necessary. We feel that the answer is yes, although exactly how much you think about each level will very-much vary depending on the needs of the engagement and your role. Indeed, we do not expect junior data scientists to sit down with a company’s CEO to hammer out the business’s long-term strategic imperatives. However, understanding how your project fits into those imperatives is important, and it will allow you to deliver a more relevant, valuable project. Moreover, as you mature as a data scientist, it will become increasingly important to think about your work in this way. Thus, we strongly encourage you to take the time to consider these levels whenever you embark on a project and to keep them in mind as you work. At the least, it will be a valuable thought exercise; in all likelihood, it will also be a valuable practise that will be more and more important as you mature in your field.