Blog Post

Part 2: How to scope data science projects


This is the 2nd of a 4 part series on managing data science projects in government.

  1. Part 1: How to solicit and select data science projects
  2. Part 2: How to scope data science projects
  3. Part 3: How to deliver a data science project
  4. Part 4: How to tell your data science story

Data science projects can quickly find themselves off track – there’s just so much fun data to play with! That’s why it’s important to scope your data science projects. Once you have a viable data science project, you need to scope it to ensure you can deliver actionable insights in a reasonable amount of time.

Work in cohorts to time bound scope

At DataSF, our data science projects happen in cohorts – groups of projects during a specific time period. The concept of a cohort conveniently contains the notion of deadlines. Cohorts end and a new one begins.

We launch each cohort with a kickoff meeting. The kickoff meeting walks the teams through the project charter, reviews the timing for each project and starts the process of data access. This helps set expectations for timing for each project.

By bringing everyone into one room, it also serves to subtly message the constraints your team operates under. They are not your only project. Asking for out-of-scope analytical support or causing unnecessary data delays affects others.

Use a project charter as a scoping tool

The project charter is a document jointly signed by our team and the department’s project champion and leadership. Although a formal agreement, its real purpose is a conversation tool meant to surface ambiguities, crystallize responsibilities and identify and measure the deliverable.

A majority of the sections – problem statement, service change and business case – can be lifted and polished from the application (our clients apply to work with us). The real conversation starts by digging a bit deeper.

Identify issues and stakeholders early

Our section on “constraints, assumptions, risks and dependencies” provides space for clients to lay out any fears and concerns they may have. Better to know and address these issues now than later.

This may also introduce some important scoping issues. For example, in one project, we identified a law that constrained the type of analysis that could be used in the service change.

We also have a stakeholder exercise at our cohort kickoff meeting to ensure we’re building support for the project from the beginning. It also helps identify whose expectations we need to surface and manage in terms of the scope and deliverable.

Create milestones to benchmark progress

In any project, milestones help measure progress and identify who is responsible for what. In data science projects, it can be hard to decide what is a milestone.

At DataSF, we build in milestones related to how we approach the analysis to help keep us on track (e.g. exploratory analysis briefing, research briefing, model briefings etc). Basically, we put ourselves on the hook for meaningful, intermediate deliverables.

Define success and success metrics

Hearing what success looks like in the client’s words helps get everyone on the same page. It also helps elicit any lurking expectations. If their expectations of success are way beyond what your team believes is feasible, you need to know this before diving in.

You should also be leery of a project whose success cannot be quantified. In terms of scope, this also helps us stay on target. Most projects can split into many analytical branches bearing juicy, insightful data fruit. But you don’t have infinite time. So your analytical forays need to tie back to these key success metrics. A good gating question on additional analysis is: “How does this support our key success measure or the service change we plan?”

We often refer back to this success statement to help keep us on track.

Define the deliverables

What is the thing clients expect in their hands when the project is complete? Is it a script? A workbook? A tool? This starts the important conversation of what your client can reasonably be expected to maintain. Depending on technical ability, some clients may be better served by an excel workbook than an R script. This also surfaces the question of who will maintain the end product going forward. More on this in our next article.

You may need to revisit the deliverable as you learn more (and should message that), but by defining it early, you are also helping to define what “done” looks like.

Remember, whenever you’re knee deep in the data and starting to feel lost, refer back to the charter as your guide. A thoughtful charter can help both you and your client stay on task.

Feel free to modify our charter available in our DataScienceSF resource collection.