Sawtooth Data

illustrations illustrations illustrations illustrations illustrations illustrations illustrations

Strategic ML

post-thumb

Machine learning projects often fail for two reasons: irrelevance and impossibility. Both of these failures stem from a lack of strategy when choosing and building machine learning projects. Projects are irrelevant when they aren’t connected to the goals of the business. It doesn’t matter how fancy a model is or how small it’s error if it doesn’t do anything for the business. Projects are impossible when the data on hand can’t be used to solve the problem. I’ve seen businesses generate a ton of data that they think is super valuable and hand it off to a data scientist to “build a model”. After understanding how the data was generated, doing some exploratory data analysis, most of the time it’s the end of the road.

In this post, I’ll work through one process for how to choose and complete successful projects.

Start with the basics

The first thing a successful machine learning project must have is a business objective. In organizations where OKRs are used, KPIs are the ideal metrics to optimize. They are super relevant to the business goals, and don’t require research to be able to compute. These metrics are probably generated as the output from an already existing business process. Make sure you understand the business process as well as how it gets translated into metrics. When you’re trying to improve these things, the devil is in the details.

If you are in an organization that doesn’t have good metrics to optimize, then that’s where you should start. Respect the “data science hierarchy of needs” and start by building out a good analytics and metrics base.

Define your problem

Once you have an idea of what business problems are important, you need to figure out where to focus your project. A good technique for figuring this out is building a “funnel” view of the business process from start to finish. At each step you should write down the inputs and outputs as well as any metrics and KPIs that get computed. On a side note, if you are unable to make any progress on the problem itself, this view will be invaluable to your business partners. They may have never seen their process laid out this way!

Using the funnel view, you should figure out which step is performing the worst. That’s a place where you might expect to be able to make the biggest improvement and is probably a good place to start.

Drill down into that step in the process. The goal is to take the inputs, outputs, and metrics and rewrite it as a well posed problem to optimize. If you can take it and re-frame it as series of decisions that an expert needs to make, then you might have a human-in-the-loop optimization problem. If users are deciding whether or not to continue to the next step, then you have an A/B testing optimization problem.

Once you have the problem framed in a way that it can be solved, you get to brainstorm ways to solve it. This is the fun part. I suggest you start by analyze the existing data and talking to people. While analyzing the data try to focus on breaking down complex parts into the smallest series of simple steps you can. Those smaller steps will be easier to understand and to optimize. But, don’t let the analysis paralyze you: You have to talk to users and stakeholders to really understand how things work. Talk to everyone you can! Really good solutions grow out of a really good understanding of the problem space (and iteration).

Iterate to success

Now that your problem is well posed and you have some ideas to try out, you can iterate using the build-measure-learn framework. Because you started the project by choosing to optimize on existing process, you won’t have to worry about your data being irrelevant. The data you have was generated with purpose.

Incorporate your insights back into the process. If the problem is really hard to solve, change the problem! You’ll usually find some flexibility in how things are done. That flexibility lends itself to redefining problems to make them easier to solve, while still delivering value. Remember that each step in a process has a goal, most of the time it doesn’t matter how you achieve the goal. What matters is that you achieve it.