Thought Leadership

When is it Time to Think About Incorporating Machine Learning into your Data Strategy?

Chris Cannon

Associate Analytics Consultant

March 14, 2024
Featured image for “When is it Time to Think About Incorporating Machine Learning into your Data Strategy?”

Enhancing Your Data Strategy for Success

You’ve probably seen many posts and articles about why machine learning (ML) projects tend not to deliver business value. These articles offer many possible reasons. Examples include the chosen project management process, failure to automate, lack of governance, and more.

So now what? If authors, researchers, and journalists can’t agree on why ML projects fail, should businesses implement ML tools and practices anyway and hope for the best? Or should we throw our hands up and return to carving wheels out of stone?

Let’s take a step back from the technical implementation of ML and discuss what we argue is the most critical step — thinking about ML and its incorporation into your data strategy.

Before we do, here is a quick side note: We acknowledge and appreciate the difference between ML and artificial intelligence (AI). To level set, ML is a subset of AI that enables computers to learn implicitly. For example, computers infer rules from the data on their own. For consistency, we are calling projects enabled by these technologies ML projects.

Hiring

Before you can build ML solutions, you must have the right people. If you’re an enterprise with a data team of 50+ engineers and analysts, maybe bringing on a team of data science specialists is suitable for you. But the odds are you will be better served by ensuring your team is well-rounded, with some understanding of engineering, analysis, and data science — and still, more importantly, people who can talk about and understand business needs and translate them into data requirements.

Your data strategy will only come to fruition if you have the right people on a well-balanced team. Therefore, consider hiring as a component of your data strategy.

Data Transformation

ML requires curated, cleaned, and prepared data. But are you overengineering a process that leans exclusively toward supporting ML solutions? Alternatively, are you underemphasizing the specific requirements for ML?

For example, many ML algorithms require one-hot encoding that translates information you might see in the table on the left into the table on the right. Both say the same thing, but the format is very different. Should you store all of your data like this?

Raw
User_IDStatus
12345Active
55555Active
22334Inactive
54321Churned
One-hot Encoded
User_IDIs_ActiveIs_Churned
1234510
5555510
2233400
5432101

This encoding is very useful for ML applications. Still, is the table on the right the most helpful form for analytics users? Not necessarily.

Therefore, it is critical to carefully allow ML ambitions to inform your overall data strategy concerning data collection, transformation, and storage. This way, you only overengineer in support of ML projects that are on the roadmap. Conversely, engaging in these thought processes helps you prepare for those ML projects that might be right around the corner.

Thinking about ML will also help inform what data you’re collecting. 

We add an important ML-agnostic point before we close out on data transformation. At DAS42, we strongly advocate for the multi-layered data storage and transformation approach. Whether you call the layers of your data solution “bronze, silver, gold,” “raw, transformed, reporting,” or another series of names, our experience shows that it is best to keep all your data and convert it into curated, business-ready datasets.

Technology Selection

Whether we’re talking about the technologies that connect, store, or serve your data, choosing the right technologies is critical to incorporating ML into your data strategy. The decisions about technology you’re making now may impact how easy or difficult it is to implement ML projects in your organization in the future.

For example, take your data storage. Does your data storage solution natively connect to ML training, serving, and monitoring tools like Snowflake or Dataiku? Or, if your data science team plans on building their tools for business stakeholders to employ, are internal service level agreements (SLAs) established? Will maintaining that technology, according to the internal SLA, continue to be a part of their scope of work?

New Product Lines or Capabilities

Imagine it — the days of bootstrapping your business are over. Your successful product has expanded into new lines. Or your product lines are staying the same, but your organization is re-evaluating how it thinks about and builds its capabilities.

So, your organization is busy and moving fast. Is this a time to consider ML in your data strategy? Absolutely.

New product lines or services may generate new data or present new ML-informed decision-making opportunities. Similarly, if your organization is developing a new capability, there may be opportunities to enhance this capability with ML.

For example, suppose the new capability is a new quality assurance process that aims to improve the catch rate for products that fall outside acceptable bounds. Then, incorporating ML may be the correct answer. In this case, it may be a computer vision project to detect problems in manufactured goods automatically.

Shifting Business Strategies

Suppose your organization is undergoing a strategic shift, such as being the subject of an acquisition, expanding into a new market, or internal restructuring. Considering ML in your data strategy is as crucial as ever.

Here’s a concrete example. Imagine another company recently acquired your organization. Your target market, product lines, and capabilities have undoubtedly changed. As you adapt your data strategy to this new reality, you may have new opportunities to use ML in your data strategy.

New data sources combine with changing requirements for your architecture, ownership, security, access, and privacy from your new enterprise. You will find an increased need for standardization and data quality in this environment. While the business must answer questions like, “Whose definition of a customer are we using?” technologists must address questions around data cleansing and standardization processes to unify data formats and definitions.

As the data architecture changes, so will the processes and technologies used to collect and integrate data. For example, your organization will build new data pipelines or modify old ones to support the emerging architecture.

These changes in data architecture can open new opportunities for ML in your data strategy. New data to train models may mean new opportunities to provide insights. For example, becoming part of a larger organization with more employee hiring data could enable a model to predict which employees will churn in the near future.

All of the above will impact model training and deployment. You may need to train models more frequently, integrate ML models into new production environments, or provide more users access to the models and their insights.

Fundamentally, ML can prescribe actions or automate processes, even in the face of changing strategies. However, this requires action on your part. You must consider ML as a component of your data strategy and adapt to this changing environment.

Conclusion

If you noticed the word “when” in the title and anticipated a straightforward answer akin to a date on a calendar, your intuition was spot on. We do provide such an answer, and that date is today.

Some will counter that they’re not “ready” to think about ML. They still struggle with what they perceive as the basics, such as establishing data quality and conformity checks and building organizational trust in their data.

You are already performing the activities discussed. We provide examples from differing ends of the data maturity spectrum, but you’re already on that spectrum, so it’s time to think about ML in your data strategy.

Whether you are hiring to build a data team of one or 100 professionals, you’re probably thinking about hiring. Whether you’re choosing to store data in spreadsheets on a shared drive or deploy a robust database architecture to Snowflake, you’re making decisions around technology, data collection, and more.

Partnering with DAS42

DAS42 is your dedicated partner, committed to unleashing the full potential of your data and fostering a data-centric company culture.

For those in search of expert guidance to shape a robust predictive analytics framework, aligning with DAS42 offers a wealth of experience and proven methodologies. If you’re driven to unveil untapped opportunities for growth and success, while instilling predictability into your strategies, let’s connect and embark on this transformative journey together.


DAS42 is a premier data and analytics consultancy with a modern point of view. We specialize in solving some of the most complex business challenges for the world’s most successful companies. As a Snowflake Elite Partner, DAS42 crafts customized strategies that create a single source of truth and enable enhanced and faster decision-making. DAS42 has a presence across the U.S. with primary offices in New York City and Denver. Connect with us at das42.com and stay updated on LinkedIn. Join us today on our journey to help you realize the possibilities of transforming your business through data and analytics.