Thought Leadership

Harnessing the Power of Unstructured Data with Snowflake

Chris Lugg

Associate Analytics Consultant

December 15, 2023
Featured image for “Harnessing the Power of Unstructured Data with Snowflake”

Transforming Data Analysis: Insights from Snowflake’s BUILD 2023

The BUILD session at Snowflake has set a new direction in data analytics, with an emphasis on extracting insights from unstructured data with Machine Learning (ML). The keynote illustrated this with an example from a hypothetical company, Frosty Toys, showing how easy it can be to analyze call transcripts, customer feedback, and reviews using upcoming features in Snowflake. This approach opens up a wealth of data previously untapped, offering deeper insights into customer behavior and preferences.

A New Era in Data Analysis

A core theme of the session was integrating every step in the data processing workflow within Snowflake. Key features in Snowflake as an end-to-end solution include…

Model Fine-Tuning and Deployment. Leveraging features such as Snowpark Container Services, Snowflake Cortex, and Snowpark ML (all in Private Preview), users can train their own models, fine-tune open-source models, and share these models with the Snowflake ML Registry.

Streamlit Integration. Streamlit provides an interactive front-end experience for end users and is written entirely in Python. Streamlit’s presence continues to grow as the Snowflake-integrated front end of choice for displaying data analysis and interacting with data ML models.

Snowflake Native Apps and Streamlit

Other highlights included Snowflake Native Apps and the integration of Streamlit in Snowflake (SiS). One session showcased an SiS app that utilizes a custom Large Language Model (LLM) and the creation of a Native App seamlessly pushed to the Snowflake Marketplace. With Native Apps, Snowflake enables developers to create secure apps delivered through its platform, opening new revenue streams. These apps only share back to the developer user-approved information, like log events, ensuring safety and data privacy.

On the topic of Snowflake Native Apps, Snow CLI (Private Preview) will make it easy to initialize and build Snowflake Native Apps with convenience commands that generate the requisite folder structure and files.

LLM Innovators Conference: Pushing the Boundaries of AI

The LLM Innovators Conference showcased five winners from a hackathon with hundreds of submissions. A standout application was a “personality” saver for an LLM. This tool saved a set of prompts and previous writings in memory, allowing the LLM to complete tasks without human supervision and refer back to these results later. This innovation opens the door to “AGI Light,” granting more autonomy to LLMs.

A practical application of this technology could be in customizing prompt sets for routine tasks. For instance, a “data engineering” ChatGPT could be configured for short, code-heavy responses and multiple-method explanations.

Snowpark ML: The Future of Machine Learning in Snowflake

Snowpark ML, not yet in Public Preview, represents a significant advancement in Snowflake’s ML capabilities. It includes a library for common functions like preprocessing and model training directly in Snowpark and a Model Registry for accessing open-source models and sharing user-generated ones. This development promises a more streamlined approach to machine learning within the Snowflake ecosystem.

The Data Pipeline Monitoring with Snowflake Cortex was a great demonstration of this.  Snowflake Cortex introduces an innovative approach to ML with SQL and Python functions. The demonstration included time series forecasting using snowpark.ml.forecast and anomaly detection with snowpark.ml.anomaly_detection. The use of TASK for model re-training and ALERT for notifications exemplifies the platform’s ability to perform data monitoring and notification. This approach, requiring minimal SQL for ML model operations, democratizes advanced data analytics, allowing predictive modeling without extensive ML knowledge.

Conclusion

Snowflake’s BUILD session has set the stage for a new era in data analytics, where the focus is shifting towards maximizing the potential of unstructured data and democratizing ML. The integration of an ML library, the ease of model deployment, and the security of Snowflake Native Apps are just a few of the advancements paving the way for more insightful, efficient, and secure data analysis.


DAS42 is a premier data and analytics consultancy with a modern point of view. We specialize in solving some of the most complex business challenges for the world’s most successful companies. As a Snowflake Elite Partner, DAS42 crafts customized strategies that create a single source of truth and enable enhanced and faster decision-making. DAS42 has a presence across the U.S. with primary offices in New York City and Denver. Connect with us at das42.com and stay updated on LinkedIn. Join us today on our journey to help you realize the possibilities of transforming your business through data and analytics.