Thought Leadership

Don’t Miss Out on These Snowflake New Features and Announcements

Teresa Kovich

Principal Consultant

August 1, 2023
Featured image for “Don’t Miss Out on These Snowflake New Features and Announcements”

“How many demos?” asked Snowflake SVP of Product Christian Kleinerman during the June 27 keynote event at the Snowflake Summit 2023 annual conference. “Ten? You’re crazy.”

Then, over the next two hours, the Snowflake team showed no fewer than 14 demos for a range of features and products – some still in development, others in private or public previews, and some released immediately that morning.

Aside from the apparent industry trends of artificial intelligence (AI), machine learning, and large language models, one central theme of Snowflake’s announcements this year was eliminating limitations. “We don’t want you to have to worry about tradeoffs,” Snowflake’s product team members told the conference attendees.

There’s so much happening in the Snowflake world that even the tech-savvy among us have trouble keeping up. So here’s a roundup of the top features that DAS42 Principal Consultant and Snowflake Data Consultant Teresa Kovich feels you should know about and how they might affect your business.

Document AI

Hands down, one of the most incredible things we saw at the summit was the preview of Document AI – but note that it’s still only in the early stages, available only by limited private preview. This feature uses large language models and machine-learning concepts to extract and parse valuable information from unstructured data, such as forms, warranties, and other text-heavy documents. It can even parse handwritten input and also give confidence scores for the values of each identified field. You can provide the AI live feedback and retrain it to adjust accordingly.

Snowflake’s tasks and streams can allow you to process new files as they come in, and you can set up alerts to provide email notifications when a file is processed with a specific result. See the five-minute demo here.

Machine Learning Packages and Tools

If your organization is as enthusiastic about data science and machine learning (ML) capabilities as most of our clients, you’ll be excited to know that Snowflake has a variety of new machine learning tools and functions in public preview.

If you’re using Snowpark, preprocessing tools are now available to help you with data preparation steps such as one-hot encoding (converting categorical dimensions into ML-friendly numerical values), and modeling tools help you train, manage, and deploy models. See details on the new features here.

There are also three new SQL functions powered by machine learning designed for use on your time-series datasets:

  1. Anomaly Detection helps detect outliers, such as unusual web traffic, sales, or signups.
  2. Forecasting allows you to identify likely future trends, incorporating seasonal analysis.
  3. Contribution Explorer helps to identify what data segments or dimensions are driving or contributing to shifts in metrics.

Anomaly detection and forecasting both use gradient-boosted machines and can currently be used on datasets up to 500,000 rows.

Privacy and Security

Although we don’t yet have a firm timeline for its release or public preview availability, we see a lot of potential utility in one of the latest features announced on the privacy and security side: query constraints. While we’ve already seen a lot of enhancements to smooth out field-level restrictions (such as the tag-basked masking now available for Enterprise edition clients), query constraints are a new concept that can help bring this to the next level. Projection constraints will make certain columns not selectable, and aggregation constraints will make certain columns only able to be queried in aggregate.

In the past, DAS42 has modeled designs in our customers’ business intelligence platforms to allow for similar restraints on sensitive datasets, particularly relevant for our clients in healthcare and finance. These features will simplify that control, and we’re excited to see how they handle some of the tricky edge circumstances. (Can you limit based on the aggregate size, e.g., only allow a query to return if the aggregate buckets are a specific size?)

Tools for Happier Engineers

Last but not least, Snowflake didn’t forget the needs of its “hands-on-keyboard” users, releasing a whole host of features to make life easier for engineers. One of the most exciting is the ability to Group by ALL, a feature that we at DAS42 have anticipated for many years. This feature allows our query experts to simply say “group by all” instead of what may in the past have looked something like “group by 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 16, 17, 18, 20, 21, 22, 23, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34” (where columns 15, 19, 24, and 35 are aggregations that cannot be included in the group, but that require every other column to be grouped in order to work). Group by ALL intelligently identifies this and removes the pain of manually counting, typing out the index list, and updating it as you move columns around. 

Another feature that has us jazzed about the future is the promise of future Git integration. As with query constraints, we don’t yet have a release timeline for this, but it’s a critical milestone. Snowflake told us you can synchronize code in a Snowflake stage with code in a specified branch of a git repository. Git integration allows for more robust version control and shows that Snowflake is listening to its users – this is a commonly-requested feature, and we hope that stage integration is just the first step.

Finally, two features are now in public preview, allowing engineers to simplify their pipelines. Table schema evolution allows more dynamic adjustments to your tables when underlying data structures change (rather than having to drop your tables and recreate and re-backfill them when certain types of changes occur). And dynamic tables help when you have streaming data that needs to be joined or transformed, making it simpler to orchestrate the necessary transformations (and do so efficiently), with rebuilds only affecting incremented rows and designated “target lag” parameters defining your data freshness thresholds.

Conclusion

Snowflake Summit is always a highlight of our year at DAS42. As Snowflake Data Consultants, we’re impressed and enthusiastic about the advancements Snowflake has made in the past year. We’ll keep our fingers on the pulse for you as they progress. Keep an eye on our Point of View series for a more in-depth look at some of these releases and themes, and let us know what you’re most excited about!

Ready to get started or unsure where to begin? Take advantage of a free 30-minute consultation to learn how DAS42 can help your organization no matter where you are in the data maturity lifecycle. You can also take our data maturity quiz to help you identify the next steps of your journey.