Recently, several of DAS42’s experienced Snowflake architects and developers came together to talk about Streamlit. For several years, we’ve been using Streamlit to develop tools and applications to enable our clients’ data-driven use cases. From chatbots to data quality audit applications to no-code interfaces for end users to interact with advanced predictive analytics, we’ve found a wide range of ways to take advantage of all Streamlit has to offer. Read on to learn more about what it is, why we like it, and how we’re using it.
What is Streamlit and how does it integrate with Snowflake?
Streamlit is a lightweight UI framework that enables somewhat simple applications to be built out in Python, a coding language that data engineers are already familiar with and typically already using.
Yeah – there’s Streamlit, which as Logan said, is a UI framework – it’s a Python library, open source, simple to use. Then there’s Streamlit in Snowflake (SiS), which is a product of the Snowflake acquisition of Streamlit a couple years ago – and it’s a native integration with Snowflake and the Snowflake universe, akin to how Snowpark runs Python natively in Snowflake. So you can use Streamlit on your local machine to connect any database out there – go forth, do great things. Or you can use Streamlit in Snowflake, which makes it even simpler by directly querying your data in the Snowflake ecosystem with even fewer lines of code – but with some trade-offs.
It’s also usually the front-end for any Snowflake native app. If you build a native app with any sort of front-end capability, chances are the front end is going to be Streamlit. The exception being apps driven off containerized services.
What excites you about Streamlit? What specific features of Streamlit make it a preferred choice for building data-driven applications?
It doesn’t require extensive web development experience. You don’t need to have that UI development experience to build out an app on Streamlit, and it’s particularly known for its simplicity and speed, enabling users to transform these data scripts into something that you can share with other users with just a few lines of code.
We addressed one of the key points, which is the use of Python, but aside from that, one of the things I’m really excited about and I’ve seen this both with Streamlit and with Snowflake itself is really responsive product updates. They both really seem to be listening to users continually updating things in a way that seems very natural and intuitive.
At the end of the day, one of the primary things is that it enables people to interact with the underlying data – say, a forecasting model – without needing to know how to code.
One thing that makes Streamlit exciting is the built-in widgets that come with it – like sliders, buttons, or text inputs. These are especially helpful and we’ve seen it in the past with our projects, these things played a huge role in creating that interactive aspect of Streamlit. They’re super customizable too, so it allows whoever is developing to tailor to the specifics of what the stakeholder needs. And then for me, I think the biggest feature was the community in itself – how responsive the Streamlit community is in general. One other thing that I’d like to add which we’ve heavily used was the caching mechanism – the st.cache allows you to reduce redundant processing.
Plus-one to the community aspect – I have posted numerous questions on the Streamlit community and I have never waited more than 36 hours for a reply, and most of the time those replies got me unblocked. They were high quality.
Also – it’s owned by Snowflake, but at its core it’s open-source. So you can take Streamlit and point it anywhere – it doesn’t need to be pointed at Snowflake – and then you can deploy it anywhere. You can deploy it on Streamlit Community Cloud, which is one of the things that got me excited about Streamlit, because they have really streamlined the process of deploying an app quickly. But then it’s flexible, so you can point it at, say, GCP and deploy it to your private and secure instance over there, if that’s where your workloads are being held.
In what scenarios would DAS42 recommend the use of Streamlit over other BI tools, applications, or frameworks? What makes it stand out for certain projects or requirements?
I would say if there’s a need to interact directly with data, especially something like a forecasting or predictive model. Our subscriber analytics app is a perfect example of that, where the forecasting allows users to have input on different parameters – that is not something that you can easily serve through a typical BI tool, certainly not without having to also do a lot of prefabricated data modeling to support it.
That goes for the chatbot model as well – for example, interacting with Snowflake’s new Cortex “Complete” function – but for someone who doesn’t know SQL, Streamlit is a really lightweight interface to connect an end user directly to these functions. It would be hypothetically possible to do something similar in some BI tools, but it would just be unnecessarily complicated.
If you have a lot of knobs to turn, this is where Streamlit plays an important role compared to other BI tools.
There’s also that obligatory single Google Sheet in every data model, and you could use Streamlit to basically serve the role of that Google Sheet right in Snowflake, so at least it’s not living somewhere else. Really, any time there’s this idea of manipulating data – especially for datasets that are relatively static, but that you’re never going to be able to replace. And the other thing that I think is interesting is doing something in a repeatable manner – if you think about Snowflake having the ability to interact with external APIs, now you could use Snowflake in a lot of ways as this sort of central hub to drive automation in the rest of your data ecosystem. You can use Streamlit as the conjugate between a non-technical person who has some input to a given API, and your data system – so they input whatever is supposed to happen and where, and then Snowflake simply goes out and does the work instead of doing this in multiple systems. It can really help you to keep your compute inside Snowflake.
What are the main challenges that teams might face when you’re using Streamlit, and how can DAS42 help organizations overcome these challenges?
As far as extracting data from Snowflake (or any other data warehouse) and manipulating it, you’re only in limited by as fast as you can download and upload data and process it on your local machine, but with Streamlit on Snowflake, there are actually explicit limitations of how much data can be returned in a given query. That being said, you can use Streamlit in such a way that you push your compute up to Snowflake – in our subscriber analytics app, we called Snowpark functions to use compute to perform the creation of a model, and then we only need to interpret the results locally. But in terms of limitations – I would not try to download a terabyte of data and manipulate it with Streamlit.
Another challenge is, well, garbage in, garbage out. But DAS42 will take a holistic approach to the data you want to analyze – we’ll build you more than a UI.
There’s like a general learning curve to Streamlit. It’s simple compared to other app development, but as we’ve developed, we have a lot of best practices that we’ve come across that we distill to our clients. We help them not end up with, like, a nasty Frankenstein Streamlit app that works, but under the hood could never be debugged.
What are some examples of Streamlit applications that DAS42 has built, and how they can transform the way that our clients interact with their data?
For one client, we built a forecasting model application, used to forecast the business’s future and different strategies. The pain point here was the Chief Strategy Officer was using a local Excel sheet, which got to be very big and made sharing difficult. Anytime there were changes, anytime he wanted to input new variables, it would take a long time to run. Being able to see like the revenue breakdown individually across branches was not, something that was viable with Microsoft Excel, and I think it had hit its limitation. So, Streamlit, hosted on Google Cloud on like a serverless compute, we could scale it easily. We include a lot of widgets and drop-downs and multiple variables that he could tune – and then I think the most important part of this was he could also share the application with his other C-level executives and board members, so they could play around and see what the forecasting model would predict for different scenarios.
Our Classification Model Builder app demonstrates the art of the possible when it comes to predictive workloads in Streamlit. It brings the user through a series of feature engineering prompts and then asks if the user wants to build a Cortex classification model or if they want to build a scikit-learn classification model, which gets into that idea of abstracting away something really technical.
We’ve also built retrieval-augmented generation chatbots, which are great for any business that has a large repository of knowledge and wants a quick interface for someone to find the relevant information inside of, say, dozens or hundreds of PDFs. Our public demo for this is for the Lincoln Park Zoo in Chicago, and you can ask about its seasonal hours, or parking, or whether the cafe has gluten-free food, or what sensory accommodations they have available – but we also have use cases for open season questions in health insurance, we have use cases for a business’s customers to use a chatbot, but also for internal users, say phone agents or seasonal workers, to have a tool that helps point them to the right resources and speed up customer service and improve the customer experience.
The subscriber analytics application enables time-series forecasting alongside other BI functionality – for example, revenue over time, turnaround time, and predictions around revenue growth, and the ability to adjust those knobs around user inputs, around, basically, what they foresee happening in the future, and analyzing its impact on those key metrics.
We also built a data resolution and cleanup tool, originally intended for the telecommunications space, for our telco clients to have a no-code way to resolve device mapping issues, like device type, and actually write that back to the database.
Our STRIDE application abstracts away the need to code in Terraform – an end user can click a button that says “generate Terraform” (according to a generic architecture), or click the other button that says “I want to design my own” and allow them to click a few buttons, type in a few names of what they want certain databases and schema names to be, and it’ll just spit out all the Terraform code and even allow them to one-click deploy right there in the Streamlit app. So someone setting up a new Snowflake instance can do it in a flexible but repeatable way.
Looking ahead, how do we see the role of Streamlit evolving in data-driven decision-making and app development?
For my piece, I’ve seen executive reports that are generated with Python and delivered as PDFs. I don’t understand why we would live in a world where the manual delivery of a PDF by email is still necessary. With automated QA processes in place, we should live in a world where the executives can interact with a Streamlit app, and turn the knobs they need, and see the data in closer to real-time, versus that PDF that they get once a day.
In terms of data insights in the future, the way it will evolve will be to help democratize data, especially enabling all the non-technical users as well with their data literacy because it’s easy to use, easy to share, and very intuitive when you look at it.
I just think we’ll see more of it. We’ve been working with it for a while, but from a market perspective, it’s still pretty new. As Snowflake continues to grow, as new people get into it and use it at first as just a warehouse, they’ll then realize that, wow, there’s a lot more here.
Partnering with DAS42
At DAS42, we bring our deep expertise as a Snowflake Elite Services Partner and an innovative approach to help organizations harness the full potential of Streamlit and Snowflake for their data-driven needs. Our team of experts excels in building sophisticated, user-friendly applications that simplify data interaction and empower decision-making. Whether you want to implement predictive analytics, streamline data workflows, or create interactive dashboards, we can tailor solutions that meet your requirements.
DAS42 is a premier data and analytics consultancy with a modern point of view. We specialize in solving some of the most complex business challenges for the world’s most successful companies. As a Snowflake Elite Partner, DAS42 crafts customized strategies that create a single source of truth and enable enhanced and faster decision-making. DAS42 has a presence across the U.S. with primary offices in New York City and Denver. Connect with us at das42.com and stay updated on LinkedIn. Join us today on our journey to help you realize the possibilities of transforming your business through data and analytics.