Search…
⌃K
⏩

Iterate efficiently

No need to do exploration in a local notebook. You can iterate in a temporary View or Job instead.
If you are a data scientist, chances are that you are used to working like this:
  1. 1.
    Iterate on the code in a notebook
  2. 2.
    Move the code to a library function
  3. 3.
    Import it back into the notebook
  4. 4.
    Proceed to the next part of the code
The best way to carry this efficiency over to Databutton is to do the iteration in a temporary View or Job.
​

Iterating in a View

The benefit of using a View to iterate is that you can use st.write to get graphical output. This makes it a good fit for
  • Exploratory Data Analysis
  • Parsing json structures
  • Modeling and analysis
Streamlit supports Matplotlib, Plotly, and have nice outputs of both Pandas Dataframes and JSON. Thus, you can reasonably well carry over most of the graphical output you normally use in notebooks.
The caveat with using a View to iterate is that you will re-run the entire code all the time. This means that you will have to wait for expensive function calls and maybe incur costs for calling third-party APIs. Luckily this can be somewhat circumvented by caching expensive function calls. As an example, here we cache fetching data and calculating the correlation matrix
import databutton as db
import matplotlib.pyplot as plt
import streamlit as st
​
​
@st.cache
def fetch():
df = db.storage.dataframes.get("df")
corr = df.corr()
​
return corr
​
​
corr = fetch()
fig = plt.matshow(corr).get_figure()
st.pyplot(fig)
​
The cache decorator ensures that the function is only re-run if the function code is changed in some way. Using cached functions, you will be able to have an iteration speed which is pretty close to what you are used to in a notebook. If you keep on iterating over time or work with somewhat live data, it might be necessary to throw in a time-out for the cache, st.cache(ttl=time_in_seconds).

Iterating in a Job

While a View is nice to iterate in when having graphical output, iterating in a job can be more suitable for certain tasks. In particular, when scraping a website, parsing an API response, or similar tasks where terminal output is fine, iterating in a job is better. It's recommended to install the rich package (see how to install packages here) to significantly improve the visual textual output.
Here's an example of rich's print output
​
​
vs the regular print method
​
​