The OmniSci data science foundation includes the Altair visualization library. An overview of Altair from the project website:
Altair is a declarative statistical visualization library for Python, based on Vega and Vega-Lite, and the source is available on GitHub.
With Altair, you can spend more time understanding your data and its meaning. Altair’s API is simple, friendly and consistent and built on top of the powerful Vega-Lite visualization grammar. This elegant simplicity produces beautiful and effective visualizations with a minimal amount of code.
Altair and Ibis
Although Altair is typically used with smaller, local datasets, OmniSci has integrated it with Ibis (and this integration itself is open-source). This combination allows interactive visualization over extremely large datasets consisting of billions of data points, all with minimal Python code.
In addition, Altair supports composable visualization, which allows for more than just local data exploration on small datasets when combined with Ibis. Because Ibis can support multiple storage backends, you can, for example, create charts that cover more than one (remote) data source at a time.
The following examples highlight the capabilities of Altair and ibis together with OmniSci.
JupyterLab version 2.0 or higher is required for the following examples.
First, install ibis-vega-transform, which in turn installs Altair and Ibis.
pip install ibis-vega-transform
jupyter labextension install ibis-vega-transform
The following minimal example of Ibis and Altair together starts with a simple pandas dataframe.
You can use Altair directly with pandas, without using Ibis (see the Altair documentation). This example shows how Ibis can support pandas itself as a backend in addition to the SQL backends Ibis supports.
Next, let's use Altair with a more scalable Ibis backend. This example uses OmniSciDB, but you can try this with other Ibis backends supported via the ibis-vega-transform project that bridges Altair to Ibis.
This example connects to a public OmniSci server, but you can use any OmniSci server you have access to.
Next, let's create a simple Altair chart. This chart groups the list of airlines by the number of records (i.e flights) in this dataset. Doing so should produce a bar chart like the earlier example, but the difference here, is that we're connected to an OmniSci backend rather than using a local pandas dataframe.
In the background, the ibis expression t[t.carrier_name]) is translated into a SQL query, and the results are rendered as a chart directly - no SQL knowledge required!
c = alt.Chart(t[t.carrier_name]).mark_bar().encode(
Let's create a more interesting chart beyond a simple bar chart - in this case an Altair heatmap.
This should create a chart like this, where hovering over the cells shows an interactive tooltip
Adding More Interactivity
Altair provides many ways to add interactivity to charts. Actions like selection and brush filters can provide more dynamic data visualizations in Altair, that allow you to explore data in a far richer manner, beyond creating static charts.
#The next 2 lines create a selection slider to drive a parametrized Ibis expression
This creates an interactive chart that is parametrized by the slider. Moving the slider changes the selected month and updates the chart. Unlike working with a static, local dataset, you are now running SQL queries against OmniSciDB each time the slide value changes.
You can see this in the logs, in the final query generated:
"SELECT ""flight_dayofmonth"", avg(""depdelay"") AS average_depdelay
WHERE ""flight_month""=3.0#this is from the slider value
GROUP BY flight_dayofmonth"T
You can build sophisticated chart combinations that combine several of Altair's capabilities with Ibis to create a crossfiltered visualization, like in OmniSci Immerse. In this example, every data source is an Ibis expression that generates SQL queries to an OmniSci backend. A total of five queries are generated and executed to create the crossfiltered visualization.
states = alt.selection_multi(fields=['origin_state'])
(count_filter | count_total)&(flights_by_state | carrier_delay)& time
This generates the following Altair visualization, which leverages composable charting and provides greater interactivity with enhanced selections powered by dynamic data loading via Ibis.
Altair and Ibis can also be used to visualize geospatial data. Altair supports multiple geospatial visualizations and can accept GeoPandas dataframes as input. Some Ibis backends, including OmniSci, support spatial operations, which output to GeoPandas dataframes. By combining the two, you can create map-based visualizations.
You can combine Ibis and Altair inside JupyterLab. By defining multiple Ibis backend connections with Ibis, you can create complex interactive visualizations that span multiple data sources, all without moving data into local memory. This allows greater flexibility and productivity in data exploration.