natural language queries on tabular data with column-based filtering and analysis

As a BI enthusiast on our team, we want to be able to ask questions comparing concepts in our tabular data and have storytell answer.

We will know this is done when:

  1. Having provided a large table to storytell, e.g. 'clicks', 'battery level', 'screen brightness' and 'customer'

  2. Having identified a column as a label column (e.g. 'customer')

  3. We can ask a question that can be answered via common BI terms and get an answer in storytell. The answer is able to be manipulated by the LLM after retrieving data from big query. Examples:

    1. What customer has the most clicks?

    2. How many clicks does our customer have (where our customer is a value in the 'customers' column, and the 'customers' column is auto labelled)

    3. 'For our customer, generate a table comparing the total number of clicks on two axis, device battery level and screen brightness. If the queries are evenly distributed across all battery levels and screen brightnesses, output the string 'no fraud detected by screen brightness and battery level'. However, if the queries result in a significant percentage of the queries being well correlated by battery level and screen size, then output the message 'possible fraud detected due to unusual correlation between battery level and screen brightness'. If possible fraud is detected, also render a heat map visually showing the numerical data in the table.

Feature goal: allow writing a question referencing labels that are columns. To answer it, start by writing a traditional BI query correlating those columns, then feeding the results of that query to the LLM to generate a story tile for it.

This lets us write questions such as: 'For customer @customer_column_value, generate a table comparing the total number of queries on two axis, device battery level (a column in the table) and screen brightness (a column in the table). If the queries are evenly distributed across all battery levels and screen brightnesses, output the string 'no fraud detected by screen brightness and battery level'. However, if the queries result in a significant percentage of the queries being well correlated by battery level and screen size, then output the message 'possible fraud detected due to unusual correlation between battery level and screen brightness'. If possible fraud is detected, also render a heat map visually showing the numerical data in the table.

Storytell serves this by generating a table via a generated traditional BI query. It then analyzes the results using an LLM, trying to answer the question. The result is used to generate a particular individual 'widget' in the dashboard and is a story view.

Please authenticate to join the conversation.

Upvoters
Status

In Review

Board

💡 How I'd like to use Storytell

Date

8 months ago

Subscribe to post

Get notified by email when there are changes.