Analytics Functions in Google BigQuery

Analytics Functions in Google BigQuery

Get quick insights using SQL analytics functions

A window(analytic) function computes values over a group of rows and returns a single result for each row. This is different from an aggregate function, which returns a single result for a group of rows.

A window function includes an OVER clause, which defines a window of rows around the row being evaluated. For each row, the window function result is computed using the selected window of rows as input, possibly doing aggregation.

With window functions, you can compute the following:

  1. moving averages,

  2. rank items,

  3. calculate cumulative sums, e.t.c.,

Step 1: Create Sample BigQuery Table

Let's create a simple sale_orders table inside the sales dataset in Google BigQuery:

  -------------- Create a sales_order table --------------
with
sales_order as (
              select DATE('2023-11-07') as delivery_date, 'Customer A' as customer, 'Kenya' as shipping_country, 10 as amount
              union all select DATE('2023-11-08') as delivery_date, 'Customer A' as customer, 'Kenya' as shipping_country, 20 as amount
              union all select DATE('2023-11-09') as delivery_date, 'Customer A' as customer, 'Kenya' as shipping_country, 30 as amount
              union all select DATE('2023-11-10') as delivery_date, 'Customer A' as customer, 'Kenya' as shipping_country, 40 as amount
              union all select DATE('2023-11-08') as delivery_date, 'Customer B' as customer, 'Kenya' as shipping_country, 100 as amount
              union all select DATE('2023-11-09') as delivery_date, 'Customer B' as customer, 'Kenya' as shipping_country, 200 as amount
              union all select DATE('2023-11-10') as delivery_date, 'Customer B' as customer, 'Kenya' as shipping_country, 300 as amount
              )
select * from sales_order

Output:

Analytics Function 1: FIRST_VALUE

  • FIRST_VALUE returns a value for the first row in the current window frame.

  • Use case:

    • Suppose we are interested in getting a unique list of customers with their corresponding first delivery_date from the sales_order?

    • We could simply use the FIRST_VALUE analytics function as follows:

         ------- get customer list with last delivery date
            select
                distinct customer,
                FIRST_VALUE(delivery_date) OVER (PARTITION BY customer ORDER BY delivery_date DESC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS last_delivery_date
            from sales_order
      

      Output:

Analytics Function 2: LAST_VALUE

  • LAST_VALUE returns the value of the last row in the current window frame.

    • Use case:

      • Suppose we are interested in getting a unique list of customers together with their last delivery_date from the sales_order?

      • We could simply use the LAST VALUE function as follows:

             ------- get customer list with last delivery date
              select
                  distinct customer,
                  LAST_VALUE(delivery_date) OVER (PARTITION BY customer ORDER BY delivery_date DESC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS last_delivery_date
              from sales_order
        

Output: