Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Microsoft is giving away 50,000 FREE Microsoft Certification exam vouchers. Get Fabric certified for FREE! Learn more

PorcuBI

KQL: First Steps to Analyze and Visualize your Data

So, you’ve started exploring real-time intelligence in Fabric and maybe even followed the Microsoft tutorial to implement a real-time solution. Now, you’re looking to analyze and visualize that data, and that’s where KQL (Kusto Query Language) comes in.

When I started, I struggled a bit, so I thought it would be a good idea to provide some basic queries (and explain what they do) for people who want to get started but are unfamiliar with querying data. The data I use for these examples is the Microsoft bike sample dataset.

 

Organizing and Analyzing your Data

Before analyzing your data, it’s important to organize it to make the insights clear and actionable. By applying the right commands, you can focus on the most relevant information and filter unnecessary data. 

Commands to organize your data are, for example:

  • Project: Select specific columns from your dataset
  • Where: Filter data based on conditions
  • Summarize: Group and aggregate your data
  • Order by: Sort your results

Example 1: Selecting and filtering your data 

Let's start with an example where we want to select and filter data:

 

Bikes

| take 100

| project Neighbourhood, No_Bikes, No_Empty_Docks,      

  ['Timestamp'] = current_time

| where No_Bikes > 10 and No_Empty_Docks > 5

KQL with take, project and where statementKQL with take, project and where statement


What does this query do? 

  • Take: Limits the dataset to the first 100 rows, making it manageable for analysis.
  • Project: This selects specific columns, such as Neighbourhood, No_Bikes, and No_Empty_Docks, and current time (which I rename Timestamp in the query).
  • Where: Filters the data to include only neighborhoods with more than 10 bikes and at least 5 empty docks.

Example 2: Summarizing and Sorting Data

Summarizing and sorting data are important steps in analyzing large datasets. You can aggregate your data in different ways, such as sum, average, or maximum/minimum values, and then sort the results to identify trends or outliers.
An example:

Bikes

| summarize                                                      

    TotalBikes = sum(No_Bikes),

    TotalEmptyDocks = sum(No_Empty_Docks),                    

    AverageBikes = avg(No_Bikes),

    MaxBikes = max(No_Bikes),

    MinEmptyDocks = min(No_Empty_Docks)

by Neighbourhood

| order by TotalBikes desc

Summarizing and Ordering dataSummarizing and Ordering data

 

What does this query do?

  • Summarize: Groups the data by Neighbourhood and aggregates the data as sum, average, maximum, and minimum values.
  • Order by: Sorts the results in descending order based on the TotalBikes (which is the sum of bikes).

Example 3: Renaming Columns for Better Readability

I like to rename my columns and add whitespace for better readability since I also want to visualize them and want my end users to find what they are looking for easily.

| take 100

| project ['Neighbourhood'],

          ['Number of Bikes'] = No_Bikes,                               

          ['Empty Docks'] = No_Empty_Docks,

          ['Current Time'] = current_time

Changing a column nameChanging a column name

 

This query selects and renames specific columns from the dataset to improve clarity:

  • Take 100: Retrieves the first 100 rows from the dataset.
  • Project: Chooses specific columns to include in the output and renames them (using ['New name'] )
    • Neighbourhood remains unchanged.
    • No_Bikes is renamed to Number of Bikes.
    • No_Empty_Docks is renamed to Empty Docks.
    • current_time is renamed to Current Time.

Visualizing your Results

Visualizing data helps transform raw data into meaningful insights. You can effectively communicate trends, patterns, and key findings by creating visuals that reflect that information. In KQL, you can directly visualize your data using the render statement.

 

Example 1: Line Chart

If you're working with time-based data, you can visualize trends over time using a line chart. This is useful for monitoring how values change at different time intervals.

 

Bikes

| summarize TotalBikes = sum(No_Bikes) by bin(current_time, 1d)      

| order by current_time asc

| render linechart

Render linechartRender linechart

 

What does this query do?

  • Summarize TotalBikes = sum(No_Bikes) by bin(current_time, 1d):

    • Summarize: Groups the data and calculates an aggregated value.
    • sum(No_Bikes): Adds up the number of bikes (No_Bikes) for each group.
    • bin(current_time, 1d): Groups the data into 1-day intervals (1d) based on the current_time column.
  • Order by current_time asc:

    • Sorts the grouped data in ascending order of time.
  • Render linechart:

    • Displays the data as a line chart.

 

Example 2: Bar Chart

Bar charts are useful for comparing categories, like the number of bikes across different neighborhoods.

Bikes

| summarize TotalBikes = sum(No_Bikes) by Neighbourhood       

| order by TotalBikes desc

| render barchart

Render barchartRender barchart

 

What does this query do? 
This query calculates the number of bikes per neighborhood and visualizes the results in a bar chart, sorted by the total bike count in descending order. This helps to compare bike usage across different neighborhoods.

 

Example 3: Stacked Column Chart

In KQL, you can create a stacked column chart to visualize how different components contribute to the total in each category. Here's an example:

Bikes
| take 100
| where No_Bikes > 10 and No_Empty_Docks > 5
| summarize ['Total Bikes'] = sum(No_Bikes), ['Total Empty Docks'] = sum(No_Empty_Docks)
by Neighbourhood
| render columnchart with (kind=stacked)
Render: Stacked column chartRender: Stacked column chart


What does this query do?

This query calculates the number of bikes and empty docks for each neighborhood. It then visualizes the data using a stacked column chart, where each bar represents a neighborhood. The bars are divided into sections, one for "Total Bikes" and one for "Total Empty Docks." The kind=stacked ensures that the values are stacked on top of each other.

 

I hope this article helps you understand some basic KQL commands a little better. Knowing commands like project, summarize, and where will already get you a long way in organizing, filtering, and analyzing your data.

I created a video about this if you rather follow along than read, you can find the video here.