Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Microsoft is giving away 50,000 FREE Microsoft Certification exam vouchers. Get Fabric certified for FREE! Learn more

Sahir_Maharaj

All in One Place - How Fabric Notebooks Simplify Data Science

As a data scientist, I know firsthand how challenging it can be to manage data analysis, with multiple tools and environments complicating the process. It can feel like you’re juggling many different apps just to get a clear picture of your data. But imagine having one environment that integrates all the steps from data exploration to visualization, a place where you can write, visualize, and transform your data all in one go. That’s where Microsoft Fabric notebooks come in.

 

What you will learnIn this edition, I'm going to guide you through how to explore and visualize data using Microsoft Fabric notebooks. We'll start by understanding why this tool is so helpful for your workflow, then move on to how you can make use of its unique capabilities. By the end of this post, you'll know how to effectively use Microsoft Fabric notebooks to explore your data, visualize key insights, and streamline your data analysis process. Stick with me, and let's explore into what makes this tool a game changer for data professionals.

 

Read Time: 6 minutes

 

Microsoft Fabric notebooks offer a convenient, unified environment that allows you to handle data exploration, transformation, and visualization effortlessly. They bring together the flexibility of notebooks, much like Jupyter, and integrate it within the Microsoft ecosystem, which makes it accessible if you’re already using other Microsoft tools like Power BI or Azure. If you've ever struggled with piecing together results across multiple applications, you’re going to appreciate the simplicity and power Fabric notebooks bring to the table.

 

One of the standout features of Microsoft Fabric notebooks is their integration within the Microsoft ecosystem. If you’re already using tools like Power BI, you’ll find that working with these notebooks feels very natural. It allows you to conduct deep data analysis while still being in the same environment where you create visual dashboards, no need for importing or exporting data back and forth. Everything is interconnected. Consider a scenario where you’re analyzing sales data. You can begin by cleaning the data within the notebook, write some Python or SQL to explore trends, and then visualize those trends instantly, all without leaving Microsoft Fabric.

 

Source: Microsoft LearnSource: Microsoft Learn

 

Another reason I love these notebooks is the flexibility they offer. Whether you prefer coding in Python, or you’re more comfortable using SQL, Fabric notebooks provide the best of both worlds. As an example, let’s say you have data that needs a complex transformation. You can use Python for the heavy lifting, then switch to SQL for quick aggregations, all within the same notebook. This makes it an ideal tool if you’re someone who has mixed experience across different coding languages. Plus, since it’s built on the foundation of familiar notebook environments, if you’ve used Jupyter or Azure Notebooks before, the learning curve is virtually non-existent.

 

Source: Sahir MaharajSource: Sahir Maharaj

 

These notebooks also come with built-in visualization capabilities, which means you can easily go from numbers to insightful charts with just a few clicks. Imagine you’re tasked with figuring out which marketing channels performed best over the past quarter. In Fabric notebooks, you can write some quick analysis code to generate insights and then immediately create a chart to visualize which channels drove the most conversions. It’s fast, effective, and helps you communicate your findings with clarity.

 

Source: Sahir MaharajSource: Sahir Maharaj

 

Now that you understand the value of Fabric Notebooks, let’s explore how to set it up.

 

1. To begin, log in to your Microsoft Fabric environment and select the Data Science tile.

 

Source: Sahir MaharajSource: Sahir Maharaj

 

2. Upon navigating to the dedicated Data Science area, from the Recommended items to create, select Notebook. A blank notebook will open up, and you’re ready to start exploring your data. The interface is quite straightforward, on the left-hand side, you have navigation and on the right side your notebook cells where you can write code.

 

Source: Sahir MaharajSource: Sahir Maharaj

 

4. There’s also a toolbar alongside each code block that allows you to run individual cells. If you’re familiar with Jupyter notebooks, this will feel familiar, with a few added features that make life easier, like tighter integration with other Microsoft services.

 

Source: Sahir MaharajSource: Sahir Maharaj

 

5. Next, let’s import some data. Select on the Lakehouses menu on the left-side panel, and you’ll see a list of available lakehouses. For this demo, let’s add a new lakehouse that contains holiday data for a store.

 

Source: Sahir MaharajSource: Sahir Maharaj

 

6. You can click on the table to load it directly into your notebook - a few lines of Python code will bring it into memory, ready for exploration. Alternatively, if SQL is more your style, you can use SQL commands directly in the notebook to query the data, which is extremely convenient for quick summaries and checks.

 

Source: Sahir MaharajSource: Sahir Maharaj

 

7. Data cleaning is often the most time-consuming part of any analysis, and Fabric notebooks make it easy to handle. Suppose the dataset has some missing values, you can use Python’s Pandas library to identify and fill in these gaps. In the notebook, simply write a few lines of code.

from pyspark.sql.functions import col, when, lit, to_date

# Load the data from the Lakehouse
df = spark.sql("SELECT * FROM SalesLakehouse.sales LIMIT 1000")

# Ensure 'date' column is in the correct format
df = df.withColumn("date", to_date(col("date"), "yyyy-MM-dd HH:mm:ss"))

# Fill missing values in 'isPaidTimeOff' with False
df = df.withColumn("isPaidTimeOff", when(col("isPaidTimeOff").isNull(), lit(False)).otherwise(col("isPaidTimeOff")))

# Remove duplicate rows
df_cleaned = df.dropDuplicates()

# Rename columns for clarity
df_cleaned = df_cleaned.select(
    col("countryOrRegion").alias("Country/Region"),
    col("holidayName").alias("Holiday Name"),
    col("normalizeHolidayName").alias("Normalized Holiday Name"),
    col("isPaidTimeOff").alias("Is Paid Time Off"),
    col("countryRegionCode").alias("Country Code"),
    col("date").alias("Date")
)

# Display the cleaned data
display(df_cleaned)

 

Source: Sahir MaharajSource: Sahir Maharaj

 

8. Once your data is clean, it’s time to explore it. Let’s say you’re interested in analyzing holidays by country/region. You can use Python to create a quick bar plot that shows which countries or regions have more recorded holiday data, providing insights into the distribution of holiday observations.

import matplotlib.pyplot as plt

df_pandas = df_cleaned.toPandas()

# Count of holidays by country/region
plt.figure(figsize=(12, 6))
df_pandas['Country/Region'].value_counts().plot(kind='bar', title='Count of Holidays by Country/Region')
plt.xlabel('Country/Region')
plt.ylabel('Count of Holidays')
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

 

Source: Sahir MaharajSource: Sahir Maharaj

 

Fabric notebooks also provide built-in charting capabilities, so once you have your dataframe ready, all it takes is a simple command to visualize it.

 

9. Visualization is where your data tells its story. In Microsoft Fabric notebooks, you can visualize your data using built-in charts or by integrating with popular Python libraries like Matplotlib or Seaborn. Let’s visualize the distribution of paid time off status. You can create a pie chart to quickly see this breakdown.

# Distribution of paid time off status
plt.figure(figsize=(6, 6))
df_pandas['Is Paid Time Off'].value_counts().plot(kind='pie', autopct='%1.1f%%', title='Paid Time Off Status')
plt.ylabel('')
plt.tight_layout()
plt.show()

 

Source: Sahir MaharajSource: Sahir Maharaj

 

Whether you’re cleaning data, creating visualizations, or sharing insights with your team, Fabric notebooks offer a unified environment that saves you time and effort. If you’ve struggled with jumping between different tools and environments in the past, I encourage you to give Fabric notebooks a try. It’s time to take your data analysis to the next level, try Microsoft Fabric notebooks today, and see how they can make a difference in your workflow.

Comments