Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Microsoft is giving away 50,000 FREE Microsoft Certification exam vouchers. Get Fabric certified for FREE! Learn more

Reply
Marco117
Frequent Visitor

Ms Fabric spark notebook with poor performance

I have a notebook that was running on a Databricks cluster with only 4 executor cores and 4 driver cores, and Autoscale and Dynamically allocate executors were disabled. Here, the notebook executed in approximately 3 minutes.

Now, in Fabric, the same notebook with the same inputs that I read from Databricks takes 14 minutes to execute (attached image, the notebook is invoked from a pipeline). What I see different in Fabric is that the workspace has Autoscale and Dynamically allocate executors enabled, and there are moments when the notebook starts using 72 cores when this is really not necessary.
Marco117_0-1727708906047.png

What could I do to improve this?

4 REPLIES 4
prasbharat
Frequent Visitor

@Marco117 

It sounds like you’re experiencing significant differences in execution time when running your Spark notebook in Microsoft Fabric compared to Databricks. Based on your description, here are some additional suggestions that could help optimize performance without conflicting with the previous insights provided:

1. Validate Resource Allocation and Execution Overhead

Use the Monitor Run series to check where the additional time is being spent. Running the notebook directly might yield insights into whether the pipeline execution is introducing delays.

  • If there are multiple notebooks in the pipeline, consider enabling High Concurrency Mode to minimize sequential bottlenecks.
  • If concurrency doesn’t apply, you can try running the notebooks independently for a direct comparison.

2. Since Autoscale and Dynamic Allocation are enabled, they might be overprovisioning resources unnecessarily. If your workload is stable and doesn’t require frequent scaling:

  • Disable Autoscale and Dynamic Allocation temporarily.
  • Configure fixed resources for Spark using settings similar to your Databricks cluster:

spark.conf.set("spark.executor.instances", "2")
spark.conf.set("spark.executor.cores", "4")
spark.conf.set("spark.driver.cores", "4")
spark.conf.set("spark.sql.shuffle.partitions", "8")

 

3. Ensure your input data is stored in an optimized format like Parquet or Delta for better read performance. If the current format is CSV or JSON, consider converting it:

df = spark.read.csv("path_to_data").repartition(4).write.parquet("optimized_data_path")

Partition your dataset based on the number of executor cores to reduce shuffle overhead:

 

df = df.repartition(4)

 

4. Monitor Job performance:

 

  • Enable Spark UI in Fabric to analyze job execution. Look for long-running stages or inefficient shuffling during transformations.
  • Use the explain() method to analyze your query execution plans and optimize joins, filters, or other transformations: df.explain()

If this post helps you resolve the issue, please consider accepting it as the solution to help other members find it more quickly. If you still have additional questions, feel free to let me know, and I’ll be happy to assist further. Thanks a lot!

 

abuislam
Frequent Visitor

It seems like the Autoscale and Dynamic Allocation settings in Fabric might be over-allocating resources, which could be slowing things down. Try adjusting those settings to better match your workload, or consider disabling Autoscale and Dynamically allocating executors to match the Databricks setup. You could also experiment with manually adjusting the number of cores to see if that helps improve performance.

lbendlin
Super User
Super User

I have been told that bucket sizes play an outsized role (sorry for the pun).  Maybe something you can compare between your two setups.

v-cgao-msft
Community Support
Community Support

Hi @Marco117 ,

 

1. I tested running notebooks in Pipeline and running notebooks directly, and it took a few seconds longer to use Pipeline than to run notebooks (my query is simple), and you can further confirm where the time is being spent in the Monitor Run series. 

Monitor Apache Spark run series
Not sure how many Notebooks are in the pipeline, if there are more than one, consider using high concurrency mode.

Introducing High Concurrency Mode for Notebooks in Pipelines for Fabric Spark

If no exceptions are found above, then it is time to move to the autoscale and dynamically allocated actuators.

If your workload is relatively stable and you don't need additional cores to speed up execution, you might consider disabling these two features.

 

2. why using 72 cores:
When you enable autoscale for Spark pools, jobs exuecute with their minimum node configuration. During runtime, scaling may occur. These requests go through the job admission control. Approved requests scale up to the maximum limits based on total available cores. Rejected requests don't affect active jobs; they continue to run with their current configuration until cores become available. 
Job admission in Apache Spark for Fabric

 

3. Other possible reasons:
If you change the default pool from Starter Pool to a Custom Spark pool you may see longer session start (~3 minutes).
Both session and command execution times have increased in the first time. (It only took 20 seconds before.).

vcgaomsft_0-1727766642335.png

Workspace administration settings in Microsoft Fabric

 

Best Regards,
Gao

Community Support Team

 

If there is any post helps, then please consider Accept it as the solution  to help the other members find it more quickly.
If I misunderstand your needs or you still have problems on it, please feel free to let us know. Thanks a lot!

How to get your questions answered quickly --  How to provide sample data in the Power BI Forum

Helpful resources

Announcements
MarchFBCvideo - carousel

Fabric Monthly Update - March 2025

Check out the March 2025 Fabric update to learn about new features.

March2025 Carousel

Fabric Community Update - March 2025

Find out what's new and trending in the Fabric community.

"); $(".slidesjs-pagination" ).prependTo(".pagination_sec"); $(".slidesjs-pagination" ).append("
"); $(".slidesjs-play.slidesjs-navigation").appendTo(".playpause_sec"); $(".slidesjs-stop.slidesjs-navigation").appendTo(".playpause_sec"); $(".slidesjs-pagination" ).append(""); $(".slidesjs-pagination" ).append(""); } catch(e){ } /* End: This code is added by iTalent as part of iTrack COMPL-455 */ $(".slidesjs-previous.slidesjs-navigation").attr('tabindex', '0'); $(".slidesjs-next.slidesjs-navigation").attr('tabindex', '0'); /* start: This code is added by iTalent as part of iTrack 1859082 */ $('.slidesjs-play.slidesjs-navigation').attr('id','playtitle'); $('.slidesjs-stop.slidesjs-navigation').attr('id','stoptitle'); $('.slidesjs-play.slidesjs-navigation').attr('role','tab'); $('.slidesjs-stop.slidesjs-navigation').attr('role','tab'); $('.slidesjs-play.slidesjs-navigation').attr('aria-describedby','tip1'); $('.slidesjs-stop.slidesjs-navigation').attr('aria-describedby','tip2'); /* End: This code is added by iTalent as part of iTrack 1859082 */ }); $(document).ready(function() { if($("#slides .item").length < 2 ) { /* Fixing Single Slide click issue (commented following code)*/ // $(".item").css("left","0px"); $(".item.slidesjs-slide").attr('style', 'left:0px !important'); $(".slidesjs-stop.slidesjs-navigation").trigger('click'); $(".slidesjs-previous").css("display", "none"); $(".slidesjs-next").css("display", "none"); } var items_length = $(".item.slidesjs-slide").length; $(".slidesjs-pagination-item > button").attr("aria-setsize",items_length); $(".slidesjs-next, .slidesjs-pagination-item button").attr("tabindex","-1"); $(".slidesjs-pagination-item button").attr("role", "tab"); $(".slidesjs-previous").attr("tabindex","-1"); $(".slidesjs-next").attr("aria-hidden","true"); $(".slidesjs-previous").attr("aria-hidden","true"); $(".slidesjs-next").attr("aria-label","Next"); $(".slidesjs-previous").attr("aria-label","Previous"); //$(".slidesjs-stop.slidesjs-navigation").attr("role","button"); //$(".slidesjs-play.slidesjs-navigation").attr("role","button"); $(".slidesjs-pagination").attr("role","tablist").attr("aria-busy","true"); $("li.slidesjs-pagination-item").attr("role","list"); $(".item.slidesjs-slide").attr("tabindex","-1"); $(".item.slidesjs-slide").attr("aria-label","item"); /*$(".slidesjs-stop.slidesjs-navigation").on('click', function() { var itemNumber = parseInt($('.slidesjs-pagination-item > a.active').attr('data-slidesjs-item')); $($('.item.slidesjs-slide')[itemNumber]).find('.c-call-to-action').attr('tabindex', '0'); });*/ $(".slidesjs-stop.slidesjs-navigation, .slidesjs-pagination-item > button").on('click keydown', function() { $.each($('.item.slidesjs-slide'),function(i,el){ $(el).find('.c-call-to-action').attr('tabindex', '-1'); }); var itemNumber = parseInt($('.slidesjs-pagination-item > button.active').attr('data-slidesjs-item')); $($('.item.slidesjs-slide')[itemNumber]).find('.c-call-to-action').attr('tabindex', '0'); }); $(".slidesjs-play.slidesjs-navigation").on('click', function() { $.each($('.item.slidesjs-slide'),function(i,el){ $(el).find('.c-call-to-action').attr('tabindex', '-1'); }); }); $(".slidesjs-pagination-item button").keyup(function(e){ var keyCode = e.keyCode || e.which; if (keyCode == 9) { e.preventDefault(); $(".slidesjs-stop.slidesjs-navigation").trigger('click').blur(); $("button.active").focus(); } }); $(".slidesjs-play").on("click",function (event) { if (event.handleObj.type === "click") { $(".slidesjs-stop").focus(); } else if(event.handleObj.type === "keydown"){ if (event.which === 13 && $(event.target).hasClass("slidesjs-play")) { $(".slidesjs-stop").focus(); } } }); $(".slidesjs-stop").on("click",function (event) { if (event.handleObj.type === "click") { $(".slidesjs-play").focus(); } else if(event.handleObj.type === "keydown"){ if (event.which === 13 && $(event.target).hasClass("slidesjs-stop")) { $(".slidesjs-play").focus(); } } }); $(".slidesjs-pagination-item").keydown(function(e){ switch (e.which){ case 37: //left arrow key $(".slidesjs-previous.slidesjs-navigation").trigger('click'); e.preventDefault(); break; case 39: //right arrow key $(".slidesjs-next.slidesjs-navigation").trigger('click'); e.preventDefault(); break; default: return; } $(".slidesjs-pagination-item button.active").focus(); }); }); // Start This code is added by iTalent as part of iTrack 1859082 $(document).ready(function(){ $("#tip1").attr("aria-hidden","true").addClass("hidden"); $("#tip2").attr("aria-hidden","true").addClass("hidden"); $(".slidesjs-stop.slidesjs-navigation, .slidesjs-play.slidesjs-navigation").attr('title', ''); $("a#playtitle").focus(function(){ $("#tip1").attr("aria-hidden","false").removeClass("hidden"); }); $("a#playtitle").mouseover(function(){ $("#tip1").attr("aria-hidden","false").removeClass("hidden"); }); $("a#playtitle").blur(function(){ $("#tip1").attr("aria-hidden","true").addClass("hidden"); }); $("a#playtitle").mouseleave(function(){ $("#tip1").attr("aria-hidden","true").addClass("hidden"); }); $("a#play").keydown(function(ev){ if (ev.which ==27) { $("#tip1").attr("aria-hidden","true").addClass("hidden"); ev.preventDefault(); return false; } }); $("a#stoptitle").focus(function(){ $("#tip2").attr("aria-hidden","false").removeClass("hidden"); }); $("a#stoptitle").mouseover(function(){ $("#tip2").attr("aria-hidden","false").removeClass("hidden"); }); $("a#stoptitle").blur(function(){ $("#tip2").attr("aria-hidden","true").addClass("hidden"); }); $("a#stoptitle").mouseleave(function(){ $("#tip2").attr("aria-hidden","true").addClass("hidden"); }); $("a#stoptitle").keydown(function(ev){ if (ev.which ==27) { $("#tip2").attr("aria-hidden","true").addClass("hidden"); ev.preventDefault(); return false; } }); }); // End This code is added by iTalent as part of iTrack 1859082
Top Kudoed Authors