Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Microsoft is giving away 50,000 FREE Microsoft Certification exam vouchers. Get Fabric certified for FREE! Learn more

Reply
dbeavon3
Power Participant
Power Participant

Anyone having luck with pyspark workloads in Fabric? Getting assorted error messages.

Has anyone tried to open a "professional" support ticket for pyspark?   I think there are some growing pains.  The fabric pyspark and the support for pyspark may both be a work-in-progress.

 

I am encountering some very unfamiliar messages in the Fabric spark environment.  The errors are proprietary to Microsoft and I haven't seen these in other spark implementations (Databricks, Synapse, HDI, or OSS).  I'm pretty sure these errors would turn up in my google results if they were related to the OSS spark from apache.  If anyone recognizes any of these errors, please let me know.  They were encountered in various parts of the pyspark experience on Fabric.  I'm not aware of any degradation in the service or any known outages, so I'm assuming these are just snafu bugs in Fabric.

 

 

Error 1.  This one is from the Fabric spark UI:


invalid_grant: Error(s): 501481 - Timestamp: 2025-01-01 - Description: AADSTS501481: The Code_Verifier does not match the code_challenge supplied in the authorization request

 

... it happens when trying to open the logs from the spark UI.

 

 

Error 2.  From Livy notebook yesterday (aka "e01"): LD_PRELOAD:

 

2025-01-06 16:24:37,490 WARN YarnAllocator [Reporter]: Container from a bad node: container_1736180637918_0001_01_000003 on host: vm-95921137. Exit status: 1. Diagnostics: t be preloaded (cannot open shared object file): ignored.
ERROR: ld.so: object '/opt/gluten/dep/libjemalloc.so.2' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
ERROR: ld.so: object '/opt/gluten/dep/libjemalloc.so.2' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
ERROR: ld.so: object '/opt/gluten/dep/libjemalloc.so.2' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
Error files: stderr, stderr-active.
Last 4096 bytes of stderr :
11842@vm-95921137

 

... no idea what any of this means.  It looks scary and is repeated thru-out the stderr of the driver.  It uses the "warn" severity, and says the problem can be "ignored".  

 

 

Error 3:  From Livy notebook yesterday (aka "e01"): successfully created connection, despite exception java.lang.reflect.UndeclaredThrowableException

 

2025-01-06 16:24:36,743 INFO TransportClientFactory [netty-rpc-connection-0]: Successfully created connection to vm-beb30181/10.0.160.9:45461 after 3 ms (0 ms spent in bootstraps)
Exception in thread "main" java.lang.reflect.UndeclaredThrowableException
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1923)
at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:61)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:471)
at org.apache.spark.executor.YarnCoarseGrainedExecutorBackend$.main(YarnCoarseGrainedExecutorBackend.scala:83)
at org.apache.spark.executor.YarnCoarseGrainedExecutorBackend.main(YarnCoarseGrainedExecutorBackend.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.spark.shuffle.sort.ColumnarShuffleManager
at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:581)
at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:527)
at java.base/java.lang.Class.forName0(Native Method)
at java.base/java.lang.Class.forName(Class.java:398)
at org.apache.spark.util.SparkClassUtils.classForName(SparkClassUtils.scala:41)
at org.apache.spark.util.SparkClassUtils.classForName$(SparkClassUtils.scala:36)
at org.apache.spark.util.Utils$.classForName(Utils.scala:94)
at org.apache.spark.util.Utils$.instantiateSerializerOrShuffleManager(Utils.scala:2557)
at org.apache.spark.SparkEnv$.create(SparkEnv.scala:326)
at org.apache.spark.SparkEnv$.createExecutorEnv(SparkEnv.scala:215)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.$anonfun$run$9(CoarseGrainedExecutorBackend.scala:520)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:62)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:61)
at java.base/java.security.AccessController.doPrivileged(Native Method)
at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1907)
... 4 more

 

 

 

 

 

Error 4:  From Livy notebook yesterday (aka "e01") : ExecutorMonitor threw an exception

 

 

2025-01-06 16:24:59,020 ERROR AsyncEventQueue [spark-listener-group-executorManagement]: Listener ExecutorMonitor threw an exception
java.lang.NullPointerException
at org.apache.spark.scheduler.dynalloc.ExecutorMonitor.getRemovedExecutor(ExecutorMonitor.scala:466)
at org.apache.spark.scheduler.dynalloc.ExecutorMonitor.onExecutorRemoved(ExecutorMonitor.scala:483)
at org.apache.spark.scheduler.SparkListenerBus.doPostEvent(SparkListenerBus.scala:65)
at org.apache.spark.scheduler.SparkListenerBus.doPostEvent$(SparkListenerBus.scala:28)
at org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37)
at org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37)
at org.apache.spark.util.ListenerBus.postToAll(ListenerBus.scala:120)
at org.apache.spark.util.ListenerBus.postToAll$(ListenerBus.scala:104)
at org.apache.spark.scheduler.AsyncEventQueue.super$postToAll(AsyncEventQueue.scala:127)
at org.apache.spark.scheduler.AsyncEventQueue.$anonfun$dispatch$1(AsyncEventQueue.scala:127)
at scala.runtime.java8.JFunction0$mcJ$sp.apply(JFunction0$mcJ$sp.java:23)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
at org.apache.spark.scheduler.AsyncEventQueue.org$apache$spark$scheduler$AsyncEventQueue$$dispatch(AsyncEventQueue.scala:121)
at org.apache.spark.scheduler.AsyncEventQueue$$anon$3.$anonfun$run$4(AsyncEventQueue.scala:117)
at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1356)
at org.apache.spark.scheduler.AsyncEventQueue$$anon$3.run(AsyncEventQueue.scala:117)

 

... Sorry for the assorted errors.  I will monitor to see which are most common and try to focus on them first.  As it is now, I'm just trying to keep up with the pace of these unfamiliar issues.  They are coming at us pretty fast!

 

Please let me know if any of these are familiar.

 

 

 

1 ACCEPTED SOLUTION
govindarajan_d
Super User
Super User

Hi @dbeavon3,

 

From Error-2, I see you are using native engine on spark. Did you try turning it off and running the same notebook? (https://learn.microsoft.com/en-us/fabric/data-engineering/native-execution-engine-overview?tabs=spar...

 

From all the error messages, I can offer you a speculative explanation. When Spark runs, VMs get spun up with specific configuration which run as Executors. I believe for some reason a VM had crashed (most likely because of running a unsupported query on native engine). And once the VM crashed, Spark application also crashed. Usually VM failures are automatically managed by Spark, but I guess with native engine integration, it still needs improvement from Microsoft. 

View solution in original post

4 REPLIES 4
v-ssriganesh
Community Support
Community Support

Hi,

Thank you for reaching out to the MS Fabric community forum.

I understand that you are encountering unfamiliar errors. Let's go through each of the errors you've mentioned:

Error 1: invalid_grant: Error(s): 501481 This occurs when opening logs from the Spark UI. It means that the code verifier and code challenge in the authorization request do not match. Ensure that these values are correctly configured and try regenerating them.
Error 2: LD_PRELOAD warnings Warnings: 
that a shared object file cannot be preloaded (/opt/gluten/dep/libjemalloc.so.2). Known issue, just ignore it. To avoid these warnings, set LD_PRELOAD settings in your Spark configuration.
Error 3: java.lang.reflect.UndeclaredThrowableException
This error is encountered when trying to create a connection in the Livy notebook because of a ClassNotFoundException for org.apache.spark.shuffle.sort.ColumnarShuffleManager. Make sure all the necessary dependencies are in your Spark environment and add the missing library or jar file.
Error 4: ExecutorMonitor threw an exception.
The null pointer exception in the Executor monitor is an issue with the dynamic allocation of executors. Remove dynamic allocation or upgrade your Spark version.

I hope this helps resolve the issues you're experiencing. Should the problems continue, please consider raising a Microsoft support ticket for further assistance. Here is the link: https://learn.microsoft.com/en-us/power-bi/support/create-support-ticket

If this helps then please Accept it as a solution and dropping a "Kudos" so other members can find it more easily.
Thanks.

govindarajan_d
Super User
Super User

Hi @dbeavon3,

 

From Error-2, I see you are using native engine on spark. Did you try turning it off and running the same notebook? (https://learn.microsoft.com/en-us/fabric/data-engineering/native-execution-engine-overview?tabs=spar...

 

From all the error messages, I can offer you a speculative explanation. When Spark runs, VMs get spun up with specific configuration which run as Executors. I believe for some reason a VM had crashed (most likely because of running a unsupported query on native engine). And once the VM crashed, Spark application also crashed. Usually VM failures are automatically managed by Spark, but I guess with native engine integration, it still needs improvement from Microsoft. 

Hi @govindarajan_d 

I'm new to the Fabric implementation of  Spark and was worried about all these proprietary components and error messages.

 

I've only been using if one week.  I only had one day where these unfamiliar errors were appearing.  However it was on my very first day with Spark in Fabric ...  so that is what made me concerned.   Since then, we have not seen it repeated.  However, I wanted to ask for tips in preparation for the next time we see these errors.  Else I will be no better off than I was on day one.

 

I have a speculative explanation as well.  The weird/unusual thing about Spark in Fabric is the integration with Entra ID user credentials.  In the other Spark environments which I've used, the cluster was always running as a system-level account (OS account or service principal).  However I think that the Spark notebooks in Fabric are constantly referring to Entra ID in order to validate the PBI user's prior credentials, or retrieve new credentials.  This introduces code that may be (1) a weak link, and (2) very proprietary and very different than what is found in the OSS spark implementation. 

... this theory would also explain the presence of these strange error messages which aren't able to be found in google.  I may be the first person ever to post their error messages on the Internet!

 

 

 

Hi @dbeavon3,

 

You are right. Microsoft has implemented proprietary code on top of OSS Spark and that makes it a source for error messages that are uncommon for people who worked with OSS Spark. Microsoft has to add more informative error messages! 

Helpful resources

Announcements
MarchFBCvideo - carousel

Fabric Monthly Update - March 2025

Check out the March 2025 Fabric update to learn about new features.

Notebook Gallery Carousel1

NEW! Community Notebooks Gallery

Explore and share Fabric Notebooks to boost Power BI insights in the new community notebooks gallery.

April2025 Carousel

Fabric Community Update - April 2025

Find out what's new and trending in the Fabric community.

"); $(".slidesjs-pagination" ).prependTo(".pagination_sec"); $(".slidesjs-pagination" ).append("
"); $(".slidesjs-play.slidesjs-navigation").appendTo(".playpause_sec"); $(".slidesjs-stop.slidesjs-navigation").appendTo(".playpause_sec"); $(".slidesjs-pagination" ).append(""); $(".slidesjs-pagination" ).append(""); } catch(e){ } /* End: This code is added by iTalent as part of iTrack COMPL-455 */ $(".slidesjs-previous.slidesjs-navigation").attr('tabindex', '0'); $(".slidesjs-next.slidesjs-navigation").attr('tabindex', '0'); /* start: This code is added by iTalent as part of iTrack 1859082 */ $('.slidesjs-play.slidesjs-navigation').attr('id','playtitle'); $('.slidesjs-stop.slidesjs-navigation').attr('id','stoptitle'); $('.slidesjs-play.slidesjs-navigation').attr('role','tab'); $('.slidesjs-stop.slidesjs-navigation').attr('role','tab'); $('.slidesjs-play.slidesjs-navigation').attr('aria-describedby','tip1'); $('.slidesjs-stop.slidesjs-navigation').attr('aria-describedby','tip2'); /* End: This code is added by iTalent as part of iTrack 1859082 */ }); $(document).ready(function() { if($("#slides .item").length < 2 ) { /* Fixing Single Slide click issue (commented following code)*/ // $(".item").css("left","0px"); $(".item.slidesjs-slide").attr('style', 'left:0px !important'); $(".slidesjs-stop.slidesjs-navigation").trigger('click'); $(".slidesjs-previous").css("display", "none"); $(".slidesjs-next").css("display", "none"); } var items_length = $(".item.slidesjs-slide").length; $(".slidesjs-pagination-item > button").attr("aria-setsize",items_length); $(".slidesjs-next, .slidesjs-pagination-item button").attr("tabindex","-1"); $(".slidesjs-pagination-item button").attr("role", "tab"); $(".slidesjs-previous").attr("tabindex","-1"); $(".slidesjs-next").attr("aria-hidden","true"); $(".slidesjs-previous").attr("aria-hidden","true"); $(".slidesjs-next").attr("aria-label","Next"); $(".slidesjs-previous").attr("aria-label","Previous"); //$(".slidesjs-stop.slidesjs-navigation").attr("role","button"); //$(".slidesjs-play.slidesjs-navigation").attr("role","button"); $(".slidesjs-pagination").attr("role","tablist").attr("aria-busy","true"); $("li.slidesjs-pagination-item").attr("role","list"); $(".item.slidesjs-slide").attr("tabindex","-1"); $(".item.slidesjs-slide").attr("aria-label","item"); /*$(".slidesjs-stop.slidesjs-navigation").on('click', function() { var itemNumber = parseInt($('.slidesjs-pagination-item > a.active').attr('data-slidesjs-item')); $($('.item.slidesjs-slide')[itemNumber]).find('.c-call-to-action').attr('tabindex', '0'); });*/ $(".slidesjs-stop.slidesjs-navigation, .slidesjs-pagination-item > button").on('click keydown', function() { $.each($('.item.slidesjs-slide'),function(i,el){ $(el).find('.c-call-to-action').attr('tabindex', '-1'); }); var itemNumber = parseInt($('.slidesjs-pagination-item > button.active').attr('data-slidesjs-item')); $($('.item.slidesjs-slide')[itemNumber]).find('.c-call-to-action').attr('tabindex', '0'); }); $(".slidesjs-play.slidesjs-navigation").on('click', function() { $.each($('.item.slidesjs-slide'),function(i,el){ $(el).find('.c-call-to-action').attr('tabindex', '-1'); }); }); $(".slidesjs-pagination-item button").keyup(function(e){ var keyCode = e.keyCode || e.which; if (keyCode == 9) { e.preventDefault(); $(".slidesjs-stop.slidesjs-navigation").trigger('click').blur(); $("button.active").focus(); } }); $(".slidesjs-play").on("click",function (event) { if (event.handleObj.type === "click") { $(".slidesjs-stop").focus(); } else if(event.handleObj.type === "keydown"){ if (event.which === 13 && $(event.target).hasClass("slidesjs-play")) { $(".slidesjs-stop").focus(); } } }); $(".slidesjs-stop").on("click",function (event) { if (event.handleObj.type === "click") { $(".slidesjs-play").focus(); } else if(event.handleObj.type === "keydown"){ if (event.which === 13 && $(event.target).hasClass("slidesjs-stop")) { $(".slidesjs-play").focus(); } } }); $(".slidesjs-pagination-item").keydown(function(e){ switch (e.which){ case 37: //left arrow key $(".slidesjs-previous.slidesjs-navigation").trigger('click'); e.preventDefault(); break; case 39: //right arrow key $(".slidesjs-next.slidesjs-navigation").trigger('click'); e.preventDefault(); break; default: return; } $(".slidesjs-pagination-item button.active").focus(); }); }); // Start This code is added by iTalent as part of iTrack 1859082 $(document).ready(function(){ $("#tip1").attr("aria-hidden","true").addClass("hidden"); $("#tip2").attr("aria-hidden","true").addClass("hidden"); $(".slidesjs-stop.slidesjs-navigation, .slidesjs-play.slidesjs-navigation").attr('title', ''); $("a#playtitle").focus(function(){ $("#tip1").attr("aria-hidden","false").removeClass("hidden"); }); $("a#playtitle").mouseover(function(){ $("#tip1").attr("aria-hidden","false").removeClass("hidden"); }); $("a#playtitle").blur(function(){ $("#tip1").attr("aria-hidden","true").addClass("hidden"); }); $("a#playtitle").mouseleave(function(){ $("#tip1").attr("aria-hidden","true").addClass("hidden"); }); $("a#play").keydown(function(ev){ if (ev.which ==27) { $("#tip1").attr("aria-hidden","true").addClass("hidden"); ev.preventDefault(); return false; } }); $("a#stoptitle").focus(function(){ $("#tip2").attr("aria-hidden","false").removeClass("hidden"); }); $("a#stoptitle").mouseover(function(){ $("#tip2").attr("aria-hidden","false").removeClass("hidden"); }); $("a#stoptitle").blur(function(){ $("#tip2").attr("aria-hidden","true").addClass("hidden"); }); $("a#stoptitle").mouseleave(function(){ $("#tip2").attr("aria-hidden","true").addClass("hidden"); }); $("a#stoptitle").keydown(function(ev){ if (ev.which ==27) { $("#tip2").attr("aria-hidden","true").addClass("hidden"); ev.preventDefault(); return false; } }); }); // End This code is added by iTalent as part of iTrack 1859082