Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Microsoft is giving away 50,000 FREE Microsoft Certification exam vouchers. Get Fabric certified for FREE! Learn more

Reply
Element115
Super User
Super User

TIP::COPY DATA::SOURCE::TSQL QUERY::LIMITATION

SUMMARY:

The T-SQL window function ROW_NUMBER() OVER (ORDER BY T0.[DATE] ASC, T0.[TIME] ASC) AS ID does NOT work as expected when used in a T-SQL script in the pipeline activity Copy data->Source tab query field.  By 'not as expected,' I mean not as on a classic on-prem SQL Server.  

 

DETAILS:

Running ROW_NUMBER() OVER on an on-prem SQL Server you get a monotonically increasing sequence of numbers that increases independent of date or time of day.  

 

When doing the same against a table in a lakehouse, ie in the FROM clause where you reference a lakehouse table, and if this table has a date field, the row number count resets at midnight when the calendar goes from one day to the next.  

 

This results in duplicate IDs or row numbers in your lakehouse table.  Therefore depending on your use case, it could be an issue if you were using ROW_NUMBER() OVER expecting to get unique IDs.

 

CONTEXT:

Creating fact tables after ingesting raw data with a pipeline. The fact tables are created using T-SQL code that also performs some sort of data aggregation and therefore the original IDs from the raw data tables cannot be re-used, that is to say, if IDs are needed in the fact tables, then the T-SQL script creating these fact tables also needs to created the new ID sequence for each newly aggregated row.

 

ALTERNATIVE:

You could do this, up to a point, with Power Query M inside a Dataflow  Gen 2. But the CU cost is higher.

 

Please feel free to correct me or add to this if I got something wrong or forgot anything.  Hope this helps some greenhorns.

2 ACCEPTED SOLUTIONS

Hello @Element115 ,

Thank you for your update and for clarifying your progress.


You’re absolutely right in this case, PARTITION BY NULL isn’t strictly necessary. I included it to intentionally ensure ROW_NUMBER() treats the whole dataset as one partition, keeping the ID sequence continuous. It’s something we often do as a best practice to make the partitioning behavior clear.

That said, leaving it out gives you the same result since the query defaults to one partition anyway. So, feel free to skip it, it won’t affect how the query runs.

 

If this post helps, then please give us Kudos and consider Accept it as a solution to help the other members find it more quickly.

 

Thank you

View solution in original post

Hello @Element115 ,

 

The ROW_NUMBER() function generates unique row numbers, and PARTITION BY groups the data into sections where numbering restarts. To partition by year and month, you should write PARTITION BY YEAR(vis.[DATE]), MONTH(vis.[DATE]) . This creates a new row number sequence for each year-month combination. The ORDER BY vis.[DATE] DESC ensures the latest dates get the first numbers

 

If this post helps, then please give us Kudos and consider Accept it as a solution to help the other members find it more quickly.

 

Thank you.

 

View solution in original post

11 REPLIES 11
v-tsaipranay
Community Support
Community Support

Hi @Element115 ,

 

May I ask if you have resolved this issue? If so, please mark the helpful reply and accept it as the solution. This will be helpful for other community members who have similar problems to solve it faster.

 

Thank you.

v-tsaipranay
Community Support
Community Support

Hi @Element115 ,

Thank you for reaching out to the Microsoft Fabric.

 

As @ObungiNiels  suggestion to combine DATE and TIME using CONCAT() is a good approach, as it ensures ROW_NUMBER() sorts over a single value, reducing the chance of resets. However, for better reliability and performance in Microsoft Fabric’s lakehouse, we recommend an enhancement: instead of relying on string concatenation, convert the date and time into a proper DATETIME2 type. Here’s the refined query:

 

SELECT 

    CAST(T0.[DATE] + ' ' + T0.[TIME] AS DATETIME2) AS FullDateTime, 

    ROW_NUMBER() OVER (ORDER BY FullDateTime) AS ID 

FROM T0 

 

This approach prevents the row number from resetting at midnight by ensuring the engine processes a continuously increasing datetime value — keeping IDs unique across all rows. This method is more efficient and avoids potential string formatting issues, making it a more reliable solution for generating unique IDs in fact tables.

 

If this post helps, then please give us Kudos and consider Accept it as a solution to help the other members find it more quickly.

 

Thank you

@v-tsaipranay  Unfortunately, your SQL statement does not work.  You can't reference another alias from within the same SELECT clause.  The SELECT clause needs to execute and complete firts, then the aliases become available. I just ran it and it fails with a syntax error.

 

But I get the point.  You could use it within a CTE, and then refer to the alias in the ROW_NUMBER() outside the CTE.

 

A nd you need to convert the types DATE and TIME to varchar explicitly as the '+' operator is not overloaded.

Screenshot 2025-03-21 094218.jpg

Hi @Element115 Just as you mentioned you can use the cast function and all the required columns inside a cte which will act as a composite key which then you can select and user ROW_NUMBER and order by the key I think this could work 

Hi @Element115 ,

Please consider the following approache which will help you to resolve the issue.


Try this query :

CONCAT(CAST(GETDATE() AS datetime2(6)),ROW_NUMBER() OVER (PARTITION BY NULL ORDER BY CreatedDate DESC)) ID

 

If this post helps, then please give us Kudos and consider Accept it as a solution to help the other members find it more quickly.

 

Thank you

@v-tsaipranay That's an interesting T-SQL trick!  However, not sure why you use 'PARTITION BY NULL' since without it, the statement seems to work just the same.  Not being a T-SQL god, I need to ask:  Is there some esoteric reason why 'PARTITION BY NULL' would be necessary here?  

 

Here is a revised version of your SQL because as I said in my previous post, having to work with strings, ie varchars, is less efficient than integers, therefore one needs to add a last step to extract the row number from the string and convert it to a BIGINT type since the purpose of the statement is to generate a unique ID.   

 

However, I hope everyone will appreciate that this is a monstrous kludge.  Just look at the length of this SELECT statement and how we repeat the CONCAT(). I guess one could then argue we should put all this code (the original script is a lot longer as the whole starting point is a CTE) in a stored procedure and declare a variable, then specify it in the Copy data->Source tab.  However it is not clear whether the result set will be properly handled by the Copy data process and passed correctly to the data sink specified in the Destination tab. 

 

 

SELECT 
	CONVERT(
		BIGINT,
		RIGHT(
			CONCAT(
				CAST(GETDATE() AS DATETIME2(0)), 
				':',
				ROW_NUMBER() OVER (ORDER BY vis.[DATE] DESC)
			),
			CHARINDEX(
				':', 
				REVERSE(
					CONCAT(
						CAST(GETDATE() AS DATETIME2(0)), 
						':',
						ROW_NUMBER() OVER (PARTITION BY NULL ORDER BY vis.[DATE] DESC)
					)
				)
			) - 1
		)
	)	AS ID

 

 

Hello @Element115 ,

Thank you for your update and for clarifying your progress.


You’re absolutely right in this case, PARTITION BY NULL isn’t strictly necessary. I included it to intentionally ensure ROW_NUMBER() treats the whole dataset as one partition, keeping the ID sequence continuous. It’s something we often do as a best practice to make the partitioning behavior clear.

That said, leaving it out gives you the same result since the query defaults to one partition anyway. So, feel free to skip it, it won’t affect how the query runs.

 

If this post helps, then please give us Kudos and consider Accept it as a solution to help the other members find it more quickly.

 

Thank you

@v-tsaipranay Thanks. I do have a follow-up question though.  What happens if the lakehouse data sink is partitioned by YEAR, or by YEAR and MONTH?  

 

Do we just change the window function call like so?

 

ROW_NUMBER() OVER (PARTITION BY YEAR(VIS.[DATE]) AND MONTH(vis.[date]) ORDER BY vis.[DATE] DESC)

Hello @Element115 ,

 

The ROW_NUMBER() function generates unique row numbers, and PARTITION BY groups the data into sections where numbering restarts. To partition by year and month, you should write PARTITION BY YEAR(vis.[DATE]), MONTH(vis.[DATE]) . This creates a new row number sequence for each year-month combination. The ORDER BY vis.[DATE] DESC ensures the latest dates get the first numbers

 

If this post helps, then please give us Kudos and consider Accept it as a solution to help the other members find it more quickly.

 

Thank you.

 

ObungiNiels
Resolver I
Resolver I

Hi @Element115 ,

did you try the following already? 

Create a new column concatenating the date and time information, creating unique timestamp value to order over: 

 

CONCAT(T0.[DATE] , ' ', T0.[TIME]) AS DATETIME_VAL,
ROW_NUMBER() OVER (ORDER BY DATETIME_VAL ASC) AS ID

Let me know if that helped. 

Kind regards,

Niels 

 

@v-tsaipranay  Thanks but I don't think we need or should lose datetime info and replace it by a string or VARCHAR.  Further, there is another issue related to VARCHAR (I already opened a support ticket for this as well).  For some reason, depending on some Copy data settings config, the system automatically converts VARCHAR type to NVARCHAR(MAX) and this causes Fabric to complain that NVARCHAR(MAX) is not a type compatible with the underlying version of SQL used causing the pipeline to terminate with an error.

 

Here is the statement I used:

 

 

ROW_NUMBER() OVER (ORDER BY vis.[DATE] ASC, vis.[M15] ASC) AS ID

 

 

Like I said, it works as it should on a SQL Server on-prem.  So this is an issue once the T-SQL is executed from a Copy data activity against a lakehouse table and therefore we should not have to convert types DATE, TIME, or DATETIME2 to type VARCHAR, which later on when sorting or performing datetime intelligence ops causes other issues, at which point you to use M and reconvert back to the original type.  A non-starter in my book.

 

 

Helpful resources

Announcements
MarchFBCvideo - carousel

Fabric Monthly Update - March 2025

Check out the March 2025 Fabric update to learn about new features.

March2025 Carousel

Fabric Community Update - March 2025

Find out what's new and trending in the Fabric community.

"); $(".slidesjs-pagination" ).prependTo(".pagination_sec"); $(".slidesjs-pagination" ).append("
"); $(".slidesjs-play.slidesjs-navigation").appendTo(".playpause_sec"); $(".slidesjs-stop.slidesjs-navigation").appendTo(".playpause_sec"); $(".slidesjs-pagination" ).append(""); $(".slidesjs-pagination" ).append(""); } catch(e){ } /* End: This code is added by iTalent as part of iTrack COMPL-455 */ $(".slidesjs-previous.slidesjs-navigation").attr('tabindex', '0'); $(".slidesjs-next.slidesjs-navigation").attr('tabindex', '0'); /* start: This code is added by iTalent as part of iTrack 1859082 */ $('.slidesjs-play.slidesjs-navigation').attr('id','playtitle'); $('.slidesjs-stop.slidesjs-navigation').attr('id','stoptitle'); $('.slidesjs-play.slidesjs-navigation').attr('role','tab'); $('.slidesjs-stop.slidesjs-navigation').attr('role','tab'); $('.slidesjs-play.slidesjs-navigation').attr('aria-describedby','tip1'); $('.slidesjs-stop.slidesjs-navigation').attr('aria-describedby','tip2'); /* End: This code is added by iTalent as part of iTrack 1859082 */ }); $(document).ready(function() { if($("#slides .item").length < 2 ) { /* Fixing Single Slide click issue (commented following code)*/ // $(".item").css("left","0px"); $(".item.slidesjs-slide").attr('style', 'left:0px !important'); $(".slidesjs-stop.slidesjs-navigation").trigger('click'); $(".slidesjs-previous").css("display", "none"); $(".slidesjs-next").css("display", "none"); } var items_length = $(".item.slidesjs-slide").length; $(".slidesjs-pagination-item > button").attr("aria-setsize",items_length); $(".slidesjs-next, .slidesjs-pagination-item button").attr("tabindex","-1"); $(".slidesjs-pagination-item button").attr("role", "tab"); $(".slidesjs-previous").attr("tabindex","-1"); $(".slidesjs-next").attr("aria-hidden","true"); $(".slidesjs-previous").attr("aria-hidden","true"); $(".slidesjs-next").attr("aria-label","Next"); $(".slidesjs-previous").attr("aria-label","Previous"); //$(".slidesjs-stop.slidesjs-navigation").attr("role","button"); //$(".slidesjs-play.slidesjs-navigation").attr("role","button"); $(".slidesjs-pagination").attr("role","tablist").attr("aria-busy","true"); $("li.slidesjs-pagination-item").attr("role","list"); $(".item.slidesjs-slide").attr("tabindex","-1"); $(".item.slidesjs-slide").attr("aria-label","item"); /*$(".slidesjs-stop.slidesjs-navigation").on('click', function() { var itemNumber = parseInt($('.slidesjs-pagination-item > a.active').attr('data-slidesjs-item')); $($('.item.slidesjs-slide')[itemNumber]).find('.c-call-to-action').attr('tabindex', '0'); });*/ $(".slidesjs-stop.slidesjs-navigation, .slidesjs-pagination-item > button").on('click keydown', function() { $.each($('.item.slidesjs-slide'),function(i,el){ $(el).find('.c-call-to-action').attr('tabindex', '-1'); }); var itemNumber = parseInt($('.slidesjs-pagination-item > button.active').attr('data-slidesjs-item')); $($('.item.slidesjs-slide')[itemNumber]).find('.c-call-to-action').attr('tabindex', '0'); }); $(".slidesjs-play.slidesjs-navigation").on('click', function() { $.each($('.item.slidesjs-slide'),function(i,el){ $(el).find('.c-call-to-action').attr('tabindex', '-1'); }); }); $(".slidesjs-pagination-item button").keyup(function(e){ var keyCode = e.keyCode || e.which; if (keyCode == 9) { e.preventDefault(); $(".slidesjs-stop.slidesjs-navigation").trigger('click').blur(); $("button.active").focus(); } }); $(".slidesjs-play").on("click",function (event) { if (event.handleObj.type === "click") { $(".slidesjs-stop").focus(); } else if(event.handleObj.type === "keydown"){ if (event.which === 13 && $(event.target).hasClass("slidesjs-play")) { $(".slidesjs-stop").focus(); } } }); $(".slidesjs-stop").on("click",function (event) { if (event.handleObj.type === "click") { $(".slidesjs-play").focus(); } else if(event.handleObj.type === "keydown"){ if (event.which === 13 && $(event.target).hasClass("slidesjs-stop")) { $(".slidesjs-play").focus(); } } }); $(".slidesjs-pagination-item").keydown(function(e){ switch (e.which){ case 37: //left arrow key $(".slidesjs-previous.slidesjs-navigation").trigger('click'); e.preventDefault(); break; case 39: //right arrow key $(".slidesjs-next.slidesjs-navigation").trigger('click'); e.preventDefault(); break; default: return; } $(".slidesjs-pagination-item button.active").focus(); }); }); // Start This code is added by iTalent as part of iTrack 1859082 $(document).ready(function(){ $("#tip1").attr("aria-hidden","true").addClass("hidden"); $("#tip2").attr("aria-hidden","true").addClass("hidden"); $(".slidesjs-stop.slidesjs-navigation, .slidesjs-play.slidesjs-navigation").attr('title', ''); $("a#playtitle").focus(function(){ $("#tip1").attr("aria-hidden","false").removeClass("hidden"); }); $("a#playtitle").mouseover(function(){ $("#tip1").attr("aria-hidden","false").removeClass("hidden"); }); $("a#playtitle").blur(function(){ $("#tip1").attr("aria-hidden","true").addClass("hidden"); }); $("a#playtitle").mouseleave(function(){ $("#tip1").attr("aria-hidden","true").addClass("hidden"); }); $("a#play").keydown(function(ev){ if (ev.which ==27) { $("#tip1").attr("aria-hidden","true").addClass("hidden"); ev.preventDefault(); return false; } }); $("a#stoptitle").focus(function(){ $("#tip2").attr("aria-hidden","false").removeClass("hidden"); }); $("a#stoptitle").mouseover(function(){ $("#tip2").attr("aria-hidden","false").removeClass("hidden"); }); $("a#stoptitle").blur(function(){ $("#tip2").attr("aria-hidden","true").addClass("hidden"); }); $("a#stoptitle").mouseleave(function(){ $("#tip2").attr("aria-hidden","true").addClass("hidden"); }); $("a#stoptitle").keydown(function(ev){ if (ev.which ==27) { $("#tip2").attr("aria-hidden","true").addClass("hidden"); ev.preventDefault(); return false; } }); }); // End This code is added by iTalent as part of iTrack 1859082