Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Microsoft is giving away 50,000 FREE Microsoft Certification exam vouchers. Get Fabric certified for FREE! Learn more

Reply
SamsonYoung20
New Member

Would I use Dataflow for this use case? Or is it easier to use notebook for transformations

Hello,

 

I am very new to Azure and MS services so I hope this is the right place to ask.

 

I have a list of csv files for 5 different tables that are generated daily (5 new csvs daily) with new data and placed in our Azure Blob storage account.

I want to create a pipeline that automates the ETL process. Where the new csv files are processed each day then appended onto the tables that I have created in my lake house. 

 

If I create a dataflow, do I need to select Azure blobs or Azure data lake storage? If I connect to the storage account via either method, I am given just a single query with a table that lists all the contents of the container (all csvs, all subfolders each csv is located in etc).

If I open up one of the csv files and do transformations on it and then set the destination to append onto the table I have already created. How does Dataflow know to get all the files from that folder to process it using the same transformations that I had applied already. Also how does it know which datasets it would need to apply the transfomations to based on which set of data I am transforming?

 

The csv files always have the same column headers etc, just different rows of data.

There are no Youtube videos I can find that show situations where you want to automate an ETL process using dataflows and pipelines where you would want to be processing a different csv file each day for multiple tables.

 

Is this possible with dataflows or do I just need to use a notebook instead? 

 

Thank You,

Paul S

 

 

1 ACCEPTED SOLUTION
v-tsaipranay
Community Support
Community Support

Hi @SamsonYoung20 ,

Thank you for posting in the Microsoft Fabric Community.

 

For your use case automating the daily ingestion and transformation of CSV files into your Fabric Lakehouse both Dataflows and Notebooks can work, but the choice depends on your requirements.

If your transformations are simple (such as renaming columns, filtering, or basic data cleaning), Dataflows is the easier, no-code solution. You can connect directly to Azure Blob Storage or ADLS, select a folder (so new files are automatically included), apply transformations using Power Query, and append data to your Lakehouse table. Then, use Data Pipelines to schedule and automate the process daily.

 

However, if you need more control, such as dynamically selecting only the latest files, handling schema changes, or performing complex transformations, a Notebook (PySpark/Pandas) is the better choice. A Notebook allows you to programmatically read new files, apply advanced transformations, and efficiently write the data to the Lakehouse in Delta format for better performance. While Notebooks require coding, they provide greater flexibility, scalability, and efficiency for large datasets. If your needs are straightforward, Dataflows should work well; otherwise, Notebooks give you more power and control over the process.

 

To decide between these approaches, you can refer to official Microsoft documentation. The Microsoft Fabric Decision Guide provides insights into when to use Dataflows, Pipelines, or Notebooks. If you're specifically looking for guidance on transforming data with Dataflows and Pipelines, you may find the Move and Transform Data with Dataflows guide helpful. 

 

I hope this will resolve your issue, if you need any further assistance, feel free to reach out.

If this post helps, then please give us Kudos and consider Accept it as a solution to help the other members find it more quickly.

 

Thankyou.

View solution in original post

5 REPLIES 5
v-tsaipranay
Community Support
Community Support

Hi @SamsonYoung20 ,

 

May I ask if you have resolved this issue? If so, please mark the helpful reply and accept it as the solution. This will be helpful for other community members who have similar problems to solve it faster.

 

Thank you.

miguel
Community Admin
Community Admin

What logic defines what a new file is? Both Notebooks and Dataflows can work, but it all relies on how you define what a "new file" is and how that logic can be translated inside of a Notebook or a Dataflow.

v-tsaipranay
Community Support
Community Support

Hi @SamsonYoung20 ,

Thank you for posting in the Microsoft Fabric Community.

 

For your use case automating the daily ingestion and transformation of CSV files into your Fabric Lakehouse both Dataflows and Notebooks can work, but the choice depends on your requirements.

If your transformations are simple (such as renaming columns, filtering, or basic data cleaning), Dataflows is the easier, no-code solution. You can connect directly to Azure Blob Storage or ADLS, select a folder (so new files are automatically included), apply transformations using Power Query, and append data to your Lakehouse table. Then, use Data Pipelines to schedule and automate the process daily.

 

However, if you need more control, such as dynamically selecting only the latest files, handling schema changes, or performing complex transformations, a Notebook (PySpark/Pandas) is the better choice. A Notebook allows you to programmatically read new files, apply advanced transformations, and efficiently write the data to the Lakehouse in Delta format for better performance. While Notebooks require coding, they provide greater flexibility, scalability, and efficiency for large datasets. If your needs are straightforward, Dataflows should work well; otherwise, Notebooks give you more power and control over the process.

 

To decide between these approaches, you can refer to official Microsoft documentation. The Microsoft Fabric Decision Guide provides insights into when to use Dataflows, Pipelines, or Notebooks. If you're specifically looking for guidance on transforming data with Dataflows and Pipelines, you may find the Move and Transform Data with Dataflows guide helpful. 

 

I hope this will resolve your issue, if you need any further assistance, feel free to reach out.

If this post helps, then please give us Kudos and consider Accept it as a solution to help the other members find it more quickly.

 

Thankyou.

Hi @SamsonYoung20 ,

I wanted to check if you had the opportunity to review the information provided and also thank you @miguel  for your information. Please feel free to contact us if you have any further questions. If my response has addressed your query, please accept it as a solution and give a 'Kudos' so other members can easily find it.


Thank you.

Hi @SamsonYoung20 ,

I wanted to check if you had the opportunity to review the information provided. Please feel free to contact us if you have any further questions. If my response has addressed your query, please accept it as a solution and give a 'Kudos' so other members can easily find it.


Thank you.

 

Helpful resources

Announcements
MarchFBCvideo - carousel

Fabric Monthly Update - March 2025

Check out the March 2025 Fabric update to learn about new features.

March2025 Carousel

Fabric Community Update - March 2025

Find out what's new and trending in the Fabric community.

"); $(".slidesjs-pagination" ).prependTo(".pagination_sec"); $(".slidesjs-pagination" ).append("
"); $(".slidesjs-play.slidesjs-navigation").appendTo(".playpause_sec"); $(".slidesjs-stop.slidesjs-navigation").appendTo(".playpause_sec"); $(".slidesjs-pagination" ).append(""); $(".slidesjs-pagination" ).append(""); } catch(e){ } /* End: This code is added by iTalent as part of iTrack COMPL-455 */ $(".slidesjs-previous.slidesjs-navigation").attr('tabindex', '0'); $(".slidesjs-next.slidesjs-navigation").attr('tabindex', '0'); /* start: This code is added by iTalent as part of iTrack 1859082 */ $('.slidesjs-play.slidesjs-navigation').attr('id','playtitle'); $('.slidesjs-stop.slidesjs-navigation').attr('id','stoptitle'); $('.slidesjs-play.slidesjs-navigation').attr('role','tab'); $('.slidesjs-stop.slidesjs-navigation').attr('role','tab'); $('.slidesjs-play.slidesjs-navigation').attr('aria-describedby','tip1'); $('.slidesjs-stop.slidesjs-navigation').attr('aria-describedby','tip2'); /* End: This code is added by iTalent as part of iTrack 1859082 */ }); $(document).ready(function() { if($("#slides .item").length < 2 ) { /* Fixing Single Slide click issue (commented following code)*/ // $(".item").css("left","0px"); $(".item.slidesjs-slide").attr('style', 'left:0px !important'); $(".slidesjs-stop.slidesjs-navigation").trigger('click'); $(".slidesjs-previous").css("display", "none"); $(".slidesjs-next").css("display", "none"); } var items_length = $(".item.slidesjs-slide").length; $(".slidesjs-pagination-item > button").attr("aria-setsize",items_length); $(".slidesjs-next, .slidesjs-pagination-item button").attr("tabindex","-1"); $(".slidesjs-pagination-item button").attr("role", "tab"); $(".slidesjs-previous").attr("tabindex","-1"); $(".slidesjs-next").attr("aria-hidden","true"); $(".slidesjs-previous").attr("aria-hidden","true"); $(".slidesjs-next").attr("aria-label","Next"); $(".slidesjs-previous").attr("aria-label","Previous"); //$(".slidesjs-stop.slidesjs-navigation").attr("role","button"); //$(".slidesjs-play.slidesjs-navigation").attr("role","button"); $(".slidesjs-pagination").attr("role","tablist").attr("aria-busy","true"); $("li.slidesjs-pagination-item").attr("role","list"); $(".item.slidesjs-slide").attr("tabindex","-1"); $(".item.slidesjs-slide").attr("aria-label","item"); /*$(".slidesjs-stop.slidesjs-navigation").on('click', function() { var itemNumber = parseInt($('.slidesjs-pagination-item > a.active').attr('data-slidesjs-item')); $($('.item.slidesjs-slide')[itemNumber]).find('.c-call-to-action').attr('tabindex', '0'); });*/ $(".slidesjs-stop.slidesjs-navigation, .slidesjs-pagination-item > button").on('click keydown', function() { $.each($('.item.slidesjs-slide'),function(i,el){ $(el).find('.c-call-to-action').attr('tabindex', '-1'); }); var itemNumber = parseInt($('.slidesjs-pagination-item > button.active').attr('data-slidesjs-item')); $($('.item.slidesjs-slide')[itemNumber]).find('.c-call-to-action').attr('tabindex', '0'); }); $(".slidesjs-play.slidesjs-navigation").on('click', function() { $.each($('.item.slidesjs-slide'),function(i,el){ $(el).find('.c-call-to-action').attr('tabindex', '-1'); }); }); $(".slidesjs-pagination-item button").keyup(function(e){ var keyCode = e.keyCode || e.which; if (keyCode == 9) { e.preventDefault(); $(".slidesjs-stop.slidesjs-navigation").trigger('click').blur(); $("button.active").focus(); } }); $(".slidesjs-play").on("click",function (event) { if (event.handleObj.type === "click") { $(".slidesjs-stop").focus(); } else if(event.handleObj.type === "keydown"){ if (event.which === 13 && $(event.target).hasClass("slidesjs-play")) { $(".slidesjs-stop").focus(); } } }); $(".slidesjs-stop").on("click",function (event) { if (event.handleObj.type === "click") { $(".slidesjs-play").focus(); } else if(event.handleObj.type === "keydown"){ if (event.which === 13 && $(event.target).hasClass("slidesjs-stop")) { $(".slidesjs-play").focus(); } } }); $(".slidesjs-pagination-item").keydown(function(e){ switch (e.which){ case 37: //left arrow key $(".slidesjs-previous.slidesjs-navigation").trigger('click'); e.preventDefault(); break; case 39: //right arrow key $(".slidesjs-next.slidesjs-navigation").trigger('click'); e.preventDefault(); break; default: return; } $(".slidesjs-pagination-item button.active").focus(); }); }); // Start This code is added by iTalent as part of iTrack 1859082 $(document).ready(function(){ $("#tip1").attr("aria-hidden","true").addClass("hidden"); $("#tip2").attr("aria-hidden","true").addClass("hidden"); $(".slidesjs-stop.slidesjs-navigation, .slidesjs-play.slidesjs-navigation").attr('title', ''); $("a#playtitle").focus(function(){ $("#tip1").attr("aria-hidden","false").removeClass("hidden"); }); $("a#playtitle").mouseover(function(){ $("#tip1").attr("aria-hidden","false").removeClass("hidden"); }); $("a#playtitle").blur(function(){ $("#tip1").attr("aria-hidden","true").addClass("hidden"); }); $("a#playtitle").mouseleave(function(){ $("#tip1").attr("aria-hidden","true").addClass("hidden"); }); $("a#play").keydown(function(ev){ if (ev.which ==27) { $("#tip1").attr("aria-hidden","true").addClass("hidden"); ev.preventDefault(); return false; } }); $("a#stoptitle").focus(function(){ $("#tip2").attr("aria-hidden","false").removeClass("hidden"); }); $("a#stoptitle").mouseover(function(){ $("#tip2").attr("aria-hidden","false").removeClass("hidden"); }); $("a#stoptitle").blur(function(){ $("#tip2").attr("aria-hidden","true").addClass("hidden"); }); $("a#stoptitle").mouseleave(function(){ $("#tip2").attr("aria-hidden","true").addClass("hidden"); }); $("a#stoptitle").keydown(function(ev){ if (ev.which ==27) { $("#tip2").attr("aria-hidden","true").addClass("hidden"); ev.preventDefault(); return false; } }); }); // End This code is added by iTalent as part of iTrack 1859082
Top Solution Authors