ISSUE:
In a Fabric pipeline, after creating for the first time a new table or writing to a lakehouse table with a Copy Data activity, pipeline execution moves on to the next activity in the pipeline faster than the lakehouse has time to update and show that there is new data available to downstream activities. When these activities try to read from the lakehouse table, of course, this causes an error. If this happens right after the table was created for the first time, the pipeline environment does not see the table as existing yet since the execution thread is moving forward faster than the lakehouse can update and tell the pipeline that now it has a new table or new data.
POSSIBLE SOLUTIONS:
1__Have an option in the Settings tab that when checked, causes the Copy Data activity to start waiting for a response from the lakehouse. After the lakehouse sends the response that the all the data or table has been committed, clear the wait flag, and Copy Data activity completes its execution, and the pipeline execution resumes with the next activity.
OR
2__Provide a new activity called "Wait for I/O commit", which would pause the pipeline execution and wait for the lakehouse that is specified in its settings to send a message that the data is now available for read operations. Upon receipt of said message, this activity would allow pipeline executoin to resume.
2 simple, elegant, and user-friendly ways to once and for all get rid of this synchronization headache.
... View more