To handle late-arriving data in GCP Dataflow streaming pipelines, configure windowing with allowed lateness and triggers, which keeps state active beyond the window closing time to correctly bucket late messages and emit revised totals; this approach is superior to Pub/Sub message retention (which only manages queue duration) or manual sleep delays (which artificially lag the entire pipeline).
Deep Dive
Prerequisite Knowledge
- No data available.
Where to go next
- No data available.
Deep Dive
GCP Data Engineer Question 12Added:
GCP data engineering professional certification question 12 of 30. Your data flow streaming pipeline reads from Pub/Sub and computes per minute the total revenue. Some messages arrive late up to 5 minutes after the event time.
This late data is causing incorrect totals in your output. What data flow feature would you use to fix this? Let us try to understand this question a bit more. The pipeline uses windowing to group events into per minute buckets.
The issue is that some events arrive at 5-minute delay. By which point their window might have crossed and the output is already produced. The good news is we know the exact maximum delay that is 5 minutes. That means we can configure a precise tolerance for it. Here are our four choices given in the exam. Increase the Pub/Sub message retention period.
That's option number one. Option number two is use data flow windowing with allowed lateness and triggers. Option number three is switch to batch processing to avoid late data. Option number four is add a 5-minute sleep delay before processing. Let's go through option one by one. Pub/Sub message retention controls how long messages wait in the queue before actually expiring. Once a message enters the data flow pipeline, the retention settings no longer apply here. This does not touch the late data problem. So, we are going to rule this out. Windowing with allowed lateness and triggers. This is the data flow feature that proposes how we can solve this problem. Now, let's keep this and maybe this is a good option. Let's go through the other two options for now. Switching to batch eliminates the streaming pipeline entirely. This is not a fix. This is really not solving the problem and moving to a complete new solution. Let's rule this option out. The fourth option is a 5-minute sleep delay without artificially holding every single message before processing, not just the late data ones. Again, here we are assuming that every message in every different flow can be delayed. So, this is not solving the problem. So, this option is out. So, we are left with one option. That is the windowing option.
Let's [music] read that again. Now, windowing with allowed latency and trigger is correct. Configure a 5-minute allowed lateness on your window. The late messages arriving within their window get added to the right time bucket and trigger an update result.
Watermarks handle event time tracking through the pipeline. This is the standard pattern for late data in the streaming [music] actually. So, the correct option is option B, dataflow windowing that allowed lateness and triggers. Check out our GCP data engineering course on codecloud.com.
Related Videos
Agentforce NOW AMA: Build with React and Salesforce Multi-Framework
SalesforceDevs
490 views•2026-05-28
How agent o11y differs from traditional o11y — Phil Hetzel, Braintrust
aiDotEngineer
450 views•2026-05-28
WEB TECHNOLOGIES UNIT-2 | Degree 4th sem BCOM Computers web technologies unit-2 full explanation💯✅
LearnwithSahera
1K views•2026-05-29
More tests are always better? How to use AI to identify tests that bring little value
Alliance4Qualification
335 views•2026-05-29
Search Algorithms Explained in 60 Seconds! 🤖💨
samarthtuliofficial
218 views•2026-06-01
People of Game of Thrones using JavaScript DOM
AltCampus
296 views•2026-05-30
Introduction to Problem Solving Part - 1 | Lecture 1 | Intermediate DSA
ascensionix
107 views•2026-05-29
🚀 BCS613C Compiler Design | Module 1 to 5 Schema Evaluation 🔥 | VTU 6th Sem 💯 #VTU #bcs613c #exam
Pranavaa-y4y
104 views•2026-06-02











