Study for the Google Cloud Professional Data Engineer Exam with engaging Qandamp;A. Each question features hints and detailed explanations to enhance your understanding. Prepare confidently and ensure your success!

Practice this question and more.


If you need to combine frequently changing data from Cloud SQL with large datasets in BigQuery, what is the most efficient approach?

  1. Copy the data from Cloud SQL to a new BigQuery table hourly

  2. Create a combined, normalized table hourly from Cloud SQL

  3. Use a federated query to get data from Cloud SQL

  4. Create a Dataflow pipeline to combine the data

The correct answer is: Use a federated query to get data from Cloud SQL

Using a federated query to get data from Cloud SQL is the most efficient approach in this scenario due to its ability to directly access and query data in Cloud SQL without the need for copying or transforming data ahead of time. Federated queries allow you to work with live data from Cloud SQL in BigQuery as if it were part of a BigQuery table. This means you can keep the data fresh and up to date without the overhead of data movement. Additionally, federated queries are particularly useful for real-time analytics, enabling you to run complex queries without the latency that comes from loading data into BigQuery first. By querying the data directly, you can reduce the operational costs and complexities associated with maintaining multiple copies of the data. While other options involve some form of data transfer or transformation that could lead to more maintenance and potential delays, using a federated query strikes the best balance of efficiency and real-time accessibility.