Exam Professional Machine Learning Engineer topic 1 question 176 discussion - ExamTopics


AI Summary Hide AI Generated Summary

Problem

A food company needs to preprocess its historical sales data stored in BigQuery for training multiple TensorFlow models in Vertex AI's custom training service. The data preprocessing involves mm-max scaling and bucketing on many features. The goal is to minimize preprocessing time, cost, and development effort.

Options

  • Use Spark with the spark-bigquery-connector and Dataproc for data preprocessing.
  • Write SQL queries to transform the data directly within BigQuery.
  • Add the transformations as a preprocessing layer within the TensorFlow models.
  • Create a Dataflow pipeline using the BigQueryIO connector to ingest, process, and rewrite data to BigQuery.

Solution

The suggested answer is B: Using SQL queries to transform the data in-place within BigQuery. This approach is likely the most efficient because it leverages BigQuery's optimized SQL engine for large-scale data processing. It minimizes the need for data transfer and external services, reducing overall cost and development effort.

Sign in to unlock more AI features Sign in with Google

You work for a food product company. Your company’s historical sales data is stored in BigQuery.You need to use Vertex AI’s custom training service to train multiple TensorFlow models that read the data from BigQuery and predict future sales. You plan to implement a data preprocessing algorithm that performs mm-max scaling and bucketing on a large number of features before you start experimenting with the models. You want to minimize preprocessing time, cost, and development effort. How should you configure this workflow?

  • A. Write the transformations into Spark that uses the spark-bigquery-connector, and use Dataproc to preprocess the data.
  • B. Write SQL queries to transform the data in-place in BigQuery.
  • C. Add the transformations as a preprocessing layer in the TensorFlow models.
  • D. Create a Dataflow pipeline that uses the BigQuerylO connector to ingest the data, process it, and write it back to BigQuery.
Show Suggested Answer Hide Answer
Suggested Answer: B 🗳️

Was this article displayed correctly? Not happy with what you see?


Share this article with your
friends and colleagues.

Facebook



Share this article with your
friends and colleagues.

Facebook