Exam Professional Data Engineer topic 1 question 286 discussion - ExamTopics


AI Summary Hide AI Generated Summary

Problem

Migrate thousands of Apache Spark jobs from an on-premises Apache Hadoop cluster to Google Cloud, minimizing code changes and using managed services.

Options

  • A. Move data to BigQuery; convert Spark scripts to SQL.
  • B. Rewrite jobs in Apache Beam; run on Dataflow.
  • C. Copy data to Compute Engine disks; manage jobs directly on instances.
  • D. Move data to Cloud Storage; run jobs on Dataproc.

Suggested Answer

The suggested answer is D: Move data to Cloud Storage and run jobs on Dataproc.

Sign in to unlock more AI features Sign in with Google

You have thousands of Apache Spark jobs running in your on-premises Apache Hadoop cluster. You want to migrate the jobs to Google Cloud. You want to use managed services to run your jobs instead of maintaining a long-lived Hadoop cluster yourself. You have a tight timeline and want to keep code changes to a minimum. What should you do?

  • A. Move your data to BigQuery. Convert your Spark scripts to a SQL-based processing approach.
  • B. Rewrite your jobs in Apache Beam. Run your jobs in Dataflow.
  • C. Copy your data to Compute Engine disks. Manage and run your jobs directly on those instances.
  • D. Move your data to Cloud Storage. Run your jobs on Dataproc.
Show Suggested Answer Hide Answer
Suggested Answer: D πŸ—³οΈ

Was this article displayed correctly? Not happy with what you see?

Tabs Reminder: Tabs piling up in your browser? Set a reminder for them, close them and get notified at the right time.

Try our Chrome extension today!


Share this article with your
friends and colleagues.
Earn points from views and
referrals who sign up.
Learn more

Facebook

Save articles to reading lists
and access them on any device


Share this article with your
friends and colleagues.
Earn points from views and
referrals who sign up.
Learn more

Facebook

Save articles to reading lists
and access them on any device