Exam Professional Data Engineer topic 1 question 284 discussion - ExamTopics


AI Summary Hide AI Generated Summary

Problem

A system needs to store and query time-series data from 1000 sensors, generating 1 metric/sensor/second. Existing data is 1TB, growing at 1GB/day. Two access patterns exist: (1) Retrieving a single sensor's metric at a specific timestamp (single-digit millisecond latency required); (2) Daily complex analytics queries (including joins).

Options

  • A. BigQuery: Use concatenated sensor ID and timestamp as primary key.
  • B. Bigtable: Use concatenated sensor ID and timestamp as row key; daily export to BigQuery.
  • C. Bigtable: Use concatenated sensor ID and metric as row key; daily export to BigQuery.
  • D. BigQuery: Use metric as primary key.

Solution

The suggested answer is B. Bigtable excels at low-latency point lookups due to its row-key based design. Using concatenated sensor ID and timestamp as the row key allows for fast retrieval of individual sensor data at a given timestamp. Daily export to BigQuery enables efficient execution of complex analytics queries.

Sign in to unlock more AI features Sign in with Google

You have a network of 1000 sensors. The sensors generate time series data: one metric per sensor per second, along with a timestamp. You already have 1 TB of data, and expect the data to grow by 1 GB every day. You need to access this data in two ways. The first access pattern requires retrieving the metric from one specific sensor stored at a specific timestamp, with a median single-digit millisecond latency. The second access pattern requires running complex analytic queries on the data, including joins, once a day. How should you store this data?

  • A. Store your data in BigQuery. Concatenate the sensor ID and timestamp, and use it as the primary key.
  • B. Store your data in Bigtable. Concatenate the sensor ID and timestamp and use it as the row key. Perform an export to BigQuery every day.
  • C. Store your data in Bigtable. Concatenate the sensor ID and metric, and use it as the row key. Perform an export to BigQuery every day.
  • D. Store your data in BigQuery. Use the metric as a primary key.
Show Suggested Answer Hide Answer
Suggested Answer: B πŸ—³οΈ

Was this article displayed correctly? Not happy with what you see?

Tabs Reminder: Tabs piling up in your browser? Set a reminder for them, close them and get notified at the right time.

Try our Chrome extension today!


Share this article with your
friends and colleagues.
Earn points from views and
referrals who sign up.
Learn more

Facebook

Save articles to reading lists
and access them on any device


Share this article with your
friends and colleagues.
Earn points from views and
referrals who sign up.
Learn more

Facebook

Save articles to reading lists
and access them on any device