Exam Professional Data Engineer topic 1 question 284 discussion - ExamTopics


AI Summary Hide AI Generated Summary

Problem

A system needs to store and query time-series data from 1000 sensors, generating 1 metric/sensor/second. Existing data is 1TB, growing at 1GB/day. Two access patterns exist: (1) Retrieving a single sensor's metric at a specific timestamp (single-digit millisecond latency required); (2) Daily complex analytics queries (including joins).

Options

  • A. BigQuery: Use concatenated sensor ID and timestamp as primary key.
  • B. Bigtable: Use concatenated sensor ID and timestamp as row key; daily export to BigQuery.
  • C. Bigtable: Use concatenated sensor ID and metric as row key; daily export to BigQuery.
  • D. BigQuery: Use metric as primary key.

Solution

The suggested answer is B. Bigtable excels at low-latency point lookups due to its row-key based design. Using concatenated sensor ID and timestamp as the row key allows for fast retrieval of individual sensor data at a given timestamp. Daily export to BigQuery enables efficient execution of complex analytics queries.

Sign in to unlock more AI features Sign in with Google

You have a network of 1000 sensors. The sensors generate time series data: one metric per sensor per second, along with a timestamp. You already have 1 TB of data, and expect the data to grow by 1 GB every day. You need to access this data in two ways. The first access pattern requires retrieving the metric from one specific sensor stored at a specific timestamp, with a median single-digit millisecond latency. The second access pattern requires running complex analytic queries on the data, including joins, once a day. How should you store this data?

  • A. Store your data in BigQuery. Concatenate the sensor ID and timestamp, and use it as the primary key.
  • B. Store your data in Bigtable. Concatenate the sensor ID and timestamp and use it as the row key. Perform an export to BigQuery every day.
  • C. Store your data in Bigtable. Concatenate the sensor ID and metric, and use it as the row key. Perform an export to BigQuery every day.
  • D. Store your data in BigQuery. Use the metric as a primary key.
Show Suggested Answer Hide Answer
Suggested Answer: B πŸ—³οΈ

Was this article displayed correctly? Not happy with what you see?


Share this article with your
friends and colleagues.

Facebook



Share this article with your
friends and colleagues.

Facebook