Page Reader AI | Exam Professional Machine Learning Engineer topic 1 question 137 discussion

Exam Professional Machine Learning Engineer topic 1 question 137 discussion - ExamTopics

AI Summary Hide AI Generated Summary

Problem:

A deployed machine learning model exhibits unpredictable performance degradation over time, sometimes degrading quickly and sometimes slowly. The goal is to find a cost-effective method to maintain high performance by retraining the model at optimal intervals.

Options Considered:

A. Anomaly Detection: Train a separate anomaly detection model to flag unusual inputs, triggering labeling and retraining.
B. Temporal Pattern Analysis: Identify performance patterns from the past year to predict future degradation and schedule retraining.
C. Cost-Benefit Analysis: Compare the cost of labeling/retraining with revenue loss due to poor performance to adjust retraining frequency.
D. Training-Serving Skew Detection: Regularly compare training and serving data statistics; if skew is detected, trigger labeling and retraining.

Suggested Solution:

The suggested answer is D, which involves regularly checking for discrepancies between the training data and live data (training-serving skew). This proactive approach allows for early detection of performance issues and targeted retraining, optimizing cost and performance.

You deployed an ML model into production a year ago. Every month, you collect all raw requests that were sent to your model prediction service during the previous month. You send a subset of these requests to a human labeling service to evaluate your model’s performance. After a year, you notice that your model's performance sometimes degrades significantly after a month, while other times it takes several months to notice any decrease in performance. The labeling service is costly, but you also need to avoid large performance degradations. You want to determine how often you should retrain your model to maintain a high level of performance while minimizing cost. What should you do?

A. Train an anomaly detection model on the training dataset, and run all incoming requests through this model. If an anomaly is detected, send the most recent serving data to the labeling service.
B. Identify temporal patterns in your model’s performance over the previous year. Based on these patterns, create a schedule for sending serving data to the labeling service for the next year.
C. Compare the cost of the labeling service with the lost revenue due to model performance degradation over the past year. If the lost revenue is greater than the cost of the labeling service, increase the frequency of model retraining; otherwise, decrease the model retraining frequency.
D. Run training-serving skew detection batch jobs every few days to compare the aggregate statistics of the features in the training dataset with recent serving data. If skew is detected, send the most recent serving data to the labeling service.

Show Suggested Answer Hide Answer

Was this article displayed correctly? Not happy with what you see?

See Archived Versions Request Manual Review

Category: MachineLearning

Tags: Machine Learning Model Retraining Anomaly Detection Performance Degradation Production

Tabs Reminder: Tabs piling up in your browser? Set a reminder for them, close them and get notified at the right time.

If you often open multiple tabs and struggle to keep track of them, Tabs Reminder is the solution you need. Tabs Reminder lets you set reminders for tabs so you can close them and get notified about them later. Never lose track of important tabs again with Tabs Reminder!

Try our Chrome extension today!

Add to Chrome

Save As Favorite

Add To Reading List

Share this article with your
friends and colleagues.
Earn points from views and
referrals who sign up.
Learn more

Twitter/X

WhatsApp

Facebook

Save articles to reading lists
and access them on any device