Page Reader AI | Exam Professional Machine Learning Engineer topic 1 question 277 discussion

Exam Professional Machine Learning Engineer topic 1 question 277 discussion - ExamTopics

AI Summary Hide AI Generated Summary

Problem

A PyTorch model deployed on nl-highcpu-16 machines in us-central1 region of Google Cloud exhibits high latency, particularly in Singapore. The model classifies transactions as fraudulent or not and uses numerical and categorical features.

Solutions and Analysis

Several solutions are proposed:

A. Attaching an NVIDIA T4 GPU: This might improve performance but doesn't address the geographical distance issue.
B. Changing to nl-highcpu-32 machines: Improves processing power but doesn't solve the latency problem in Singapore.
C. Deploying to Vertex AI private endpoints in both us-central1 and asia-southeast1: This allows the application to choose the nearest endpoint, directly addressing the latency issue in Singapore. This is deemed the correct answer.
D. Creating another Vertex AI endpoint in asia-southeast1: Similar to option C, but it doesn't leverage the existing us-central1 deployment.

Suggested Answer

The suggested answer is C, deploying the model to Vertex AI private endpoints in both the US and Singapore regions to minimize latency.

You work for a large bank that serves customers through an application hosted in Google Cloud that is running in the US and Singapore. You have developed a PyTorch model to classify transactions as potentially fraudulent or not. The model is a three-layer perceptron that uses both numerical and categorical features as input, and hashing happens within the model.

You deployed the model to the us-central1 region on nl-highcpu-16 machines, and predictions are served in real time. The model's current median response latency is 40 ms. You want to reduce latency, especially in Singapore, where some customers are experiencing the longest delays. What should you do?

A. Attach an NVIDIA T4 GPU to the machines being used for online inference.
B. Change the machines being used for online inference to nl-highcpu-32.
C. Deploy the model to Vertex AI private endpoints in the us-central1 and asia-southeast1 regions, and allow the application to choose the appropriate endpoint.
D. Create another Vertex AI endpoint in the asia-southeast1 region, and allow the application to choose the appropriate endpoint.

Show Suggested Answer Hide Answer

Was this article displayed correctly? Not happy with what you see?

See Archived Versions Request Manual Review

Category: Technology

Tags: MachineLearning GoogleCloud Latency ModelDeployment PyTorch

Tabs Reminder: Tabs piling up in your browser? Set a reminder for them, close them and get notified at the right time.

If you often open multiple tabs and struggle to keep track of them, Tabs Reminder is the solution you need. Tabs Reminder lets you set reminders for tabs so you can close them and get notified about them later. Never lose track of important tabs again with Tabs Reminder!

Try our Chrome extension today!

Add to Chrome

Save As Favorite

Add To Reading List

Share this article with your
friends and colleagues.
Earn points from views and
referrals who sign up.
Learn more

Twitter/X

WhatsApp

Facebook

Save articles to reading lists
and access them on any device