DP-100 Designing and Implementing a Data Science Solution on Azure sample Question + Exam 2025 Practice Exam Dumps

Question # 4

You need to identify the methods for dividing the data according, to the testing requirements.

Which properties should you select? To answer, select the appropriate option-, m the answer area. NOTE: Each correct selection is worth one point.

Full Access

Question # 5

You need to configure the Feature Based Feature Selection module based on the experiment requirements and datasets.

How should you configure the module properties? To answer, select the appropriate options in the dialog box in the answer area.

NOTE: Each correct selection is worth one point.

Full Access

Question # 6

You need to visually identify whether outliers exist in the Age column and quantify the outliers before the outliers are removed.

Which three Azure Machine Learning Studio modules should you use in sequence? To answer, move the appropriate modules from the list of modules to the answer area and arrange them in the correct order.

Full Access

Question # 7

You need to identify the methods for dividing the data according to the testing requirements.

Which properties should you select? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Full Access

Question # 8

You need to select a feature extraction method.

Which method should you use?

Mutual information

Moodâ€™s median test

Kendall correlation

Permutation Feature Importance

Full Access

Question # 9

You need to configure the Edit Metadata module so that the structure of the datasets match.

Which configuration options should you select? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Full Access

Question # 10

You need to implement early stopping criteria as suited in the model training requirements.

Which three code segments should you use to develop the solution? To answer, move the appropriate code segments from the list of code segments to the answer area and arrange them in the correct order.

NOTE: More than one order of answer choices is correct. You will receive credit for any of the correct orders you select.

Full Access

Question # 11

You need to correct the model fit issue.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Full Access

Question # 12

You need to produce a visualization for the diagnostic test evaluation according to the data visualization requirements.

Which three modules should you recommend be used in sequence? To answer, move the appropriate modules from the list of modules to the answer area and arrange them in the correct order.

Full Access

Question # 13

You create an Azure Machine Learning workspace.

You must use the Python SDK v2 to implement an experiment from a Jupiter notebook in the workspace. The experiment must log string metrics.

You need to implement the method to log the string metrics.

Which method should you use?

mlflow.log-metric0

mlflow.log. artifact0

mlflow.log. dist0

mlflow.log-text0

Full Access

Question # 14

You need to set up the Permutation Feature Importance module according to the model training requirements.

Which properties should you select? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Full Access

Question # 15

You are training machine learning models in Azure Machine Learning. You use Hyperdrive to tune the hyperparameters. In previous model training and tuning runs, many models showed similar performance. You need to select an early termination policy that meets the following requirements:

â€¢ accounts for the performance of all previous runs when evaluating the current run

â€¢ avoids comparing the current run with only the best performing run to date

Which two early termination policies should you use? Each correct answer presents part of the solution. NOTE: Each correct selection is worth one point.

Bandit

Median stopping

Default

Truncation selection

Full Access

Question # 16

A set of CSV files contains sales records. All the CSV files have the same data schema.

Each CSV file contains the sales record for a particular month and has the filename sales.csv. Each file in stored in a folder that indicates the month and year when the data was recorded. The folders are in an Azure blob container for which a datastore has been defined in an Azure Machine Learning workspace. The folders are organized in a parent folder named sales to create the following hierarchical structure:

At the end of each month, a new folder with that monthâ€™s sales file is added to the sales folder.

You plan to use the sales data to train a machine learning model based on the following requirements:

You must define a dataset that loads all of the sales data to date into a structure that can be easily converted to a dataframe.

You must be able to create experiments that use only data that was created before a specific previous month, ignoring any data that was added after that month.

You must register the minimum number of datasets possible.

You need to register the sales data as a dataset in Azure Machine Learning service workspace.

What should you do?

Create a tabular dataset that references the datastore and explicitly specifies each 'sales/mm-yyyy/sales.csv' file every month. Register the dataset with the name sales_dataset each month, replacing theexisting dataset and specifying a tag named month indicating the month and year it was registered. Usethis dataset for all experiments.

Create a tabular dataset that references the datastore and specifies the path 'sales/*/sales.csv', register the dataset with the name sales_dataset and a tag named month indicating the month and year it was registered, and use this dataset for all experiments.

Create a new tabular dataset that references the datastore and explicitly specifies each 'sales/mm-yyyy/ sales.csv' file every month. Register the dataset with the name sales_dataset_MM-YYYY each month with appropriate MM and YYYY values for the month and year. Use the appropriate month-specific dataset for experiments.

Create a tabular dataset that references the datastore and explicitly specifies each 'sales/mm-yyyy/sales.csv' file. Register the dataset with the name sales_dataset each month as a new version and with a tag named month indicating the month and year it was registered. Use this dataset for all experiments,identifying the version to be used based on the month tag as necessary.

Full Access

Question # 17

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You create an Azure Machine Learning service datastore in a workspace. The datastore contains the following files:

â€¢ /data/2018/Q1 .csv

â€¢ /data/2018/Q2.csv

â€¢ /data/2018/Q3.csv

â€¢ /data/2018/Q4.csv

â€¢ /data/2019/Q1.csv

All files store data in the following format:

id,M,f2,l

1,1,2,0

2,1,1,1

32,10

You run the following code:

You need to create a dataset named training_data and load the data from all files into a single data frame by using the following code:

Solution: Run the following code:

Does the solution meet the goal?

Yes

Full Access

Question # 18

You create an Azure Machine Learning workspace and a new Azure DevOps organization. You register a model in the workspace and deploy the model to the target environment.

All new versions of the model registered in the workspace must automatically be deployed to the target environment.

You need to configure Azure Pipelines to deploy the model.

Which four actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Full Access

Question # 19

You plan to use a Deep Learning Virtual Machine (DLVM) to train deep learning models using Compute Unified Device Architecture (CUDA) computations.

You need to configure the DLVM to support CUDA.

What should you implement?

Intel Software Guard Extensions (Intel SGX) technology

Solid State Drives (SSD)

Graphic Processing Unit (GPU)

Computer Processing Unit (CPU) speed increase by using overcloking

High Random Access Memory (RAM) configuration

Full Access

Question # 20

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You have an Azure Machine Learning workspace. You connect to a terminal session from the Notebooks page in Azure Machine Learning studio.

You plan to add a new Jupyter kernel that will be accessible from the same terminal session.

You need to perform the task that must be completed before you can add the new kernel.

Solution: Create an environment.

Does the solution meet the goal?

Yes

Full Access

Question # 21

You use the Azure Machine Learning Python SDK to create a batch inference pipeline.

You must publish the batch inference pipeline so that business groups in your organization can use the pipeline. Each business group must be able to specify a different location for the data that the pipeline submits to the model for scoring.

You need to publish the pipeline.

What should you do?

Create multiple endpoints for the published pipeline service and have each business group submit jobs to its own endpoint.

Define a PipelineParameter object for the pipeline and use it to specify the business group-specific input dataset for each pipeline run.

Define a OutputFileDatasetConfig object for the pipeline and use the object to specify the business group-specific input dataset for each pipeline run.

Have each business group run the pipeline on local compute and use a local file for the input data.

Full Access

Question # 22

You use the designer to create a training pipeline for a classification model. The pipeline uses a dataset that includes the features and labels required for model training.

You create a real-time inference pipeline from the training pipeline. You observe that the schema for the generated web service input is based on the dataset and includes the label column that the model predicts. Client applications that use the service must not be required to submit this value.

You need to modify the inference pipeline to meet the requirement.

What should you do?

Add a Select Columns in Dataset module to the inference pipeline after the dataset and use it to select all columns other than the label.

Delete the dataset from the training pipeline and recreate the real-time inference pipeline.

Delete the Web Service Input module from the inference pipeline.

Replace the dataset in the inference pipeline with an Enter Data Manually module that includes data for the feature columns but not the label column.

Full Access

Question # 23

You manage an Azure Machine Learning workspace named workspace1 and a Data Science Virtual Machine (DSVM) named DSMV1.

You must an experiment in DSMV1 by using a Jupiter notebook and Python SDK v2 code. You must store metrics and artifacts in workspace 1 You start by creating Python SCK v2 code to import ail required packages.

You need to implement the Python SOK v2 code to store metrics and article in workspace1.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them the correctly order.

Full Access

Question # 24

You use Azure Machine Learning Designer to load the following datasets into an experiment:

Dataset1

Dataset2

You use Azure Machine Learning Designer to load the following datasets into an experiment:

You need to create a dataset that has the same columns and header row as the input datasets and contains all rows from both input datasets.

Solution: Use the Join Data component.

Does the solution meet the goal?

Yes

Full Access

Question # 25

You use the Azure Machine Learning Python SDK to define a pipeline to train a model.

The data used to train the model is read from a folder in a datastore.

You need to ensure the pipeline runs automatically whenever the data in the folder changes.

What should you do?

Set the regenerate_outputs property of the pipeline to True

Create a ScheduleRecurrance object with a Frequency of auto. Use the object to create a Schedule for the pipeline

Create a PipelineParameter with a default value that references the location where the training data is stored

Create a Schedule for the pipeline. Specify the datastore in the datastore property, and the folder containing the training data in the path_on_datascore property

Full Access

Question # 26

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You create a model to forecast weather conditions based on historical data.

You need to create a pipeline that runs a processing script to load data from a datastore and pass the processed data to a machine learning model training script.

Solution: Run the following code:

Does the solution meet the goal?

Yes

Full Access

Question # 27

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You have an Azure Machine Learning workspace. You connect to a terminal session from the Notebooks page in Azure Machine Learning studio.

You plan to add a new Jupyter kernel that will be accessible from the same terminal session.

You need to perform the task that must be completed before you can add the new kernel.

Solution: Delete the Python 3.6 - AzureML kernel.

Does the solution meet the goal?

Yes

Full Access

Question # 28

You have an Azure Machine Learning workspace that contains a CPU-based compute cluster and an Azure Kubernetes Services (AKS) inference cluster. You create a tabular dataset containing data that you plan to use to create a classification model.

You need to use the Azure Machine Learning designer to create a web service through which client applications can consume the classification model by submitting new data and getting an immediate prediction as a response.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Full Access

Question # 29

You load data from a notebook in an Azure Machine Learning workspace into a pandas dataframe named df. The data contains 10.000 patient records. Each record includes the Age property for the corresponding patient.

You must identify the mean age value from the differentially private data generated by SmartNoise SDK.

You need to complete the Python code that will generate the mean age value from the differentially private data.

Which code segments should you use? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Full Access

Question # 30

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You have an Azure Machine Learning workspace. You connect to a terminal session from the Notebooks page in Azure Machine Learning studio.

You plan to add a new Jupyter kernel that will be accessible from the same terminal session.

You need to perform the task that must be completed before you can add the new kernel.

Solution: Delete the Python 3.8 - AzureML kernel.

Does the solution meet the goal?

Yes

Full Access

Question # 31

You are building a recurrent neural network to perform a binary classification. You review the training loss, validation loss, training accuracy, and validation accuracy for each training epoch.

You need to analyze model performance.

Which observation indicates that the classification model is over fitted?

The training loss .stays constant and the validation loss stays on a constant value and close to the training loss value when training the model.

The training loss increases while the validation loss decreases when training the model.

The training loss decreases while the validation loss increases when training the model.

The training loss stays constant and the validation loss decreases when training the model.

Full Access

Question # 32

You manage an Azure Machine Learning workspace. You plan to import data from Azure Data Lake Storage Gen2. You need to build a URI that represents the storage location. Which protocol should you use?

abfss

https

adl

wasbs

Full Access

Question # 33

You have an Azure Machine Learning workspace. You are running an experiment on your local computer.

You need to use MLflow Tracking to store metrics and artifacts from your local experiment runs in the workspace.

In which order should you perform the actions? To answer, move all actions from the list of actions to the answer area and arrange them in the correct order.

Full Access

Question # 34

You create an Azure Machine Learning workspace.

You must configure an event handler to send an email notification when data drift is detected in the workspace datasets. You must minimize development efforts.

You need to configure an Azure service to send the notification.

Which Azure service should you use?

Azure Function apps

Azure DevOps pipeline

Azure Automation runbook

Azure Logic Apps

Full Access

Question # 35

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You are using Azure Machine Learning to run an experiment that trains a classification model.

You want to use Hyperdrive to find parameters that optimize the AUC metric for the model. You configure a HyperDriveConfig for the experiment by running the following code:

You plan to use this configuration to run a script that trains a random forest model and then tests it with validation data. The label values for the validation data are stored in a variable named y_test variable, and the predicted probabilities from the model are stored in a variable named y_predicted.

Solution: Run the following code:

Does the solution meet the goal?

Yes

Full Access

Question # 36

You are conducting feature engineering to prepuce data for further analysis.

The data includes seasonal patterns on inventory requirements.

You need to select the appropriate method to conduct feature engineering on the data.

Which method should you use?

Exponential Smoothing (ETS) function.

One Class Support Vector Machine module

Time Series Anomaly Detection module

Finite Impulse Response (FIR) Filter module.

Full Access

Question # 37

You have a Python data frame named salesData in the following format:

The data frame must be unpivoted to a long data format as follows:

You need to use the pandas.melt() function in Python to perform the transformation.

How should you complete the code segment? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Full Access

Question # 38

You have a feature set containing the following numerical features: X, Y, and Z.

The Poisson correlation coefficient (r-value) of X, Y, and Z features is shown in the following image:

Use the drop-down menus to select the answer choice that answers each question based on the information presented in the graphic.

NOTE: Each correct selection is worth one point.

Full Access

Question # 39

You use the Azure Machine Learning SDK to run a training experiment that trains a classification model and calculates its accuracy metric.

The model will be retrained each month as new data is available.

You must register the model for use in a batch inference pipeline.

You need to register the model and ensure that the models created by subsequent retraining experiments are registered only if their accuracy is higher than the currently registered model.

What are two possible ways to achieve this goal? Each correct answer presents a complete solution.

NOTE: Each correct selection is worth one point.

Specify a different name for the model each time you register it.

Register the model with the same name each time regardless of accuracy, and always use the latestversion of the model in the batch inferencing pipeline.

Specify the model framework version when registering the model, and only register subsequent models if this value is higher.

Specify a property named accuracy with the accuracy metric as a value when registering the model, and only register subsequent models if their accuracy is higher than the accuracy property value of thecurrently registered model.

Specify a tag named accuracy with the accuracy metric as a value when registering the model, and only register subsequent models if their accuracy is higher than the accuracy tag value of the currentlyregistered model.

Full Access

Question # 40

You create a script that trains a convolutional neural network model over multiple epochs and logs the validation loss after each epoch. The script includes arguments for batch size and learning rate.

You identify a set of batch size and learning rate values that you want to try.

You need to use Azure Machine Learning to find the combination of batch size and learning rate that results in the model with the lowest validation loss.

What should you do?

Run the script in an experiment based on an AutoMLConfig object

Create a PythonScriptStep object for the script and run it in a pipeline

Use the Automated Machine Learning interface in Azure Machine Learning studio

Run the script in an experiment based on a ScriptRunConfig object

Run the script in an experiment based on a HyperDriveConfig object

Full Access

Question # 41

You create a pipeline in designer to train a model that predicts automobile prices.

Because of non-linear relationships in the data, the pipeline calculates the natural log (Ln) of the prices in the training data, trains a model to predict this natural log of price value, and then calculates the exponential of the scored label to get the predicted price.

The training pipeline is shown in the exhibit. (Click the Training pipeline tab.)

Training pipeline

You create a real-time inference pipeline from the training pipeline, as shown in the exhibit. (Click the Real-time pipeline tab.)

Real-time pipeline

You need to modify the inference pipeline to ensure that the web service returns the exponential of the scored label as the predicted automobile price and that client applications are not required to include a price value in the input values.

Which three modifications must you make to the inference pipeline? Each correct answer presents part of the solution.

NOTE: Each correct selection is worth one point.

Connect the output of the Apply SQL Transformation to the Web Service Output module.

Replace the Web Service Input module with a data input that does not include the price column.

Add a Select Columns module before the Score Model module to select all columns other than price.

Replace the training dataset module with a data input that does not include the price column.

Remove the Apply Math Operation module that replaces price with its natural log from the data flow.

Remove the Apply SQL Transformation module from the data flow.

Full Access

Question # 42

You use the following code to define the steps for a pipeline:

from azureml.core import Workspace, Experiment, Run

from azureml.pipeline.core import Pipeline

from azureml.pipeline.steps import PythonScriptStep

ws = Workspace.from_config()

. . .

step1 = PythonScriptStep(name="step1", ...)

step2 = PythonScriptsStep(name="step2", ...)

pipeline_steps = [step1, step2]

You need to add code to run the steps.

Which two code segments can you use to achieve this goal? Each correct answer presents a complete solution.

NOTE: Each correct selection is worth one point.

experiment = Experiment(workspace=ws,name='pipeline-experiment')run = experiment.submit(config=pipeline_steps)

run = Run(pipeline_steps)

pipeline = Pipeline(workspace=ws, steps=pipeline_steps)experiment = Experiment(workspace=ws,name='pipeline-experiment')run = experiment.submit(pipeline)

pipeline = Pipeline(workspace=ws, steps=pipeline_steps)run = pipeline.submit(experiment_name='pipeline-experiment')

Full Access

Question # 43

You need to implement a new cost factor scenario for the ad response models as illustrated in the

performance curve exhibit.

Which technique should you use?

Set the threshold to 0.5 and retrain if weighted Kappa deviates +/- 5% from 0.45.

Set the threshold to 0.05 and retrain if weighted Kappa deviates +/- 5% from 0.5.

Set the threshold to 0.2 and retrain if weighted Kappa deviates +/- 5% from 0.6.

Set the threshold to 0.75 and retrain if weighted Kappa deviates +/- 5% from 0.15.

Full Access

Question # 44

You plan to use Hyperdrive to optimize the hyperparameters selected when training a model. You create the following code to define options for the hyperparameter experiment

For each of the following statements, select Yes if the statement is true. Otherwise, select No. NOTE: Each correct selection is worth one point.

Full Access

Question # 45

You need to modify the inputs for the global penalty event model to address the bias and variance issue.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Full Access

Question # 46

You need to define a process for penalty event detection.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Full Access

Question # 47

You need to select an environment that will meet the business and data requirements.

Which environment should you use?

Azure HDInsight with Spark MLlib

Azure Cognitive Services

Azure Machine Learning Studio

Microsoft Machine Learning Server

Full Access

Question # 48

You need to define an evaluation strategy for the crowd sentiment models.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Full Access

Question # 49

You need to use the Python language to build a sampling strategy for the global penalty detection models.

How should you complete the code segment? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Full Access

Question # 50

You need to define an evaluation strategy for the crowd sentiment models.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Full Access

Question # 51

You need to define a modeling strategy for ad response.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Full Access

Question # 52

You need to implement a feature engineering strategy for the crowd sentiment local models.

What should you do?

Apply an analysis of variance (ANOVA).

Apply a Pearson correlation coefficient.

Apply a Spearman correlation coefficient.

Apply a linear discriminant analysis.

Full Access

Question # 53

You need to resolve the local machine learning pipeline performance issue. What should you do?

Increase Graphic Processing Units (GPUs).

Increase the learning rate.

Increase the training iterations,

Increase Central Processing Units (CPUs).

Full Access

Question # 54

You need to implement a model development strategy to determine a userâ€™s tendency to respond to an ad.

Which technique should you use?

Use a Relative Expression Split module to partition the data based on centroid distance.

Use a Relative Expression Split module to partition the data based on distance travelled to the event.

Use a Split Rows module to partition the data based on distance travelled to the event.

Use a Split Rows module to partition the data based on centroid distance.

Full Access

Question # 55

You need to build a feature extraction strategy for the local models.

How should you complete the code segment? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Full Access

Question # 56

You need to define a process for penalty event detection.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Full Access

Question # 57

You need to implement a scaling strategy for the local penalty detection data.

Which normalization type should you use?

Streaming

Weight

Batch

Cosine

Full Access

Question # 58

You are planning to register a trained model in an Azure Machine Learning workspace.

You must store additional metadata about the model in a key-value format. You must be able to add new metadata and modify or delete metadata after creation.

You need to register the model.

Which parameter should you use?

description

model_framework

cags

properties

Full Access

Question # 59

You are moving a large dataset from Azure Machine Learning Studio to a Weka environment.

You need to format the data for the Weka environment.

Which module should you use?

Convert to CSV

Convert to Dataset

Convert to ARFF

Convert to SVMLight

Full Access

Question # 60

You manage an Azure Machine Learning workspace named workspace1 by using the Python SDK v2.

The default datastore of workspace1 contains a folder named sample_data. The folder structure contains the following content:

You write Python SDK v2 code to materialize the data from the files in the sample.data folder into a Pandas data frame. You need to complete the Python SDK v2 code to use the MLTaWe folder as the materialization blueprint. How should you complete the code? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Full Access

Question # 61

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You are using Azure Machine Learning to run an experiment that trains a classification model.

You want to use Hyperdrive to find parameters that optimize the AUC metric for the model. You configure a HyperDriveConfig for the experiment by running the following code:

Solution: Run the following code:

Does the solution meet the goal?

Yes

Full Access

Question # 62

You are working on a classification task. You have a dataset indicating whether a student would like to play soccer and associated attributes. The dataset includes the following columns:

You need to classify variables by type.

Which variable should you add to each category? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Full Access

Question # 63

You define a datastore named ml-data for an Azure Storage blob container. In the container, you have a folder named train that contains a file named data.csv. You plan to use the file to train a model by using the Azure Machine Learning SDK.

You plan to train the model by using the Azure Machine Learning SDK to run an experiment on local compute.

You define a DataReference object by running the following code:

You need to load the training data.

Which code segment should you use?

Option A

Option B

Option C

Option D

Option E

Full Access

Question # 64

You create an Azure Machine Learning workspace.

You must use the Python SDK v2 to implement an experiment from a Jupyter notebook in the workspace. The experiment must log a table in the following format:

You need to complete the Python code to log the table.

How should you complete the code? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Full Access

Question # 65

You are profiling mltabte data assets by using Azure Machine Learning studio. You need to detect columns with odd or missing values. Which statistic should you analyze?

Profile

Std deviation

Error count

Type

Full Access

Question # 66

You create a multi-class image classification deep learning model that uses a set of labeled images. You

create a script file named train.py that uses the PyTorch 1.3 framework to train the model.

You must run the script by using an estimator. The code must not require any additional Python libraries to be installed in the environment for the estimator. The time required for model training must be minimized.

You need to define the estimator that will be used to run the script.

Which estimator type should you use?

TensorFlow

PyTorch

SKLearn

Estimator

Full Access

Question # 67

You are performing feature engineering on a dataset.

You must add a feature named CityName and populate the column value with the text London.

You need to add the new feature to the dataset.

Which Azure Machine Learning Studio module should you use?

Edit Metadata

Preprocess Text

Execute Python Script

Latent Dirichlet Allocation

Full Access

Question # 68

You plan to run a script as an experiment using a Script Run Configuration. The script uses modules from the scipy library as well as several Python packages that are not typically installed in a default conda environment.

You plan to run the experiment on your local workstation for small datasets and scale out the experiment by running it on more powerful remote compute clusters for larger datasets.

You need to ensure that the experiment runs successfully on local and remote compute with the least administrative effort.

What should you do?

Create and register an Environment that includes the required packages. Use this Environment for all experiment runs.

Always run the experiment with an Estimator by using the default packages.

Do not specify an environment in the run configuration for the experiment. Run the experiment by using the default environment.

Create a config. yaml file defining the conda packages that are required and save the file in the experiment folder.

Create a virtual machine (VM) with the required Python configuration and attach the VM as a compute target. Use this compute target for all experiment runs.

Full Access

Question # 69

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

An IT department creates the following Azure resource groups and resources:

The IT department creates an Azure Kubernetes Service (AKS)-based inference compute target named aks-cluster in the Azure Machine Learning workspace.

You have a Microsoft Surface Book computer with a GPU. Python 3.6 and Visual Studio Code are installed.

You need to run a script that trains a deep neural network (DNN) model and logs the loss and accuracy metrics.

Solution: Attach the mlvm virtual machine as a compute target in the Azure Machine Learning workspace. Install the Azure ML SDK on the Surface Book and run Python code to connect to the workspace. Run the training script as an experiment on the mlvm remote compute resource.

Yes

Full Access

Question # 70

You manage an Azure Machine Learning workspace. You configure an automated machine learning regression training job by using the Azure Machine Learning Python SDK v2. You configure the regression job by using the following script:

For each of the following statements, select Yes if the statement is true. Otherwise, select No.

Full Access

Question # 71

You are tuning a hyperparameter for an algorithm. The following table shows a data set with different hyperparameter, training error, and validation errors.

Use the drop-down menus to select the answer choice that answers each question based on the information presented in the graphic.

Full Access

Question # 72

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You have an Azure Machine Learning workspace. You connect to a terminal session from the Notebooks page in Azure Machine Learning studio.

You plan to add a new Jupyter kernel that will be accessible from the same terminal session.

You need to perform the task that must be completed before you can add the new kernel.

Solution: Delete the Python 3.8 - AzureML kernel.

Does the solution meet the goal?

Yes

Full Access

Question # 73

You manage an Azure Machine Learning workspace. The development environment for managing the workspace is configured to use Python SDK v2 in Azure Machine Learning Notebooks.

A Synapse Spark Compute is currently attached and uses system-assigned identity.

You need to use Python code to update the Synapse Spark Compute to use a user-assigned identity.

Solution: Initialize the DefaultAzureCredential class.

Does the solution meet the goal?

Yes

Full Access

Question # 74

You manage an Azure Machine Learning workspace named workspace 1 with a compute instance named computet.

You must remove a kernel named kernel 1 from computet1. You connect to compute 1 by using noa terminal window from workspace 1.

You need to enter a command in the terminal window to remove kernel 1.

Which command should you use? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection it worth one point.

Full Access

Question # 75

You are creating an experiment by using Azure Machine Learning Studio.

You must divide the data into four subsets for evaluation. There is a high degree of missing values in the data. You must prepare the data for analysis.

You need to select appropriate methods for producing the experiment.

Which three modules should you run in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

NOTE: More than one order of answer choices is correct. You will receive credit for any of the correct orders you select.

Full Access

Question # 76

You manage an Azure Machine Learning workspace. You train a model named model1.

You must identify the features to modify for a differing model prediction result.

You need to configure the Responsible Al (RAI) dashboard for model1.

Which three actions should you perform in sequence? To answer move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Full Access

Question # 77

You plan to deliver a hands-on workshop to several students. The workshop will focus on creating data

visualizations using Python. Each student will use a device that has internet access.

Student devices are not configured for Python development. Students do not have administrator access to

install software on their devices. Azure subscriptions are not available for students.

You need to ensure that students can run Python-based data visualization code.

Which Azure tool should you use?

Anaconda Data Science Platform

Azure BatchAl

Azure Notebooks

Azure Machine Learning Service

Full Access

Question # 78

You have an Azure Machine Learning workspace.

You plan to implement automated hyperparameter tuning for model training in the workspace.

You need to select the sweep jobs parameter sampling method that will randomize the selection of hyperparameters from the search space but allow for reproducing search results.

Which sampling method should you use?

Bayesian

Random

Sobol

Grid

Full Access

Question # 79

You are using The Azure Machine Learning designer to transform a dataset containing the census data of all nations.

You must use the Split Data component to separate the dataset into two datasets. The first dataset must contain the census data of the United States. The second dataset must include the census data of the remaining nations.

You need to configure the component to create the datasets.

Which configuration values should you use? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Full Access

Question # 80

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You are analyzing a numerical dataset which contains missing values in several columns.

You must clean the missing values using an appropriate operation without affecting the dimensionality of the feature set.

You need to analyze a full dataset to include all values.

Solution: Remove the entire column that contains the missing data point.

Does the solution meet the goal?

Yes

Full Access

Question # 81

You use an Azure Machine Learning workspace.

You must monitor cost at the endpoint and deployment level.

You have a trained model that must be deployed as an online endpoint. Users must authenticate by using Microsoft Entra ID.

What should you do?

Deploy the model lo Azure Kubernetes Service (AKS). During deployment, set the token_auth_mode parameter of the target configuration object to true.

Deploy the model to a managed online endpoint. During deployment, set the token_auth_mode parameter of the target configuration object to true.

Deploy the model to Azure Kubernetes Service (AKS). During deployment, set the auth.mode parameter to configure the authentication type.

Deploy the model to a managed online endpoint. During deployment, set the auth_mode parameter to configure the authentication type.

Full Access

Question # 82

You are performing a filter based feature selection for a dataset 10 build a multi class classifies by using Azure Machine Learning Studio.

The dataset contains categorical features that are highly correlated to the output label column.

You need to select the appropriate feature scoring statistical method to identify the key predictors. Which method should you use?

Chi-squared

Spearman correlation

Kendall correlation

Person correlation

Full Access

Question # 83

You need to select a pre built development environment for a series of data science experiments. You must use the R language for the experiments.

Which three environments can you use? Each correct answer presents a complete solution. NOTE: Each correct selection is worth one point.

MI.NET Library on a local environment

Azure Machine Learning Studio

Data Science Virtual Machine (OSVM)

Azure Data bricks

Azure Cognitive Services

Full Access

Question # 84

You use Azure Machine Learning designer to create a real-time service endpoint. You have a single Azure Machine Learning service compute resource. You train the model and prepare the real-time pipeline for deployment You need to publish the inference pipeline as a web service. Which compute type should you use?

HDInsight

Azure Databricks

Azure Kubernetes Services

the existing Machine Learning Compute resource

a new Machine Learning Compute resource

Full Access

Question # 85

You have a binary classifier that predicts positive cases of diabetes within two separate age groups.

The classifier exhibits a high degree of disparity between the age groups.

You need to modify the output of the classifier to maximize its degree of fairness across the age groups and meet the following requirements:

â€¢ Eliminate the need to retrain the model on which the classifier is based.

â€¢ Minimize the disparity between true positive rates and false positive rates across age groups.

Which algorithm and panty constraint should you use? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.

Full Access

Question # 86

You train and register a machine learning model. You create a batch inference pipeline that uses the model to generate predictions from multiple data files.

You must publish the batch inference pipeline as a service that can be scheduled to run every night.

You need to select an appropriate compute target for the inference service.

Which compute target should you use?

Azure Machine Learning compute instance

Azure Machine Learning compute cluster

Azure Kubernetes Service (AKS)-based inference cluster

Azure Container Instance (ACI) compute target

Full Access

Question # 87

You are developing deep learning models to analyze semi-structured, unstructured, and structured data types.

You have the following data available for model building:

Video recordings of sporting events

Transcripts of radio commentary about events

Logs from related social media feeds captured during sporting events

You need to select an environment for creating the model.

Which environment should you use?

Azure Cognitive Services

Azure Data Lake Analytics

Azure HDInsight with Spark MLib

Azure Machine Learning Studio

Full Access

Question # 88

You manage an Azure Machine Learning workspace named workspaces

You plan to create a registry named registry01 with the help of the following registry.yml (line numbers are used for reference only):

You need to use Azure Machine Learning Python SDK v2 with Python 3.10 in a notebook to interact with workspace1.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Full Access

Question # 89

You manage an Azure Al Foundry project.

You develop a Prompt flow that includes a large language model (LLM) node and an upstream node with a single output. You need to link the LLM node input with the output of the upstream node by using a YAML flow configuration. Which flow configuration should you use?

$(upstream_node_name.output)

<#upstream_node_nameoutput#>

(% upstream node_name,output%}

{{upstream.node.nameoutput})

Full Access

Question # 90

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You train and register an Azure Machine Learning model.

You plan to deploy the model to an online endpoint.

You need to ensure that applications will be able to use the authentication method with a non-expiring artifact to access the model.

Solution:

Create a managed online endpoint and set the value of its auto_mode parameter to key. Deploy the model to the inline endpoint.

Does the solution meet the goal?

Yes

Full Access

Question # 91

You create a pipeline in designer to train a model that predicts automobile prices.

The training pipeline is shown in the exhibit. (Click the Training pipeline tab.)

Training pipeline

You create a real-time inference pipeline from the training pipeline, as shown in the exhibit. (Click the Real-time pipeline tab.)

Real-time pipeline

Which three modifications must you make to the inference pipeline? Each correct answer presents part of the solution.

NOTE: Each correct selection is worth one point.

Connect the output of the Apply SQL Transformation to the Web Service Output module.

Replace the Web Service Input module with a data input that does not include the price column.

Add a Select Columns module before the Score Model module to select all columns other than price.

Replace the training dataset module with a data input that does not include the price column.

Remove the Apply Math Operation module that replaces price with its natural log from the data flow.

Remove the Apply SQL Transformation module from the data flow.

Full Access

Question # 92

You have a Jupyter Notebook that contains Python code that is used to train a model.

You must create a Python script for the production deployment. The solution must minimize code maintenance.

Which two actions should you perform? Each correct answer presents part of the solution.

NOTE: Each correct selection is worth one point.

Refactor the Jupyter Notebook code into functions

Save each function to a separate Python file

Define a main() function in the Python script

Remove all comments and functions from the Python script

Full Access

Question # 93

You are evaluating a completed binary classification machine.

You need to use the precision as the evaluation metric.

Which visualization should you use?

scatter plot

coefficient of determination

Receiver Operating Characteristic CROC) curve

Gradient descent

Full Access

Question # 94

You use Azure Machine Learning Designer lo load the following datasets into an experiment:

Dataset1:

Dataset2:

You need to create a dataset that has the same columns and header row as the input datasets and contains all rows from both input datasets.

Solution: Use the Add Rows component.

Does the solution meet the goal?

Yes

Full Access

Question # 95

You create an Azure Machine learning workspace. The workspace contains a folder named src. The folder contains a Python script named script 1 .py.

You use the Azure Machine Learning Python SDK v2 to create a control script. You must use the control script to run script l.py as part of a training job.

You need to complete the section of script that defines the job parameters.

How should you complete the script? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Full Access

Question # 96

You have an Azure Machine Learning workspace. You are connecting an Azure Data Lake Storage Gen2 account to the workspace as a data store. You need to authorize access from the workspace to the Azure Data Lake Storage Gen2 account.

What should you use?

Managed identity

SAS token

Service principal

Account key

Full Access

Question # 97

A coworker registers a datastore in a Machine Learning services workspace by using the following code:

You need to write code to access the datastore from a notebook.

Full Access

Question # 98

: 215

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You train a classification model by using a logistic regression algorithm.

You must be able to explain the modelâ€™s predictions by calculating the importance of each feature, both as an overall global relative importance value and as a measure of local importance for a specific set of predictions.

You need to create an explainer that you can use to retrieve the required global and local feature importance values.

Solution: Create a MimicExplainer.

Does the solution meet the goal?

Yes

Full Access

Question # 99

You manage an Azure Machine Learning workspace. You have an environment for training jobs which uses an existing Docker image. A new version of the Docker image is available.

You need to use the latest version of the Docker image for the environment configuration by using the Azure Machine Learning SDK v2-What should you do?

Modify the conda. file to specify the new version of the Docker image.

Use the Environment class to create a new version of the environment.

Use the create.or. update method to change the tag of the image.

Change the description parameter of the environment configuration.

Full Access

Question # 100

You train a model by using Azure Machine Learning. You use Azure Blob Storage to store production data.

The model must be re-trained when new data is uploaded to Azure Blob Storage. You need to minimize development and coding.

You need to configure Azure services to develop a re-training solution.

Which Azure services should you use? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Full Access

Question # 101

You are analyzing the asymmetry in a statistical distribution.

The following image contains two density curves that show the probability distribution of two datasets.

Use the drop-down menus to select the answer choice that answers each question based on the information presented in the graphic.

NOTE: Each correct selection is worth one point.

Full Access

Question # 102

: 217

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You train a classification model by using a logistic regression algorithm.

You need to create an explainer that you can use to retrieve the required global and local feature importance values.

Solution: Create a PFIExplainer.

Does the solution meet the goal?

Yes

Full Access

Question # 103

You manage an Azure Machine Learning workspace named workspace1.

You must register an Azure Blob storage datastore in workspace1 by using an access key. You develop Python SDK v2 code to import all modules required to register the datastore.

You need to complete the Python SDK v2 code to define the datastore.

How should you complete the code? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Full Access

Question # 104

You need to select an environment that will meet the business and data requirements.

Which environment should you use?

Azure HDInsight with Spark MLlib

Azure Cognitive Services

Azure Machine Learning Studio

Microsoft Machine Learning Server

Full Access

Question # 105

You need to define an evaluation strategy for the crowd sentiment models.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Full Access

Question # 106

You need to define a process for penalty event detection.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Full Access

Question # 107

You need to define an evaluation strategy for the crowd sentiment models.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Full Access

Question # 108

You need to resolve the local machine learning pipeline performance issue. What should you do?

Increase Graphic Processing Units (GPUs).

Increase the learning rate.

Increase the training iterations,

Increase Central Processing Units (CPUs).

Full Access

Question # 109

You need to build a feature extraction strategy for the local models.

How should you complete the code segment? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Full Access

Question # 110

You need to define a process for penalty event detection.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Full Access

Question # 111

You need to implement a new cost factor scenario for the ad response models as illustrated in the

performance curve exhibit.

Which technique should you use?

Set the threshold to 0.5 and retrain if weighted Kappa deviates +/- 5% from 0.45.

Set the threshold to 0.05 and retrain if weighted Kappa deviates +/- 5% from 0.5.

Set the threshold to 0.2 and retrain if weighted Kappa deviates +/- 5% from 0.6.

Set the threshold to 0.75 and retrain if weighted Kappa deviates +/- 5% from 0.15.

Full Access

Question # 112

You need to implement a scaling strategy for the local penalty detection data.

Which normalization type should you use?

Streaming

Weight

Batch

Cosine

Full Access

Question # 113

You need to implement a feature engineering strategy for the crowd sentiment local models.

What should you do?

Apply an analysis of variance (ANOVA).

Apply a Pearson correlation coefficient.

Apply a Spearman correlation coefficient.

Apply a linear discriminant analysis.

Full Access

Question # 114

You need to define a modeling strategy for ad response.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Full Access

Question # 115

You need to modify the inputs for the global penalty event model to address the bias and variance issue.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Full Access

Question # 116

You need to implement a model development strategy to determine a userâ€™s tendency to respond to an ad.

Which technique should you use?

Use a Relative Expression Split module to partition the data based on centroid distance.

Use a Relative Expression Split module to partition the data based on distance travelled to the event.

Use a Split Rows module to partition the data based on distance travelled to the event.

Use a Split Rows module to partition the data based on centroid distance.

Full Access

Question # 117

You need to use the Python language to build a sampling strategy for the global penalty detection models.

How should you complete the code segment? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Full Access

Question # 118

You need to set up the Permutation Feature Importance module according to the model training requirements.

Which properties should you select? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Full Access

Question # 119

You need to select a feature extraction method.

Which method should you use?

Mutual information

Moodâ€™s median test

Kendall correlation

Permutation Feature Importance

Full Access

Question # 120

You need to replace the missing data in the AccessibilityToHighway columns.

How should you configure the Clean Missing Data module? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Full Access

Question # 121

You need to configure the Edit Metadata module so that the structure of the datasets match.

Which configuration options should you select? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Full Access

Question # 122

You need to implement early stopping criteria as suited in the model training requirements.

NOTE: More than one order of answer choices is correct. You will receive credit for any of the correct orders you select.

Full Access

Question # 123

You need to identify the methods for dividing the data according, to the testing requirements.

Which properties should you select? To answer, select the appropriate option-, m the answer area. NOTE: Each correct selection is worth one point.

Full Access

Question # 124

You need to identify the methods for dividing the data according to the testing requirements.

Which properties should you select? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Full Access

Question # 125

You need to configure the Feature Based Feature Selection module based on the experiment requirements and datasets.

How should you configure the module properties? To answer, select the appropriate options in the dialog box in the answer area.

NOTE: Each correct selection is worth one point.

Full Access

Question # 126

You need to produce a visualization for the diagnostic test evaluation according to the data visualization requirements.

Which three modules should you recommend be used in sequence? To answer, move the appropriate modules from the list of modules to the answer area and arrange them in the correct order.

Full Access

Question # 127

You need to visually identify whether outliers exist in the Age column and quantify the outliers before the outliers are removed.

Full Access

Question # 128

You need to correct the model fit issue.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Full Access

Question # 129

You need to select a feature extraction method.

Which method should you use?

Spearman correlation

Mutual information

Mann-Whitney test

Pearsonâ€™s correlation

Full Access

Question # 130

You need to configure the Permutation Feature Importance module for the model training requirements.

What should you do? To answer, select the appropriate options in the dialog box in the answer area.

NOTE: Each correct selection is worth one point.

Full Access

Weekend Sale - Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: mxmas70

DP-100 Designing and Implementing a Data Science Solution on Azure Question and Answers

Answer:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Answer:

Answer:

Explanation:

Answer:

Answer:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Answer:

Explanation:

Answer:

Answer:

Answer:

Answer:

Answer:

Answer:

Answer:

Answer:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Answer:

Answer:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation: