Valentine Day Sale - Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: mxmas70

Home > Snowflake > SnowPro Advanced Certification > DSA-C02

DSA-C02 SnowPro Advanced: Data Scientist Certification Exam Question and Answers

Question # 4

What Can Snowflake Data Scientist do in the Snowflake Marketplace as Provider?

A.

Publish listings for free-to-use datasets to generate interest and new opportunities among the Snowflake customer base.

B.

Publish listings for datasets that can be customized for the consumer.

C.

Share live datasets securely and in real-time without creating copies of the data or im-posing data integration tasks on the consumer.

D.

Eliminate the costs of building and maintaining APIs and data pipelines to deliver data to customers.

Full Access
Question # 5

Which command manually triggers a single run of a scheduled task (either a standalone task or the root task in a DAG) independent of the schedule defined for the task?

A.

RUN TASK

B.

CALL TASK

C.

EXECUTE TASK

D.

RUN ROOT TASK

Full Access
Question # 6

What is the formula for measuring skewness in a dataset?

A.

MEAN - MEDIAN

B.

MODE - MEDIAN

C.

(3(MEAN - MEDIAN))/ STANDARD DEVIATION

D.

(MEAN - MODE)/ STANDARD DEVIATION

Full Access
Question # 7

Which ones are the key actions in the data collection phase of Machine learning included?

A.

Label

B.

Ingest and Aggregate

C.

Probability

D.

Measure

Full Access
Question # 8

Which one is incorrect understanding about Providers of Direct share?

A.

A data provider is any Snowflake account that creates shares and makes them available to other Snowflake accounts to consume.

B.

As a data provider, you share a database with one or more Snowflake accounts.

C.

You can create as many shares as you want, and add as many accounts to a share as you want.

D.

If you want to provide a share to many accounts, you can do the same via Direct Share.

Full Access
Question # 9

Consider a data frame df with 10 rows and index [ 'r1', 'r2', 'r3', 'row4', 'row5', 'row6', 'r7', 'r8', 'r9', 'row10']. What does the expression g = df.groupby(df.index.str.len()) do?

A.

Groups df based on index values

B.

Groups df based on length of each index value

C.

Groups df based on index strings

D.

Data frames cannot be grouped by index values. Hence it results in Error.

Full Access
Question # 10

Which Python method can be used to Remove duplicates by Data scientist?

A.

remove_duplicates()

B.

duplicates()

C.

drop_duplicates()

D.

clean_duplicates()

Full Access
Question # 11

Select the Correct Statements regarding Normalization?

A.

Normalization technique uses minimum and max values for scaling of model.

B.

Normalization technique uses mean and standard deviation for scaling of model.

C.

Scikit-Learn provides a transformer RecommendedScaler for Normalization.

D.

Normalization got affected by outliers.

Full Access
Question # 12

Which metric is not used for evaluating classification models?

A.

Recall

B.

Accuracy

C.

Mean absolute error

D.

Precision

Full Access
Question # 13

Consider a data frame df with 10 rows and index [ 'r1', 'r2', 'r3', 'row4', 'row5', 'row6', 'r7', 'r8', 'r9', 'row10']. What does the aggregate method shown in below code do?

g = df.groupby(df.index.str.len())

g.aggregate({'A':len, 'B':np.sum})

A.

Computes Sum of column A values

B.

Computes length of column A

C.

Computes length of column A and Sum of Column B values of each group

D.

Computes length of column A and Sum of Column B values

Full Access
Question # 14

Which tools helps data scientist to manage ML lifecycle & Model versioning?

A.

MLFlow

B.

Pachyderm

C.

Albert

D.

CRUX

Full Access
Question # 15

Select the correct mappings:

I. W Weights or Coefficients of independent variables in the Linear regression model --> Model Pa-rameter

II. K in the K-Nearest Neighbour algorithm --> Model Hyperparameter

III. Learning rate for training a neural network --> Model Hyperparameter

IV. Batch Size --> Model Parameter

A.

I,II

B.

I,II,III

C.

III,IV

D.

II,III,IV

Full Access
Question # 16

Which object records data manipulation language (DML) changes made to tables, including inserts, updates, and deletes, as well as metadata about each change, so that actions can be taken using the changed data of Data Science Pipelines?

A.

Task

B.

Dynamic tables

C.

Stream

D.

Tags

E.

Delta

F.

OFFSET

Full Access
Question # 17

Secure Data Sharing do not let you share which of the following selected objects in a database in your account with other Snowflake accounts?

A.

Sequences

B.

Tables

C.

External tables

D.

Secure UDFs

Full Access
Question # 18

Which type of Machine learning Data Scientist generally used for solving classification and regression problems?

A.

Supervised

B.

Unsupervised

C.

Reinforcement Learning

D.

Instructor Learning

E.

Regression Learning

Full Access
Question # 19

Which are the following additional Metadata columns Stream contains that could be used for creating Efficient Data science Pipelines & helps in transforming only the New/Modified data only?

A.

METADATA$ACTION

B.

METADATA$FILE_ID

C.

METADATA$ISUPDATE

D.

METADATA$DELETE

E.

METADATA$ROW_ID

Full Access