Weekend Sale - Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: mxmas70

Home > Databricks > Data Analyst > Databricks-Certified-Data-Analyst-Associate

Databricks-Certified-Data-Analyst-Associate Databricks Certified Data Analyst Associate Exam Question and Answers

Question # 4

Data professionals with varying titles use the Databricks SQL service as the primary touchpoint with the Databricks Lakehouse Platform. However, some users will use other services like Databricks Machine Learning or Databricks Data Science and Engineering.

Which of the following roles uses Databricks SQL as a secondary service while primarily using one of the other services?

A.

Business analyst

B.

SQL analyst

C.

Data engineer

D.

Business intelligence analyst

E.

Data analyst

Full Access
Question # 5

Data professionals with varying responsibilities use the Databricks Lakehouse Platform Which role in the Databricks Lakehouse Platform use Databricks SQL as their primary service?

A.

Data scientist

B.

Data engineer

C.

Platform architect

D.

Business analyst

Full Access
Question # 6

What is a benefit of using Databricks SQL for business intelligence (Bl) analytics projects instead of using third-party Bl tools?

A.

Computations, data, and analytical tools on the same platform

B.

Advanced dashboarding capabilities

C.

Simultaneous multi-user support

D.

Automated alerting systems

Full Access
Question # 7

What describes the variance of a set of values?

A.

Variance is a measure of how far a single observed value is from a set ot va IN

B.

Variance is a measure of how far an observed value is from the variable's maximum or minimum value.

C.

Variance is a measure of central tendency of a set of values.

D.

Variance is a measure of how far a set of values is spread out from the sets central value.

Full Access
Question # 8

A data organization has a team of engineers developing data pipelines following the medallion architecture using Delta Live Tables. While the data analysis team working on a project is using gold-layer tables from these pipelines, they need to perform some additional processing of these tables prior to performing their analysis.

Which of the following terms is used to describe this type of work?

A.

Data blending

B.

Last-mile

C.

Data testing

D.

Last-mile ETL

E.

Data enhancement

Full Access
Question # 9

In which of the following situations should a data analyst use higher-order functions?

A.

When custom logic needs to be applied to simple, unnested data

B.

When custom logic needs to be converted to Python-native code

C.

When custom logic needs to be applied at scale to array data objects

D.

When built-in functions are taking too long to perform tasks

E.

When built-in functions need to run through the Catalyst Optimizer

Full Access
Question # 10

Which of the following benefits of using Databricks SQL is provided by Data Explorer?

A.

It can be used to run UPDATE queries to update any tables in a database.

B.

It can be used to view metadata and data, as well as view/change permissions.

C.

It can be used to produce dashboards that allow data exploration.

D.

It can be used to make visualizations that can be shared with stakeholders.

E.

It can be used to connect to third party Bl cools.

Full Access
Question # 11

What describes Partner Connect in Databricks?

A.

it allows for free use of Databricks partner tools through a common API.

B.

it allows multi-directional connection between Databricks and Databricks partners easier.

C.

It exposes connection information to third-party tools via Databricks partners.

D.

It is a feature that runs Databricks partner tools on a Databricks SQL Warehouse (formerly known as a SQL endpoint).

Full Access
Question # 12

A data analyst has created a Query in Databricks SQL, and now they want to create two data visualizations from that Query and add both of those data visualizations to the same Databricks SQL Dashboard.

Which of the following steps will they need to take when creating and adding both data visualizations to the Databricks SQL Dashboard?

A.

They will need to alter the Query to return two separate sets of results.

B.

They will need to add two separate visualizations to the dashboard based on the same Query.

C.

They will need to create two separate dashboards.

D.

They will need to decide on a single data visualization to add to the dashboard.

E.

They will need to copy the Query and create one data visualization per query.

Full Access
Question # 13

A data analyst wants to create a Databricks SQL dashboard with multiple data visualizations and multiple counters. What must be completed before adding the data visualizations and counters to the dashboard?

A.

All data visualizations and counters must be created using Queries.

B.

A SQL warehouse (formerly known as SQL endpoint) must be turned on and selected.

C.

A markdown-based tile must be added to the top of the dashboard displaying the dashboard's name.

D.

The dashboard owner must also be the owner of the queries, data visualizations, and counters.

Full Access
Question # 14

A business analyst has been asked to create a data entity/object called sales_by_employee. It should always stay up-to-date when new data are added to the sales table. The new entity should have the columns sales_person, which will be the name of the employee from the employees table, and sales, which will be all sales for that particular sales person. Both the sales table and the employees table have an employee_id column that is used to identify the sales person.

Which of the following code blocks will accomplish this task?

A)

B)

C)

D)

A.

Option

B.

Option

C.

Option

D.

Option

Full Access
Question # 15

A data analyst has created a user-defined function using the following line of code:

CREATE FUNCTION price(spend DOUBLE, units DOUBLE)

RETURNS DOUBLE

RETURN spend / units;

Which of the following code blocks can be used to apply this function to the customer_spend and customer_units columns of the table customer_summary to create column customer_price?

A.

SELECT PRICE customer_spend, customer_units AS customer_price FROM customer_summary

B.

SELECT price FROM customer_summary

C.

SELECT function(price(customer_spend, customer_units)) AS customer_price FROM customer_summary

D.

SELECT double(price(customer_spend, customer_units)) AS customer_price FROM customer_summary

E.

SELECT price(customer_spend, customer_units) AS customer_price FROM customer_summary

Full Access
Question # 16

Which location can be used to determine the owner of a managed table?

A.

Review the Owner field in the table page using Catalog Explorer

B.

Review the Owner field in the database page using Data Explorer

C.

Review the Owner field in the schema page using Data Explorer

D.

Review the Owner field in the table page using the SQL Editor

Full Access
Question # 17

How can a data analyst determine if query results were pulled from the cache?

A.

Go to the Query History tab and click on the text of the query. The slideout shows if the results came from the cache.

B.

Go to the Alerts tab and check the Cache Status alert.

C.

Go to the Queries tab and click on Cache Status. The status will be green if the results from the last run came from the cache.

D.

Go to the SQL Warehouse (formerly SQL Endpoints) tab and click on Cache. The Cache file will show the contents of the cache.

E.

Go to the Data tab and click Last Query. The details of the query will show if the results came from the cache.

Full Access
Question # 18

A data analyst is working with gold-layer tables to complete an ad-hoc project. A stakeholder has provided the analyst with an additional dataset that can be used to augment the gold-layer tables already in use.

Which of the following terms is used to describe this data augmentation?

A.

Data testing

B.

Ad-hoc improvements

C.

Last-mile

D.

Last-mile ETL

E.

Data enhancement

Full Access
Question # 19

A data analyst has set up a SQL query to run every four hours on a SQL endpoint, but the SQL endpoint is taking too long to start up with each run.

Which of the following changes can the data analyst make to reduce the start-up time for the endpoint while managing costs?

A.

Reduce the SQL endpoint cluster size

B.

Increase the SQL endpoint cluster size

C.

Turn off the Auto stop feature

D.

Increase the minimum scaling value

E.

Use a Serverless SQL endpoint

Full Access