Databricks-Certified-Data-Engineer-Associate Databricks Certified Data Engineer Associate Exam sample Question + Exam 2025 Practice Exam Dumps

Question # 4

A new data engineering team team has been assigned to an ELT project. The new data engineering team will need full privileges on the table sales to fully manage the project.

Which command can be used to grant full permissions on the database to the new data engineering team?

grant all privileges on table sales TO team;

GRANT SELECT ON TABLE sales TO team;

GRANT SELECT CREATE MODIFY ON TABLE sales TO team;

GRANT ALL PRIVILEGES ON TABLE team TO sales;

Full Access

Question # 5

A data engineer wants to create a relational object by pulling data from two tables. The relational object does not need to be used by other data engineers in other sessions. In order to save on storage costs, the data engineer wants to avoid copying and storing physical data.

Which of the following relational objects should the data engineer create?

Spark SQL Table

View

Database

Temporary view

Delta Table

Full Access

Question # 6

A dataset has been defined using Delta Live Tables and includes an expectations clause:

CONSTRAINT valid_timestamp EXPECT (timestamp > '2020-01-01') ON VIOLATION DROP ROW

What is the expected behavior when a batch of data containing data that violates these constraints is processed?

Records that violate the expectation are dropped from the target dataset and loaded into a quarantine table.

Records that violate the expectation are added to the target dataset and flagged as invalid in a field added to the target dataset.

Records that violate the expectation are dropped from the target dataset and recorded as invalid in the event log.

Records that violate the expectation are added to the target dataset and recorded as invalid in the event log.

Records that violate the expectation cause the job to fail.

Full Access

Question # 7

Which of the following is stored in the Databricks customer's cloud account?

Databricks web application

Cluster management metadata

Repos

Data

Notebooks

Full Access

Question # 8

Identify how the count_if function and the count where x is null can be used

Consider a table random_values with below data.

What would be the output of below query?

select count_if(col > 1) as count_a. count(*) as count_b.count(col1) as count_c from random_values col1

NULL -

3 6 5

4 6 5

3 6 6

4 6 6

Full Access

Question # 9

Which of the following commands will return the number of null values in the member_id column?

SELECT count(member_id) FROM my_table;

SELECT count(member_id) - count_null(member_id) FROM my_table;

SELECT count_if(member_id IS NULL) FROM my_table;

SELECT null(member_id) FROM my_table;

SELECT count_null(member_id) FROM my_table;

Full Access

Question # 10

A new data engineering team team. has been assigned to an ELT project. The new data engineering team will need full privileges on the database customers to fully manage the project.

Which of the following commands can be used to grant full permissions on the database to the new data engineering team?

GRANT USAGE ON DATABASE customers TO team;

GRANT ALL PRIVILEGES ON DATABASE team TO customers;

GRANT SELECT PRIVILEGES ON DATABASE customers TO teams;

GRANT SELECT CREATE MODIFY USAGE PRIVILEGES ON DATABASE customers TO team;

GRANT ALL PRIVILEGES ON DATABASE customers TO team;

Full Access

Question # 11

Which of the following describes the type of workloads that are always compatible with Auto Loader?

Dashboard workloads

Streaming workloads

Machine learning workloads

Serverless workloads

Batch workloads

Full Access

Question # 12

A data engineer wants to schedule their Databricks SQL dashboard to refresh every hour, but they only want the associated SQL endpoint to be running when It is necessary. The dashboard has multiple queries on multiple datasets associated with it. The data that feeds the dashboard is automatically processed using a Databricks Job.

Which approach can the data engineer use to minimize the total running time of the SQL endpoint used in the refresh schedule of their dashboard?

O They can reduce the cluster size of the SQL endpoint.

Q They can turn on the Auto Stop feature for the SQL endpoint.

O They can set up the dashboard's SQL endpoint to be serverless.

0 They can ensure the dashboard's SQL endpoint matches each of the queries' SQL endpoints.

Full Access

Question # 13

Which of the following describes when to use the CREATE STREAMING LIVE TABLE (formerly CREATE INCREMENTAL LIVE TABLE) syntax over the CREATE LIVE TABLE syntax when creating Delta Live Tables (DLT) tables using SQL?

CREATE STREAMING LIVE TABLE should be used when the subsequent step in the DLT pipeline is static.

CREATE STREAMING LIVE TABLE should be used when data needs to be processed incrementally.

CREATE STREAMING LIVE TABLE is redundant for DLT and it does not need to be used.

CREATE STREAMING LIVE TABLE should be used when data needs to be processed through complicated aggregations.

CREATE STREAMING LIVE TABLE should be used when the previous step in the DLT pipeline is static.

Full Access

Question # 14

Which query is performing a streaming hop from raw data to a Bronze table?

Option A

Option B

Option C

Option D

Full Access

Question # 15

Which two components function in the DB platform architectureâ€™s control plane? (Choose two.)

Virtual Machines

Compute Orchestration

Serverless Compute

Compute

Unity Catalog

Full Access

Question # 16

A data engineer has realized that they made a mistake when making a daily update to a table. They need to use Delta time travel to restore the table to a version that is 3 days old. However, when the data engineer attempts to time travel to the older version, they are unable to restore the data because the data files have been deleted.

Which of the following explains why the data files are no longer present?

The VACUUM command was run on the table

The TIME TRAVEL command was run on the table

The DELETE HISTORY command was run on the table

The OPTIMIZE command was nun on the table

The HISTORY command was run on the table

Full Access

Question # 17

A data engineer needs to determine whether to use the built-in Databricks Notebooks versioning or version their project using Databricks Repos.

Which of the following is an advantage of using Databricks Repos over the Databricks Notebooks versioning?

Databricks Repos automatically saves development progress

Databricks Repos supports the use of multiple branches

Databricks Repos allows users to revert to previous versions of a notebook

Databricks Repos provides the ability to comment on specific changes

Databricks Repos is wholly housed within the Databricks Lakehouse Platform

Full Access

Question # 18

An engineering manager uses a Databricks SQL query to monitor ingestion latency for each data source. The manager checks the results of the query every day, but they are manually rerunning the query each day and waiting for the results.

Which of the following approaches can the manager use to ensure the results of the query are updated each day?

They can schedule the query to refresh every 1 day from the SQL endpoint's page in Databricks SQL.

They can schedule the query to refresh every 12 hours from the SQL endpoint's page in Databricks SQL.

They can schedule the query to refresh every 1 day from the query's page in Databricks SQL.

They can schedule the query to run every 1 day from the Jobs UI.

They can schedule the query to run every 12 hours from the Jobs UI.

Full Access

Question # 19

A data engineer and data analyst are working together on a data pipeline. The data engineer is working on the raw, bronze, and silver layers of the pipeline using Python, and the data analyst is working on the gold layer of the pipeline using SQL The raw source of the pipeline is a streaming input. They now want to migrate their pipeline to use Delta Live Tables.

Which change will need to be made to the pipeline when migrating to Delta Live Tables?

The pipeline can have different notebook sources in SQL & Python.

The pipeline will need to be written entirely in SQL.

The pipeline will need to be written entirely in Python.

The pipeline will need to use a batch source in place of a streaming source.

Full Access

Question # 20

Which of the following benefits is provided by the array functions from Spark SQL?

An ability to work with data in a variety of types at once

An ability to work with data within certain partitions and windows

An ability to work with time-related data in specified intervals

An ability to work with complex, nested data ingested from JSON files

An ability to work with an array of tables for procedural automation

Full Access

Question # 21

A data engineer wants to schedule their Databricks SQL dashboard to refresh once per day, but they only want the associated SQL endpoint to be running when it is necessary.

Which of the following approaches can the data engineer use to minimize the total running time of the SQL endpoint used in the refresh schedule of their dashboard?

They can ensure the dashboardâ€™s SQL endpoint matches each of the queriesâ€™ SQL endpoints.

They can set up the dashboardâ€™s SQL endpoint to be serverless.

They can turn on the Auto Stop feature for the SQL endpoint.

They can reduce the cluster size of the SQL endpoint.

They can ensure the dashboardâ€™s SQL endpoint is not one of the included queryâ€™s SQL endpoint.

Full Access

Question # 22

Which of the following describes a scenario in which a data team will want to utilize cluster pools?

An automated report needs to be refreshed as quickly as possible.

An automated report needs to be made reproducible.

An automated report needs to be tested to identify errors.

An automated report needs to be version-controlled across multiple collaborators.

An automated report needs to be runnable by all stakeholders.

Full Access

Question # 23

Which of the following data lakehouse features results in improved data quality over a traditional data lake?

A data lakehouse provides storage solutions for structured and unstructured data.

A data lakehouse supports ACID-compliant transactions.

A data lakehouse allows the use of SQL queries to examine data.

A data lakehouse stores data in open formats.

A data lakehouse enables machine learning and artificial Intelligence workloads.

Full Access

Question # 24

A data engineer wants to create a new table containing the names of customers who live in France.

They have written the following command:

CREATE TABLE customersInFrance

_____ AS

SELECT id,

firstName,

lastName

FROM customerLocations

WHERE country = â€™FRANCEâ€™;

A senior data engineer mentions that it is organization policy to include a table property indicating that the new table includes personally identifiable information (Pll).

Which line of code fills in the above blank to successfully complete the task?

COMMENT "Contains PIT

511

"COMMENT PII"

TBLPROPERTIES PII

Full Access

Question # 25

Which of the following commands can be used to write data into a Delta table while avoiding the writing of duplicate records?

DROP

IGNORE

MERGE

APPEND

INSERT

Full Access

Question # 26

A data engineer has been using a Databricks SQL dashboard to monitor the cleanliness of the input data to a data analytics dashboard for a retail use case. The job has a Databricks SQL query that returns the number of store-level records where sales is equal to zero. The data engineer wants their entire team to be notified via a messaging webhook whenever this value is greater than 0.

Which of the following approaches can the data engineer use to notify their entire team via a messaging webhook whenever the number of stores with $0 in sales is greater than zero?

They can set up an Alert with a custom template.

They can set up an Alert with a new email alert destination.

They can set up an Alert with one-time notifications.

They can set up an Alert with a new webhook alert destination.

They can set up an Alert without notifications.

Full Access

Question # 27

A data engineer is using the following code block as part of a batch ingestion pipeline to read from a composable table:

Which of the following changes needs to be made so this code block will work when the transactions table is a stream source?

Replace predict with a stream-friendly prediction function

Replace schema(schema) with option ("maxFilesPerTrigger", 1)

Replace "transactions" with the path to the location of the Delta table

Replace format("delta") with format("stream")

Replace spark.read with spark.readStream

Full Access

Question # 28

A new data engineering team team has been assigned to an ELT project. The new data engineering team will need full privileges on the table sales to fully manage the project.

Which of the following commands can be used to grant full permissions on the database to the new data engineering team?

GRANT ALL PRIVILEGES ON TABLE sales TO team;

GRANT SELECT CREATE MODIFY ON TABLE sales TO team;

GRANT SELECT ON TABLE sales TO team;

GRANT USAGE ON TABLE sales TO team;

GRANT ALL PRIVILEGES ON TABLE team TO sales;

Full Access

Question # 29

A data engineer has developed a data pipeline to ingest data from a JSON source using Auto Loader, but the engineer has not provided any type inference or schema hints in their pipeline. Upon reviewing the data, the data engineer has noticed that all of the columns in the target table are of the string type despite some of the fields only including float or boolean values.

Which of the following describes why Auto Loader inferred all of the columns to be of the string type?

There was a type mismatch between the specific schema and the inferred schema

JSON data is a text-based format

Auto Loader only works with string data

All of the fields had at least one null value

Auto Loader cannot infer the schema of ingested data

Full Access

Question # 30

Which SQL keyword can be used to convert a table from a long format to a wide format?

TRANSFORM

PIVOT

SUM

CONVERT

Full Access

Question # 31

A data engineer needs to use a Delta table as part of a data pipeline, but they do not know if they have the appropriate permissions.

In which of the following locations can the data engineer review their permissions on the table?

Databricks Filesystem

Jobs

Dashboards

Repos

Data Explorer

Full Access

Question # 32

A data engineer needs to apply custom logic to identify employees with more than 5 years of experience in array column employees in table stores. The custom logic should create a new column exp_employees that is an array of all of the employees with more than 5 years of experience for each row. In order to apply this custom logic at scale, the data engineer wants to use the FILTER higher-order function.

Which of the following code blocks successfully completes this task?

Option A

Option B

Option C

Option D

Option E

Full Access

Summer Special Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: myex65

Databricks-Certified-Data-Engineer-Associate Databricks Certified Data Engineer Associate Exam Question and Answers

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Answer:

Explanation:

Answer:

Explanation:

Quick Links

Why Us

Unlimited Packages

Site Secure

We Accept