Summer Special Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: myex65

Home > CompTIA > CompTIA Data+ > DA0-001

DA0-001 CompTIA Data+ Certification Exam Question and Answers

Question # 4

Which of the following is the first step an analyst should perform upon receiving a business request for analysis?

A.

Determine the data needs and sources for analysis.

B.

Initiate the analysis for exploratory data analysis.

C.

Review the business questions to understand the scope.

D.

Finalize the methodology to solve the problem.

Full Access
Question # 5

Which of the following is a domain-specific language used in programming that is designed for managing data that is held in a relational data stream management system?

A.

SAS

B.

SQL

C.

Python

D.

R

Full Access
Question # 6

Which of the following would be considered non-personally identifiable information?

A.

Cell phone device name

B.

Customer’s name

C.

Government ID number

D.

Telephone number

Full Access
Question # 7

Joe. an analyst. tests the loading time on a dashboard he is preparing to go live and finds it is slower than he would like. Which of the following must occur to decrease the loading time?

A.

Deploy the dashboard to production.

B.

Change the field definitions.

C.

Update the dashboard subscribers.

D.

Optimize the dashboard.

Full Access
Question # 8

Which of the following database schemas features normalized dimension tables?

A.

Flat

B.

Snowflake

C.

Hierarchical

D.

Star

Full Access
Question # 9

A data analyst is developing a data dictionary that aligns with a company's data management processes and policies. Which of the following best describes what should be included in the data dictionary?

A.

Information containing the links to business data

B.

Information explaining the business methodologies

C.

Information containing definitions of the business data

D.

Information describing the data analysis phases

Full Access
Question # 10

Which of the following is a common data analytics tool that is also used as an interpreted, high-level, general-purpose programming language?

A.

SAS

B.

Microsoft Power BI

C.

IBM SPSS

D.

Python

Full Access
Question # 11

The current date is July 14, 2020. A data analyst has been asked to create a report that shows the company’s year-over-year Q2 2020 sales. Which of the following reports should the analyst compare?

A.

A Q2 2020 and Q4 2019

B.

YTD 2020 and YTD 2019

C.

Q2 2020 and Q2 2019

D.

Q2 2020 and Q2 2021

Full Access
Question # 12

Which of the following is a characteristic of a relational database?

A.

It utilizes key-value pairs.

B.

It has undefined fields.

C.

It is structured in nature.

D.

It uses minimal memory.

Full Access
Question # 13

A data analyst is designing a dashboard that will provide a story of sales and determine which site is providing the highest sales volume per customer. The analyst must choose an appropriate chart to include in the dashboard. The following data is available:

Which of the following types of charts should be considered?

A.

Include a line chart using the site and average sales per customer.

B.

Include a pie chart using the site and sales to average sales per customer.

C.

Include a scatter chart using sales volume and average sales per customer.

D.

Include a column chart using the site and sales to average sales per customer.

Full Access
Question # 14

Q3 2020 has just ended, and now a data analyst needs to create an ad-hoc sales report that demonstrates how well the Q3 2020 promotion went versus last year's Q3 promotion.

Which of the following date parameters should the analyst use?

A.

2019 vs. YTD 2020

B.

Q3 2019 vs. Q3 2020

C.

YTD 2019 vs. YTD 2020

D.

Q4 2019 vs. Q3 2020

Full Access
Question # 15

The current date is July 14, 2020. A data analyst has been asked to create a report that shows the company's year-over-year Q2 2020 sales. Which of the following reports should the analyst compare?

A.

Q2 2020 and Q4 2019

B.

YTD 2020 and YTD 2019

C.

Q2 2020 and Q2 2019

D.

Q2 2020 and Q2 2021

Full Access
Question # 16

A data analyst needs to create a dashboard using the company's yearly revenue data sets. Which of the following would be the best way to plot the information to show the top-performing region?

A.

A line chart

B.

A waterfall chart

C.

A heat map

D.

A stacked bar chart

Full Access
Question # 17

Which of the following is an example of a discrete data type?

A.

8in (20cm)

B.

5 kids

C.

2.5mi (4km)

D.

10.7lbs (4.9kg)

Full Access
Question # 18

Which of the following data manipulation techniques is an example of a logical function?

A.

WHERE

B.

AGGREGATE

C.

BOOLEAN

D.

IF

Full Access
Question # 19

An analyst has written the following code:

SELECT *

FROM Cust_table

WHERE age > 60 AND City = "New York"

Which of the following criteria is the analyst retrieving?

A.

All customers older than age 60 in New York state

B.

All customers aged 60 and older in New York state

C.

All customers older than age 60 in New York City

D.

All customers younger than age 60 in New York City

Full Access
Question # 20

After completing web scraping, which of the following file formats needs to be parsed?

A.

.html

B.

.txt

C.

.csv

D.

.tsv

Full Access
Question # 21

A data analyst for a media company needs to determine the most popular movie genre. Given the table below:

Which of the following must be done to the Genre column before this task can be completed?

A.

Append

B.

Merge

C.

Concatenate

D.

Delimit

Full Access
Question # 22

A marketing analytics team received customer transaction data from two different sources. The data is complete and accurate; however, the field names appear to be inconsistent. Given the following tables:

Which of the following is considered best practice if the team wants to consolidate the files and conduct further analysis?

A.

Standardize the field names.

B.

Recode the data values.

C.

Overwrite the field names in one of the tables.

D.

Edit the field names in the data dictionary.

Full Access
Question # 23

The ACME Corporation hired an analyst to detect data quality issues in their Excel documents. Which of the following are the most common issues? (Select TWO)

A.

Apostrophe.

B.

Commas.

C.

Symbols.

D.

Duplicates.

E.

Misspellings.

Full Access
Question # 24

A data analyst has a set of data that shows the number of gallons of oil produced each day. The company would like to know the standard deviation for the data set. The variance for the data is 36 gallons. Which of the following is the standard deviation for gallons produced?

A.

1.16

B.

6

C.

36

D.

72

Full Access
Question # 25

Given the table below:

Which of the following variables can be considered inconsistent, and how many distinct values should the variable have?

A.

Name, one

B.

Gender, two

C.

Level, three

D.

Code, four

E.

Region, five

Full Access
Question # 26

An analyst is reviewing the following data:

Car IDSpeed

123155

566436

564418

650567

546436

645638

Which of the following should the analyst include in the measures of central tendency for speed?

A.

Mode = 38 Range = 31 Mean = 42.5

B.

Range = 49 Max = 67 Min = 18

C.

Mode = 36 Max = 67 Min = 18

D.

Mode = 36 Median = 37 Mean = 41.5

Full Access
Question # 27

Mario works with a group of R programmers tasked with copying data from an accounting system into a data warehouse.

In what phase are the group's R skills most relevant?

A.

Extract.

B.

Load.

C.

Transform.

D.

Purge.

Full Access
Question # 28

An e-commerce company recently tested a new website layout. The website was tested by a test group of customers, and an old website was presented to a control group. The table below shows the percentage of users in each group who made purchases on the websites:

Which of the following conclusions is accurate at a 95% confidence interval?

A.

In Germany, the increase in conversion from the new layout was not significant.

B.

In France, the increase in conversion from the new layout was not significant.

C.

In general, users who visit the new website are more likely to make a purchase.

D.

The new layout has the lowest conversion rates in the United Kingdom.

Full Access
Question # 29

Which of the following is an example of a data-mining ETL tool?

A.

SSIS

B.

Stata

C.

SPSS

D.

Cognos

Full Access
Question # 30

A data analyst needs to calculate the mean for Q1 sales using the data set below:

Which of the following is the mean?

A.

$2,466.18

B.

$2,667.60

C.

$3,082.72

D.

$12,330.88

Full Access
Question # 31

Given the diagram below:

Which of the following data schemas shown?

A.

Key-value pairs

B.

Online transactional processing

C.

Data Lake

D.

Relational database

Full Access
Question # 32

Which of the following is a difference between a primary key and a unique key?

A.

A unique key cannot take null values, whereas a primary key can take null values.

B.

There can be only one primary key in a data set, whereas there can be multiple unique keys.

C.

A primary key can take a value more than once, whereas a unique key cannot take a value more than once.

D.

A primary key cannot be a date variable, whereas a unique key can be.

Full Access
Question # 33

A data analyst needs to perform a full outer join of a customer's orders using the tables below:

Which of the following is the mean of the order quantity?

A.

73.5

B.

76.5

C.

78.8

D.

81.5

Full Access
Question # 34

A data analyst has been asked to merge the tables below, first performing an INNER JOIN and then a LEFT JOIN:

Customer Table -

In-store Transactions –

Which of the following describes the number of rows of data that can be expected after performing both joins in the order stated, considering the customer table as the main table?

A.

INNER: 6 rows; LEFT: 9 rows

B.

INNER: 9 rows; LEFT: 6 rows

C.

INNER: 9 rows; LEFT: 15 rows

D.

INNER: 15 rows; LEFT: 9 rows

Full Access
Question # 35

Given the information in the following tables:

Which of the following describes merging these tables to create a master file that includes all transactions for both online and in-store sales?

A.

Data audit

B.

Data completeness

C.

Data validation

D.

Data consolidation

Full Access
Question # 36

Given the following:

Which of the following is the most important thing for an analyst to do when transforming the table for a trend analysis?

A.

Fill in the missing cost where it is null.

B.

Separate the table into two tables and create a primary key

C.

Replace the extended cost field with a calculated field.

D.

Correct the dates so they have the same format.

Full Access
Question # 37

A development company is constructing a new Init in its apartment complex. The complex has the following floor plans:

Using the average cost per square foot of the original floor plans. which of the following should be the price of the Rose Init?

A.

$640,900

B.

$690,000

C.

$705,200

D.

$702,500

Full Access
Question # 38

Given the image below:

The data should be cleaned because of the presence of:

A.

outlier

B.

non-parametric data.

C.

multicollinearity.

D.

invalid data.

Full Access
Question # 39

A data analyst needs to create a data visualization that aids in un the cumulative impact of sequentially introduced values that are positive or negative. Which of the following

data visualization methods should the analyst use?

A.

A bubble chart

B.

A waterfall chart

C.

A scatter plot

D.

A line chart

Full Access
Question # 40

Which of the following concepts should be applied if a data set with 40 fields needs to be pared down to 20 fields and contains similar data across multiple fields?

A.

Duplication

B.

Consolidation

C.

Compliance

D.

Standardization

Full Access
Question # 41

An analyst must obtain the average daily sales for the following week:

Which of the following must the analyst perform to obtain this value?

A.

Data normalization

B.

Data append

C.

Data aggregation

D.

Data blending

Full Access
Question # 42

Which one of the following programming languages is specifically designed for use in analytics applications?

A.

Python.

B.

R

C.

C++

D.

Java.

Full Access
Question # 43

Which of the following would be used to store unstructured data from different sources?

A.

A data lake

B.

A database management system

C.

A database

D.

A data warehouse

Full Access
Question # 44

A data analyst received a large amount of third-party data that needs to be joined with in-house data files. After the data is joined, the analyst notices three columns all contain dates. Which of the following should the analyst do to maintain data consistency?

A.

Append all date columns and parse the strings.

B.

Impute all three date columns and then merge.

C.

Merge all date columns and unify the format.

D.

Separate the columns into a table and merge.

Full Access
Question # 45

Five dogs have the following heights in millimeters:

300, 430, 170, 470, 600

Which of the following is the mean height for the five dogs?

A.

394mm

B.

405mm

C.

493mm

D.

504mm

Full Access
Question # 46

Given the image below:

Which of the following file formats is depicted?

A.

JSON

B.

CSV

C.

XML

D.

HTML

Full Access
Question # 47

Which of the following descriptive statistical methods are measures of central tendency? (Choose two.)

A.

Mean

B.

Minimum

C.

Mode

D.

Variance

E.

Correlation

F.

Maximum

Full Access
Question # 48

You are working with a dataset and want to change the names of categories that you used for different types of books.

What term best describes this action?

A.

Recording.

B.

Summarizing

C.

Aggregating.

D.

Filtering.

Full Access
Question # 49

A report is scheduled to run and be distributed at the end of business each day. On Mondays, one of the recipients opens the previous week's reports and combines them to calculate the weekly totals and projections for the coming week. This is a tedious process, and the recipient asks an analyst for help. Which of the following should the analyst recommend?

A.

Add calculation fields to the daily report so the totals are built in.

B.

Create a new report with weekly totals set to run at the end of business on Friday.

C.

Provide a daily summary to the report with totals to save the user the effort of manual calculations.

D.

Reduce the frequency of the report to once a week and change the date range.

Full Access
Question # 50

An analyst is preparing a report that contains weather data. The temperatures are shown in Fahrenheit. but they must be reported in Celsius. Which of the following should the analyst do to fix this issue?

A.

Normalize the data.

B.

Standardize the data.

C.

Rescale the data.

D.

Aggregate the data.

Full Access
Question # 51

During data profiling, an analyst decides to recode the status column in the following data set:

Which of the following data concerns explains why the analyst wants to take this action?

A.

Redundancy

B.

Duplication

C.

Invalidity

D.

Inconsistency

Full Access
Question # 52

Which one of the following values will appear first if they are sorted in descending order?

A.

Aaron.

B.

Molly.

C.

Xavier.

D.

Adam.

Full Access
Question # 53

Which of the following summary statements upholds integrity in data reporting?

A.

Sales are approximately equal for Product A and Product B across all strategies.

B.

Strategy 4 provides the best sales in comparison to other strategies.

C.

While Strategy 2 does not result in the highest sales of Product D. over all products it appears to be the most effective.

D.

Product D should be promoted more than the other products in all strategies.

Full Access
Question # 54

Each month an analyst needs to execute a data pull for the two prior months. Which of the following is the most efficient function for the analyst to use?

A.

Logical

B.

Date

C.

Aggregate

D.

System

Full Access
Question # 55

Given the following data tables:

Which of the following MDM processes needs to take place FIRST?

A.

Creation of a data dictionary

B.

Compliance with regulations

C.

Standardization of data field names

D.

Consolidation of multiple data fields

Full Access
Question # 56

Which of the following roles is responsible for ensuring an organization's data quality, security, privacy, and regulatory compliance?

A.

Data owner.

B.

Data steward.

C.

Data custodian.

D.

Data processor.

Full Access
Question # 57

A user receives a large custom report to track company sales across various date ranges. The user then completes a series of manual calculations for each date range. Which of the following should an analyst suggest so the user has a dynamic, seamless experience?

A.

Create multiple reports, one for each needed date range.

B.

Build calculations into the report so they are done automatically.

C.

Add macros to the report to speed up the filtering and calculations process.

D.

Create a dashboard with a date range picker and calculations built in.

Full Access
Question # 58

Which one of the following would not normally be considered a summary statistic?

A.

z-score.

B.

Mean.

C.

Variance.

D.

Standard deviation.

Full Access
Question # 59

A JSON file is an example of:

A.

structured data.

B.

web data.

C.

machine data.

D.

processed data.

Full Access
Question # 60

A data analyst has been asked to derive a new variable labeled “Promotion_flag” based on the total quantity sold by each salesperson. Given the table below:

Which of the following functions would the analyst consider appropriate to flag “Yes” for every salesperson who has a number above 1,000,000 in the Quantity_sold column?

A.

Date

B.

Mathematical

C.

Logical

D.

Aggregate

Full Access
Question # 61

An e-commerce company recently tested a new website layout. The website was tested by a test group of customers, and an old website was presented to a control group. The table below shows the percentage of users in each group who made purchases on the websites:

Which of the following conclusions is accurate at a 95% confidence interval?

A.

In Germany, the increase in conversion from the new layout was not significant.

B.

In France, the increase in conversion from the new layout was not significant.

C.

In general, users who visit the new website are more likely to make a purchase.

D.

The new layout has the lowest conversion rates in the United Kingdom.

Full Access
Question # 62

What subset of Structured Query Language (SQL) is used to add, remove, modify, or retrieve the information stored within a relational database?

A.

DDL.

B.

DSL.

C.

DQL.

D.

DML.

Full Access
Question # 63

Given the diagram below:

Which of the following types of sampling is depicted in the image?

A.

Stratified

B.

Random

C.

Cluster

D.

Systematic

Full Access
Question # 64

Different people manually type a series of handwritten surveys into an online database. Which of the following issues will MOST likely arise with this data? (Choose two.)

A.

Data accuracy

B.

Data constraints

C.

Data attribute limitations

D.

Data bias

E.

Data consistency

F.

Data manipulation

Full Access
Question # 65

A data analyst must fulfill a request for information that is needed weekly and should be automatically emailed to a specific set of users. Which of the following types of reports should the analyst recommend?

A.

A self-service report

B.

A research report

C.

An ad hoc report

D.

An operational report

Full Access
Question # 66

A customer list from a financial services company is shown below:

A data analyst wants to create a likely-to-buy score on a scale from 0 to 100, based on an average of the three numerical variables: number of credit cards, age, and income. Which of the following should the analyst do to the variables to ensure they all have the same weight in the score calculation?

A.

Recode the variables.

B.

Calculate the percentiles of the variables.

C.

Calculate the standard deviations of the variables.

D.

Normalize the variables.

Full Access
Question # 67

Given the following data table:

Which of the following are appropriate reasons to undertake data cleansing? (Select two).

A.

Non-parametric data

B.

Missing data

C.

Duplicate data

D.

Invalid data

E.

Redundant data

F.

Normalized data

Full Access
Question # 68

A company’s marketing department wants to do a promotional campaign next month. A data analyst on the team has been asked to perform customer segmentation, looking at how recently a customer bought the product, at what frequency, and at what value. Which of the following types of analysis would this practice be considered?

A.

Prescriptive

B.

Trend

C.

Gap

D.

Custer

Full Access
Question # 69

Which of the following variable name formats would be problematic if used in the majority of data software programs?

A.

First_Name_

B.

FirstName

C.

First_Name

D.

First Name

Full Access
Question # 70

A military commander would like to see the health scorecards of the troops daily and filter them based on gender and rank. Considering this data is PHI, which of the following would be the best way for the commander to view the information?

A.

An emailed report

B.

A password-protected dashboard

C.

A daily printout of a report

D.

A cloud-hosted spreadsheet

Full Access
Question # 71

A Chief Executive Officer (CEO) is requesting more up-to-date sales data for improved visibility prior to month-end. An analyst must determine the frequency of a sales report that was previously distributed on an as-needed basis. Which of the following would be the most appropriate frequency for this report?

A.

Monthly

B.

Quarterly

C.

Weekly

D.

Every other month

Full Access
Question # 72

A financial institution is reporting on sales performance to a company at the account level. Due to the sensitive nature of the government the does il with, some account information is not shown. Which of the following fields should be masked?

A.

Sales volume

B.

Start date

C.

Product name

D.

Customer name

Full Access
Question # 73

Kelly wants to get feedback on the final draft of a strategic report that has taken her six months to develop.

What can she do to get prevent confusion as see seeks feedback before publishing the report?

Choose the best answer.

A.

Distribute the report to the appropriate stakeholders via email.

B.

Use a watermark to identify the report as a draft.

C.

Show the report to her immediate supervisor.

D.

Publish the report on an internally facing website.

Full Access
Question # 74

Which of the following is a best practice when updating a legacy data source?

A.

Placing old data in new fields

B.

Keeping only the most recent data

C.

Creating a codebook to document field changes

D.

Removing the data source from production

Full Access
Question # 75

‘Which of the following is the BEST reason to use database views instead of tables?

A.

Views reduce the need for repetitive, complex data joins.

B.

Views allow for the storage of temporary data. whereas tables do not.

C.

Views allow for the joining of multiple data sources, whereas tables do not.

D.

Views can be used to restrict sensitive information.

Full Access
Question # 76

Given the table below:

Which of the following variable types BEST describes the “Year” column?

A.

Numeric

B.

Date

C.

Alphanumeric

D.

Text

Full Access
Question # 77

A database consists of one fact table that is composed of multiple dimensions. Depending on the dimension, each one can be represented by a denormalized table or multiple normalized tables. This structure is an example of a:

A.

transactional schema.

B.

star schema.

C.

non-relational schema.

D.

snowflake schema.

Full Access
Question # 78

Which of the following are reasons to conduct data cleansing? (Select two).

A.

To perform web scraping

B.

To track KPls

C.

To improve accuracy

D.

To review data sets

E.

To increase the sample size

F.

To calculate trends

Full Access