Black Friday Sale Special - Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: mxmas70

Home > CompTIA > CompTIA Data+ > DA0-001

DA0-001 CompTIA Data+ Certification Exam Question and Answers

Question # 4

A data analyst was asked to create a chart that shows the relationship between study hours and exam scores for each student using the data sets in the table below:

Which of the following charts would BEST represent the relationship between the variables?

A.

A histogram

B.

A scatter plot

C.

A heat map

D.

A bar chart

Full Access
Question # 5

Which of the following is a common data analytics tool that is also used as an interpreted, high-level, general-purpose programming language?

A.

SAS

B.

Microsoft Power BI

C.

IBM SPSS

D.

Python

Full Access
Question # 6

Which of the following is the most likely reason for a data analyst to optimize a query using parameterization?

A.

To return a subset of records

B.

To insert a temporary table

C.

To prevent SQL injections

D.

To increase the query speed

Full Access
Question # 7

Which of the following is an example of a at flat file?

A.

CSV file

B.

PDF file

C.

JSON file

D.

JPEG file

Full Access
Question # 8

Which of the following can be used to translate data into another form so it can only be read by a user who has a key or a password?

A.

Data encryption.

B.

Data transmission.

C.

Data protection.

D.

Data masking.

Full Access
Question # 9

A company's human resources department has asked a data analyst to categorize the income of all employees into five salary bands:

Which of the following types of functions would be the most appropriate to use?

A.

Statistical

B.

Aggregate

C.

Logical

D.

Mathematical

Full Access
Question # 10

An analyst for a small business with multiple locations is using each location’s quarterly sales reports from last year to create a single revenue report for the year. Which of the following data mining techniques should the analyst use to complete this task?

A.

Data merge

B.

Data append

C.

Data blending

D.

Data imputation

Full Access
Question # 11

Which of the following is a relational database?

A.

SQL

B.

Excel

C.

JSON

D.

NoSQL

Full Access
Question # 12

Which of the following roles is responsible for ensuring an organization's data quality, security, privacy, and regulatory compliance?

A.

Data owner.

B.

Data steward.

C.

Data custodian.

D.

Data processor.

Full Access
Question # 13

A data engineer needs to store data that can be natively used by an API. Which of the following should the engineer use to best accomplish this task?

A.

HTML

B.

JSON

C.

ZIF

D.

CSS

Full Access
Question # 14

A sales manager requested a report that contains the first name, last name, and phone number of all of the company's customers and employees. The data engineer needs to return all the records from several tables, even duplicates. Which of the following is the best way to join the two tables?

A.

FULL OUTER JOIN

B.

FULL INNER JOIN

C.

LEFT OUTER JOIN

D.

CROSS JOIN

Full Access
Question # 15

The current date is July 14, 2020. A data analyst has been asked to create a report that shows the company’s year-over-year Q2 2020 sales. Which of the following reports should the analyst compare?

A.

A Q2 2020 and Q4 2019

B.

YTD 2020 and YTD 2019

C.

Q2 2020 and Q2 2019

D.

Q2 2020 and Q2 2021

Full Access
Question # 16

A data analyst is designing a dashboard that will provide a story of sales and determine which site is providing the highest sales volume per customer. The analyst must choose an appropriate chart to include in the dashboard. The following data is available:

Which of the following types of charts should be considered?

A.

Include a line chart using the site and average sales per customer.

B.

Include a pie chart using the site and sales to average sales per customer.

C.

Include a scatter chart using sales volume and average sales per customer.

D.

Include a column chart using the site and sales to average sales per customer.

Full Access
Question # 17

Which of the following should an analyst do to best summarize the data on a data set?

A.

Filtering

B.

Aggregation

C.

Sorting

D.

Concatenation

Full Access
Question # 18

Which of the following data governance concepts fits into the security requirements category?

A.

Data transmission

B.

Data deletion

C.

Data use agreements

D.

Personally identifiable information

Full Access
Question # 19

A data analyst must fulfill a request for information that is needed weekly and should be automatically emailed to a specific set of users. Which of the following types of reports should theanalyst recommend?

A.

A self-service report

B.

A research report

C.

An ad hoc report

D.

An operational report

Full Access
Question # 20

A research analyst wants to determine whether the data being analyzed is connected to other datapoints. Which of the following is the BEST type of analysis to conduct?

A.

Trend analysis

B.

Performance analysis

C.

Link analysis

D.

Exploratory analysis

Full Access
Question # 21

Q3 2020 has just ended, and now a data analyst needs to create an ad-hoc sales report that demonstrates how well the Q3 2020 promotion went versus last year's Q3 promotion.

Which of the following date parameters should the analyst use?

A.

2019 vs. YTD 2020

B.

Q3 2019 vs. Q3 2020

C.

YTD 2019 vs. YTD 2020

D.

Q4 2019 vs. Q3 2020

Full Access
Question # 22

Which of the following best defines SCD?

A.

A technique used to profile data.

B.

A technique used to sort large data sets.

C.

A technique used to archive data.

D.

A technique used to manage historical data changes.

Full Access
Question # 23

Which of the following describes the method of sampling in which elements of data are selected randomly from each of the small subgroups within a population?

A.

Simple random

B.

Cluster

C.

Systematic

D.

Stratified

Full Access
Question # 24

Which of the following differentiates a flat text file from other data types?

A.

Data is separated by a delimiter.

B.

Data is stored in defined rows.

C.

Data is defined with key-value pairs.

D.

Data is housed in a markup language.

Full Access
Question # 25

Five dogs have the following heights in millimeters:

300,430, 170, 470, 600

Which of the following is the standard deviation for the five dogs?

A.

147mm

B.

154mm

C.

394 mm

D.

21,704mm

Full Access
Question # 26

Which one of the following in NOT a common data integration tool?

A.

XSS

B.

ELT

C.

ETL

D.

APIs

Full Access
Question # 27

Which of the following is a KPI metric for tracking sales performance?

A.

Order status percentage

B.

Customer acquisition percentage

C.

Gross profit percentage

D.

Click-through rate percentage

Full Access
Question # 28

An analyst needs to create an analytics dashboard for an employee intranet site to improve the search functionality, display relevant information, and maintain an updated FAQ page. Which of the following visualizations would best represent what employees are searching for?

A.

A word cloud

B.

A histogram

C.

A pie chart

D.

A scatter plot

Full Access
Question # 29

The process of performing initial investigations on data to spot outliers, discover patterns, and test assumptions with statistical insight and graphical visualization is called:

A.

a t-test.

B.

a performance analysis.

C.

an exploratory data analysis.

D.

a link analysis.

Full Access
Question # 30

An analyst notices changes in sales ratios when analyzing a quarterly report. Which of the following is the analyst conducting?

A.

A gap analysis

B.

A link analysis

C.

A trend analysis

D.

A statistical analysis

Full Access
Question # 31

A data analyst needs to collect a similar proportion of data from every state. Which of the following sampling methods would be the most appropriate?

A.

Systematic sampling

B.

Convenience sampling

C.

Stratified sampling

D.

Random sampling

Full Access
Question # 32

Given the following table:

Which of the following methods is the best way to describe the changes in the values in the table?

A.

Average

B.

Range

C.

Standard deviation

D.

Median

Full Access
Question # 33

A user imports a data file into the accounts payable system each day. On a regular basis. the field input is not what the system is expecting. so it results in an error for the row and a broken import process. To resolve the issue, the user opens the file, finds the error in the row, and manually corrects it before attempting the import again. The import sometimes breaks on subsequent attempts. though. Which of the following changes should be made to this process to reduce the number of errors?

A.

Delete all incorrect inputs and upload the corrected file.

B.

Have the user manually review the file for data completeness before loading it

C.

Create a data field to data type validator to run the file through prior to import.

D.

Spot-check the file prior to import to catch and correct field errors.

Full Access
Question # 34

Kelly wants to get feedback on the final draft of a strategic report that has taken her six months to develop.

What can she do to get prevent confusion as see seeks feedback before publishing the report?

Choose the best answer.

A.

Distribute the report to the appropriate stakeholders via email.

B.

Use a watermark to identify the report as a draft.

C.

Show the report to her immediate supervisor.

D.

Publish the report on an internally facing website.

Full Access
Question # 35

An analyst has written the following code:

SELECT *

FROM Cust_table

WHERE age > 60 AND City = "New York"

Which of the following criteria is the analyst retrieving?

A.

All customers older than age 60 in New York state

B.

All customers aged 60 and older in New York state

C.

All customers older than age 60 in New York City

D.

All customers younger than age 60 in New York City

Full Access
Question # 36

A data analyst is creating a report that will provide information about various regions, products, and time periods. Which of the following formats would be themost efficient way to deliver this report?

A.

A workbook with multiple tabs for each region

B.

A daily email with snapshots of regional summaries

C.

A static report with a different page for every filtered view

D.

A dashboard with filters at the top that the user can toggle

Full Access
Question # 37

Which of the following file formats is best suited to start exploratory analysis within statistical software?

A.

CSV

B.

XLSM

C.

XML

D.

JSON

Full Access
Question # 38

An analyst is reviewing the following data:

Car IDSpeed

123155

566436

564418

650567

546436

645638

Which of the following should the analyst include in the measures of central tendency for speed?

A.

Mode = 38 Range = 31 Mean = 42.5

B.

Range = 49 Max = 67 Min = 18

C.

Mode = 36 Max = 67 Min = 18

D.

Mode = 36 Median = 37 Mean = 41.5

Full Access
Question # 39

An analyst is working on a project for a director. During this process. the analyst pulled the data. created summarized tables and graphs with descriptions, created a report summary, and inserted all items into a report. After writing the report, which of the following would be the most appropriate next step?

A.

Complete an audit on the data pulled for the report.

B.

Complete a check for quality in the report.

C.

Complete a review of the data and a check for consistency

D.

Complete a trend analysis to be included in the report.

Full Access
Question # 40

Given the customer table below:

Which of the following chart types is the most appropriate to represent the average spending of active customers vs. inactive customers?

A.

Pie chart

B.

Heat graph

C.

Scatter plot

D.

Line chart

Full Access
Question # 41

Which of the following is a difference between a primary key and a unique key?

A.

A unique key cannot take null values, whereas a primary key can take null values.

B.

There can be only one primary key in a data set, whereas there can be multiple unique keys.

C.

A primary key can take a value more than once, whereas a unique key cannot take a value more than once.

D.

A primary key cannot be a date variable, whereas a unique key can be.

Full Access
Question # 42

The director of operations at a power company needs data to help identify where company resources should be allocated in order to monitor activity for outages and restoration of power in the entire state. Specifically, the director wants to see the following:

* County outages

* Status

* Overall trend of outages

INSTRUCTIONS:

Please, select each visualization to fit the appropriate space on the dashboard and choose an appropriate color scheme. Once you have selected all visualizations, please, select the appropriate titles and labels, if applicable. Titles and labels may be used more than once.

If at any time you would like to bring back the initial state of the simulation, please click the Reset All button.

Full Access
Question # 43

Which of the following data sampling methods involves dividing a population into subgroups by similar characteristics?

A.

Systematic

B.

Simple random

C.

Convenience

D.

Stratified

Full Access
Question # 44

An analysts building a monthly report for production and wants to ensure the audience is aware of its once-a-month cadence. Which of the following is the MOST important to convey that information?

A.

The date of the dashboard build

B.

The data refresh date

C.

A report summary

D.

Frequently asked questions

Full Access
Question # 45

A data analyst is working with a data set and would like to combine two fields into a single field. Which of the following data manipulation techniques should the analyst use?

A.

Data merge

B.

Transpose

C.

Data append

D.

Concatenation

Full Access
Question # 46

A data analyst is using a two-tailed, independent t-test to determine whether the type of stretching, dynamic or static, has any influence on a dancer's flexibility. Which of the following is the alternative hypothesis?

A.

A dancer's flexibility is improved through static stretching.

B.

The change in a dancer's flexibility is not equal to zero.

C.

There is a difference in a dancer's flexibility between static and dynamic stretching.

D.

The means of the static and dynamic stretching groups do not differ from each other.

Full Access
Question # 47

Which of the following tools would be best to use to calculate the interquartile range, median, mean, and standard deviation of a column in a table that has 5.000.000 rows?

A.

Microsoft Excel

B.

R

C.

Snowflake

D.

SQL

Full Access
Question # 48

A data scientist wants to see which products make the most money and which products attract the most customer purchasing interest in their company.

Which of the following data manipulation techniques would he use to obtain this information?

A.

Data append

B.

Data blending

C.

Normalize data

D.

Data merge

Full Access
Question # 49

Andy is a pricing analyst for a retailer. Using a hypothesis test, he wants to assess whether people who receive electronic coupons spend more on average.

What should Andy's null hypothesis be?

A.

People who receive electronic coupons spend more on average.

B.

People who receive electronic coupons spend less on average.

C.

People who receive electronic coupons do not spend more on average.

D.

People who do not receive electronic coupons spend more on average.

Full Access
Question # 50

A reporting analyst needs to create a report that refreshes automatically and is accessible to the entire sales organization. Which of the following tools is the most appropriate to use for this task?

A.

R

B.

Excel

C.

Tableau

D.

Python

Full Access
Question # 51

A data governance analyst who is reviewing a retailer's data set notices that sales data is captured at the regional level but not at the individual store level. Which of the following best describes the issue with this data set?

A.

Data attribute limitations

B.

Data accuracy

C.

Data integrity

D.

Data consistency

Full Access
Question # 52

A development company is constructing a new unit in its apartment complex. The complex has the following floor plans:

Using the average cost per square foot of the original floor plans, which of the following should be the price of the Rose unit?

A.

$640,900

B.

$690,000

C.

$705,200

D.

$702,500

Full Access
Question # 53

Which of the following is concatenate typically used to combine?

A.

Rows

B.

Columns

C.

Tables

D.

Databases

Full Access
Question # 54

An e-commerce company recently tested a new website layout. The website was tested by a test group of customers, and an old website was presented to a control group. The table below shows the percentage of users in each group who made purchases on the websites:

Which of the following conclusions is accurate at a 95% confidence interval?

A.

In Germany, the increase in conversion from the new layout was not significant.

B.

In France, the increase in conversion from the new layout was not significant.

C.

In general, users who visit the new website are more likely to make a purchase.

D.

The new layout has the lowest conversion rates in the United Kingdom.

Full Access
Question # 55

Alex wants to use data from his corporate sale, CRM, and shipping systems to try and predict future sales.

Which of the following systems is the most appropriate?

Choose the best answer.

A.

Data mart.

B.

OLAP.

C.

Data Warehouse.

D.

OLTP.

Full Access
Question # 56

Given the image below:

The data should be cleaned because of the presence of:

A.

outlier

B.

non-parametric data.

C.

multicollinearity.

D.

invalid data.

Full Access
Question # 57

Which of the following occurs if a 90% confidence interval increases to 95%?

A.

The margin of error does not change.

B.

The interval remains the same.

C.

The interval becomes narrower.

D.

The margin of error doubles.

Full Access
Question # 58

Which of the following value is the measure of dispersion "range" between the scores of ten students in a test.

The scores of ten students in a test are 17, 23, 30, 36, 45, 51, 58, 66, 72, 77.

A.

90

B.

60

C.

70

D.

80

Full Access
Question # 59

A data analyst wants to create "Income Categories" that would be calculated based on the existing variable "Income". The "Income Categories" would be as follows:

Income category 1: less than $1.

Income category 2: more than $1 and less than $20,000.

Income category 3: more than $20,001 and less than $40,000.

Income category 4: more than $40,001.

Which of the following data manipulation techniques should the data analyst use to create "Income Categories"?

A.

Data merge

B.

Derived variables

C.

Data blending

D.

Data append

Full Access
Question # 60

Angela is aggregating data from CRM system with data from an employee system.

While performing an initial quality check, she realizes that her employee ID is not associated with her identifier in the CRM system.

What kind of issues is Angela facing?

Choose the best answer.

A.

ETL process.

B.

Record linkage.

C.

ELT process.

D.

System integration.

Full Access
Question # 61

Which of the following best describes the process of examining data for statistics and information about the data?

    Cleansing

A.

search

B.

Profiling

C.

Governance

Full Access
Question # 62

An analyst needs to conduct a quick analysis. Which of the following is the FIRST step the analyst should perform with the data?

A.

Conduct an exploratory analysis and use descriptive statistics.

B.

Conduct a trend analysis and use a scatter chart.

C.

Conduct a link analysis and illustrate the connection points.

D.

Conduct an initial analysis and use a Pareto chart.

Full Access
Question # 63

Which of the following is used for calculations and pivot tables?

A.

IBM SPSS

B.

SAS

C.

Microsoft Excel

D.

Domo

Full Access
Question # 64

A data analyst is working for a shipping company and calculating the volume of boxes according to the following formula:

volume = height × width × depth.

Which of the following variable types describes volume?

A.

Derived

B.

Normalized

C.

Concatenated

D.

Aggregated

Full Access
Question # 65

Which of the following BEST describes the issue in which character values are mixed with integer values in a data set column?

A.

Duplicate data

B.

Missing data

C.

Data outliers

D.

Invalid data type

Full Access
Question # 66

A financial institution is reporting on sales performance to a company at the account level. Due to the sensitive nature of the government the does il with, some account information is not shown. Which of the following fields should be masked?

A.

Sales volume

B.

Start date

C.

Product name

D.

Customer name

Full Access
Question # 67

A data analyst has removed the outliers from a data set due to large variances. Which of the following central tendencies would be the best measure to use?

A.

Range

B.

Mean

C.

Mode

D.

Median

Full Access
Question # 68

An analyst computed a new variable of income per day in the household by multiplying the number of days worked by the number of people working in the household and the income earned per day. Which of the following is the correct name for this new variable?

A.

Derived

B.

Categorical

C.

Continuous

D.

Control

Full Access
Question # 69

Which of the following will MOST likely be streamed live?

A.

Machine data

B.

Key-value pairs

C.

Delimited rows

D.

Flat files

Full Access
Question # 70

After the daily ETL jobs are completed, the data in the reports does not appear complete, and a lot of data seems to be missing. Which of the following concepts should be used to assess and investigate further?

A.

Cross-validation

B.

Data profiling

C.

Data integrity

D.

Data consistency

Full Access
Question # 71

Given the table below:

Which of the following variable types BEST describes the “Year” column?

A.

Numeric

B.

Date

C.

Alphanumeric

D.

Text

Full Access
Question # 72

An analyst wants to test the association between the number of doors in a car and the number of gears in the car. Which of the following is the best test to use?

A.

F-test

B.

Acceptance test

C.

Chi-squared test

D.

Z-test

Full Access
Question # 73

A business unit made the following modification to the values in a table:

Which of the following data quality dimensions was applied in this scenario?

A.

Integrity

B.

Consistency

C.

Completeness

D.

Accuracy

Full Access
Question # 74

Which of the following descriptive statistical methods are measures of central tendency? (Choose two.)

A.

Mean

B.

Minimum

C.

Mode

D.

Variance

E.

Correlation

F.

Maximum

Full Access
Question # 75

Which one of the following would not normally be considered a summary statistic?

A.

z-score.

B.

Mean.

C.

Variance.

D.

Standard deviation.

Full Access
Question # 76

A data analyst is attempting to understand how ice cream consumption is affected by different attributes. such as cost, temperature. and income level. Which of the following

regression analyses should the data analyst perform to understand this relationship?

A.

Logistic

B.

Ordinary least squares

C.

Cox

D.

Polynomial

Full Access
Question # 77

Given the following data tables:

Which of the following MDM processes needs to take place FIRST?

A.

Creation of a data dictionary

B.

Compliance with regulations

C.

Standardization of data field names

D.

Consolidation of multiple data fields

Full Access
Question # 78

Jenny wants to study the academic performance of undergraduate sophomores and wants to determine the average grade point average at different points during an academic year.

What best describes the data set she needs?

A.

Sample.

B.

Observation.

C.

Variable.

D.

Population.

Full Access
Question # 79

A data analyst has been asked to merge the tables below, first performing an INNER JOIN and then a LEFT JOIN:

Customer Table -

In-store Transactions –

Which of the following describes the number of rows of data that can be expected after performing both joins in the order stated, considering the customer table as the main table?

A.

INNER: 6 rows; LEFT: 9 rows

B.

INNER: 9 rows; LEFT: 6 rows

C.

INNER: 9 rows; LEFT: 15 rows

D.

INNER: 15 rows; LEFT: 9 rows

Full Access
Question # 80

Which of the following analysis techniques is an unsupervised data mining process?

A.

Clustering

B.

Descriptive

C.

Regression

D.

Predictive

Full Access
Question # 81

A sales analyst needs to report how the sales team is performing to target. Which of the following files will be important in determining 2019 performance attainment?

A.

2018 goal data

B.

2018 actual revenue

C.

2019 goal data

D.

2019 commission plan

Full Access
Question # 82

An analyst is designing a dashboard that will provide a story of the sales and sales customer ratio. The following data is available:

Which of the following charts should the analyst consider including in the dashboard?

A.

A column chart with site and sales

B.

A line chart with site and sales

C.

A pie chart with site and sales

D.

A scatter chart with site and sales

Full Access
Question # 83

Which of the following data analysis tools increases the efficiency of data visualizations?

A.

SQL

B.

Microsoft Excel

C.

SAS

D.

RapidMiner

Full Access
Question # 84

An analyst collected data that includes primary account numbers, expiration dates, and service codes. Which of the following data governance classifications is used to describe this data?

A.

PI I

B.

PCI

C.

PBI

D.

PHI

Full Access
Question # 85

Given the following report:

Which of the following components need to be added to ensure the report is point-in-time and static? (Select two).

A.

A control group for the phrases

B.

A summary of the KPIs

C.

Filter buttons for the status

D.

The date when the report was last accessed

E.

The time period lhe report covers

F.

The date on which the report was run

Full Access
Question # 86

During data profiling, an analyst decides to recode the status column in the following data set:

Which of the following data concerns explains why the analyst wants to take this action?

A.

Redundancy

B.

Duplication

C.

Invalidity

D.

Inconsistency

Full Access
Question # 87

Given the following data table:

Which of the following are appropriate reasons to undertake data cleansing? (Select two).

A.

Non-parametric data

B.

Missing data

C.

Duplicate data

D.

Invalid data

E.

Redundant data

F.

Normalized data

Full Access
Question # 88

A web developer wants to ensure that malicious users can't type SQL statements when they asked for input, like their username/userid.

Which of the following query optimization techniques would effectively prevent SQL Injection attacks?

A.

Indexing.

B.

Subset of records.

C.

Temporary table in the query set.

D.

Parametrization.

Full Access
Question # 89

Which of the following should be accomplished NEXT after understanding a business requirement for a data analysis report?

A.

Rephrase the business requirement.

B.

Determine the data necessary for the analysis

C.

Build a mock dashboard/presentation layout.

D.

Perform exploratory data analysis.

Full Access
Question # 90

Jhon is working on an ELT process that sources data from six different source systems.

Looking at the source data, he finds that data about the sample people exists in two of six systems.

What does he have to make sure he checks for in his ELT process?

Choose the best answer.

A.

Duplicate Data.

B.

Redundant Data.

C.

Invalid Data.

D.

Missing Data.

Full Access
Question # 91

Which of the following is a non-parametric test?

A.

One-sample t-test

B.

Two-way ANOVA

C.

Correlation coefficient

D.

Spearman's rank correlation

Full Access
Question # 92

A database consists of one fact table that is composed of multiple dimensions. Depending on the dimension, each one can be represented by a denormalized table or multiple normalized tables. This structure is an example of a:

A.

transactional schema.

B.

star schema.

C.

non-relational schema.

D.

snowflake schema.

Full Access
Question # 93

A recurring event is being stored in two databases that are housed in different geographical locations. A data analyst notices the event is being logged three hours earlier in one database than in the other database. Which of the following is the MOST likely cause of the issue?

A.

The data analyst is not querying the databases correctly.

B.

The databases are recording different events.

C.

The databases are recording the event in different time zones.

D.

The second database is logging incorrectly.

Full Access
Question # 94

Given the following data set:

Which of the following is the best reason for cleansing the data?

A.

Duplicate data

B.

Imputed data

C.

Redundant data

D.

Corrupt data

Full Access
Question # 95

Which of the following would be the best way to identify multicollinear attributes in a data set?

A.

Correlation coefficient

B.

Chi-squared test

C.

Two-sample f-test

D.

Two-way ANOVA

Full Access
Question # 96

A data analyst has been asked to create a sales report that calculates the rolling 12-month average for sales. If the report will be published on November 1, 2020, which of the following months shouts the report cover?

A.

October 1, 2019 to October 31, 2020

B.

October 31, 2020 to November 1, 2021

C.

November 1, 2019 to October 31, 2020

D.

October 31, 2019 to October 31, 2020

Full Access
Question # 97

Which of the following query optimization techniques involves examining only the data that is needed for a particular task?

A.

Making a temporary table

B.

Creating a flat file

C.

Indexing documents

D.

Creating an execution plan

Full Access
Question # 98

Which one of the following programming languages is specifically designed for use in analytics applications?

A.

Python.

B.

R

C.

C++

D.

Java.

Full Access
Question # 99

A data analyst for a media company needs to determine the most popular movie genre. Given the table below:

Which of the following must be done to the Genre column before this task can be completed?

A.

Append

B.

Merge

C.

Concatenate

D.

Delimit

Full Access
Question # 100

Given the following tables:

Which of the following will be the dimensions from a FULL JOIN of the tables above?

A.

Two rows and three columns

B.

Three rows and four columns

C.

Four rows and two columns

D.

Four rows and four columns

Full Access
Question # 101

Which of the following contains alphanumeric values?

A.

10.1Ε²

B.

13.6

C.

1347

D.

A3J7

Full Access
Question # 102

A data set for sales per month includes the following data:

Which of the following cleaning and profiling methods should be applied to the data set?

A.

Data outliers

B.

Invalid data

C.

Duplicate data

D.

Data type validation

Full Access
Question # 103

Which of the following is an example of structured data?

A.

A credit card number

B.

An email

C.

A photo

D.

Social media correspondence

Full Access
Question # 104

A data analyst needs to write a SOL query measuring last month's website visits and distribute a summary report to the marketing team. Which of the following is the analyst creating?

A.

Date range

B.

Distribution list

C.

Data content

D.

Report view

Full Access
Question # 105

An analyst is updating a customer contacts database with information obtained from a survey of new customers. Which of the following data manipulation techniques should the analyst use?

A.

Join

B.

Append

C.

Transform

D.

Blend

Full Access
Question # 106

Under which of the following circumstances should the null hypothesis be accepted when a = 0.05?

A.

When p is 0.00003

B.

When p is 0.001

C.

When p is 0.04

D.

When p is 0.06

Full Access
Question # 107

Which of the following defines the policies and procedures for managing the master data?

A.

Data administration

B.

Data stewardship

C.

Data ownership

D.

Data governance

Full Access
Question # 108

A data analyst has been asked to create one table that has each employee's first name, last name, sales, and address. The sales and addresses are listed in the tables below:

Which of the following steps should the analyst take to create the table?

A.

Transpose the first name and last name in both tables. Use lookup to pull the address field from Table 2 into Table 1.

B.

Use lookup with the first name or first name to pull the address field from Table 2 into Table 1.

C.

Use the append formula in both tables for the first name and last name. Use lookup to pull the address field from Table 2 into Table 1.

D.

Create a column that concatenates the first name and last name in each table. Use concatenate and lookup to bring the address field into Table 1.

Full Access
Question # 109

Given the following data:

CustomerID

ItemBought

Date

Tre_234

Sofa

2022-09-08

216_Tre

Shoes

08/02/2021

215/Tre

Blanket

2021/06/20

045/Tre

Mug

12-26-2021

Tre-345

Lamp

31/08/2022

TREJD19

Bucket

2022'08/01

Which of the following best describes the main issue in the data set?

A.

Inconsistent data

B.

Data mismatch

C.

Invalid data

D.

Redundant data

Full Access
Question # 110

An analyst is training a new coworker on the importance of data governance and is focusing on security requirements. Which of the following should the analyst include in the training?

(Select two).

A.

Data masking

B.

Data encryption

C.

Data parallelism

D.

Data inclusiveness

E.

Data exclusiveness

F.

Data openness

Full Access
Question # 111

An analyst has generated a report that includes the number of months in the first two quarters of 2019 when sales exceeded $50,000:

Which of the following functions did the analyst use to generate the data in the Sales_indicator column?

A.

Aggregate

B.

Logical

C.

Date

D.

Sort

Full Access
Question # 112

Which of the following is the best variable formal to store a customer's age using the least possible amount of storage data?

A.

Int

B.

Float

C.

Char

D.

Double

Full Access
Question # 113

Which of the following types of analyses is best to use when tracking sales revenue against quarterly targets?

A.

Trend

B.

Performance

C.

Link

D.

Scope

Full Access
Question # 114

A healthcare data analyst notices that one data set in the column for BloodPressure contains several outliers that need to be replaced with meaningful values. Which of the following data manipulation techniques should the analyst use?

A.

Recode

B.

Impute

C.

Append

D.

Reduction

Full Access
Question # 115

A development company is constructing a new Init in its apartment complex. The complex has the following floor plans:

Using the average cost per square foot of the original floor plans. which of the following should be the price of the Rose Init?

A.

$640,900

B.

$690,000

C.

$705,200

D.

$702,500

Full Access
Question # 116

Given the below:

Which of the following numbers represents a Type I error?

A.

1

B.

2

C.

3

D.

4

Full Access
Question # 117

An analyst needs to know what data an organization possesses. Which of the following is the best document for the analyst to consult?

A.

Data destruction policy

B.

Data use document

C.

Data dictionary

D.

Data retention policy

Full Access
Question # 118

An analyst must obtain the average daily sales for the following week:

Which of the following must the analyst perform to obtain this value?

A.

Data normalization

B.

Data append

C.

Data aggregation

D.

Data blending

Full Access