A data analyst was asked to create a chart that shows the relationship between study hours and exam scores for each student using the data sets in the table below:
Which of the following charts would BEST represent the relationship between the variables?
Which of the following is a common data analytics tool that is also used as an interpreted, high-level, general-purpose programming language?
Which of the following is the most likely reason for a data analyst to optimize a query using parameterization?
Which of the following can be used to translate data into another form so it can only be read by a user who has a key or a password?
A company's human resources department has asked a data analyst to categorize the income of all employees into five salary bands:
Which of the following types of functions would be the most appropriate to use?
An analyst for a small business with multiple locations is using each location’s quarterly sales reports from last year to create a single revenue report for the year. Which of the following data mining techniques should the analyst use to complete this task?
Which of the following roles is responsible for ensuring an organization's data quality, security, privacy, and regulatory compliance?
A data engineer needs to store data that can be natively used by an API. Which of the following should the engineer use to best accomplish this task?
A sales manager requested a report that contains the first name, last name, and phone number of all of the company's customers and employees. The data engineer needs to return all the records from several tables, even duplicates. Which of the following is the best way to join the two tables?
The current date is July 14, 2020. A data analyst has been asked to create a report that shows the company’s year-over-year Q2 2020 sales. Which of the following reports should the analyst compare?
A data analyst is designing a dashboard that will provide a story of sales and determine which site is providing the highest sales volume per customer. The analyst must choose an appropriate chart to include in the dashboard. The following data is available:
Which of the following types of charts should be considered?
Which of the following should an analyst do to best summarize the data on a data set?
Which of the following data governance concepts fits into the security requirements category?
A data analyst must fulfill a request for information that is needed weekly and should be automatically emailed to a specific set of users. Which of the following types of reports should theanalyst recommend?
A research analyst wants to determine whether the data being analyzed is connected to other datapoints. Which of the following is the BEST type of analysis to conduct?
Q3 2020 has just ended, and now a data analyst needs to create an ad-hoc sales report that demonstrates how well the Q3 2020 promotion went versus last year's Q3 promotion.
Which of the following date parameters should the analyst use?
Which of the following describes the method of sampling in which elements of data are selected randomly from each of the small subgroups within a population?
Which of the following differentiates a flat text file from other data types?
Five dogs have the following heights in millimeters:
300,430, 170, 470, 600
Which of the following is the standard deviation for the five dogs?
An analyst needs to create an analytics dashboard for an employee intranet site to improve the search functionality, display relevant information, and maintain an updated FAQ page. Which of the following visualizations would best represent what employees are searching for?
The process of performing initial investigations on data to spot outliers, discover patterns, and test assumptions with statistical insight and graphical visualization is called:
An analyst notices changes in sales ratios when analyzing a quarterly report. Which of the following is the analyst conducting?
A data analyst needs to collect a similar proportion of data from every state. Which of the following sampling methods would be the most appropriate?
Given the following table:
Which of the following methods is the best way to describe the changes in the values in the table?
A user imports a data file into the accounts payable system each day. On a regular basis. the field input is not what the system is expecting. so it results in an error for the row and a broken import process. To resolve the issue, the user opens the file, finds the error in the row, and manually corrects it before attempting the import again. The import sometimes breaks on subsequent attempts. though. Which of the following changes should be made to this process to reduce the number of errors?
Kelly wants to get feedback on the final draft of a strategic report that has taken her six months to develop.
What can she do to get prevent confusion as see seeks feedback before publishing the report?
Choose the best answer.
An analyst has written the following code:
SELECT *
FROM Cust_table
WHERE age > 60 AND City = "New York"
Which of the following criteria is the analyst retrieving?
A data analyst is creating a report that will provide information about various regions, products, and time periods. Which of the following formats would be themost efficient way to deliver this report?
Which of the following file formats is best suited to start exploratory analysis within statistical software?
An analyst is reviewing the following data:
Car IDSpeed
123155
566436
564418
650567
546436
645638
Which of the following should the analyst include in the measures of central tendency for speed?
An analyst is working on a project for a director. During this process. the analyst pulled the data. created summarized tables and graphs with descriptions, created a report summary, and inserted all items into a report. After writing the report, which of the following would be the most appropriate next step?
Given the customer table below:
Which of the following chart types is the most appropriate to represent the average spending of active customers vs. inactive customers?
Which of the following is a difference between a primary key and a unique key?
The director of operations at a power company needs data to help identify where company resources should be allocated in order to monitor activity for outages and restoration of power in the entire state. Specifically, the director wants to see the following:
* County outages
* Status
* Overall trend of outages
INSTRUCTIONS:
Please, select each visualization to fit the appropriate space on the dashboard and choose an appropriate color scheme. Once you have selected all visualizations, please, select the appropriate titles and labels, if applicable. Titles and labels may be used more than once.
If at any time you would like to bring back the initial state of the simulation, please click the Reset All button.
Which of the following data sampling methods involves dividing a population into subgroups by similar characteristics?
An analysts building a monthly report for production and wants to ensure the audience is aware of its once-a-month cadence. Which of the following is the MOST important to convey that information?
A data analyst is working with a data set and would like to combine two fields into a single field. Which of the following data manipulation techniques should the analyst use?
A data analyst is using a two-tailed, independent t-test to determine whether the type of stretching, dynamic or static, has any influence on a dancer's flexibility. Which of the following is the alternative hypothesis?
Which of the following tools would be best to use to calculate the interquartile range, median, mean, and standard deviation of a column in a table that has 5.000.000 rows?
A data scientist wants to see which products make the most money and which products attract the most customer purchasing interest in their company.
Which of the following data manipulation techniques would he use to obtain this information?
Andy is a pricing analyst for a retailer. Using a hypothesis test, he wants to assess whether people who receive electronic coupons spend more on average.
What should Andy's null hypothesis be?
A reporting analyst needs to create a report that refreshes automatically and is accessible to the entire sales organization. Which of the following tools is the most appropriate to use for this task?
A data governance analyst who is reviewing a retailer's data set notices that sales data is captured at the regional level but not at the individual store level. Which of the following best describes the issue with this data set?
A development company is constructing a new unit in its apartment complex. The complex has the following floor plans:
Using the average cost per square foot of the original floor plans, which of the following should be the price of the Rose unit?
An e-commerce company recently tested a new website layout. The website was tested by a test group of customers, and an old website was presented to a control group. The table below shows the percentage of users in each group who made purchases on the websites:
Which of the following conclusions is accurate at a 95% confidence interval?
Alex wants to use data from his corporate sale, CRM, and shipping systems to try and predict future sales.
Which of the following systems is the most appropriate?
Choose the best answer.
Given the image below:
The data should be cleaned because of the presence of:
Which of the following occurs if a 90% confidence interval increases to 95%?
Which of the following value is the measure of dispersion "range" between the scores of ten students in a test.
The scores of ten students in a test are 17, 23, 30, 36, 45, 51, 58, 66, 72, 77.
A data analyst wants to create "Income Categories" that would be calculated based on the existing variable "Income". The "Income Categories" would be as follows:
Income category 1: less than $1.
Income category 2: more than $1 and less than $20,000.
Income category 3: more than $20,001 and less than $40,000.
Income category 4: more than $40,001.
Which of the following data manipulation techniques should the data analyst use to create "Income Categories"?
Angela is aggregating data from CRM system with data from an employee system.
While performing an initial quality check, she realizes that her employee ID is not associated with her identifier in the CRM system.
What kind of issues is Angela facing?
Choose the best answer.
Which of the following best describes the process of examining data for statistics and information about the data?
Cleansing
An analyst needs to conduct a quick analysis. Which of the following is the FIRST step the analyst should perform with the data?
A data analyst is working for a shipping company and calculating the volume of boxes according to the following formula:
volume = height × width × depth.
Which of the following variable types describes volume?
Which of the following BEST describes the issue in which character values are mixed with integer values in a data set column?
A financial institution is reporting on sales performance to a company at the account level. Due to the sensitive nature of the government the does il with, some account information is not shown. Which of the following fields should be masked?
A data analyst has removed the outliers from a data set due to large variances. Which of the following central tendencies would be the best measure to use?
An analyst computed a new variable of income per day in the household by multiplying the number of days worked by the number of people working in the household and the income earned per day. Which of the following is the correct name for this new variable?
After the daily ETL jobs are completed, the data in the reports does not appear complete, and a lot of data seems to be missing. Which of the following concepts should be used to assess and investigate further?
Given the table below:
Which of the following variable types BEST describes the “Year†column?
An analyst wants to test the association between the number of doors in a car and the number of gears in the car. Which of the following is the best test to use?
A business unit made the following modification to the values in a table:
Which of the following data quality dimensions was applied in this scenario?
Which of the following descriptive statistical methods are measures of central tendency? (Choose two.)
Which one of the following would not normally be considered a summary statistic?
A data analyst is attempting to understand how ice cream consumption is affected by different attributes. such as cost, temperature. and income level. Which of the following
regression analyses should the data analyst perform to understand this relationship?
Given the following data tables:
Which of the following MDM processes needs to take place FIRST?
Jenny wants to study the academic performance of undergraduate sophomores and wants to determine the average grade point average at different points during an academic year.
What best describes the data set she needs?
A data analyst has been asked to merge the tables below, first performing an INNER JOIN and then a LEFT JOIN:
Customer Table -
In-store Transactions –
Which of the following describes the number of rows of data that can be expected after performing both joins in the order stated, considering the customer table as the main table?
Which of the following analysis techniques is an unsupervised data mining process?
A sales analyst needs to report how the sales team is performing to target. Which of the following files will be important in determining 2019 performance attainment?
An analyst is designing a dashboard that will provide a story of the sales and sales customer ratio. The following data is available:
Which of the following charts should the analyst consider including in the dashboard?
Which of the following data analysis tools increases the efficiency of data visualizations?
An analyst collected data that includes primary account numbers, expiration dates, and service codes. Which of the following data governance classifications is used to describe this data?
Given the following report:
Which of the following components need to be added to ensure the report is point-in-time and static? (Select two).
During data profiling, an analyst decides to recode the status column in the following data set:
Which of the following data concerns explains why the analyst wants to take this action?
Given the following data table:
Which of the following are appropriate reasons to undertake data cleansing? (Select two).
A web developer wants to ensure that malicious users can't type SQL statements when they asked for input, like their username/userid.
Which of the following query optimization techniques would effectively prevent SQL Injection attacks?
Which of the following should be accomplished NEXT after understanding a business requirement for a data analysis report?
Jhon is working on an ELT process that sources data from six different source systems.
Looking at the source data, he finds that data about the sample people exists in two of six systems.
What does he have to make sure he checks for in his ELT process?
Choose the best answer.
A database consists of one fact table that is composed of multiple dimensions. Depending on the dimension, each one can be represented by a denormalized table or multiple normalized tables. This structure is an example of a:
A recurring event is being stored in two databases that are housed in different geographical locations. A data analyst notices the event is being logged three hours earlier in one database than in the other database. Which of the following is the MOST likely cause of the issue?
Given the following data set:
Which of the following is the best reason for cleansing the data?
Which of the following would be the best way to identify multicollinear attributes in a data set?
A data analyst has been asked to create a sales report that calculates the rolling 12-month average for sales. If the report will be published on November 1, 2020, which of the following months shouts the report cover?
Which of the following query optimization techniques involves examining only the data that is needed for a particular task?
Which one of the following programming languages is specifically designed for use in analytics applications?
A data analyst for a media company needs to determine the most popular movie genre. Given the table below:
Which of the following must be done to the Genre column before this task can be completed?
Given the following tables:
Which of the following will be the dimensions from a FULL JOIN of the tables above?
A data set for sales per month includes the following data:
Which of the following cleaning and profiling methods should be applied to the data set?
A data analyst needs to write a SOL query measuring last month's website visits and distribute a summary report to the marketing team. Which of the following is the analyst creating?
An analyst is updating a customer contacts database with information obtained from a survey of new customers. Which of the following data manipulation techniques should the analyst use?
Under which of the following circumstances should the null hypothesis be accepted when a = 0.05?
Which of the following defines the policies and procedures for managing the master data?
A data analyst has been asked to create one table that has each employee's first name, last name, sales, and address. The sales and addresses are listed in the tables below:
Which of the following steps should the analyst take to create the table?
Given the following data:
CustomerID
ItemBought
Date
Tre_234
Sofa
2022-09-08
216_Tre
Shoes
08/02/2021
215/Tre
Blanket
2021/06/20
045/Tre
Mug
12-26-2021
Tre-345
Lamp
31/08/2022
TREJD19
Bucket
2022'08/01
Which of the following best describes the main issue in the data set?
An analyst is training a new coworker on the importance of data governance and is focusing on security requirements. Which of the following should the analyst include in the training?
(Select two).
An analyst has generated a report that includes the number of months in the first two quarters of 2019 when sales exceeded $50,000:
Which of the following functions did the analyst use to generate the data in the Sales_indicator column?
Which of the following is the best variable formal to store a customer's age using the least possible amount of storage data?
Which of the following types of analyses is best to use when tracking sales revenue against quarterly targets?
A healthcare data analyst notices that one data set in the column for BloodPressure contains several outliers that need to be replaced with meaningful values. Which of the following data manipulation techniques should the analyst use?
A development company is constructing a new Init in its apartment complex. The complex has the following floor plans:
Using the average cost per square foot of the original floor plans. which of the following should be the price of the Rose Init?
Given the below:
Which of the following numbers represents a Type I error?
An analyst needs to know what data an organization possesses. Which of the following is the best document for the analyst to consult?
An analyst must obtain the average daily sales for the following week:
Which of the following must the analyst perform to obtain this value?