Information
AI Chat

Data mining exam questions

Question Bank for Data Mining Course

Data Mining

91 Documents

Students shared 91 documents in this course

Assiut University

Academic year: 2023/2024

Uploaded by:

Please sign in or register to post comments.

Mahmoud10 months ago
Thanks

Biological Data Mining zelalem simon DM

49. In association rule mining the

generation of the frequent itermsets is the computational

intensive step

50. The problem of finding hidden structure in unlabeled data is called

51. The choice of a data mining tool is made at this step of the KDD

process.

a. goal identification

b. creating a target dataset

c. data preprocessing

d. data mining

52. Attibutes may be eliminated from the target dataset during this step of

the KDD process.

a. creating a target dataset

b. data preprocessing

c. data transformation

d. data mining

53. A common method used by some data mining techniques to deal

with missing data items during the learning process.

a) replace missing real-valued data items with class means

b) discard records with missing data

c) replace missing attribute values with the values found within

other similar instances

d) ignore missing attribute values

54. The term data mining was originally used to ______.

a. include most forms of data analysis in order to increase sales

b. describe the prices through which previously unknown patterns

in data were discovered

c. describe the analysis of huge datasets stored in data

warehouses

d. All of the above

55. What is a major characteristic of data mining?

a. Because of the large amounts of data and massive search

efforts, it is sometimes necessary to use serial processing for

data mining

59. __________ may be defined as the data objects that do not comply with the

60. "Efficiency and scalability of data mining algorithms" issues comes under?

61. What is the use of data cleaning?

62. Data Mining System Classification consists of?

63. Which of the following is correct application of data mining?

67. The first steps involved in the knowledge discovery is?

68. In which step of Knowledge Discovery, multiple data sources are combined?

69. The most commonly used algorithm to discover association rules by recursively

70. A process that uses statistical, mathematical, artificial intelligence, and machine-

71. A machine learning process that performs rule induction or a related procedure to

72. Commonly co-occurring groupings of things. AKA market-basket analysis.

73. A type of data that represents the numeric values of specific variables. for

78. Supervised induction used to analyze the historical data stored in a database and

79. The number of iterations in apriori ___________ Select one:

80. To determine association rules from frequent item sets Select one:

81. If {A,B,C,D} is a frequent itemset, candidate rules which is not possible is Select one:

82. Feedback: D – >ABCD

83. ______ routines attempt to fill in missing values, smooth out noise while

84. ________ is used to refer to systems and technologies that provide the business

85. In Smoothing by bin means each value in a bin is replaced by the mean value of

86. ______ regression involves finding the “best” line to fit two variables so that

87. _____ works to remove the noise from the data that includes techniques like

88. Redundancies can be detected by correlation analysis. (True/False)

89. The ______ technique uses encoding mechanisms to reduce the data set size.

90. In which Strategy of data reduction redundant attributes are detected.

91. The _____ rule can be used to segment numeric data into relatively uniform,

92. Oracle, SQL/Server, DB2 are examples for _____________.

Multiple Choice

Multiple Choice

Was this document helpful?

Data mining exam questions

Course: Data Mining

91 Documents

Students shared 91 documents in this course

University: Assiut University

Multiple Choice

Multiple Choice

Was this document helpful?

1. Data mining is best described as the process of

a. identifying patterns in data.

b. deducing relationships in data.

c. representing data.

d. simulating trends in data.

2. Data used to build a data mining model.

a. validation data

b. training data

c. test data

d. hidden data

3. Supervised learning and unsupervised clustering both require at least one

a. hidden attribute.

b. output attribute.

c. input attribute.

d. categorical attribute.

4. Supervised learning differs from unsupervised clustering in that supervised learning

a. at least one input attribute.

b. input attributes to be categorical.

c. at least one output attribute.

d. ouput attriubutes to be categorical.

5. Which of the following is a valid production rule for the decision tree below?

a. IF Business Appointment = No & Temp above 70 = No

THEN Decision = wear slacks

b. IF Business Appointment = Yes & Temp above 70 = Yes

THEN Decision = wear shorts

c. IF Temp above 70 = No

THEN Decision = wear shorts

Data mining exam questions

Data Mining

Assiut University

Comments

Students also viewed

Related documents

Related Studylists

Preview text

1. Data mining is best described as the process of

a. identifying patterns in data.

b. deducing relationships in data.

c. representing data.

d. simulating trends in data.

2. Data used to build a data mining model.

a. validation data

b. training data

c. test data

d. hidden data

3. Supervised learning and unsupervised clustering both require at least one

a. hidden attribute.

b. output attribute.

c. input attribute.

d. categorical attribute.

4. Supervised learning differs from unsupervised clustering in that supervised learning

requires

a. at least one input attribute.

b. input attributes to be categorical.

c. at least one output attribute.

d. ouput attriubutes to be categorical.

5. Which of the following is a valid production rule for the decision tree below?

a. IF Business Appointment = No & Temp above 70 = No

THEN Decision = wear slacks

b. IF Business Appointment = Yes & Temp above 70 = Yes

THEN Decision = wear shorts

c. IF Temp above 70 = No

THEN Decision = wear shorts

Business

Appoint-

ment?

Temp

above

70?

No

Yes

Decision =

wear jeans

No

Yes

Decision =

wear slacks

Decision =

wear shorts

d. IF Business Appointment= No & Temp above 70 = No

THEN Decision = wear jeans

6. A statement to be tested.

a. theory

b. procedure

c. principle

d. hypothesis

7. Which statement about outliers is true?

a. Outliers should be identified and removed from a dataset.

b. Outliers should be part of the training dataset but should not be present in the test data.

c. Outliers should be part of the test dataset but should not be present in the training data.

d. The nature of the problem determines how outliers are used.

e. More than one of a,b,c or d is true.

8. Assume that we have a dataset containing information about 200 individuals. One

hundred of these individuals have purchased life insurance. A supervised data mining

session has discovered the following rule:

IF age < 30 & credit card insurance = yes

THEN life insurance = yes

Rule Accuracy: 70%

Rule Coverage: 63%

How many individuals in the class life insurance= no have credit card insurance and are less

than 30 years old?

a. 63

b. 70

c. 30

d. 27

9. unlike traditional production rules, association rules

a. allow the same variable to be an input attribute in one rule and an output attribute