Friday, November 30, 2012

.::VULMSIT::.eNoxel.com CS614 Quiz No.2 Nov 29, 2012

CS614 - Data Warehousing

Quiz No.2 Nov 29,2012

 

Question # 1 of 10 ( Start time: 10:29:52 PM ) Total Marks: 1
Data mining uses _________ algorithms to discover patterns and regularities in data.
Select correct option:
Mathematical
Computational
Statistical
None of these

Question # 2 of 10 ( Start time: 10:31:13 PM ) Total Marks: 1
The goal of ___________ is to look at as few blocks as possible to find the matching records(s).
Select correct option:
Indexing
Partitioning
Joining
None of these

Question # 3 of 10 ( Start time: 10:32:34 PM ) Total Marks: 1
An optimized structure which is built primarily for retrieval, with update being only a secondary consideration is
Select correct option:
OLTP
OLAP
DSS
Inverted Index

Question # 4 of 10 ( Start time: 10:33:23 PM ) Total Marks: 1
If every key in the data file is represented in the index file then index is
Select correct option:
Dense Index
Sparse Index
Inverted Index
None of these

Question # 5 of 10 ( Start time: 10:34:47 PM ) Total Marks: 1
There are many variants of the traditional nested-loop join. If the index is built as part of the query plan and subsequently dropped, it is called
Select correct option:
Naive nested-loop join
Index nested-loop join
Temporary index nested-loop join
None of these

Question # 6 of 10 ( Start time: 10:36:08 PM ) Total Marks: 1
Data mining evolve as a mechanism to cater the limitations of ________ systems to deal massive data sets with high dimensionality, new data types, multiple heterogeneous data resources etc.
Select correct option:
OLTP
OLAP
DSS
DWH

Question # 7 of 10 ( Start time: 10:37:30 PM ) Total Marks: 1
A dense index, if fits into memory, costs only ______ disk I/O access to locate a record by given key.
Select correct option:
One
Two
Linear
Quadratic

Question # 8 of 10 ( Start time: 10:38:29 PM ) Total Marks: 1
Data mining derives its name from the similarities between searching for valuable business information in a large database, for example, finding linked products in gigabytes of store scanner data, and mining a mountain for a _________ of valuable ore.
Select correct option:
Furrow
Streak
Trough
Vein

Question # 9 of 10 ( Start time: 10:39:49 PM ) Total Marks: 1
If 'M' rows from table-A match the conditions in the query then table-B is accessed 'M' times. Suppose table-B has an index on the join column. If 'a' I/Os are required to read the data block for each scan plus 'b' I/Os for each data block then the total cost of accessing table-B is _____________ logical I/Os approximately.
Select correct option:
(a + b)M
(a - b)M
(a + b + M)
(a * b * M)

Question # 10 of 10 ( Start time: 10:41:16 PM ) Total Marks: 1
________ is the technique in which existing heterogeneous segments are reshuffled, relocated into homogeneous segments.
Select correct option:
Clustering
Aggregation
Segmentation
Partitioning

 

 

The goal of ideal parallel execution is to completely parallelize those parts of a computation that are not constrained by data dependencies. The ______ the portion of the program that must be executed sequentially, the greater the scalability of the computation

Larger

Smaller

Unambiguous

Superior

 

_______________, if fits into memory, costs only one disk I/O access to locate a record by given key.

An Inverted Index

A Sparse Index

A Dense Index

None of these

 

If someone told you that he had a good model to predict customer usage, the first thing you might try would be to ask him to apply his model to your customer _______, where you already knew the answer.

Base

Drive

File

Log

 

The automated, prospective analyses offered by data mining move beyond the analyses of past events provided by _____________ tools typical of decision support systems.

Introspective

Intuitive

Reminiscent

Retrospective

 

If every key in the data file is represented in the index file then index is

Dense Index   

Sparse Index

Inverted Index

None of these

 

A dense index, if fits into memory, costs only ______ disk I/O access to locate a record by given key.

One

Two

Linear

Quadratic

 

With data mining, the best way to accomplish this is by setting aside some of your data in a vault to isolate it from the mining process; once the mining is complete, the results can be tested against the isolated data to confirm the model's _______.

Validity           

Security

Integrity

None of these

 

Data mining uses _________ algorithms to discover patterns and regularities in data.

Mathematical

Computational

Statistical

None of these

 

The goal of ___________ is to look at as few blocks as possible to find the matching records(s).

Indexing

Partitioning

Joining

None of these

 

_______________, if too big and does not fit into memory, will be expensive when used to find a record by given key.

An Inverted Index

A Sparse Index

A Dense Index

None of these

 

 

There are many variants of the traditional nested-loop join. If the index is built as part of the query plan and subsequently dropped, it is called

Naive nested-loop join

Index nested-loop join

Temporary index nested-loop join

None of these

 

_______________, if fits into memory, costs only one disk I/O access to locate a record by given key.

An Inverted Index

A Sparse Index

A Dense Index

None of these

 

 

If 'M' rows from table-A match the conditions in the query then table-B is accessed 'M' times. Suppose table-B has an index on the join column. If 'a' I/Os are required to read the data block for each scan plus 'b' I/Os for each data block then the total cost of accessing table-B is _____________ logical I/Os approximately.

(a + b)M

(a - b)M

(a + b + M)

(a * b * M)

 

 

With data mining, the best way to accomplish this is by setting aside some of your data in a ________ to isolate it from the mining process; once the mining is complete, the results can be tested against the isolated data to confirm the model's validity.

Cell

Disk

Folder

Vault

 

The goal of ideal parallel execution is to completely parallelize those parts of a computation that are not constrained by data dependencies. The smaller the portion of the program that must be executed __________, the greater the scalability of the computation.

In Parallel

Distributed

Sequentially

None of these

 

 

Data mining is a/an __________ approach, where browsing through data using data mining techniques may reveal something that might be of interest to the user as information that was unknown previously.

Non-Exploratory

Exploratory

Compute Science

none of these

 

Data mining evolve as mechanism to cater the limitations of _____ systems to deal massive data sets with high dimensionality , new data types, multiple heterogeneous data resources etc..
OLTP

OLAP

DSS

DWH

To identify the __________________ required we need to perform data profiling
Degree of Transformation
Complexity
Cost
Time


Execution can be completed successfully or it may be stopped due to some error. If some error occurs, execution will be terminated abnormally and all transactions will be ___________
Committed to the database
Rolled back

Companies collect and record their own operational data, but at the same time they also use reference data obtained from _______ sources such as codes, prices etc.
Operational
None of these
Internal
External

 


Ad-hoc access means to run such queries which are known already.
True
False

 


____________ in agriculture extension is that pest population beyond which the benefit of spraying outweighs its cost.
Profit Threshold Level
Economic Threshold Level
Medicine Threshold Level
None of these


People that design and build the data warehouse must be capable of working across the organization at all levels
True
False

The _________ is only a small part in realizing the true business value buried within the mountain of data collected and stored within organizations business systems and operational databases.
Independence on technology
Dependence on technology
None of these

Many data warehouse project teams waste enormous amounts of time searching in vain for a ___________________.
Silver Bullet
Golden Bullet
Suitable Hardware
Compatible Product

 

Multidimensional databases typically use proprietary __________ format to store pre-summarized cube structures.
File
Application
Aggregate
Database

A dense index, if fits into memory, costs only ______ disk I/O access to locate a record by given key.
One
Two 
lg (n)
n

All data is ______________ of something real.
I An Abstraction
II A Representation
Which of the following option is true?
I Only
II Only
Both I & II
None of I & II

The key idea behind ___________ is to take a big task and break it into subtasks that can be processed concurrently on a stream of data inputs in multiple, overlapping stages of execution.
Pipeline Parallelism
Overlapped Parallelism
Massive Parallelism
Distributed Parallelism

Non uniform distribution, when the data is distributed across the processors, is called ______.
Skew in Partition
Pipeline Distribution
Distributed Distribution
Uncontrolled Distribution

The goal of ideal parallel execution is to completely parallelize those parts of a computation that are not constrained by data dependencies. The smaller the portion of the program that must be executed __________, the greater the scalability of the computation.
None of these
Sequentially
In Parallel
Distributed

 

Data mining is a/an __________ approach, where browsing through data using data mining techniques may reveal something that might be of interest to the user as information that was unknown previously.
Exploratory
Non-Exploratory
Computer Science

Data mining evolve as a mechanism to cater the limitations of ________ systems to dealmassive data sets with high dimensionality, new data types, multiple heterogeneous data resources etc.
OLTP
OLAP
DSS
DWH 

________ is the technique in which existing heterogeneous segments are reshuffled, relocated into homogeneous segments.
Clustering
Aggregation
Segmentation
Partitioning

To measure or quantify the similarity or dissimilarity, different techniques are available. Which of the following option represent the name of available techniques?
Pearson correlation is the only technique
Euclidean distance is the only technique
Both Pearson correlation and Euclidean distance
None of these

 

For a DWH project, the key requirement are ________ and product experience.
Tools
Industry
Software
None of these

Pipeline parallelism focuses on increasing throughput of task execution, NOT on __________ sub-task execution time.
Increasing
Decreasing
Maintaining
None of these

Focusing on data warehouse delivery only often end up _________.
Rebuilding
Success
Good Stable Product
None of these

Pakistan is one of the five major ________ countries in the world.
Cotton-growing
Rice-growing
Weapon Producing

_____________ is a process which involves gathering of information about column through execution of certain queries with intention to identify erroneous records.
Data profiling
Data Anomaly Detection
Record Duplicate Detection
None of these

Relational databases allow you to navigate the data in ____________ that is appropriate using the primary, foreign key structure within the data model.
Only One Direction
Any Direction
Two Direction
None of these

DSS queries do not involve a primary key
True
False

__________________ contributes to an under-utilization of valuable and expensive historical data, and inevitably results in a limited capability to provide decision support and analysis.
The lack of data integration and standardization
Missing Data
Data Stored in Heterogeneous Sources

 

 

DTS allows us to connect through any data source or destination that is supported by ____________
OLE DB
OLAP
OLTP
Data Warehouse

Data Transformation Services (DTS) provide a set of _____ that lets you extract, transform, and consolidate data from disparate sources into single or multipledestinations supported by DTS connectivity.
Tools
Documentations
Guidelines

If some error occurs, execution will be terminated abnormally and all transactions will be rolled back. In this case when we will access the database we will find it in the state that was before the ____________.
Execution of package
Creation of package
Connection of package

To judge effectiveness we perform data profiling twice. 
One before Extraction and the other after Extraction
One before Transformation and the other after Transformation
One before Loading and the other after Loading

The need to synchronize data upon update is called
Data Manipulation
Data Replication
Data Coherency
Data Imitation

Taken jointly, the extract programs or naturally evolving systems formed a spider web, also known as
Distributed Systems Architecture
Legacy Systems Architecture
Online Systems Architecture
Intranet Systems Architecture

 

Node of a B-Tree is stored in memory block and traversing a B-Tree involves ______ page faults.
O (n)
O (n2)
O (n lg n)
O (lg n)
Which statement is true for De-Normalization?
Redundant data is a performance liability at query time, but is a performance benefit at update time.
Redundant data is a performance benefit at both query time and update time.
Redundant data is a performance liability at both query time and update time.
Redundant data is a performance benefit at query time, but is a performance liability at update time.

It is observed that every year the amount of data recorded in an organization is

Doubles  

Triples

Quartiles

Remains same as previous year

 

Pre-computed _______ can solve performance problems

Aggregates  

Facts

Dimensions

 

The degree of similarity between two records, often measured by a numerical value between _______, usually depends on application characteristics.

0 and 1  

0 and 10

0 and 100

0 and 99

The purpose of the House of Quality technique is to reduce ______ types of risk.

Two  

Three

Four

All

NUMA stands for __________

Non-uniform Memory Access

Non-updateable Memory Architecture

New Universal Memory Architecture

 

There are many variants of the traditional nested-loop join. If the index is built as part of the query plan and subsequently dropped, it is called

Naive nested-loop join

Index nested-loop join

Temporary index nested-loop join  

None of these

The Kimball s iterative data warehouse development approach drew on decades of experience to develop the _____________.

Business Dimensional Lifecycle

Data Warehouse Dimension

Business Definition Lifecycle

OLAP Dimension

 

For a smooth DWH implementation we must be a technologist.

True

False  

During the application specification activity, we also must give consideration to the organization of the applications.

True  

False

 

The most recent attack is the ________ attack on the cotton crop during 2003- 04, resulting in a loss of nearly 0.5 million bales.

Boll Worm  

Purple Worm

Blue Worm

Cotton Worm

 

The users of data warehouse are knowledge workers in other words they are_________ in the organization.

Decision maker

Manager

Database Administrator

DWH Analyst

 

_________ breaks a table into multiple tables based upon common column values.

Horizontal splitting  

Vertical splitting

 

As apposed to the out come of classification , estimation deal with ____________ valued

outcome.

Discrete

Isolated

Continuous

Distinct

 

 

 

 

The goal of ______is to look at as few block as possible to find the matching records. Indexing

Partitioning

Joining

none of these

nested loop join

none of these

 

The technique that is used to perform these feats in data mining modeling, and this act of model building is something that people have been doing for long time, certainly before the _______ of computers or data mining technology.

Access Advent

Ascent Avowal

 

A data warehouse may include
Legacy systems
Only internal data sources
Privacy restrictions
Small data mart

De-Normalization normally speeds up 
Data Retrieval
Data Modification
Development Cycle
Data Replication

In horizontal splitting, we split a relation into multiple tables on the basis of 
Common Column Values
Common Row Values
Different Index Values
Value resulted by ad-hoc query

 

For a given data set, to get a global view in un-supervised learning we use
One-way Clustering
Bi-clustering
Pearson correlation
Euclidean distance

In DWH project, it is assured that ___________ environment is similar to the production environment.
Designing
Development
Analysis
Implementation

For good decision making, data should be integrated across the organization to cross the LoB (Line of Business). This is to give the total view of organization from:
Owner's Perspective
Customer's Perspective
Decision Maker's Perspective
Employee's Perspective

Which is the least appropriate join operation for Pipeline parallelism?

Hash Join

Inner Join

Outer Join

Sort-Merge Join

 

Data mining derives its name from the similarities between searching for valuable business information in a large database, for example, finding linked products in gigabytes of store scanner data, and mining a mountain for a _________ of valuable ore.

Furrow

Streak

Trough

Vein

 

With data mining, the best way to accomplish this is by setting aside some of your data in a ________ to isolate it from the mining process; once the mining is complete, the results can be tested against the isolated data to confirm the model's validity.

Cell

Disk

Folder

Vault

We must try to find the one access tool that will handle all the needs of their users.

True

False

Investing years in architecture and forgetting the primary purpose of solving business problems, results in inefficient application. This is the example of _________ mistake.

Extreme Technology Design

Extreme Architecture Design

 

The automated, prospective analyses offered by data mining move beyond the analysis of past

events provided by respective tools typical of ___________.

OLTP

OLAP

Decision Support systems

None of these

There are many variants of the traditional nested-loop join, if there is an index is exploited, then it is called……

Naïve nested loop join index

Nested loop join temporary index

Index nested-loop joins

 


A data warehouse implementation without an OLAP tool is always possible.

True

False

 

 

_____modeling technique is more appropriate for data warehouses.

entity-relationship

dimensional

physical

None of the given

 

 

The performance in a MOLAP cube comes from the O(1) look-up time for the array data structure.

 

True

False

 

 

Multi-dimensional databases (MDDs) typically use ___________ formats to store pre-summarized cube structures.

 

SQL

proprietary file

Object oriented

Non- proprietary file

 

Slice and Dice is changing the view of the data.

True

False

 

 

 

Data warehousing and on-line analytical processing (OLAP) are _______ elements of decision support system.

 

Unusual

Essential

Optional

None of the given

 

 

Virtual cube is used to query two similar cubes by creating a third "virtual" cube by a join between two cubes.

 

True

False

 

Analytical processing uses ____________ , instead of record level access.

multi-level aggregates

Single-level aggregates

Single-level hierarchy

None of the Given

 

 

The divide&conquer cube partitioning approach helps alleviate the ____________ limitations of MOLAP implementation.

Flexibility

Maintainability

Security

Scalability

 

 

In a traditional MIS system, there is an almost linear sequence of queries.

True

False

 

 

Data Warehouse provides the best support for analysis while OLAP carries out the _________ task.

Mandatory

Whole

Analysis

Prediction

 

 

DOLAP allows download of "cube" structures to a desktop platform with the need for shared relational or cube server.

 

True

False

 

 

The STAR schema used for data design is a __________ consisting of fact and dimension tables.

Select correct option:

Network model

Relational model

Hierarchical data model

None of the given

 

Data Warehouse provides the best support for analysis while OLAP carries out the _________ task.

Select correct option:

Mandatory

Whole

Analysis

Prediction

 

 

Virtual cube is used to query two similar cubes by creating a third "virtual" cube by a join between two cubes.

Select correct option:

 True

 False

 

Data warehousing and on-line analytical processing (OLAP) are _______ elements of decision support system.

Select correct option:

Unusual

Essential

Optional

None of the given


--
Zindagi mein 2 Logo ka buhat khayal rahkoooo
Ist woh jiss ney tumhari jeet ke Liye buhat kuch hara hoo
(Father)
2nd woh jiss ko tum ney har dukh me pukaara hoo (Mother)
Regards,
Umair Saulat Mc100403250

--
--
Virtual University of Pakistan*** IT n CS Blog
================================
http://www.eNoxel.com
http://www.enoxelit.tk
http://www.geniusweb.tk
 
and Please do Share this group with your Friends and Class Fellows so that our Circle would expand and can be more useful for other Students.
 
Thanks, n Best of Luck......
 
 
You received this message because you are subscribed to the Google
Groups "vulms" group.
To post to this group, send email to vulmsit@googlegroups.com
To unsubscribe from this group, send email to
vulmsit+unsubscribe@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/vulmsit?hl=en?hl=en
---
You received this message because you are subscribed to the Google Groups "vulms" group.
Visit this group at http://groups.google.com/group/vulmsit?hl=en-GB.
 
 

No comments:

Post a Comment