Sale!

CS 411: Database Systems Assignment 5 solved

Original price was: $35.00.Current price is: $30.00. $25.50

Category:

Description

5/5 - (5 votes)

1 Query Execution (25 pts)
Consider the following relations with no indexes on them:
• Relation R has 5,000 tuples, 100 tuples per block
• Relation S has 2,000 tuples, unknown number of tuples per block
The number of blocks in memory is 10. Say, S is as large as possible (within the limits that
main memory can afford) for the two pass sort-merge join (slide 57). That is, if S was larger,
the two-pass sort-merge join would not work.
Answer the following questions:
A How many tuples per block does S have? (Do not forget to show your calculations.)
B Using your answer from A, what is the cost of joining R and S using the two pass
sort-merge join algorithm (slide 57)?
C Using your answer from A, what is the cost of joining R and S using the optimized
block nested-loop join algorithm?
D What is the cost of joining R and S using a two pass hash-based join?
E Based on questions 2, 3, and 4, explain which variant of the algorithm you would
choose in terms of I/O cost. If multiple algorithms have the same I/O cost, explain
other considerations that may influence your choice.
3
2 Query Optimization (35 pts)
Consider the relations A(x,y,z), B(w,x), and C(u,v,w), with the following properties:
A(x,y) B(y,z) C(z,x,u)
T(A) = 2500 T(B) = 1000 T(C) = 6000
V(A, x) = 30 V(B, y) = 250 V(C, z) = 20
V(A, y) = 500 V(B, z) = 100 V(C, x) = 60
V(C, u) = 40
where, T(R) = number of tuples in relation R and V(R, a) = number of distinct values of
attribute a in relation R. Estimate the sizes (measured in number of tuples) of the result of
the following expressions:
1. A × C
2. A ./ B
3. SELECT u FROM C WHERE u=20
4. σx = 10 and y=30 (B ./ C)
4
3 Dynamic Programming (40 pts)
Consider the following relations, where T(R) = number of tuples in relation R and V(R, a)
= number of distinct values of attribute a in relation R.
A(x,y) B(y,z) C(z,x,u) D(u,v)
T(A) = 2500 T(B) = 1000 T(C) = 6000 T(D) = 2000
V(A, x) = 30 V(B, y) = 250 V(C, z) = 20 V(D, u) = 100
V(A, y) = 200 V(B, z) = 100 V(C, x) = 60 V(D, v) = 40
V(C, u) = 40
We want to join all these relations as efficiently as possible. Determine the most efficient
way to do the join. Clearly state any assumptions you have made. Show your work by
completing the following table (each step in the dynamic programming algorithm should be
one row):
Subset Size Lowest Cost Lowest cost plan
. . . . . . . . . . . .