CSCI 5408 Assignment 2 solution

$30.00

Download Details:

  • Name: Assignment-2-cmli21.zip
  • Type: zip
  • Size: 787.00 B

Category:

Description

5/5 - (7 votes)

Problem #1: This problem contains one reading task.
Reading Material #1: To retrieve the paper, visit IEEE database through libraries.dal.ca
Y. Gao, X. Gao, X. Yang, J. Liu and G. Chen, “An Efficient Ring-Based Metadata Management Policy for
Large-Scale Distributed File Systems,” in IEEE Transactions on Parallel and Distributed Systems, vol. 30,
no. 9, pp. 1962-1974, 1 Sept. 2019, doi: 10.1109/TPDS.2019.2901883.
→ Read the paper and perform the following:
• Write a summary ( 1.5 page) on the paper in your own words. (you do not
need to add images/figures/tables from the paper. However, you can add your
own block diagrams or flowcharts to support the summary you have written)
• What is the central idea of discussion?
• Did you find any topic of interest in this paper? If Yes, what are those, and why
do you think those are interesting? If No, then as per you, what are the
shortcomings of this paper?
Problem #1 Submission Requirements:
Approx. 1.5-page Report (problem1.pdf) containing the summary and analysis
Winter 2022 saurabh.dey@dal.ca
Problem #2: This problem contains series of tasks. You need to fill a worksheet and
provide screenshots to support your worksheet
Research and Development: You need to create distributed DBMS infrastructure, and
perform a comparative study of local and distributed transactions
Visit the website and extract the following datasets:
https://www.kaggle.com/olistbr/brazilian-ecommerce
Task 1: Building database from given dataset

• Download the CSV files given in the Kaggle link, and consider those as your data
points.
o Similar to previous assignment (Assignment 1), clean and format each csv file
(replace null values with ‘0’ if it is a number column || replace null values with
‘ ’ if it is a text/string column) – Write a ½ page summary on what type of
cleaning and formatting you have applied on the CSV files. A single ½ page
summary for all CSV files is sufficient. Add summary to problem2.pdf
o You do not need to remove any columns, keep all columns as it is in CSVs.
• Create a single database (name of the database “A2”) in your local
MySQL RDBMS.
• Create 9 tables within the newly created database. These tables will contain structure
identical to the CSV files you cleaned, and you will populate the tables by importing
data from the CSV files. (if files are large, you need to import using command prompt.
Workbench import option may not work)
• Export the SQL (structure + value) of the database with all tables and data. Name the
file “MyDatabase.SQL”
• Create a GCP MySQL Virtual Machine instance and use the MyDatabase.SQL to create
the database, tables, and uploading data.
• If you have successfully completed the above steps – it means, at this point, you have
two identical databases. One in your local machine and another in the cloud.
• In Problem2.pdf, attach screenshots of your cloud instance, and local instance
running mysql command “SHOW TABLES FROM A2;”

Task 2: Perform Transaction in your local machine
• In your local machine MySQL, you have 9 tables in the database A2. You
need to write a Transaction with at least (2 insert, 2 delete, 2 update, 3 select
operations) considering all the 9 tables. For insert, and update you can create your
own set of dummy data. Based on your transaction query formation, complexity, and
uniqueness you will get points for Novelty
• Capture the transaction execution time (i.e. execution time for all 9 or more
statements
Begin Transaction
—–
—–
CSCI 5408
End Transaction
Task 3: Perform Transaction in the Remote machine
• In your GCP MySQL virtual machine MySQL, you have 9 tables in the database
A2. You need to write a Remote Transaction with at least (2 insert, 2
delete, 2 update, 3 select operations) considering all the 9 tables. For insert, and
update you can create your own set of dummy data. Based on your transaction query
formation, complexity, and uniqueness you will get points for Novelty. You should not
use the same queries that you used in Task 2. Please do some major modifications in
the SQL queries
• Capture the transaction execution time (i.e. execution time for all 9 or more
statements
Begin Transaction
—–
—–
End Transaction

Task 4: Perform Distributed Transaction
• In your GCP MySQL virtual machine, and in your local MySQL, you have 9 tables in the
database A2. You need to write a Distributed Transaction with at least
(2 insert, 2 delete, 2 update, 3 select operations) considering all the 9 tables.
• This time, from the 9 unique tables, you need to use any 5 tables (your choice) from
your Local machine database, and the remaining 4 tables in the GCP instance for
performing the distributed transaction.
• For insert, and update you can create your own set of dummy data. Based on your
transaction query formation, complexity, and uniqueness you will get points for
Novelty. Please do some major modifications in the SQL queries.
• You need to write a Java program to call the distributed transaction.
Hint: Set connection.setAutoCommit(false); in your Java program to control
the execution and database update
o Run Java Program:
o within Java program select the 1st database (Local machine)
Begin Transaction
—–
—–
select the 2nd database (Remote machine)
—–
—–
End Transaction
connection.setAutoCommit(true);
• Capture the distributed transaction execution time (i.e. execution time for all 9 or
more statements
• Fill the worksheet and submit it as part of Problem2.pdf
Tasks Execution
Time
Transaction Query Your Observations
Task 2 Write the transaction query here for
each transaction
Task 3
Task 4
Problem #2 Submission Requirements:
1. Upload your program code for Problem 2 (Task 4) to gitlab (https://git.cs.dal.ca).
2. Provide summary needed for Task 1, and screenshots
3. Provide Queries of all transactions you performed in Task2, Task3, and Task4.
4. Provide the filled worksheet