BIF524/CSC463 Project -Phase 3 Solved

Original price was: $40.00.Current price is: $35.00.



5/5 - (2 votes)


The project is split into three phases that match the learning outcomes throughout the
course. Each phase accounts for 10% of your total grade.


The aim of this project is to demonstrate your ability to apply and discuss the outcomes
of various data mining techniques on a problem and a dataset of your interest.
 The dataset must include quantitative and qualitative attributes.
 Your work should not be limited to what you learn in the practical sessions of the course.
 You must submit an R markdown, knitted as a pdf file, for every phase.
 You can work in a group of two – same group in all phases.
 Your grade will be subject to a 5% penalty for every day of submission delay.
– Phase III: (10%) due Wednesday, Dec. 7, 11:59pm.
 Use the dataset that you picked in Phase 2 or choose a new dataset – discuss your choice
with me in that case. (1%)

 N.B. Your dataset should not be associated with any existing work related to the
required tasks – e.g., on kaggle, Github, …
 Apply tree-based approaches including decision trees, random forest, bagging, and boosting.
 Apply unsupervised techniques including k-means and hierarchical clustering, as well as
principal component analysis. Analyze and comment on your results. (6%)

For each phase, make sure to highlight the following in your R markdown pdf file:

 Dataset description including context and features
 Data mining tasks
 Model performance
 Results
 Comparison of results
 Comments and interpretation
Name of your R markdown pdf file following this template: NameOfTeamMember1-
NameOfTeamMember2_Phase PhaseNumber.