SIT718: The given Dataset RedWine, is used to model Wine Quality Based on physicochemical tests The Dataset Provides the 1,599 Red Wine Samples From The North of Portugal: Real World Analytics Assignment, Deakin University, Australia
|Subject||SIT718: Real World Analytics|
Red wine quality Dataset
The given dataset, “RedWine.txt”, is used to model wine quality based on physicochemical tests. The dataset provides the 1,599 red wine samples from the north of Portugal. It is a modified version of the data used in the study . This dataset includes 5 variables, denoted as X1, X2, X3, X4, X5, and Y, described as follows:
X1 – citric acid
X2 – chlorides
X3 – total sulfur dioxide
X4 – pH
X5 – alcohol
Y – quality (score between 0 and 10)
* Q4 and Q5 are for students who are aiming for HD.
1. Understand the data
(i) Download the text file (RedWine.txt) and save it to your R working directory.
(ii) Assign the data to a matrix, e.g. using the.data <- as.matrix (read.table (“RedWine.txt “))
(iii) The variable of interest is quality (Y). To investigate Y, generate a subset of 450 data, e.g. using: my.data <- the.data[sample(1:1599,450),c(1:6)]
(iv) Using scatter plots and histograms, report on the general relationship between each of the variables X1, X2, X3, X4, X5 and the variable of interest Y. Include 5 scatter plots, 6 histograms, and 1 or 2 sentences for each of the variables, including the variable of interest Y.
2. Transform the data
(i) Choose any four from the five variables (X1, X2,.., X5). Make appropriate transformations to the chosen four variables and the variable of interest Y individually, so that the values can be aggregated in order to predict the variable of interest.
Removing or imputing outliers if necessary. Assign your transformed data along with your transformed variable of interest to an array. Save it to a txt file titled “nametransformed.txt” using write.table(your.data,”name transformed. txt”)
where “name” is replaced with your name – you can use your surname or first name. [All the following tasks are based on the saved transformed data]
(ii) Briefly explain the transformations applied for the selected four variables and the variable of interest. (1- 2 sentences each)
3. Build models and investigate the importance of each variable [30 marks]
(i) Download the AggWaFit718.R file to your working directory and load it into the R workspace using, source(“AggWaFit718.R”)
(ii) Use the fitting functions to learn the parameters for
• A-weighted arithmetic mean (WAM)
• Weighted power means (WPM) with p = 0.1, and p = 6 [define your own generator]
• An ordered weighted averaging function (OWA), and
• A Choquet integral.
(iii) Include two tables in your report – one with the error measures and correlation coefficients, and one summarising the weights/parameters and any other useful information learned for your data.
(iv) Compare and interpret the data in your tables. Comment on
a. How good the model is,
b. The importance of each of the variables (the four variables that you have selected),
c. Any interaction between any of those variables (are they complementary or redundant?) and
d. Better models favour higher or lower inputs.
(1-3 paragraphs for part 3(iv))
4. Use your model for prediction
(i) Choose your best fitting model based on Q3(iv). Using your best fitting model, predict the wine quality for the following input X1=0; X2= 0.075; X3=41; X4=3.53; X5=9.3. [Use the same pre-process as Q2]
(ii) Give your result and comment on whether you think it is reasonable. (1-2 sentences).
(iii) Comment on the best conditions (in terms of your chosen four variables) under which a higher quality wine will occur. (1-2 sentences).
5. Comparing with a linear regression model
Linear regression is used to predict the value of an outcome variable Y based on one or more input predictor variables X.
The equation is 𝐘 = 𝛃𝟎 + 𝛃𝟏𝑿𝟏 + 𝛃𝟐𝑿𝟐 + ⋯ 𝛃𝐧𝑿𝒏 + 𝜺. The built-in function lm() is used to fit linear models in R.
(i) Build your linear model using the same dataset in Question 3 and describe the summary statistics for your model using the function summary().
(ii) Compare the performance of the linear model you got with your best fitting model in
Question 4. You can visualise the predicted Y values of both models on the data (used in Question 3) and compare them with the true Y values.
(iii) Give your comment on the differences between the linear model and your best fitting model. (2-4 sentences).
Stuck in Completing this Assignment and feeling stressed ? Take our Private Writing Services.
Get Help By Expert
Looking for someone to do my assignment cheap on SIT718: Real-World Analytics assignment? then don't worry. At Australia Assignment help we have hired PhD and master degree native experts from a reputed university who have more than 10 years of experience in making flawless solutions on SIT718: Real-World Analytics assignments at a low price.
Recent Solved Questions
- BSBPMG430: List five project management tools and explain their use in project management: Undertake project work Assignment, Australia
- Explain The Philosophy Underpinning Gestalt Therapy With Reference: Gestalt Therapy Assignment, VU, Australia
- PVVAL103A: Identify Relevant Planning Laws And Codes To Undertake Property Valuation: Planning And Spatial Analysis Assignment, UT, Australia
- Huawei: How Can We Lead the Way?
- Dr. Stephen Karpman first theorized the Drama Triangle in the late 1960s. It demonstrates the three roles of Victim: Diploma of Holistic Therapies Assignment, TUA, Australia
- CPCCBC4004A: To complete this task, you are required to submit a price for the contractual obligations of Mr. and Mrs. Brown’s construction project: Identify and produce estimated costs for building and construction projects Assessment, VU, Australia
- Provide a Case Study Focusing on the Midwifery Management, Support, and Referral Pathways for the Woman: Perinatal Mental Health Assignment, ANU, Australia
- BUSM2562: You are required to prepare a personal reflection focusing on how you perceive your strengths and weaknesses: Understanding the Business Environment Report, RMIT, Australia
- The study of visible sound vibration, called cymatics, explores, examines, and explains how sound and matter: Diploma of Holistic Integrated Creative Arts Therapy course work, Australia
- Clay is a familiar material in art therapy and in psychotherapy. Many advocate the therapeutic potential of clay as a tool for advancing: Clay Therapy Essay, HHC, Australia