Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
forecasting:regression [2020/10/19 19:28]
jclaudio [Regression]
forecasting:regression [2021/09/19 21:59] (current)
Line 9: Line 9:
   * Decision Tree Regression   * Decision Tree Regression
   * Random Forest Regression   * Random Forest Regression
 +
 +
 ==== Simple Linear Regression ==== ==== Simple Linear Regression ====
 **Form:** y = w0 + w1*x1\\ **Form:** y = w0 + w1*x1\\
Line 19: Line 21:
   - Apply the **.predict** method to your regressor to make any predictions about your data   - Apply the **.predict** method to your regressor to make any predictions about your data
 **Sample code and output:**\\ **Sample code and output:**\\
-{{ :​forecasting:​simpleregresscode.png |}}+{{ :​forecasting:​simpregcode.png?1600 |}}
 {{ :​forecasting:​simpleregressoutput.png |}}\\ {{ :​forecasting:​simpleregressoutput.png |}}\\
 //Note that we can easily visualize our results by using a 2D plot. This type of plot is only possible when we have no more than **1** independent variable.// //Note that we can easily visualize our results by using a 2D plot. This type of plot is only possible when we have no more than **1** independent variable.//
 +
 +
 ==== Multiple Linear Regression ==== ==== Multiple Linear Regression ====
 **Form:** y = w0 + w1*x1 + w2*x2 + ... + wn*xn\\ **Form:** y = w0 + w1*x1 + w2*x2 + ... + wn*xn\\
Line 29: Line 33:
   - Use the **iloc[//​rows//,​ //​columns//​].values** method from the **pandas** library to grab all columns which correspond to your various independent variables. Note that using iloc[:, :-1].values grabs all rows and all columns except for the last column of your dataset. This assumes that your .csv file was organized such that the dependent variable is in the last column. ​   - Use the **iloc[//​rows//,​ //​columns//​].values** method from the **pandas** library to grab all columns which correspond to your various independent variables. Note that using iloc[:, :-1].values grabs all rows and all columns except for the last column of your dataset. This assumes that your .csv file was organized such that the dependent variable is in the last column. ​
   - Repeat the previous step for your dependent variable using y = dataset.iloc.[:,​ -1].values   - Repeat the previous step for your dependent variable using y = dataset.iloc.[:,​ -1].values
-  - Import **LinearRegression** ​library ​from **sklearn.linear_model**+  - Import **LinearRegression** ​class from **sklearn.linear_model**
   - Create an instance of the **LinearRegression()** class   - Create an instance of the **LinearRegression()** class
   - Apply the **.fit** method to your independent and dependent variables   - Apply the **.fit** method to your independent and dependent variables
Line 37: Line 41:
 {{ :​forecasting:​multline_sampleout.png |}}\\ {{ :​forecasting:​multline_sampleout.png |}}\\
 //Note that we sometimes cannot create a plot for Multiple Linear Regression because we may have more than **2** independent variables (ie: the graph would have more than 3 dimensions). In this case, we can visualize our results by just printing the predicted output and true output side-by-side in a matrix// //Note that we sometimes cannot create a plot for Multiple Linear Regression because we may have more than **2** independent variables (ie: the graph would have more than 3 dimensions). In this case, we can visualize our results by just printing the predicted output and true output side-by-side in a matrix//
 +
 +
 ==== Polynomial Regression ==== ==== Polynomial Regression ====
 **Form:** y = w0 + w1*x1^1 + w2*x1^2 + w3*x1^3... + wn*x1^n\\ **Form:** y = w0 + w1*x1^1 + w2*x1^2 + w3*x1^3... + wn*x1^n\\
Line 42: Line 48:
 **Library used:** [[https://​scikit-learn.org/​stable/​modules/​generated/​sklearn.preprocessing.PolynomialFeatures.html#​sklearn.preprocessing.PolynomialFeatures|sklearn.preprocessing.PolynomialFeatures]]\\ **Library used:** [[https://​scikit-learn.org/​stable/​modules/​generated/​sklearn.preprocessing.PolynomialFeatures.html#​sklearn.preprocessing.PolynomialFeatures|sklearn.preprocessing.PolynomialFeatures]]\\
 **General workflow:** **General workflow:**
 +  - Use the **iloc[//​rows//,​ //​columns//​].values** method from the **pandas** library to grab all columns which correspond to your various independent variables. Note that using iloc[:, :-1].values grabs all rows and all columns except for the last column of your dataset. This assumes that your .csv file was organized such that the dependent variable is in the last column. ​
 +  - Repeat the previous step for your dependent variable using y = dataset.iloc.[:,​ -1].values
 +  - Import **LinearRegression** class from **sklearn.linear_model**
 +  - Create an instance of the **LinearRegression()** class
 +  - Import **PolynomialFeatures** class from **sklearn.preprocessing** ​
 +  - Create an instance of the **PolynomialFeatures()** class and define the degree of your polynomial
 +  - Apply the **PolynomialFeatures().fit_transform** method to your independent variable to change it into a polynomial matrix and save this into a new variable.
 +  - Apply the **LinearRegression().fit** method to your polynomial independent variable and your dependent variable
 +  - Apply the **.predict** method to your regressor to make any predictions about your data
 +**Sample Code and Output:**
 +{{ :​forecasting:​polyregcode.png?​1600 |}}
 +{{ :​forecasting:​polyregoutput.png?​400 |}}
 +//Note that we can easily visualize our results by using a 2D plot. This type of plot is only possible when we have no more than **1** independent variable.//
 +
 ==== Support Vector Regression ==== ==== Support Vector Regression ====
 **Form:** Universal. Can be used with any form. Just need to adjust the SVM Kernel accordingly.\\ **Form:** Universal. Can be used with any form. Just need to adjust the SVM Kernel accordingly.\\
Line 47: Line 67:
 **Library used:** [[https://​scikit-learn.org/​stable/​modules/​generated/​sklearn.svm.SVR.html#​sklearn.svm.SVR|sklearn.SVM.svr]]\\ **Library used:** [[https://​scikit-learn.org/​stable/​modules/​generated/​sklearn.svm.SVR.html#​sklearn.svm.SVR|sklearn.SVM.svr]]\\
 **General workflow:** **General workflow:**
 +  - Use the **iloc[//​rows//,​ //​columns//​].values** method from the **pandas** library to grab all columns which correspond to your various independent variables. Note that using iloc[:, :-1].values grabs all rows and all columns except for the last column of your dataset. This assumes that your .csv file was organized such that the dependent variable is in the last column. ​
 +  - Repeat the previous step for your dependent variable using y = dataset.iloc.[:,​ -1].values
 +  - Import **StandardScaler** class from **sklearn.preprocessing**
 +  - Create an 2 instances of the **StandardScaler()** class, one for your independent matrix, and one for your dependent matrix
 +  - Apply the **StandardScaler().fit_transform** method to your independent and dependent variables to perform feature scaling accordingly
 +  - Import **SVR** class from **sklearn.svm** ​
 +  - Create an instance of the **SVR()** class and set your kernel to whatever you want (Radial Basis Function is commonly used)
 +  - Apply the **SVR().fit** method to your independent variable and your dependent variable
 +**Sample Code and Output:**
 +{{ :​forecasting:​suppvectregcode.png?​1600 |}}
 +{{ :​forecasting:​suppvectregoutput.png?​400 |}}
 +//Note that we can easily visualize our results by using a 2D plot. This type of plot is only possible when we have no more than **1** independent variable.//
 +
 ==== Decision Tree Regression ==== ==== Decision Tree Regression ====
 +**Form:** Universal. Can be used with any form.\\
 +**When to use it:** Used when you want to divide your dataset into smaller subs-sets in the form of a tree structure\\
 +**Library used:** [[https://​scikit-learn.org/​stable/​modules/​generated/​sklearn.tree.DecisionTreeRegressor.html#​sklearn.tree.DecisionTreeRegressor|sklearn.tree.DecisionTreeRegressor]]\\
 +**General workflow:**
 +  - Import **DecisionTreeRegressor** class from **sklearn.tree**
 +  - Create an instance of the **DecisionTreeRegressor()** class
 +  - Apply the **.fit** method to your independent and dependent variables
 +  - Apply the **.predict** method to your regressor to make any predictions about your data
 +**Sample Code and Output:**
 +{{ :​forecasting:​dectreeregcode.png?​1600 |}}
 +{{ :​forecasting:​dectreeregoutput.png?​400 |}}
 +//Note that we can easily visualize our results by using a 2D plot. This type of plot is only possible when we have no more than **1** independent variable.//
  
 ==== Random Forest Regression ==== ==== Random Forest Regression ====
 +**Form:** Universal. Can be used with any form.\\
 +**When to use it:** Used when you want to have multiple random decision trees to improve your regression results.\\
 +**Library used:** [[https://​scikit-learn.org/​stable/​modules/​generated/​sklearn.ensemble.RandomForestRegressor.html#​sklearn.ensemble.RandomForestRegressor|sklearn.ensemble.RandomForestRegressor]]\\
 +**General workflow:**
 +  - Import **RandomForestRegressor** class from **sklearn.ensemble**
 +  - Create an instance of the **RandomForestRegressor()** class
 +  - Apply the **.fit** method to your independent and dependent variables
 +  - Apply the **.predict** method to your regressor to make any predictions about your data
 +**Sample Code and Output:**
 +{{ :​forecasting:​randforestregcode.png?​1600 |}}
 +{{ :​forecasting:​randforestregoutput.png?​400 |}}
 +//Note that we can easily visualize our results by using a 2D plot. This type of plot is only possible when we have no more than **1** independent variable.//
  • forecasting/regression.1603135737.txt.gz
  • Last modified: 2021/09/19 21:59
  • (external edit)