{"id":1128,"date":"2023-11-19T16:10:40","date_gmt":"2023-11-19T16:10:40","guid":{"rendered":"https:\/\/savvaspanagi.com\/?p=1128"},"modified":"2023-11-20T17:46:58","modified_gmt":"2023-11-20T17:46:58","slug":"deploying-diverse-regression-machine-learning-algorithms-an-implementation-overview","status":"publish","type":"post","link":"https:\/\/savvaspanagi.com\/?p=1128","title":{"rendered":"Deploying Diverse Regression Machine Learning Algorithms: An Implementation Overview"},"content":{"rendered":"\n<p class=\"has-black-color has-text-color\">In this current post, we go into the exploration and analysis of various regression machine learning methods. Our objective is to utilize a range of machine learning techniques to forecast the load demand of the Cyprus power system.<\/p>\n\n\n\n<p class=\"has-black-color has-text-color\">In the initial phase, the &#8216;data&#8217; dataframe include all essential features and measurements spanning a 16-month period. Throughout the analysis, our focus is directed towards two important features: temperature and hour. While incorporating additional features could potentially enhance result accuracy, for code explanation and methodology simplification, we have opted to retain only these two key features.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\nimport matplotlib.pyplot as plt\nimport numpy as np\n\n# Create a scatter plot with demand as color\nplt.scatter(data&#x5B;&#039;hour&#039;], data&#x5B;&#039;T2M&#039;], c=data&#x5B;&#039;Demand&#039;], cmap=&#039;plasma&#039;, s=50)\n\n# Add colorbar\ncbar = plt.colorbar()\ncbar.set_label(&#039;Demand (MW)&#039;)\n\n# Set labels for axes\nplt.xlabel(&#039;Hour of the Day&#039;)\nplt.ylabel(&#039;Temperature (\u00b0C)&#039;)\n\nplt.title(&#039;Temperature vs Hour (Color-coded by Demand)&#039;)\n\nplt.show()\n\n<\/pre><\/div>\n\n<figure><img decoding=\"async\" style=\"aspect-ratio: 1.260485651214128; width: 526px; height: auto;\" src=\"http:\/\/savvaspanagi.com\/wp-content\/uploads\/2023\/11\/image-2.png\" alt=\"\" \/><\/figure>\n\n\n<p class=\"has-black-color has-text-color\">The first analysis illustrates the fluctuation of demand in megawatts (MW) in response to variations in temperature and hour. In instances of low demand, the figure showcases a darker color, transitioning to brighter hues as demand increases.<\/p>\n\n\n\n<p class=\"has-black-color has-text-color\"><strong>Data&#8217;s preparation<\/strong><\/p>\n\n\n\n<p class=\"has-black-color has-text-color\">Prior to normalizing the data, it&#8217;s crucial to recognize that the &#8216;hour&#8217; feature is cyclical. A detailed analysis on handling cyclical features has been discussed in a previous post, accessible <a href=\"https:\/\/savvaspanagi.com\/?p=803\"><span style=\"text-decoration: underline;\">here<\/span><\/a>. Furthermore, the guidance on scaling data developed in other post <a href=\"https:\/\/savvaspanagi.com\/?p=844\"><span style=\"text-decoration: underline;\">here<\/span><\/a>, Normalisation and Standardisation method implemented. Moreover, the data&#8217;s was divided into test and training data.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\nfrom sklearn.preprocessing import StandardScaler\nsc=StandardScaler()\ndata.loc&#x5B;:,&#x5B;&#039;hour_sin&#039;,&#039;hour_cos&#039;,&#039;T2M_norm&#039;]]=sc.fit_transform(data.loc&#x5B;:,&#x5B;&#039;hour_sin&#039;,&#039;hour_cos&#039;,&#039;T2M&#039;]])\n\nfrom sklearn.model_selection import train_test_split\nX_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)\n<\/pre><\/div>\n\n\n<p class=\"has-black-color has-text-color\"><strong>Assessing Machine Learning Regression Models: RMSE and R-squared Evaluation<\/strong><\/p>\n\n\n\n<p class=\"has-black-color has-text-color\">When it comes to checking how well a machine learning regression model is doing, we use two important measures: RMSE and R-squared. RMSE looks at the average difference between the predicted values and the actual values. If the RMSE is low, it means the model is doing a good job. R-squared, on the other hand, tells us how much of the variability in the data our model can explain. A higher R-squared is better, showing that the model fits the data well. So, by looking at both RMSE and R-squared, we can figure out if our regression model is accurate and how well it explains the data.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\nfrom sklearn.metrics import r2_score\nfrom sklearn.metrics import mean_squared_error\n# Calculate RMSE\nrmse = np.sqrt(mean_squared_error(y_test, X_test&#x5B;:,5]))\n\n# Calculate R-squared\nr_squared = r2_score(y_test, X_test&#x5B;:,5])\n\nprint(f&#039;RMSE: {rmse}&#039;)\nprint(&quot;R-squared:&quot;, r_squared)\n<\/pre><\/div>\n\n\n<p class=\"has-black-color has-text-color\">Current TSO&#8217;s demand forecast accuracy: RMSE= 30.194821557544707 , R-squared= 0.9700502334373836<\/p>\n\n\n\n<p class=\"has-black-color has-text-color\"><strong>1. Multiple Linear Regression<\/strong><\/p>\n\n\n\n<p class=\"has-black-color has-text-color\">Multiple Linear Regression is a statistical technique that helps us make predictions by analyzing the relationships between multiple variables. Imagine you have a dataset with one outcome you want to predict (let&#8217;s call it &#8216;Y&#8217;) and several factors that might influence it (let&#8217;s call them &#8216;X1,&#8217; &#8216;X2,&#8217; etc.).<\/p>\n\n\n<p class=\"ql-center-displayed-equation\" style=\"line-height: 54px;\"><span class=\"ql-right-eqno\"> &nbsp; <\/span><span class=\"ql-left-eqno\"> &nbsp; <\/span><img decoding=\"async\" src=\"https:\/\/savvaspanagi.com\/wp-content\/ql-cache\/quicklatex.com-90ba97b794832cae1ed1a56e47f98209_l3.png\" height=\"54\" width=\"274\" class=\"ql-img-displayed-equation quicklatex-auto-format\" alt=\" &#92;&#98;&#101;&#103;&#105;&#110;&#123;&#99;&#101;&#110;&#116;&#101;&#114;&#125; &#32;&#32;&#32;&#32;&#77;&#117;&#108;&#116;&#105;&#112;&#108;&#101;&#32;&#76;&#105;&#110;&#101;&#97;&#114;&#32;&#82;&#101;&#103;&#114;&#101;&#115;&#115;&#105;&#111;&#110;&#58; &#32;&#32;&#32;&#32;&#92;&#91;&#32;&#89;&#32;&#61;&#32;&#98;&#95;&#48;&#32;&#43;&#32;&#98;&#95;&#49;&#88;&#95;&#49;&#32;&#43;&#32;&#98;&#95;&#50;&#88;&#95;&#50;&#32;&#43;&#32;&#92;&#108;&#100;&#111;&#116;&#115;&#32;&#43;&#32;&#98;&#95;&#107;&#88;&#95;&#107;&#32;&#92;&#93; &#92;&#101;&#110;&#100;&#123;&#99;&#101;&#110;&#116;&#101;&#114;&#125; \" title=\"Rendered by QuickLaTeX.com\"\/><\/p>\n\n\n\n<p class=\"has-black-color has-text-color\">The goal is to find the best values for b0,b1,\u2026,<em>bk<\/em> that minimize the difference between our predicted <em>Y<\/em> and the actual <em>Y<\/em> in our dataset. Once we find these values, we can use them to make predictions. For example, if we want to predict someone&#8217;s weight (<em>Y<\/em>), we&#8217;d use their height (X1), age (X2), and other relevant factors. The coefficients (<em>b<\/em>0,<em>b<\/em>1,\u2026,<em>bk<\/em>) tell us how much <em>Y<\/em> is expected to change for a one-unit change in each <em>X<\/em> variable, holding other variables constant. A positive coefficient means an increase in <em>Y<\/em> with an increase in the corresponding <em>X<\/em>, and a negative coefficient means the opposite.<\/p>\n\n\n\n<p class=\"has-black-color has-text-color\"><strong>Implementation of <em>Multiple<\/em> Linear Regression<\/strong><\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\n# Training the Simple Linear Regression model on the Training set\nfrom sklearn.linear_model import LinearRegression\nMLR = LinearRegression()\nMLR.fit(X_train&#x5B;:,:3], y_train)\n\n# Predicting the Test set results\ny_predMLR = MLR.predict(X_test&#x5B;:,:3])\n\nfrom sklearn.metrics import r2_score\nfrom sklearn.metrics import mean_squared_error\n# Calculate RMSE\nrmse = np.sqrt(mean_squared_error(y_test, y_predMLR))\n\n# Calculate R-squared\nr_squared = r2_score(y_test, y_predMLR)\n\nprint(f&#039;RMSE: {rmse}&#039;)\nprint(&quot;R-squared:&quot;, r_squared)\n\n<\/pre><\/div>\n\n\n<p class=\"has-black-color has-text-color\"><strong>2. Polyonimial Regression<\/strong><\/p>\n\n\n\n<p class=\"has-black-color has-text-color\">Polynomial regression focuses on capturing nonlinear patterns by introducing higher-degree polynomial terms of a single independent variable. Polynomial regression is particularly useful when the relationship between the variables is nonlinear, providing a more accurate fit to the data compared to the linear assumptions of multiple linear regression.<\/p>\n\n\n<p class=\"ql-center-displayed-equation\" style=\"line-height: 54px;\"><span class=\"ql-right-eqno\"> &nbsp; <\/span><span class=\"ql-left-eqno\"> &nbsp; <\/span><img decoding=\"async\" src=\"https:\/\/savvaspanagi.com\/wp-content\/ql-cache\/quicklatex.com-aadc198dee34793d80b8daa9f5d7374a_l3.png\" height=\"54\" width=\"272\" class=\"ql-img-displayed-equation quicklatex-auto-format\" alt=\" &#92;&#98;&#101;&#103;&#105;&#110;&#123;&#99;&#101;&#110;&#116;&#101;&#114;&#125; &#32;&#32;&#32;&#32;&#80;&#111;&#108;&#121;&#110;&#111;&#109;&#105;&#97;&#108;&#32;&#82;&#101;&#103;&#114;&#101;&#115;&#115;&#105;&#111;&#110;&#58; &#32;&#32;&#32;&#32;&#92;&#91;&#32;&#89;&#32;&#61;&#32;&#98;&#95;&#48;&#32;&#43;&#32;&#98;&#95;&#49;&#88;&#32;&#43;&#32;&#98;&#95;&#50;&#88;&#94;&#50;&#32;&#43;&#32;&#92;&#108;&#100;&#111;&#116;&#115;&#32;&#43;&#32;&#98;&#95;&#110;&#88;&#94;&#110;&#32;&#92;&#93; &#92;&#101;&#110;&#100;&#123;&#99;&#101;&#110;&#116;&#101;&#114;&#125; \" title=\"Rendered by QuickLaTeX.com\"\/><\/p>\n\n\n\n<p class=\"has-black-color has-text-color\"><strong>Implementation of Polyonimial Regression<\/strong><\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\nfrom sklearn.preprocessing import PolynomialFeatures\nfrom sklearn.linear_model import LinearRegression\nfrom sklearn.pipeline import make_pipeline\n\n# Assuming X_train is your feature matrix with 3 columns, and y_train is your target variable\n\n# Create a polynomial regression model with degree 2 (you can change the degree as needed)\ndegree = 2\nPR = make_pipeline(PolynomialFeatures(degree), LinearRegression())\n\n# Train the model\nPR.fit(X_train&#x5B;:,:3], y_train)\n\n# Make predictions\ny_predPR = PR.predict(X_test&#x5B;:,:3])\n\nfrom sklearn.metrics import mean_squared_error\n# Calculate RMSE\nrmse = np.sqrt(mean_squared_error(y_test, y_predPR))\n\n# Calculate R-squared\nr_squared = r2_score(y_test, y_predPR)\n\nprint(f&#039;RMSE: {rmse}&#039;)\nprint(&quot;R-squared:&quot;, r_squared)\n<\/pre><\/div>\n\n\n<p class=\"has-black-color has-text-color\"><strong>Implementation Support Vector Regression<\/strong><\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\n# Training the SVR model on the whole dataset\nfrom sklearn.svm import SVR\nSVR = SVR(kernel = &#039;rbf&#039;)\nSVR.fit(X_train&#x5B;:,:3], y_train)\n\n# Predicting a new result\ny_predSVR=SVR.predict(X_test&#x5B;:,:3])\n\nprint(np.concatenate((y_predSVR.reshape(len(y_predSVR),1), y_test.reshape(len(y_test),1)),1))\nfrom sklearn.metrics import mean_squared_error\n# Calculate RMSE\nrmse = np.sqrt(mean_squared_error(y_test, y_predSVR))\n# Calculate R-squared\nr_squared = r2_score(y_test, y_predSVR)\n\nprint(f&#039;RMSE: {rmse}&#039;)\nprint(&quot;R-squared:&quot;, r_squared)\n<\/pre><\/div>\n\n\n<p class=\"has-black-color has-text-color\"><strong>Implementation Disicion Tree Regression<\/strong><\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\n# Training the Decision Tree Regression model on the whole dataset\nfrom sklearn.tree import DecisionTreeRegressor\nDTR = DecisionTreeRegressor(random_state = 0)\nDTR.fit(X_train&#x5B;:,:3], y_train)\n# Predicting a new result\ny_predDTR=DTR.predict(X_test&#x5B;:,:3])\nfrom sklearn.metrics import mean_squared_error\n# Calculate RMSE\nrmse = np.sqrt(mean_squared_error(y_test, y_predDTR))\n# Calculate R-squared\nr_squared = r2_score(y_test, y_predDTR)\nprint(f&#039;RMSE: {rmse}&#039;)\nprint(&quot;R-squared:&quot;, r_squared)\n<\/pre><\/div>\n\n\n<p class=\"has-black-color has-text-color\"><strong>Implementation Random Forest Regression<\/strong><\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\n# Training the Random Forest Regression model on the whole dataset\nfrom sklearn.ensemble import RandomForestRegressor\nRFR = RandomForestRegressor(n_estimators = 10, random_state = 0)\nRFR.fit(X_train&#x5B;:,:3], y_train)\n# Predicting a new result\ny_predRFR=RFR.predict(X_test&#x5B;:,:3])\nfrom sklearn.metrics import mean_squared_error\n# Calculate RMSE\nrmse = np.sqrt(mean_squared_error(y_test, y_predRFR))\n# Calculate R-squared\nr_squared = r2_score(y_test, y_predRFR)\nprint(f&#039;RMSE: {rmse}&#039;)\nprint(&quot;R-squared:&quot;, r_squared)\n<\/pre><\/div>\n\n\n<p><strong>Results<\/strong><\/p>\n\n\n\n<figure>\n<table>\n<tbody>\n<tr>\n<td>&nbsp;<\/td>\n<td>TSO&#8217;s Forecast<\/td>\n<td>Multiple Linear Regression<\/td>\n<td>Polyonimial Regression<\/td>\n<td>Support Vector Regression<\/td>\n<td>Disicion Tree Regression<\/td>\n<td>Random Forest Regression<\/td>\n<\/tr>\n<tr>\n<td>RMSE<\/td>\n<td>30.2<\/td>\n<td>109.9<\/td>\n<td>63.1<\/td>\n<td>59.7<\/td>\n<td>27.3<\/td>\n<td>29.4<\/td>\n<\/tr>\n<tr>\n<td>R-Squared<\/td>\n<td>0.970<\/td>\n<td>0.603<\/td>\n<td>0.869<\/td>\n<td>0.883<\/td>\n<td>0.976<\/td>\n<td>0.972<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/figure>\n\n\n\n<p class=\"has-black-color has-text-color\">Your thoughts and questions are important for me. Feel free to share your insights or inquire about anything in the comments section below. Let&#8217;s keep the conversation going!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In this current post, we go into the exploration and analysis of various regression machine learning methods. Our objective is to utilize a range of machine learning techniques to forecast the load demand of the Cyprus power system. In the initial phase, the &#8216;data&#8217; dataframe include all essential features and measurements spanning a 16-month period. &hellip;<\/p>\n<p class=\"read-more\"> <a class=\"\" href=\"https:\/\/savvaspanagi.com\/?p=1128\"> <span class=\"screen-reader-text\">Deploying Diverse Regression Machine Learning Algorithms: An Implementation Overview<\/span> Read More &raquo;<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"default","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"set","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-gradient":""}},"footnotes":"[]"},"categories":[7],"tags":[],"jetpack_featured_media_url":"","rttpg_featured_image_url":null,"rttpg_author":{"display_name":"Savvas Panagi","author_link":"https:\/\/savvaspanagi.com\/?author=1"},"rttpg_comment":2,"rttpg_category":"<a href=\"https:\/\/savvaspanagi.com\/?cat=7\" rel=\"category\">Machine Learning<\/a>","rttpg_excerpt":"In this current post, we go into the exploration and analysis of various regression machine learning methods. Our objective is to utilize a range of machine learning techniques to forecast the load demand of the Cyprus power system. In the initial phase, the &#8216;data&#8217; dataframe include all essential features and measurements spanning a 16-month period.&hellip;","_links":{"self":[{"href":"https:\/\/savvaspanagi.com\/index.php?rest_route=\/wp\/v2\/posts\/1128"}],"collection":[{"href":"https:\/\/savvaspanagi.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/savvaspanagi.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/savvaspanagi.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/savvaspanagi.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1128"}],"version-history":[{"count":52,"href":"https:\/\/savvaspanagi.com\/index.php?rest_route=\/wp\/v2\/posts\/1128\/revisions"}],"predecessor-version":[{"id":1197,"href":"https:\/\/savvaspanagi.com\/index.php?rest_route=\/wp\/v2\/posts\/1128\/revisions\/1197"}],"wp:attachment":[{"href":"https:\/\/savvaspanagi.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1128"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/savvaspanagi.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1128"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/savvaspanagi.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1128"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}