Sales & Marketing Forecast
Autor: Jannisthomas • November 11, 2018 • 2,593 Words (11 Pages) • 698 Views
...
a five year span. This chart represents a forecast using the Holt forecasting method and it shows a number of different aspects including actual revenue for Engility, the predicted values, and ultimately the fitted values. The ‘actual line’ of the chart represents the data that was pulled from Engility’s financial reports. The forecast is what the program created using the actual data and preferences that we chose while creating the model. The forecasted data shows that Engility will see its revenue become stationary with no positive or negative trend. Lastly, the fitted values are the estimated figures for the actual data that are used to build the model. Holts method did a decent job of accounting for the trends in the data. However, it was not able to account for the acquisition that took place in the last quarter of 2014. Even though Holts was not able to account for the acquisition it was still able to account for the trend in our data, the forecasting only had a MAPE of 6.53%
In the Holts forecast there was a noticeable error because of the event that took place in the last quarter of 2014. In order to account for the event, we must create dummy variables (D), which are independent variables that are meant to capture the effect of the event but isn’t typical sales data such as sales or income. Since we are using a linear trend regression model we must also add a time index column (t) to the data. The time index column serves as another independent variable to assist the forecast by accounting for the time trend. By adding the dummy variables along with the time index column to the linear regression model we were able to create a more accurate forecast decreasing the MAPE to 5.46%.
The first independent variable, 200,858.54, represents the first slope term for the event and the second independent variable, -4459.31, represents the second slope term for the time index. Since we have two independent variables our linear model is extended to Y = a +b1 x + b2 + x = Revenue = 386,852.78 + 200,858.54D – 4459.31t. Now that we have our forecast results we must evaluate the model.
First, we have to determine if the model makes sense logically. Logic tells us that there should be a positive relationship between the event and revenue; in this case, there is. Whenever an acquisition happens the revenue should go up because of the value added from the other companies. The inverse relationship between time and revenue indicates that as more time passes since the closing of the acquisition we can expect the revenue to begin to balance out. Secondly, we must determine if each slope for each independent variable is significantly more or less than zero. In order to determine if each slope term is significantly positive or negative we must do a “T test” for each independent variable. We have to test the hypothesis for each independent variable with a 95% confidence level and 5% significance level. After conducting a T test for D (event) and t (time index), we are able to determine that the slope term for D is significantly more than zero, and the slope term for t is significantly less than zero.
Thirdly, we must determine how much explanatory power we have by looking at the coefficient of determination; which is the percent of the variation in dependent variable that is explained by the model. We use the R square value to measure the coefficient of determination. This value can be found in the regression analysis (see Figure 2 in the appendix). When there is only one independent variable we can only use the R square; however, since there are multiple variables we have to make an adjustment to the R square. This adjustment is called the adjusted R square. The adjusted R square tell us that 80.7% of the variations in revenue are explained by variations in the event model and time index.
Lastly, we must determine if there is any serial correlation in the data. This correlation looks at the error terms in our regression model over time. The error terms are the vertical distances between each observation and the regression line. The error terms are calculated by the actual value minus the predicted value. We would like if our data points did not have a positive or negative pattern, however we prefer them to be randomly distributed. We use the Durbin Watson (DW) statistic to determine if there is a serial correlation problem. This range for this statistic is anywhere from zero to four. It is ideal for DW to be between 1.5 -2.5. If less than two it more than likely has positive serial correlation. If above two it more than likely has negative serial correlation. If we have serial correlation there can be a bias in the way in which the standard errors are calculated; meaning that the standard error of each slope term might be smaller than it really should be, thus causing the T ratio to become too big. The result is that we may think some slope term is significantly different than zero, when in actuality it isn’t. Serial correlation becomes a problem when using time series data, which in this case we are using. The DW statistic is 1.965 which signifies that there is no serial correlation present in our data. Results from the regression analysis and the DW statistic can be seen in figure 3 in the Appendix. Since the regression model has passed the evaluation steps we know that it is useful for the forecast. By using the linear regression forecasting method that accounts for the event and time trend in our data we are able to create a more accurate forecast by decreasing the MAPE by 1.07%.
Before we can combine the two forecasts we have to determine if doing so will create a bias. In order to make this determination we must do a multiple regression with the actual values (A) as a function of the two forecasting methods, Holts and linear regression. Doing this gives us an equation of A = a + b1F1 + b2F2 , where b represents the different slopes for each forecast. We will do a T test to determine if each slope term is statistically different than zero. If the p-value for the intercept is greater than 0.05 the constant is not statistically different from zero. Thus, combining those forecasts will not create a bias. If the p value is smaller than .05 then the constant is significantly different than zero and we don’t want to combine. After combining the two forecasts the regression analysis will assign an optimal weight to each slope term. The regression analysis (see figure 4 in the Appendix) lets us know that the P value for the intercept is 0.432. Since this value is larger than 0.05 the constant is not statistically different than zero, and combining the forecast will not create a bias. The combination method gave Holts method a weight of .377 and linear regression
...