**Abstract:**

One of the most widely used standard procedures for model evaluation in classification and regression is -fold cross-validation (CV). However, when it comes to time series forecasting, because of the inherent serial correlation and potential non-stationarity of the data, its application is not straightforward and often omitted by practitioners in favor of an out-of-sample (OOS) evaluation. In this paper, we show that the particular setup in which time series forecasting is usually performed using Machine Learning methods renders the use of standard -fold CV possible. We present theoretical insights supporting our arguments. Furthermore, we present a simulation study where we show empirically that -fold CV performs favorably compared to both OOS evaluation and other time-series-specific techniques such as non-dependent cross-validation.

by Timothy L. McMurry and Dimitris N. Politis.

]]>

時 間 2015/01/12 11:00 星期一

地 點 中研院-統計所 2F 交誼廳

備 註 茶 會：上午10：40統計所二樓交誼廳

Time series can often be naturally disaggregated in a hierarchical or grouped structure. For example, a manufacturing company can disaggregate total demand for their products by country of sale, retail outlet, product type, package size, and so on. As a result, there can be millions of individual time series to forecast at the most disaggregated level, plus additional series to forecast at higher levels of aggregation.

The first problem with handling such large numbers of time series is how to produce useful graphics to uncover structures and relationships between series. I will demonstrate some data visualization tools that help in exploring big time series data.

The second problem is that the disaggregated forecasts need to add up to the forecasts of the aggregated data. This is known as reconciliation. I will show that the optimal reconciliation method involves fitting an ill-conditioned linear regression model where the design matrix has one column for each of the series at the most disaggregated level. For problems involving huge numbers of series, the model is impossible to estimate using standard regression algorithms. I will also discuss some fast algorithms for implementing this model that make it practicable for implementing in business contexts.

]]>**Abstract:**

In this article we explore some bivariate smoothing methods with partial differential regularizations designed to handle smooth bivariate surfaces with occasional ridges. We apply our technique to smoothing mortality rates.

Mortality rates are typically smoothed over two dimensions: age and time. Occasional ridges occur due to period effects (e.g., deaths due to wars and epidemics) and cohort effects (e.g., the effects of wars and epidemics on the survivors).

We propose three new practical methods of smoothing mortality rates over age and time. The first method uses bivariate thin plate splines. The second uses a similar procedure but with lasso-type regularization. The third method also uses bivariate lasso-type regularization, but allows for both period and cohort effects. In these smoothing methods, the {logarithms of} mortality rates are modelled as the sum of four components: a smooth bivariate function of age and time, smooth one-dimensional cohort effects, smooth one-dimensional period effects and random errors. Cross validation is used to compare these new smoothing methods with existing approaches.

Although our methods are designed to smooth logarithms of mortality rates, they are generic enough to be applied to any bivariate data with occasional ridges.

**Keywords:** Bivariate data, nonparametric smoothing, mortality rates, graduation, cohort effects, period effects.

]]>

The package requires the following data as input: half-hourly/hourly electricity demands; half-hourly/hourly temperatures at one or two locations; seasonal demographical and economical data; public holiday data. The formats of the required data are described in the help files.

Some documentation of the underlying model is provided at http://robjhyndman.com/working-papers/mefm/.

The package itself is hosted on github and can be installed as follows:

R Code

install.packages("devtools") library(devtools) install_github("robjhyndman/MEFM-package") |

*Foresight* (Fall, 2014). pp.42-48.

This is an introduction to our approach to forecast reconciliation without using any matrices. The original research is available here:

The software is available in the hts package for R with some notes on usage in the vignette. There is also a gentle introduction in my forecasting textbook.

]]>**Venue**: The University Club, University of Western Australia, Nedlands WA.

**Requirements:** a laptop with R installed, along with the fpp package and its dependencies. We will also use the hts and vars package on the third day.

Hyndman and Athanasopoulos (2014)

*Forecasting: principles and practice*,

OTexts: Melbourne, Australia.

- Introduction to forecasting [Slides, R code, Lab solutions]
- Forecasting tools [Slides, R code, Lab solutions]
- Exponential smoothing I [Slides, R code, Lab solutions]
- Exponential smoothing II [Slides, R code, Lab solutions]
- Time series decomposition and cross-validation [Slides, R code, Lab solutions]
- Transformations, stationarity and differencing [Slides, R code, Lab solutions]
- Non-seasonal ARIMA models [Slides, R code, Lab solutions]
- Seasonal ARIMA models [Slides, R code, Lab solutions]
- State space models [Slides, R code, Lab solutions]
- Dynamic regression [Slides, R code, Lab solutions]
- Hierarchical forecasting [Slides, R code, Lab solutions]
- Advanced methods [Slides, R code, Lab solutions]

*European Respiratory Journal (2014)*, *44*(Suppl 58).

**Introduction**

Asthma can be exacerbated by exposure to various fungal spores and Human Rhinovirus [HRV], but current understanding of the importance of fungal exposure to child asthma hospitalisations is limited. Moreover the interaction between HRV and fungal spore exposure on admission has not been examined.

**Aim**

To investigate the role of outdoor fungal spores in child asthma hospitalisations and if HRV modifies any such effect.

**Methods**

We conducted a case-crossover study of 644 child asthma hospitalisations in Melbourne, Australia (2009–11). On admission, participants had nasal and throat swabs that were tested using a sensitive nested multiplex PCR for HRV infection. Daily ambient spore counts of 14 fungi species were obtained using a Burkard Volumetric spore trap. Conditional logistic regression assessed the role of fungi adjusting for confounders. Interaction terms were included if there was evidence of effect modification from HRV. Results are presented as odds ratios [OR] per unit increase in daily number of fungi spores/m^{3} of air sampled.

**Results**

Overall, higher risk of hospitalisation was observed when participants were exposed to*Alternaria* (OR=1.011, 95%CI 1.004-1.017), *Coprinus* (1.009, 1.000-1.017), *Leptosphaeria*(1.001, 1.000-1.013) independent of air pollution, HRV and sensitization to common allergens. There was evidence of effect modification by HRV in boys exposed to*Leptosphaeria* (1.028, 1.006-1.050) and *Ganoderma* (1.320, 1.048-1.660)*.* No evidence of HRV effect modification in girls.

**Conclusion**

Some fungal genera are associated with increased risk of asthma hospitalisation in both sexes but the risk increased with two specific fungal genera in boys infected with HRV.

]]>- School of Mathematical Sciences, Monash University, Clayton, Australia
- Department of Econometrics & Business Statistics, Monash University, Clayton, Australia
- Ceramic Fuel Cells Limited, Noble Park, Australia

Learning and Intelligent Optimization

Lecture Notes in Computer Science, vol 8426, 341-352.

**Abstract.** In this paper, we focus on expensive multiobjective optimization problems and propose a method to predict an approximation of the Pareto optimal set using classification of sampled decision vectors as dominated or nondominated. The performance of our method, called EPIC, is demonstrated on a set of benchmark problems used in the multiobjective optimization literature and compared with state-of-the-art methods, ParEGO and PAL. The initial results are promising and encourage further research in this direction.