Leman et al. and Endert et al. develop an interactive data visualization framework called visual to parametric interaction (V2PI). With V2PI, experts may explore data visually (assess multiple data visualizations) based on their judgments and an underlying data analytic method. Specifically, V2PI offers a deterministic procedure to quantify expert judgments and update analytical parameters to create new data visualizations. In this article, we explain V2PI from a probabilistic perspective and develop Bayesian visual analytics (BaVA). We model data probabilistically, develop parallels between quantifying expert judgments and eliciting prior distributions from experts, and justify how we update parameters using Bayesian sequential updating. We apply BaVA using two linear projections methods to assess simulated and real-world datasets.

We present a new technique for comparing models using a median form of cross-validation and least median of squares estimation (MCV-LMS). Rather than minimizing the sums of squares of residual errors, we minimize the median of the squared residual errors. We compare this with a robustified form of cross-validation using the Huber loss function and robust coefficient estimators (HCV). Through extensive simulations we find that for linear models MCV-LMS outperforms HCV for data that is representative of the data generator when the tails of the noise distribution are heavy enough and asymmetric enough. We also find that MCV-LMS is often better able to detect the presence of small terms. Otherwise, HCV typically outperforms MCV-LMS for ‘good’ data. MCV-LMS also outperforms HCV in the presence of enough severe outliers.

One of MCV and HCV also generally gives better model selection for linear models than the conventional version of cross-validation with least squares estimators (CV-LS) when the tails of the noise distribution are heavy or asymmetric or when the coefficients are small and the data is representative. CV-LS only performs well when the tails of the error distribution are light and symmetric and the coefficients are large relative to the noise variance. Outside of these contexts and the contexts noted above, HCV outperforms CV-LS and MCV-LMS.

We illustrate CV-LS, HVC, and MCV-LMS via numerous simulations to map out when each does best on representative data and then apply all three to a real dataset from econometrics that includes outliers.

Expanding a lower-dimensional problem to a higher-dimensional space and then projecting back is often beneficial. This article rigorously investigates this perspective in the context of finite mixture models, specifically how to improve inference for mixture models by using auxiliary variables. Despite the large literature in mixture models and several empirical examples, there is no previous work that gives general theoretical justification for including auxiliary variables in mixture models, even for special cases. We provide a theoretical basis for comparing inference for mixture multivariate models with the corresponding inference for marginal univariate mixture models. Analytical results for several special cases are established. We show that the probability of correctly allocating mixture memberships and the information number for the means of the primary outcome in a bivariate model with two Gaussian mixtures are generally larger than those in each univariate model. Simulations under a range of scenarios, including mis-specified models, are conducted to examine the improvement. The method is illustrated by two real applications in ecology and causal inference.

The support vector machine (SVM) and other reproducing kernel Hilbert space (RKHS) based classifier systems are drawing much attention recently owing to its robustness and generalization capability. General theme here is to construct classifiers based on the training data in a high dimensional space by using all available dimensions. The SVM achieves huge data compression by selecting only few observations that lie close to the boundary of the classifier function. However when the number of observations is not very large (small *n*) but the number of dimensions/features is large (large *p*), then it is not necessary that all available features are of equal importance in the classification context. Possible selection of a useful fraction of the available features may result in huge data compression. In this paper, we propose an algorithmic approach by means of which such an *optimal* set of features could be selected. In short, we reverse the traditional sequential observation selection strategy of SVM to that of sequential feature selection. To achieve this we have modified the solution proposed by Zhu and Hastie in the context of import vector machine (IVM), to select an *optimal* sub-dimensional model to build the final classifier with sufficient accuracy.