The rational Arnoldi process is a popular method for the computation of a few eigenvalues of a large non-Hermitian matrix
and for the approximation of matrix functions. The method is particularly attractive when the rational functions that determine the process have only few distinct poles
, because then few factorizations of matrices of the form *A* − *z*_{j}*I* have to be computed. We discuss recursion relations for orthogonal bases of rational Krylov subspaces determined by rational functions with few distinct poles. These recursion formulas yield a new implementation of the rational Arnoldi process. Applications of the rational Arnoldi process to the approximation of matrix functions as well as to the computation of eigenvalues and pseudospectra of *A* are described. The new implementation is compared to several available implementations.

In this paper, first we introduce a new tensor product for a transition probability tensor originating from a higher-order Markov chain. Subsequently, some properties of the new tensor product are explained, and its relationship with the stationary probability vector is studied. Also, similarity between results obtained by this new product and the first-order case is shown. Furthermore, we prove the convergence of a transition probability tensor to the stationary probability vector. Finally, we show how to achieve a stationary probability vector with some numerical examples and make some comparison between the proposed method and another existing method for obtaining stationary probability vectors. Copyright © 2016 John Wiley & Sons, Ltd.

Matrix logarithmic norm is an important quantity, which characterize the stability of linear dynamical systems. We propose the logarithmic norms for tensors and tensor pairs, and extend some classical results from the matrix case. Moreover, the explicit forms of several tensor logarithmic norms and semi-norms are also derived. Employing the tensor logarithmic norms, we bound the real parts of all the eigenvalues of a complex tensor and study the stability of a class of nonlinear dynamical systems. Copyright © 2016 John Wiley & Sons, Ltd.

Large-scale scientific computing models are needed for the simulation of wave propagation especially for multiple frequency and high-frequency models in complex heterogeneous media. Multigrid methods provide efficient iterative solvers for many large sign-definite systems of equations resulting from physical models. Time-harmonic wave propagation models lead to sign-indefinite systems with eigenvalues in the left half of the complex plane. Thus standard multigrid approaches applied in conjunction with a low-order finite difference or finite element method are not sufficient. In this work, we describe a high-order finite element method model for multiple (low to high) frequency time-harmonic acoustic wave propagation on general curved, non-convex, and non-smooth domains with heterogeneous media using a multigrid approximation of the shifted Laplacian operator as a preconditioner. We implement the model using an efficient geometric multigrid approach with parallel grid transfer operator calculations to simulate the model using the BiCGStab iterative solver. We demonstrate the efficiency and parallel performance of the computational model with multiple low (5 wavelength) to high-frequency (100 wavelength) input incident waves. Copyright © 2016 John Wiley & Sons, Ltd.

The incompressible. Stokes equations are a widely used model of viscous or tightly confined flow in which convection effects are negligible. In order to strongly enforce the conservation of mass at the element scale, special discretization techniques must be employed. In this paper, we consider a discontinuous Galerkin approximation in which the velocity field is *H*(div,Ω)-conforming and divergence-free, based on the Brezzi, Douglas, and Marini finite-element space, with complementary space (*P*_{0}) for the pressure. Because of the saddle-point structure and the nature of the resulting variational formulation, the linear systems can be difficult to solve. Therefore, specialized preconditioning strategies are required in order to efficiently solve these systems. We compare the effectiveness of two families of preconditioners for saddle-point systems when applied to the resulting matrix problem. Specifically, we consider block-factorization techniques, in which the velocity block is preconditioned using geometric multigrid, as well as fully coupled monolithic multigrid methods. We present parameter study data and a serial timing comparison, and we show that a monolithic multigrid preconditioner using Braess–Sarazin style relaxation provides the fastest time to solution for the test problem considered. Copyright © 2016 John Wiley & Sons, Ltd.

This paper proposes a new, low-communication algorithm for solving PDEs on massively parallel computers. The range decomposition (RD) algorithm exposes coarse-grain parallelism by applying nested iteration and adaptive mesh refinement locally before performing a global communication step. Just a few such steps are observed to be sufficient to obtain accuracy within a small multiple of discretization error. The target applications are petascale and exascale machines, where hierarchical parallelism is required and traditional parallel numerical PDE communication patterns are costly because of message latency. The RD algorithm uses a partition of unity to equally distribute the error, and thus, the work. The computational advantages of this approach are that the decomposed problems can be solved in parallel without any communication until the partitioned solutions are summed. This offers potential advantages in the paradigm of expensive communication but very cheap computation. This paper introduces the method and explains the details of the communication step. Two performance models are developed, showing that the latency cost associated with a traditional parallel implementation of nested iteration is proportional to *l**o**g*(*P*)^{2}, whereas the RD method reduces the communication latency to *l**o**g*(*P*), while maintaining similar bandwidth costs. Numerical results for two problems, Laplace and advection diffusion, demonstrate the enhanced performance, and a heuristic argument explains why the method converges quickly. Copyright © 2016 John Wiley & Sons, Ltd.

The following paper discusses the application of a multigrid-in-time scheme to Least Squares Shadowing (LSS), a novel sensitivity analysis method for chaotic dynamical systems. While traditional sensitivity analysis methods break down for chaotic dynamical systems, LSS is able to compute accurate gradients. Multigrid is used because LSS requires solving a very large Karush–Kuhn–Tucker system constructed from the solution of the dynamical system over the entire time interval of interest. Several different multigrid-in-time schemes are examined, and a number of factors were found to heavily influence the convergence rate of multigrid-in-time for LSS. These include the iterative method used for the smoother, how the coarse grid system is formed and how the least squares objective function at the center of LSS is weighted. Copyright © 2014 John Wiley & Sons, Ltd.

No abstract is available for this article.

]]>We consider symmetrized Karush–Kuhn–Tucker systems arising in the solution of convex quadratic programming problems in standard form by Interior Point methods. Their coefficient matrices usually have 3 × 3 block structure, and under suitable conditions on both the quadratic programming problem and the solution, they are nonsingular in the limit. We present new spectral estimates for these matrices: the new bounds are established for the unpreconditioned matrices and for the matrices preconditioned by symmetric positive definite augmented preconditioners. Some of the obtained results complete the analysis recently given by Greif, Moulding, and Orban in [*SIAM J. Optim., 24 (2014), pp. 49-83*]. The sharpness of the new estimates is illustrated by numerical experiments. Copyright © 2016 John Wiley & Sons, Ltd.

This paper deals with the problem of recovering an unknown low-rank matrix from a sampling of its entries. For its solution, we consider a nonconvex approach based on the minimization of a nonconvex functional that is the sum of a convex fidelity term and a nonconvex, nonsmooth relaxation of the rank function. We show that by a suitable choice of this nonconvex penalty, it is possible, under mild assumptions, to use also in this matrix setting the iterative forward–backward splitting method. Specifically, we propose the use of certain parameter dependent nonconvex penalties that with a good choice of the parameter value allow us to solve in the backward step a convex minimization problem, and we exploit this result to prove the convergence of the iterative forward–backward splitting algorithm. Based on the theoretical results, we develop for the solution of the matrix completion problem the efficient iterative improved matrix completion forward–backward algorithm, which exhibits lower computing times and improved recovery performance when compared with the best state-of-the-art algorithms for matrix completion. Copyright © 2016 John Wiley & Sons, Ltd.

An algorithm to generate samples with approximate first-order, second-order, third-order, and fourth-order moments is presented by extending the Cholesky matrix decomposition to a Cholesky tensor decomposition of an arbitrary order. The tensor decomposition of the first-order, second-order, third-order, and fourth-order objective moments generates a non-linear system of equations. The algorithm solves these equations by numerical methods. The results show that the optimization algorithm delivers samples with an approximate residual error of less than 10^{16} between the components of the objective and the sample moments. The algorithm is extended for a *n*-th-order approximate tensor moment version, and simulations of non-normal samples replicated from distributions with asymmetries and heavy tails are presented. An application for sensitivity analysis of portfolio risk assessment with Value-at-Risk (*VaR*) is provided. A comparison with previous methods available in the literature suggests that the methodology proposed reduces the error of the objective moments in the generated samples. Copyright © 2016 John Wiley & Sons, Ltd.

We consider the parallel factorization of sparse finite element matrices on distributed memory machines. Our method is based on a nested dissection approach combined with a cyclic re-distribution of the interface Schur complements. We present a detailed definition of the parallel method, and the well-posedness and the complexity of the algorithm are analyzed. A lean and transparent functional interface to existing finite element software is defined, and the performance is demonstrated for several representative examples. Copyright © 2016 John Wiley & Sons, Ltd.

This paper presents a class of limited memory preconditioners (LMP) for solving linear systems of equations with symmetric indefinite matrices and multiple right-hand sides. These preconditioners based on limited memory quasi-Newton formulas require a small number *k* of linearly independent vectors and may be used to improve an existing first-level preconditioner. The contributions of the paper are threefold. First, we derive a formula to characterize the spectrum of the preconditioned operator. A spectral analysis of the preconditioned matrix shows that the eigenvalues are all real and that the LMP class is able to cluster at least *k* eigenvalues at 1. Secondly, we show that the eigenvalues of the preconditioned matrix enjoy interlacing properties with respect to the eigenvalues of the original matrix provided that the *k* linearly independent vectors have been prior projected onto the invariant subspaces associated with the eigenvalues of the original matrix in the open right and left half-plane, respectively. Third, we focus on theoretical properties of the Ritz-LMP variant, where Ritz information is used to determine the *k* vectors. Finally, we illustrate the numerical behaviour of the Ritz limited memory preconditioners on realistic applications in structural mechanics that require the solution of sequences of large-scale symmetric saddle-point systems. Numerical experiments show the relevance of the proposed preconditioner leading to a significant decrease in terms of computational operations when solving such sequences of linear systems. A saving of up to 43% in terms of computational effort is obtained on one of these applications. Copyright © 2016 John Wiley & Sons, Ltd.

As the computational power of high-performance computing systems continues to increase by using a huge number of cores or specialized processing units, high-performance computing applications are increasingly prone to faults. In this paper, we present a new class of numerical fault tolerance algorithms to cope with node crashes in parallel distributed environments. This new resilient scheme is designed at application level and does not require extra resources, that is, computational unit or computing time, when no fault occurs. In the framework of iterative methods for the solution of sparse linear systems, we present numerical algorithms to extract relevant information from available data after a fault, assuming a separate mechanism ensures the fault detection. After data extraction, a well-chosen part of missing data is regenerated through interpolation strategies to constitute meaningful inputs to restart the iterative scheme. We have developed these methods, referred to as interpolation–restart techniques, for Krylov subspace linear solvers. After a fault, lost entries of the current iterate computed by the solver are interpolated to define a new initial guess to restart the Krylov method. A well-suited initial guess is computed by using the entries of the faulty iterate available on surviving nodes. We present two interpolation policies that preserve key numerical properties of well-known linear solvers, namely, the monotonic decrease of the A-norm of the error of the conjugate gradient or the residual norm decrease of generalized minimal residual algorithm for solving. The qualitative numerical behavior of the resulting scheme has been validated with sequential simulations, when the number of faults and the amount of data losses are varied. Finally, the computational costs associated with the recovery mechanism have been evaluated through parallel experiments. Copyright © 2016 John Wiley & Sons, Ltd.

Let ** n** = (

In this study, we examine the accurate matrix multiplication in floating-point arithmetic. We demonstrate the error-free transformations of matrix multiplication using high performance basic linear algebra subprograms. These transformations can be applied to accurately compute the product of two matrices using floating-point entries. A key technique for this calculation is error-free splitting in floating-point matrices. In this study, we improve upon our previous method by a posteriori validation using floating-point exception. In the method, we utilize the presence of overflow in a positive manner for detecting whether rounding error occurs. If overflow occurs, the result contains some exceptional values such as
and NaN, that is, the method fails by necessity. Otherwise, we can confirm that no rounding error occurs in the process. Therefore, reducing the possibility of overflow is important. The numerical results suggest that the proposed algorithm provides more accurate results compared with the original algorithm. Moreover, for the product of *n* × *n* matrices, when
, the new algorithm reduces the computing time for error-free transformation by an average of 20 % and up to 30 % compared with the original algorithm. Furthermore, the new algorithm can be used when matrix multiplication is performed using divide-and-conquer methods. Copyright © 2016 John Wiley & Sons, Ltd.

Many modern approaches of time series analysis belong to the class of methods based on approximating high-dimensional spaces by low-dimensional subspaces. A typical method would embed a given time series into a structured matrix and find a low-dimensional approximation to this structured matrix. The purpose of this paper is twofold: (i) to establish a correspondence between a class of SVD-compatible matrix norms on the space of Hankel matrices and weighted vector norms (and provide methods to construct this correspondence) and (ii) to motivate the importance of this for problems in time series analysis. Examples are provided to demonstrate the merits of judiciously selecting weights on imputing missing data and forecasting in time series. Copyright © 2016 John Wiley & Sons, Ltd.