In variational data assimilation a least-squares objective function is minimised to obtain the most likely state of a dynamical system. This objective function combines observation and prior (or background) data weighted by their respective error statistics. In numerical weather prediction, data assimilation is used to estimate the current atmospheric state, which then serves as an initial condition for a forecast. New developments in the treatment of observation uncertainties have recently been shown to cause convergence problems for this least-squares minimisation. This is important for operational numerical weather prediction centres due to the time constraints of producing regular forecasts. The condition number of the Hessian of the objective function can be used as a proxy to investigate the speed of convergence of the least-squares minimisation. In this paper we develop novel theoretical bounds on the condition number of the Hessian. These new bounds depend on the minimum eigenvalue of the observation error covariance matrix and the ratio of background error variance to observation error variance. Numerical tests in a linear setting show that the location of observation measurements has an important effect on the condition number of the Hessian. We identify that the conditioning of the problem is related to the complex interactions between observation error covariance and background error covariance matrices. Increased understanding of the role of each constituent matrix in the conditioning of the Hessian will prove useful for informing the choice of correlated observation error covariance matrix and observation location, particularly for practical applications.

This paper discusses techniques for computing a few selected eigenvalue–eigenvector pairs of large and sparse symmetric matrices. A recently developed class of techniques to solve this type of problems is based on integrating the matrix resolvent operator along a complex contour that encloses the interval containing the eigenvalues of interest. This paper considers such contour integration techniques from a domain decomposition viewpoint and proposes two schemes. The first scheme can be seen as an extension of domain decomposition linear system solvers in the framework of contour integration methods for eigenvalue problems, such as FEAST. The second scheme focuses on integrating the resolvent operator primarily along the interface region defined by adjacent subdomains. A parallel implementation of the proposed schemes is described, and results on distributed computing environments are reported. These results show that domain decomposition approaches can lead to reduced run times and improved scalability.

For the discrete linear systems resulted from the discretization of the one-dimensional anisotropic spatial fractional diffusion equations of variable coefficients with the shifted finite-difference formulas of the Grünwald–Letnikov type, we propose a class of respectively scaled Hermitian and skew-Hermitian splitting iteration method and establish its asymptotic convergence theory. The corresponding induced matrix splitting preconditioner, through further replacements of the involved Toeplitz matrices with certain circulant matrices, leads to an economic variant that can be executed by fast Fourier transforms. Both theoretical analysis and numerical implementations show that this fast respectively scaled Hermitian and skew-Hermitian splitting preconditioner can significantly improve the computational efficiency of the Krylov subspace iteration methods employed as effective linear solvers for the target discrete linear systems.

This paper presents some recent advances for parallel-in-time methods applied to linear elasticity. With recent computer architecture changes leading to stagnant clock speeds, but ever increasing numbers of cores, future speedups will be available through increased concurrency. Thus, sequential algorithms, such as time stepping, will suffer a bottleneck. This paper explores multigrid reduction in time (MGRIT) for an important application area, linear elasticity. Previously, efforts at parallel-in-time for elasticity have experienced difficulties, for example, the beating phenomenon. As a result, practical parallel-in-time algorithms for this application area currently do not exist. This paper proposes some solutions made possible by MGRIT (e.g., slow temporal coarsening and FCF-relaxation) and, more importantly, a different formulation of the problem that is more amenable to parallel-in-time methods. Using a recently developed convergence theory for MGRIT and Parareal, we show that the changed formulation of the problem avoids the instability issues and allows the reduction of the error using two temporal grids. We then extend our approach to the multilevel case, where we demonstrate how slow temporal coarsening improves convergence. The paper ends with supporting numerical results showing a practical algorithm enjoying speedup benefits over the sequential algorithm.

The multigroup neutron diffusion equations (an approximation of the neutron transport equation) are widely used for studying the motion of neutrons and their interactions with stationary background materials. Solving the multigroup neutron diffusion equations is challenging because the unknowns are tightly coupled through scattering and fission events, and solutions with high spatial resolution of full reactor cores in multiphysics environments are frequently required. In this paper, we focus on the development of a scalable, parallel preconditioner for solving the system of equations arising from the finite-element discretization of the multigroup neutron diffusion equations in space and an implicit finite-difference scheme in time. The parallel preconditioner (here referred to as the “fully coupled Schwarz preconditioner”) is constructed by monolithically applying the overlapping domain decomposition method together with a smoothed aggregation-based coarse space to the coupled system. Our approach is different from the traditional block Gauss–Seidel sweep method that applies the preconditioner from the fast group to the thermal group sequentially, and we demonstrate that it provides significant improvements in terms of both the number of iterations required and the total compute time for a system of equations with millions of unknowns on a large supercomputer.

A stochastic chemical system with multiple types of molecules interacting through reaction channels can be modeled as a continuous-time Markov chain with a countably infinite multidimensional state space. Starting from an initial probability distribution, the time evolution of the probability distribution associated with this continuous-time Markov chain is described by a system of ordinary differential equations, known as the chemical master equation (CME). This paper shows how one can solve the CME using backward differentiation. In doing this, a novel approach to truncate the state space at each time step using a prediction vector is proposed. The infinitesimal generator matrix associated with the truncated state space is represented compactly, and exactly, using a sum of Kronecker products of matrices associated with molecules. This exact representation is already compact and does not require a low-rank approximation in the hierarchical Tucker decomposition (HTD) format. During transient analysis, compact solution vectors in HTD format are employed with the exact, compact, and truncated generated matrices in Kronecker form, and the linear systems are solved with the Jacobi method using fixed or adaptive rank control strategies on the compact vectors. Results of simulation on benchmark models are compared with those of the proposed solver and another version, which works with compact vectors and highly accurate low-rank approximations of the truncated generator matrices in quantized tensor train format and solves the linear systems with the density matrix renormalization group method. Results indicate that there is a reason to solve the CME numerically, and adaptive rank control strategies on compact vectors in HTD format improve time and memory requirements significantly.

In this paper, we consider efficient and robust algorithms for computing the diffusion state distance (DSD) metric on graphs developed recently. In order to efficiently compute DSD, we reformulate the problem into graph Laplacians and use unsmoothed aggregation algebraic multigrid to solve the resulting linear system of equations. To further reduce the computational cost, we approximate DSD by using random projections based on the Johnson–Lindenstrauss lemma. Numerical results for real-world protein–protein interaction networks are presented to demonstrate the efficiency and robustness of the proposed new approaches.

The goal of this paper is to create a fruitful bridge between the numerical methods for approximating PDEs in fluid dynamics and the (iterative) numerical methods for dealing with the resulting large linear systems. Among the main objectives are the design of new, efficient iterative solvers and a rigorous analysis of their convergence speed. The link we have in mind is either the structure or the hidden structure that the involved coefficient matrices inherit, both from the continuous PDE and from the approximation scheme; in turn, the resulting structure is used for deducing spectral information, crucial for the conditioning and convergence analysis and for the design of more efficient solvers.

As a specific problem, we consider the incompressible Navier–Stokes equations; as a numerical technique, we consider a novel family of high-order, accurate discontinuous Galerkin methods on *staggered* meshes, and as tools, we use the theory of Toeplitz matrices generated by a function (in the most general block, the multilevel form) and the more recent theory of generalized locally Toeplitz matrix sequences. We arrive at a somehow complete picture of the spectral features of the underlying matrices, and this information is employed for giving a forecast of the convergence history of the conjugate gradient method, together with a discussion on new and more advanced techniques (involving preconditioning, multigrid, multi-iterative solvers). Several numerical tests are provided and critically illustrated in order to show the validity and the potential of our analysis.

When the implicit, conservative difference scheme with the fractional centered difference formula is employed to discretize the space fractional coupled nonlinear Schrödinger equations, in each time step, we need to solve a complex symmetric linear system. The real part of the coefficient matrix is a symmetric Toeplitz-plus-diagonal matrix, whereas the imaginary part is the identity matrix. In this paper, a structure preserving preconditioner and a circulant preconditioner are proposed for such Toeplitz-like matrix. Theoretically, tight bounds for eigenvalues of the preconditioned matrices are derived. Numerical implementations show that Krylov subspace iteration methods such as BiCGSTAB, when accelerated by the proposed preconditioners, are efficient solvers for solving the discretized linear system.

Multigrid methods that use block-structured relaxation schemes have been successfully applied to several saddle-point problems, including those that arise from the discretization of the Stokes equations. In this paper, we present a local Fourier analysis of block-structured relaxation schemes for the staggered finite-difference discretization of the Stokes equations to analyze their convergence behavior. Three block-structured relaxation schemes are considered: distributive relaxation, Braess–Sarazin relaxation, and Uzawa relaxation. In each case, we consider variants based on weighted-Jacobi relaxation, as is most suitable for parallel implementation on modern architectures. From this analysis, optimal parameters are proposed, and we compare the efficiency of the presented algorithms with these parameters. Finally, some numerical experiments are presented to validate the two-grid and multigrid convergence factors.

We focus on efficient preconditioning techniques for sequences of Karush-Kuhn-Tucker (KKT) linear systems arising from the interior point (IP) solution of large convex quadratic programming problems. Constraint preconditioners (CPs), although very effective in accelerating Krylov methods in the solution of KKT systems, have a very high computational cost in some instances, because their factorization may be the most time-consuming task at each IP iteration. We overcome this problem by computing the CP from scratch only at selected IP iterations and by updating the last computed CP at the remaining iterations, via suitable low-rank modifications based on a BFGS-like formula. This work extends the limited-memory preconditioners (LMPs) for symmetric positive definite matrices proposed by Gratton, Sartenaer and Tshimanga in 2011, by exploiting specific features of KKT systems and CPs. We prove that the updated preconditioners still belong to the class of exact CPs, thus allowing the use of the conjugate gradient method. Furthermore, they have the property of increasing the number of unit eigenvalues of the preconditioned matrix as compared with the generally used CPs. Numerical experiments are reported, which show the effectiveness of our updating technique when the cost for the factorization of the CP is high.

This paper studies a low-communication algorithm for solving elliptic PDEs on high-performance machines, the nested iteration with range decomposition (NIRD) algorithm. Previous work has shown that NIRD converges to a high level of accuracy within a small, fixed number of iterations (usually one or two) when applied to simple elliptic problems. This paper makes some improvements to the NIRD algorithm (including the addition of adaptivity during preprocessing, wider choice of partitioning functions, and modified error measurement) that enhance the method's accuracy and scalability, especially on more difficult problems. In addition, an updated convergence proof is presented based on heuristic assumptions that are supported by numerical evidence. Furthermore, a new performance model is developed that shows increased performance benefits for NIRD when problems are more expensive to solve using traditional methods. Finally, extensive testing on a variety of elliptic problems provides additional insight into the behavior of NIRD and additional evidence that NIRD achieves excellent convergence on a wide class of elliptic PDEs and, as such, should be a very competitive method for solving PDEs on large parallel computers.

In this paper, we study the nearest stable matrix pair problem: given a square matrix pair (*E*,*A*), minimize the Frobenius norm of (Δ_{E},Δ_{A}) such that (*E*+Δ_{E},*A*+Δ_{A}) is a stable matrix pair. We propose a reformulation of the problem with a simpler feasible set by introducing dissipative Hamiltonian matrix pairs: A matrix pair (*E*,*A*) is dissipative Hamiltonian if *A*=(*J*−*R*)*Q* with skew-symmetric *J*, positive semidefinite *R*, and an invertible *Q* such that *Q*^{T}*E* is positive semidefinite. This reformulation has a convex feasible domain onto which it is easy to project. This allows us to employ a fast gradient method to obtain a nearby stable approximation of a given matrix pair.

We propose nonsymmetric lean algebraic multigrid (NS-LAMG), a new algebraic multigrid algorithm for directed graph Laplacian systems that combines ideas from undirected graph Laplacian multigrid solvers and multigrid algorithms for Markov chain stationary distribution systems. Low-degree elimination, proposed in LAMG for undirected graphs, is generalized to directed graphs and is a key component of NS-LAMG. In the setup phase, we propose a simple stationary-aggregation multigrid algorithms for Markov chain stationary distribution systems solver enhanced by low-degree elimination to find the right null-space vector that is used for the intergrid transfer operators. Numerical results show that low-degree elimination improves performance and that NS-LAMG outperforms generalized minimal residual method with restart and stable bi-conjugate gradient method for real-world, directed graph Laplacian linear systems.

This work describes a domain embedding technique between two nonmatching meshes used for generating realizations of spatially correlated random fields with applications to large-scale sampling-based uncertainty quantification. The goal is to apply the multilevel Monte Carlo (MLMC) method for the quantification of output uncertainties of PDEs with random input coefficients on general and unstructured computational domains. We propose a highly scalable, hierarchical sampling method to generate realizations of a Gaussian random field on a given unstructured mesh by solving a reaction–diffusion PDE with a stochastic right-hand side. The stochastic PDE is discretized using the mixed finite element method on an embedded domain with a structured mesh, and then, the solution is projected onto the unstructured mesh. This work describes implementation details on how to efficiently transfer data from the structured and unstructured meshes at coarse levels, assuming that this can be done efficiently on the finest level. We investigate the efficiency and parallel scalability of the technique for the scalable generation of Gaussian random fields in three dimensions. An application of the MLMC method is presented for quantifying uncertainties of subsurface flow problems. We demonstrate the scalability of the sampling method with nonmatching mesh embedding, coupled with a parallel forward model problem solver, for large-scale 3D MLMC simulations with up to 1.9·10^{9} unknowns.

It is known that for a tridiagonal Toeplitz matrix, having on the main diagonal the constant *a*_{0} and on the two first off-diagonals the constants *a*_{1}(lower) and *a*_{−1}(upper), which are all complex values, there exist closed form formulas, giving the eigenvalues of the matrix and a set of associated eigenvectors. For example, for the 1D discrete Laplacian, this triple is (*a*_{0},*a*_{1},*a*_{−1})=(2,−1,−1). In the first part of this article, we consider a tridiagonal Toeplitz matrix of the same form (*a*_{0},*a*_{ω},*a*_{−ω}), but where the two off-diagonals are positioned *ω* steps from the main diagonal instead of only one. We show that its eigenvalues and eigenvectors can also be identified in closed form and that interesting connections with the standard Toeplitz symbol are identified. Furthermore, as numerical evidences clearly suggest, it turns out that the eigenvalue behavior of a general banded symmetric Toeplitz matrix with real entries can be described qualitatively in terms of the symmetrically sparse tridiagonal case with real *a*_{0}, *a*_{ω}=*a*_{−ω}, *ω*=2,3,…, and also quantitatively in terms of those having monotone symbols. A discussion on the use of such results and on possible extensions complements the paper.

In this paper, a few dual least-squares finite element methods and their application to scalar linear hyperbolic problems are studied. The purpose is to obtain *L*^{2}-norm approximations on finite element spaces of the exact solutions to hyperbolic partial differential equations of interest. This is approached by approximating the generally infeasible quadratic minimization that defines the *L*^{2}-orthogonal projection of the exact solution, by feasible least-squares principles using the ideas of the original
method proposed in the context of elliptic equations. All methods in this paper are founded upon and extend the
approach that is rather general and applicable beyond the setting of elliptic problems. Error bounds are shown that point to the factors affecting the convergence and provide conditions that guarantee optimal rates. Furthermore, the preconditioning of the resulting linear systems is discussed. Numerical results are provided to illustrate the behavior of the methods on common finite element spaces.

The restarted block generalized minimum residual method (BGMRES) with deflated restarting (BGMRES-DR) was proposed by Morgan to dump the negative effect of small eigenvalues from the convergence of the BGMRES method. More recently, Wu et al. introduced the shifted BGMRES method (BGMRES-Sh) for solving the sequence of linear systems with multiple shifts and multiple right-hand sides. In this paper, a new shifted block Krylov subspace algorithm that combines the characteristics of both the BGMRES-DR and the BGMRES-Sh methods is proposed. Moreover, our method is enhanced with a seed selection strategy to handle the case of almost linear dependence of the right-hand sides. Numerical experiments illustrate the potential of the proposed method to solve efficiently the sequence of linear systems with multiple shifts and multiple right-hand sides, with and without preconditioner, also against other state-of-the-art solvers.

The aim of this work is to compare algebraic multigrid (AMG) preconditioned GMRES methods for solving the nonsymmetric and positive definite linear systems of algebraic equations that arise from a space–time finite-element discretization of the heat equation in 3D and 4D space–time domains. The finite-element discretization is based on a Galerkin–Petrov variational formulation employing piecewise linear finite elements simultaneously in space and time. We focus on a performance comparison of conventional and modern AMG methods for such finite-element equations, as well as robustness with respect to the mesh discretization and the heat capacity constant. We discuss different coarsening and interpolation strategies in the AMG methods for coarse-grid selection and coarse-grid matrix construction. Further, we compare AMG performance for the space–time finite-element discretization on both uniform and adaptive meshes consisting of tetrahedra and pentachora in 3D and 4D, respectively. The mesh adaptivity occurring in space and time is guided by a residual-based a posteriori error estimation.

In this paper, we focus on the solution of shifted quasiseparable systems and of more general parameter-dependent matrix equations with quasiseparable representations. We propose an efficient algorithm exploiting the invariance of the quasiseparable structure under diagonal shifting and inversion. This algorithm is applied to compute various functions of matrices. Numerical experiments show that this approach is fast and numerically robust.

In this paper, we present a geometric multigrid methodology for the solution of matrix systems associated with isogeometric compatible discretizations of the generalized Stokes and Oseen problems. The methodology provably yields a pointwise divergence-free velocity field independent of the number of pre-smoothing steps, postsmoothing steps, grid levels, or cycles in a V-cycle implementation, provided that the initial velocity guess is also divergence free. The methodology relies upon Scwharz-style smoothers in conjunction with specially defined overlapping subdomains that respect the underlying topological structure of the generalized Stokes and Oseen problems. Numerical results in both two- and three-dimensions demonstrate the robustness of the methodology through the invariance of convergence rates with respect to grid resolution and flow parameters for the generalized Stokes problem and the generalized Oseen problem, provided that it is not advection dominated.

Elliptic optimal control problems with pointwise box constraints on the control are considered. To numerically solve elliptic optimal control problems with pointwise box constraints on the control, an inexact alternating direction method of multipliers (iADMM) is first proposed on the continuous level with the aim of solving discretized problems with moderate accuracy. Then, the standard piecewise linear finite element is employed to discretize the related subproblems appearing in each iteration of the iADMM algorithm. Such approach will give us the freedom to discretize two inner subproblems of the iADMM algorithm by different discretized scheme, respectively. More importantly, it should be emphasized that the discretized version of the iADMM algorithm can be regarded as a modification of the inexact semiproximal ADMM (isPADMM) algorithm. In order to obtain more accurate solution, the primal-dual active set method is used as a postprocessor of the isPADMM. Numerical results not only show that the isPADMM and the two-phase strategy are highly efficient but also show the mesh independence of the isPADMM.

It is known that in many functions of banded and, more generally, sparse Hermitian positive definite matrices, the entries exhibit a rapid decay away from the sparsity pattern. This is particularly true for the inverse, and based on results for the inverse, bounds for Cauchy–Stieltjes functions of Hermitian positive definite matrices have recently been obtained. We add to the known results by considering certain types of normal matrices, for which fewer and typically less satisfactory results exist so far. Starting from a very general estimate based on approximation properties of Chebyshev polynomials on ellipses, we obtain as special cases insightful decay bounds for various classes of normal matrices, including (shifted) skew-Hermitian and Hermitian indefinite matrices. In addition, some of our results improve over known bounds when applied to the Hermitian positive definite case.

Matrix equations of the kind *A*_{1}*X*^{2}+*A*_{0}*X*+*A*_{−1}=*X*, where both the matrix coefficients and the unknown are semi-infinite matrices belonging to a Banach algebra, are considered. These equations, where coefficients are quasi-Toeplitz matrices, are encountered in certain quasi-birth–death processes as the tandem Jackson queue or in any other processes that can be modeled as a reflecting random walk in the quarter plane. We provide a numerical framework for approximating the minimal nonnegative solution of these equations that relies on semi-infinite quasi-Toeplitz matrix arithmetic. In particular, we show that the algorithm of cyclic reduction can be effectively applied and can approximate the infinite-dimensional solutions with quadratic convergence at a cost that is comparable to that of the finite case. This way, we may compute a finite approximation of the sought solution and of the invariant probability measure of the associated quasi-birth–death process, within a given accuracy. Numerical experiments, performed on a collection of benchmarks, confirm the theoretical analysis.

In this paper, we consider an efficient approach to solve a linear ill-posed inverse problem with total variation regularization, where has a Kronecker product structure, . A numerical scheme is developed by using a variable splitting technique and directly exploiting the Kronecker structure of the problem. Experimental results on image restoration applications illustrate the effectiveness of our proposed method.

For large sparse non-Hermitian positive definite linear systems, we establish exact and inexact quasi-HSS iteration methods and discuss their convergence properties. Numerical experiments show that both iteration methods are effective and robust when they are used either as linear solvers or as matrix splitting preconditioners for the Krylov subspace iteration methods. In addition, these two iteration methods are, respectively, much more powerful than the exact and inexact HSS iteration methods, especially when the linear systems have nearly singular Hermitian parts or strongly dominant skew-Hermitian parts, and they can be employed to solve non-Hermitian indefinite linear systems with only mild indefiniteness.

We suggest and compare different methods for the numerical solution of Lyapunov-like equations with application to control of Markovian jump linear systems. First, we consider fixed-point iterations and associated Krylov subspace formulations. Second, we reformulate the equation as an optimization problem and consider steepest descent, conjugate gradient, and trust-region methods. Numerical experiments illustrate that, for large-scale problems, the trust-region method is more efficient than a direct solution or a standard solver for linear matrix inequalities. The fixed-point approach, however, is superior to the optimization methods. As an application, we consider a networked control system, where the Markov jumps are induced by the wireless communication protocol.

No abstract is available for this article.

]]>We propose a new inertia-revealing factorization for sparse symmetric matrices. The factorization scheme and the method for extracting the inertia from it were proposed in the 1960s for dense, banded, or tridiagonal matrices, but they have been abandoned in favor of faster methods. We show that this scheme can be applied to any sparse symmetric matrix and that the fill in the factorization is bounded by the fill in the sparse *Q**R* factorization of the same matrix (but is usually much smaller). We describe our serial proof-of-concept implementation and present experimental results, studying the method's numerical stability and performance.

The finite-difference method applied to the time-fractional subdiffusion equation usually leads to a large-scale linear system with a block lower triangular Toeplitz coefficient matrix. The approximate inversion method is employed to solve this system. A sufficient condition is proved to guarantee the high accuracy of the approximate inversion method for solving the block lower triangular Toeplitz systems, which are easy to verify in practice and have a wide range of applications. The applications of this sufficient condition to several existing finite-difference schemes are investigated. Numerical experiments are presented to verify the validity of theoretical results.

Several applied problems may produce large sparse matrices with a small number of dense rows and/or columns, which can adversely affect the performance of commonly used direct solvers. By posing the problem as a saddle point system, an unconventional application of a null space method can be employed to eliminate dense rows and columns. The choice of null space basis is critical in retaining the overall sparse structure of the matrix. A new one-sided application of the null space method is also presented to eliminate either dense rows or columns. These methods can be considered techniques that modify the nonzero structure of the matrix before employing a direct solver and may result in improved direct solver performance.

In this paper, we propose a fast algorithm for computing the spectral radii of symmetric nonnegative tensors. In particular, by this proposed algorithm, we are able to obtain the spectral radii of weakly reducible symmetric nonnegative tensors without requiring the partition of the tensors. As we know, it is very costly to determine the partition for large-sized weakly reducible tensors. Numerical results are reported to show that the proposed algorithm is efficient and also able to compute the spectral radii of large-sized tensors. As an application, we present an algorithm for testing the positive definiteness of Z-tensors. By this algorithm, it is guaranteed to determine the positive definiteness for any Z-tensor.

Many scientific and engineering disciplines use multivariate polynomials. Decomposing a multivariate polynomial vector function into a sandwiched structure of univariate polynomials surrounded by linear transformations can provide useful insight into the function while reducing the number of parameters. Such a decoupled representation can be realized with techniques based on tensor decomposition methods, but these techniques have only been studied in the exact case. Generalizing the existing techniques to the noisy case is an important next step for the decoupling problem. For this extension, we have considered a weight factor during the tensor decomposition process, leading to an alternating weighted least squares scheme. In addition, we applied the proposed weighted decoupling algorithm in the area of system identification, and we observed smaller model errors with the weighted decoupling algorithm than those with the unweighted decoupling algorithm.

The goal of this paper is to find a low-rank approximation for a given *n*th tensor. Specifically, we give a computable strategy on calculating the rank of a given tensor, based on approximating the solution to an NP-hard problem. In this paper, we formulate a sparse optimization problem via an *l*_{1}-regularization to find a low-rank approximation of tensors. To solve this sparse optimization problem, we propose a rescaling algorithm of the proximal alternating minimization and study the theoretical convergence of this algorithm. Furthermore, we discuss the probabilistic consistency of the sparsity result and suggest a way to choose the regularization parameter for practical computation. In the simulation experiments, the performance of our algorithm supports that our method provides an efficient estimate on the number of rank-one tensor components in a given tensor. Moreover, this algorithm is also applied to surveillance videos for low-rank approximation.

Algorithms and implementations for computing the sign function of a triangular matrix are fundamental building blocks for computing the sign of arbitrary square real or complex matrices. We present novel recursive and cache-efficient algorithms that are based on Higham's stabilized specialization of Parlett's substitution algorithm for computing the sign of a triangular matrix. We show that the new recursive algorithms are asymptotically optimal in terms of the number of cache misses that they generate. One algorithm that we present performs more arithmetic than the nonrecursive version, but this allows it to benefit from calling highly optimized matrix multiplication routines; the other performs the same number of operations as the nonrecursive version, suing custom computational kernels instead. We present implementations of both, as well as a cache-efficient implementation of a block version of Parlett's algorithm. Our experiments demonstrate that the blocked and recursive versions are much faster than the previous algorithms and that the inertia strongly influences their relative performance, as predicted by our analysis.

An algebraic multigrid (AMG) solution method for saddle-point problems is described. The indefinite nature of the saddle-point matrix makes it unsuitable for the simple smoothing methods normally used in AMG. Moreover, even if presented in a stabilised form, as is done here, poorly conditioned matrices will be generated when constructing the coarse-grid approximation. This is because, with each successive coarsening step, the off-diagonal matrix blocks (of interfield coupling) reduce in size more slowly than the diagonal blocks (of intrafield coupling). Stabilised smoothing operators are therefore considered. The first is based on an incomplete decomposition of the complete system matrix into the product of lower-triangular (**L**) and upper-triangular (**U**) matrices, an **ILU** factorisation. The second is based on an exact block decomposition of an incomplete (simplified) system matrix into lower and upper block-triangular matrices, a **BILU** factorisation. However, the degree of stabilisation thus established in the smoothing operators does not guarantee an efficient smoothing at all grid levels. There can still be inefficiency on the least-stable coarser grids. The breakdown in efficiency begins at a grid level where the ratio of the inter- to intrafield coupling strengths exceeds a critical ratio. Provision is thus made for a further conditioning of coarse-grid operators, a coarse-level conditioning (CLC). This is another block-LU factorisation that is only applied at and beyond the critical grid level. It is not applied directly to the operator for any chosen level but to its fine-grid progenitor. Thus, while ILU and BILU use postcoarsening-step factorisations, CLC uses precoarsening-step factorisations. With CLC so deployed, ILU and BILU become efficient at all grid levels, resulting in an AMG convergence that is independent of the total number of grids.