skip to main content
10.1145/1837210acmotherconferencesBook PagePublication PagesissacConference Proceedingsconference-collections
PASCO '10: Proceedings of the 4th International Workshop on Parallel and Symbolic Computation
ACM2010 Proceeding
Publisher:
  • Association for Computing Machinery
  • New York
  • NY
  • United States
Conference:
PASCO '10: 4th International Workshop on Parallel and Symbolic Computation Grenoble France July 21 - 23, 2010
ISBN:
978-1-4503-0067-4
Published:
21 July 2010
Sponsors:
Grenoble University, Grenoble INP / ENSIMAG, INRIA
In-Cooperation:

Bibliometrics
Skip Abstract Section
Abstract

The International Workshop on Parallel and Symbolic Computation (PASCO) is a series of workshops dedicated to the promotion and advancement of parallel algorithms and software in all areas of symbolic mathematical computation. The pervasive ubiquity of parallel architectures and memory hierarchy has led to the emergence of a new quest for parallel mathematical algorithms and software capable of exploiting the various levels of parallelism: from hardware acceleration technologies (multicore and multi-processor system on chip, GPGPU, FPGA) to cluster and global computing platforms. To push up the limits of symbolic and algebraic computations, beyond the optimization of the application itself, the effective use of a large number of resources (memory and specialized computing units) is expected to enhance the performance multi-criteria objectives: time, energy consumption, resource usage, reliability. In this context, the design and the implementation of mathematical algorithms with provable and adaptive performances is a major challenge.

Skip Table Of Content Section
SESSION: Invited talks
research-article
Techniques and tools for implementing IEEE 754 floating-point arithmetic on VLIW integer processors

Recently, some high-performance IEEE 754 single precision floating-point software has been designed, which aims at best exploiting some features (integer arithmetic, parallelism) of the STMicroelectronics ST200 Very Long Instruction Word (VLIW) ...

research-article
Fifteen years after DSC and WLSS2 what parallel computations I do today: invited lecture at PASCO 2010

A second wave of parallel and distributed computing research is rolling in. Today's multicore/multiprocessor computers facilitate everyone's parallel execution. In the mid 1990s, manufactures of expensive main-frame parallel computers faltered and ...

research-article
Exploiting multicore systems with Cilk

The increasing prevalence of multicore processors has led to a renewed interest in parallel programming. Cilk is a language extension to C and C++ designed to simplify programming shared-memory multiprocessor systems. The workstealing scheduler in Cilk ...

research-article
Automated performance tuning

This tutorial presents automated techniques for implementing and optimizing numeric and symbolic libraries on modern computing platforms including SSE, multicore, and GPU. Obtaining high performance requires effective use of the memory hierarchy, short ...

research-article
Roomy: a system for space limited computations

There are numerous examples of problems in symbolic algebra in which the required storage grows far beyond the limitations even of the distributed RAM of a cluster. Often this limitation determines how large a problem one can solve in practice. Roomy ...

SESSION: Contributed papers
research-article
Generic design of Chinese remaindering schemes

We propose a generic design for Chinese remainder algorithms. A Chinese remainder computation consists in reconstructing an integer value from its residues modulo coprime integers. We also propose an efficient linear data structure, a radix ladder, for ...

research-article
A complete modular resultant algorithm targeted for realization on graphics hardware

This paper presents a complete modular approach to computing bivariate polynomial resultants on Graphics Processing Units (GPU). Given two polynomials, the algorithm first maps them to a prime field for sufficiently many primes, and then processes each ...

research-article
Parallel operations of sparse polynomials on multicores: I. multiplication and Poisson bracket

The multiplication of the sparse multivariate polynomials using the recursive representations is revisited to take advantage on the multicore processors. We take care of the memory management and load-balancing in order to obtain linear speedup. The ...

research-article
Parallel computation of the minimal elements of a poset

Computing the minimal elements of a partially ordered finite set (poset) is a fundamental problem in combinatorics with numerous applications such as polynomial expression optimization, transversal hypergraph generation and redundant component removal, ...

research-article
Parallel disk-based computation for large, monolithic binary decision diagrams

Binary Decision Diagrams (BDDs) are widely used in formal verification. They are also widely known for consuming large amounts of memory. For larger problems, a BDD computation will often start thrashing due to lack of memory within minutes. This work ...

research-article
Parallel arithmetic encryption for high-bandwidth communications on multicore/GPGPU platforms

In this work we study the feasibility of high-bandwidth, secure communications on generic machines equipped with the latest CPUs and General-Purpose Graphical Processing Units (GPGPU). We first analyze the suitability of current Nehalem CPU ...

research-article
Exact sparse matrix-vector multiplication on GPU's and multicore architectures

We propose different implementations of the sparse matrix-dense vector multiplication (SpMV) for finite fields and rings Z /m Z. We take advantage of graphic card processors (GPU) and multi-core architectures. Our aim is to improve the speed of SpMV in ...

research-article
Parallel Gaussian elimination for Gröbner bases computations in finite fields

Polynomial system solving is one of the important area of Computer Algebra with many applications in Robotics, Cryptology, Computational Geometry, etc. To this end computing a Gröbner basis is often a crucial step. The most efficient algorithms [6, 7] ...

research-article
A quantitative study of reductions in algebraic libraries

How much of existing computer algebra libraries is amenable to automatic parallelization? This is a difficult topic, yet of practical importance in the era of commodity multicore machines. This paper reports on a quantitative study of reductions in the ...

research-article
Parallel sparse polynomial division using heaps

We present a parallel algorithm for exact division of sparse distributed polynomials on a multicore processor. This is a problem with significant data dependencies, so our solution requires fine-grained parallelism. Our algorithm manages to avoid ...

research-article
A high-performance algorithm for calculating cyclotomic polynomials

The nth cyclotomic polynomial, Φn(z), is the monic polynomial whose ϕ(n) distinct roots are the nth primitive roots of unity. Φn(z) can be computed efficiently as a quotient of terms of the form (1 - zd) by way of a method the authors call the Sparse ...

research-article
Accuracy versus time: a case study with summation algorithms

In this article, we focus on numerical algorithms for which, in practice, parallelism and accuracy do not cohabit well. In order to increase parallelism, expressions are reparsed, implicitly using mathematical laws like associativity, and this reduces ...

research-article
Polynomial homotopies on multicore workstations

Homotopy continuation methods to solve polynomial systems scale very well on parallel machines. We examine its parallel implementation on multiprocessor multicore workstations using threads. With more cores we speed up pleasingly parallel path tracking ...

research-article
Parallel computations in modular group algebras

We report about the parallelisation of the algorithm to compute the normalised unit group V (FpG) of a modular group algebra FpG of a finite p-group G over the field of p elements Fp in the computational algebra system GAP. We present its distributed ...

research-article
Cache-oblivious polygon indecomposability testing

We examine a cache-oblivious reformulation of the (iterative) polygon indecomposability test of [19]. We analyse the cache complexity of the iterative version of this test within the ideal-cache model and identify the bottlenecks affecting its memory ...

research-article
Parallel sparse polynomial interpolation over finite fields

We present a probabilistic algorithm to interpolate a sparse multivariate polynomial over a finite field, represented with a black box. Our algorithm modifies the algorithm of Ben-Or and Tiwari from 1988 for interpolating polynomials over rings with ...

SESSION: Contributed extended abstracts
research-article
Spiral-generated modular FFT algorithms

This paper presents an extension of the Spiral system to automatically generate and optimize FFT algorithms for the discrete Fourier transform over finite fields. The generated code is intended to support modular algorithms for multivariate polynomial ...

research-article
High performance linear algebra using interval arithmetic

In this paper, we describe implementations of interval matrix multiplication and verified solution to a linear system, using entirely BLAS routines, which are fully optimized and parallelized.

research-article
Parallel computation of determinants of matrices with polynomial entries for robust control design

In this paper we consider computing determinants of polynomial matrices symbolically. Determinant computation of matrices with polynomial entries in a small number of variables is of particular interest since it commonly appears in solving engineering ...

research-article
Cache friendly sparse matrix-vector multiplication

Sparse matrix-vector multiplication or SpMXV is an important kernel in scientific computing. For example, the conjugate gradient method (CG) is an iterative linear system solving process where multiplication of the coefficient matrix A with a dense ...

research-article
Parallelising the computational algebra system GAP

We report on the project of parallelising GAP, a system for computational algebra. Our design aims to make concurrency facilities available for GAP users, while preserving as much of the existing code base (about one million lines of code) with as few ...

Contributors
  • Western University
  • Grenoble Alpes University
Index terms have been assigned to the content through auto-classification.

Recommendations