Sciweavers

630 search results - page 57 / 126
» Optimized union of non-disjoint distributed data sets
Sort
View
ICS
2010
Tsinghua U.
15 years 9 months ago
Speeding up Nek5000 with autotuning and specialization
Autotuning technology has emerged recently as a systematic process for evaluating alternative implementations of a computation, in order to select the best-performing solution for...
Jaewook Shin, Mary W. Hall, Jacqueline Chame, Chun...
IPPS
2003
IEEE
15 years 12 months ago
Parallel ROLAP Data Cube Construction On Shared-Nothing Multiprocessors
The pre-computation of data cubes is critical to improving the response time of On-Line Analytical Processing (OLAP) systems and can be instrumental in accelerating data mining tas...
Ying Chen, Frank K. H. A. Dehne, Todd Eavis, Andre...
ICS
2009
Tsinghua U.
16 years 1 months ago
MPI-aware compiler optimizations for improving communication-computation overlap
Several existing compiler transformations can help improve communication-computation overlap in MPI applications. However, traditional compilers treat calls to the MPI library as ...
Anthony Danalis, Lori L. Pollock, D. Martin Swany,...
EUROPAR
2010
Springer
15 years 6 months ago
Optimized On-Chip-Pipelined Mergesort on the Cell/B.E
Abstract. Limited bandwidth to off-chip main memory is a performance bottleneck in chip multiprocessors for streaming computations, such as Cell/B.E., and this will become even mor...
Rikard Hultén, Christoph W. Kessler, Jö...
DAC
1996
ACM
15 years 10 months ago
Optimal Clock Skew Scheduling Tolerant to Process Variations
1- A methodology is presented in this paper for determining an optimal set of clock path delays for designing high performance VLSI/ULSI-based clock distribution networks. This met...
José Luis Neves, Eby G. Friedman