Sciweavers

558 search results - page 53 / 112
» Programming the FlexRAM parallel intelligent memory system
Sort
View
ICS
2009
Tsinghua U.
16 years 1 months ago
Computer generation of fast fourier transforms for the cell broadband engine
The Cell BE is a multicore processor with eight vector accelerators (called SPEs) that implement explicit cache management through direct memory access engines. While the Cell has...
Srinivas Chellappa, Franz Franchetti, Markus P&uum...
IPPS
2009
IEEE
16 years 1 months ago
Scalable RDMA performance in PGAS languages
Partitioned Global Address Space (PGAS) languages provide a unique programming model that can span shared-memory multiprocessor (SMP) architectures, distributed memory machines, o...
Montse Farreras, George Almási, Calin Casca...
IEEEPACT
2007
IEEE
16 years 25 days ago
JudoSTM: A Dynamic Binary-Rewriting Approach to Software Transactional Memory
With the advent of chip-multiprocessors, we are faced with the challenge of parallelizing performance-critical software. Transactional memory (TM) has emerged as a promising progr...
Marek Olszewski, Jeremy Cutler, J. Gregory Steffan
IEEEPACT
1998
IEEE
15 years 10 months ago
Sirocco: Cost-Effective Fine-Grain Distributed Shared Memory
Software fine-grain distributed shared memory (FGDSM) provides a simplified shared-memory programming interface with minimal or no hardware support. Originally software FGDSMs tar...
Ioannis Schoinas, Babak Falsafi, Mark D. Hill, Jam...
PLDI
2012
ACM
13 years 9 months ago
Adaptive input-aware compilation for graphics engines
While graphics processing units (GPUs) provide low-cost and efficient platforms for accelerating high performance computations, the tedious process of performance tuning required...
Mehrzad Samadi, Amir Hormati, Mojtaba Mehrara, Jan...