HiPC 2014 Accepted Papers
GPU Parallelization of the Stochastic On-time Arrival Problem
On the Suitability of MPI as a PGAS Runtime
Saving Energy by Exploiting Residual Imbalances on Iterative Applications
Balancing Context Switch Penalty and Response Time with Elastic Time Slicing
Relax-Miracle: GPU Parallelization of Semi-Analytic Fourier-Domain solvers for Earthquake Modeling
Mixed-Precision Models for Calculation of High-Order Virial Coefficients on GPUs
Efficient and Robust Allocation Algorithms in Clouds under Memory Constraints
Reducing elimination tree height for parallel LU factorization of sparse unsymmetric matrices
Particle Advection Performance Over Varied Architectures and Workloads
Xevolver: An XML-based Code Translation Framework for Supporting HPC Application Migration
Towards Realizing the Potential of Malleable Jobs
Analysis and Tuning of Libtensor Framework on Multicore Architectures
Optimizing Shared Data Accesses in Distributed-Memory X10 Systems
Distance Threshold Similarity Searches on Spatiotemporal Trajectories using GPGPU
Fine-grained GPU parallelization of Pairwise Local Sequence Alignment
DRIVE: Using Implicit Caching Hints to achieve Disk I/O Reduction in Virtualized Environments
Improving Multi-dimensional Query Processing with Data Migration in Distributed Cache Infrastructure
Interface for Heterogeneous Kernels: A Framework to Enable Hybrid OS Designs targeting High Performance Computing on Manycore Architectures
Optimization of Scan Algorithms on Multi- and Many-core Processors
A Proactive Approach for Coping with Uncertain Resource Availabilities on Desktop Grids
Software Based Ultrasound B-mode/Beamforming Optimization on GPU and its Performance Prediction
GpuTejas: A Parallel Simulator for GPU Architectures
RADIR: Lock-free and Wait-free Bandwidth Allocation Models for Solid State Drives
Trikon: A Hypervisor Aware Manycore Processor
Combining HoL-blocking Avoidance and Differentiated Services in High-Speed Interconnects
CQA: A Code Quality Analyzer tool at binary level
A Multilevel Compressed Sparse Row Format for Efficient Sparse Computations on Muticore Processors
Premonition of Storage Response Class Using Skyline Ranked Ensemble Method
Optimizing the performance of parallel applications on a 5D torus via task mapping
A High Performance Broadcast Design with Hardware Multicast and GPUDirect RDMA for Streaming Applications on Infiniband Clusters
An Early Experience of Regional Ocean Modelling on Intel Many Integrated Core Architecture
Matrix-Matrix Multiplication on a Large Register File Architecture with Indirection
Parallel AMG Solver for Three Dimensional Unstructured Grids Using GPU
A Flexible Scheduling Framework for Heterogeneous CPU-GPU Clusters
Smart Multi-Task Scheduling for OpenCL Programs on CPU/GPU Heterogeneous Platforms
Queueing-based Storage Performance Modeling and Placement in OpenStack Environments
Optical Overlay NUCA: A High Speed Substrate for Shared L2 Caches
A Fast Implementation of MLR-MCL Algorithm on Multi-core Processors
Coupling-Aware Graph Partitioning Algorithms
An improved recursive graph bipartitioning algorithm for well balanced domain decomposition
Algorithms for Power-Aware Resource Activation
Design and Evaluation of Parallel Hashing over Large-scale Data
Cache-Conscious Scheduling of Streaming Pipelines on Parallel Machines with Private Caches
Scaling Graph Community Detection on the Tilera Many-core Architecture
High Performance MPI Library over SR-IOV Enabled InfiniBand Clusters
Designing Efficient Small Message Transfer Mechanism for Inter-node MPI Communication on InfiniBand GPU Clusters
Online failure prediction for HPC resources using decentralized clustering
Heterogeneous many cores for medical control: Performance, Scalability, and Accuracy