default search action
29th PACT 2020: Virtual Event, GA, USA
- Vivek Sarkar, Hyesoon Kim:
PACT '20: International Conference on Parallel Architectures and Compilation Techniques, Virtual Event, GA, USA, October 3-7, 2020. ACM 2020, ISBN 978-1-4503-8075-1
Keynote I
- Rick Stevens:
Overview of HPC and AI Computing for COVID-19 in the US. 1
Session 1: Optimizations for GPUs
- Jiannan Tian, Sheng Di, Kai Zhao, Cody Rivera, Megan Hickman Fulp, Robert Underwood, Sian Jin, Xin Liang, Jon Calhoun, Dingwen Tao, Franck Cappello:
cuSZ: An Efficient GPU-Based Error-Bounded Lossy Compression Framework for Scientific Data. 3-15 - Kishore Punniyamurthy, Andreas Gerstlauer:
TAFE: Thread Address Footprint Estimation for Capturing Data/Thread Locality in GPU Systems. 17-29 - Ziheng Wang:
SparseRT: Accelerating Unstructured Sparsity on GPUs for Deep Learning Inference. 31-42 - Chanyoung Oh, Zhen Zheng, Xipeng Shen, Jidong Zhai, Youngmin Yi:
GOPipe: A Granularity-Oblivious Programming Framework for Pipelined Stencil Executions on GPU. 43-54 - Changwan Hong, Laxman Dhulipala, Julian Shun:
Exploring the Design Space of Static and Incremental Graph Connectivity Algorithms on GPUs. 55-69
Session 2: Compiler Optimization and Code Generation
- Bastian Hagedorn, Archibald Samuel Elliott, Henrik Barthels, Rastislav Bodík, Vinod Grover:
Fireiron: A Data-Movement-Aware Scheduling Language for GPUs. 71-82 - Lorenzo Chelini, Tobias Gysi, Tobias Grosser, Martin Kong, Henk Corporaal:
Automatic Generation of Multi-Objective Polyhedral Compiler Transformations. 83-96 - Mingchuan Wu, Ying Liu, Huimin Cui, Qingfu Wei, Quanfeng Li, Limin Li, Fang Lv, Jingling Xue, Xiaobing Feng:
Bandwidth-Aware Loop Tiling for DMA-Supported Scratchpad Memory. 97-109 - Guixin Ye, Zhanyong Tang, Huanting Wang, Dingyi Fang, Jianbin Fang, Songfang Huang, Zheng Wang:
Deep Program Structure Modeling Through Multi-Relational Graph-based Learning. 111-123 - Linjian Ma, Jiayu Ye, Edgar Solomonik:
AutoHOOT: Automatic High-Order Optimization for Tensors. 125-137 - Tanzima Sultana, Blake Allen, Apan Qasem:
Intelligent Data Placement on Discrete GPU Nodes with Unified Memory. 139-151
Poster Session I
- Ruobing Chen, Jinping Wu, Haosen Shi, Yusen Li, Haiyan Yin, Shanjiang Tang, Xiaoguang Liu, Gang Wang:
Deep Learning Assisted Resource Partitioning for Improving Performance on Commodity Servers. 153-154 - Bokyeong Kim, Soojin Hwang, Sanghoon Cha, Chang Hyun Park, Jongse Park, Jaehyuk Huh:
Decoupled Address Translation for Heterogeneous Memory Systems. 155-156 - Jiho Kim, Sanghun Cho, Minsoo Rhu, Ali Bakhoda, Tor M. Aamodt, John Kim:
Bandwidth Bottleneck in Network-on-Chip for High-Throughput Processors. 157-158
Keynote II
- Sarita V. Adve:
Scalable Specialization: Architectures, Interfaces, & Applications. 159
Session 3: Parallel Architectures
- Mohamed Assem Ibrahim, Onur Kayiran, Yasuko Eckert, Gabriel H. Loh, Adwait Jog:
Analyzing and Leveraging Shared L1 Caches in GPUs. 161-173 - Subhankar Pal, Siying Feng, Dong-Hyeon Park, Sung Kim, Aporva Amarnath, Chi-Sheng Yang, Xin He, Jonathan Beaumont, Kyle May, Yan Xiong, Kuba Kaszyk, John Magnus Morton, Jiawen Sun, Michael F. P. O'Boyle, Murray Cole, Chaitali Chakrabarti, David T. Blaauw, Hun-Seok Kim, Trevor N. Mudge, Ronald G. Dreslinski:
Transmuter: Bridging the Efficiency Gap using Memory and Dataflow Reconfiguration. 175-190 - Xulong Tang, Ziyu Zhang, Weizheng Xu, Mahmut Taylan Kandemir, Rami G. Melhem, Jun Yang:
Enhancing Address Translations in Throughput Processors via Compression. 191-204 - Sawan Singh, Alexandra Jimborean, Alberto Ros:
Regional Out-of-Order Writes in Total Store Order. 205-216 - Stuart Byma, Akash Dhasade, Adrian M. Altenhoff, Christophe Dessimoz, James R. Larus:
Parallel and Scalable Precise Clustering. 217-228
Session 4: Hardware/software for Security&Machine Learning
- Omais Shafi, Janibul Bashir:
SecSched: Flexible Scheduling in Secure Processors. 229-240 - Kim-Anh Tran, Christos Sakalis, Magnus Själander, Alberto Ros, Stefanos Kaxiras, Alexandra Jimborean:
Clearing the Shadows: Recovering Lost Performance for Invisible Speculative Execution through HW/SW Co-Design. 241-254 - Yulin Zhang, Xiaoming Li:
Fast Convolutional Neural Networks with Fine-Grained FFTs. 255-265 - Masuma Akter Rumi, Xiaolong Ma, Yanzhi Wang, Peng Jiang:
Accelerating Sparse CNN Inference on GPUs with Performance-Aware Weight Pruning. 267-278 - Zhangxiaowen Gong, Houxiang Ji, Christopher W. Fletcher, Christopher J. Hughes, Josep Torrellas:
SparseTrain: Leveraging Dynamic Sparsity in Software for Training DNNs on General-Purpose SIMD Processors. 279-292
Session 5: Best Paper
- Qian Lou, Sarath Chandra Janga, Lei Jiang:
Helix: Algorithm/Architecture Co-design for Accelerating Nanopore Genome Base-calling. 293-304 - Saurabh Gupta, Niranjan Soundararajan, Ragavendra Natarajan, Sreenivas Subramoney:
Opportunistic Early Pipeline Re-steering for Data-dependent Branches. 305-316 - Abhinav Jangda, Arjun Guha:
Model-Based Warp Overlapped Tiling for Image Processing Programs on GPUs. 317-328 - Yiming Gan, Yuxian Qiu, Lele Chen, Jingwen Leng, Yuhao Zhu:
Low-Latency Proactive Continuous Vision. 329-342
Poster Session II
- Mahmut T. Kandemir, Jihyun Ryoo, Hui Zhao, Myoungsoo Jung, Mustafa Karaköy:
Collective Affinity Aware Computation Mapping. 343-344 - Feng Yu, Jiacheng Zhao, Huimin Cui, Xiaobing Feng, Jingling Xue:
VTensor: Using Virtual Tensors to Build a Layout-oblivious AI Programming Framework. 345-346 - Roberto Castañeda Lozano, Murray Cole, Björn Franke:
Parallelizing Parallel Programs: A Dynamic Pattern Analysis for Modernization of Legacy Parallel Code. 347-348 - Xinglei Dou, Lei Liu:
A New Qubits Mapping Mechanism for Multi-programming Quantum Computing. 349-350 - Matthew Rodriguez, Ahmed Hassan, Michael F. Spear:
Exploiting Locality in Scalable Ordered Maps. 351-352 - Majed Valad Beigi, Bahareh Pourshirazi, Gokhan Memik, Zhichun Zhu:
DeepSwapper: A Deep Learning Based Page Swap Management Scheme for Hybrid Memory Systems. 353-354 - Tiago T. Jost, Yves Durand, Christian Fabre, Albert Cohen, Frédéric Pétrot:
VP Float: First Class Treatment for Variable Precision Floating Point Arithmetic. 355-356 - Vignesh Adhinarayanan, Wu-chun Feng:
Approximate Pattern Matching for On-Chip Interconnect Traffic Prediction. 357-358
Keynote III
- Bradford L. Chamberlain:
Compiling Chapel: Keys to Making Parallel Programming Productive at Scale. 359
Session 6: Domain/Application-Specific Hardware/Software
- Kartik Lakshminarasimhan, Ajeya Naithani, Josué Feliu, Lieven Eeckhout:
The Forward Slice Core Microarchitecture. 361-372 - Yan Pei, Swarnendu Biswas, Donald S. Fussell, Keshav Pingali:
A Methodology for Principled Approximation in Visual SLAM. 373-386 - Jonathan M. Baker, David I. Schuster, Frederic T. Chong:
Memory-Equipped Quantum Architectures: The Power of Random Access. 387-398 - Soroush Ghodrati, Hardik Sharma, Sean Kinzer, Amir Yazdanbakhsh, Jongse Park, Nam Sung Kim, Doug Burger, Hadi Esmaeilzadeh:
Mixed-Signal Charge-Domain Acceleration of Deep Neural Networks through Interleaved Bit-Partitioned Arithmetic. 399-411 - Mohammad Alaul Haque Monil, Mehmet E. Belviranli, Seyong Lee, Jeffrey S. Vetter, Allen D. Malony:
MEPHESTO: Modeling Energy-Performance in Heterogeneous SoCs and Their Trade-Offs. 413-425
Session 7: Memory/Storage Systems
- Kai Wu, Ivy Bo Peng, Jie Ren, Dong Li:
Ribbon: High Performance Cache Line Flushing for Persistent Memory. 427-439 - Rachata Ausavarungnirun, Timothy Merrifield, Jayneel Gandhi, Christopher J. Rossbach:
PRISM: Architectural Support for Variable-granularity Memory Metadata. 441-454 - Trinayan Baruah, Yifan Sun, Saiful A. Mojumder, José L. Abellán, Yash Ukidave, Ajay Joshi, Norman Rubin, John Kim, David R. Kaeli:
Valkyrie: Leveraging Inter-TLB Locality to Enhance GPU Performance. 455-466 - Changyeon Jo, Hyunik Kim, Hexiang Geng, Bernhard Egger:
RackMem: A Tailored Caching Layer for Rack Scale Computing. 467-480 - Harsh Gugale, Nagendra Gulur, Yashwant Marathe, Lizy K. John:
ATTC (@C): Addressable-TLB based Translation Coherence. 481-492
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.