default search action
57th MICRO 2024: Austin, TX, USA
- 57th IEEE/ACM International Symposium on Microarchitecture, MICRO 2024, Austin, TX, USA, November 2-6, 2024. IEEE 2024, ISBN 979-8-3503-5057-9
- Daniel Johnson:
Message from the MICRO 2024 General Chairs: "Hi, How Are you?" - "Jeremiah The Innocent" Mural. xxiii-xxv - Daniel A. Jiménez, Alaa R. Alameldeen:
Message from the MICRO 2024 Program Chairs. xxvi-xxvii - Yuqi Xue, Yiqi Liu, Lifeng Nai, Jian Huang:
Hardware-Assisted Virtualization of Neural Processing Units for Cloud Platforms. 1-16 - Stratos Psomadakis, Chloe Alverti, Vasileios Karakostas, Christos Katsakioris, Dimitrios Siakavaras, Konstantinos Nikas, Georgios I. Goumas, Nectarios Koziris:
Elastic Translations: Fast Virtual Memory with Multiple Translation Sizes. 17-35 - Osang Kwon, Yongho Lee, Junhyeok Park, Sungbin Jang, Byungchul Tak, Seokin Hong:
Distributed Page Table: Harnessing Physical Memory as an Unbounded Hashed Page Table. 36-49 - Dongseok Im, Hoi-Jun Yoo:
CamPU: A Multi-Camera Processing Unit for Deep Learning-based 3D Spatial Computing Systems. 50-63 - Seungjae Yoo, Hangyeol Kim, Joo-Young Kim:
AdapTiV: Sign-Similarity Based Image-Adaptive Token Merging for Vision Transformer Acceleration. 64-77 - Sixu Li, Yang Zhao, Chaojian Li, Bowei Guo, Jingqun Zhang, Wenbo Zhu, Zhifan Ye, Cheng Wan, Yingyan Celine Lin:
Fusion-3D: Integrated Acceleration for Instant 3D Reconstruction and Real-Time Rendering. 78-91 - Sumon Nath, Agustín Navarro-Torres, Alberto Ros, Biswabandan Panda:
Secure Prefetching for Secure Cache Systems. 92-104 - Yunkai Bai, Peinan Li, Yubiao Huang, Michael C. Huang, Shijun Zhao, Lutan Zhao, Fengwei Zhang, Dan Meng, Rui Hou:
HyperTEE: A Decoupled TEE Architecture with Secure Enclave Management. 105-120 - Jaeseok Choi, Hyunwoo Joe, Changhee Jung, Jongouk Choi:
Defending Against EMI Attacks on Just-In-Time Checkpoint for Resilient Intermittent Systems. 121-135 - Pouya Esmaili-Dokht, Francesco Sgherzi, Valéria Soldera Girelli, Isaac Boixaderas, Mariana Carmin, Alireza Monemi, Adrià Armejach, Estanislao Mercadal, Germán Llort, Petar Radojkovic, Miquel Moretó, Judit Giménez, Xavier Martorell, Eduard Ayguadé, Jesús Labarta, Emanuele Confalonieri, Rishabh Dubey, Jason Adlard:
A Mess of Memory System Benchmarking, Simulation and Application Profiling. 136-152 - Jehyeon Bang, Yujeong Choi, Myeongwoo Kim, Yongdeok Kim, Minsoo Rhu:
vTrain: A Simulation Framework for Evaluating Cost-Effective and Compute-Optimal Large Language Model Training. 153-167 - Jianchao Yang, Mei Wen, Dong Chen, Zhaoyun Chen, Zeyu Xue, Yuhang Li, Junzhong Shen, Yang Shi:
HyFiSS: A Hybrid Fidelity Stall-Aware Simulator for GPGPUs. 168-185 - Ruobing Han, Jisheng Zhao, Hyesoon Kim:
Unleashing CPU Potential for Executing GPU Programs Through Compiler/Runtime Optimizations. 186-200 - Yishen Chen, Saman P. Amarasinghe:
A Framework for Fine-Grained Program Versioning. 201-214 - Yuchen Zhou, Jianping Zeng, Changhee Jung:
LightWSP: Whole-System Persistence on the Cheap. 215-230 - Peter W. Deutsch, Vincent Quentin Ulitzsch, Sudhanva Gurumurthi, Vilas Sridharan, Joel S. Emer, Mengjia Yan:
DelayAVF: Calculating Architectural Vulnerability Factors for Delay Faults. 231-245 - Evgeny Manzhosov, Simha Sethumadhavan:
Polymorphic Error Correction. 246-262 - Heng Zhou, Bing Wu, Huan Cheng, Jinpeng Liu, Taoming Lei, Dan Feng, Wei Tong:
DRCTL: A Disorder-Resistant Computation Translation Layer Enhancing the Lifetime and Performance of Memristive CIM Architecture. 263-277 - Junhyeok Park, Osang Kwon, Yongho Lee, Seongwook Kim, Gwangeun Byeon, Jihun Yoon, Prashant J. Nair, Seokin Hong:
A Case for Speculative Address Translation with Rapid Validation for GPUs. 278-292 - Pratheek B, Guilherme Cox, Ján Veselý, Arkaprava Basu:
SUV: Static Analysis Guided Unified Virtual Memory. 293-308 - Bingyao Li, Yueqi Wang, Tianyu Wang, Lieven Eeckhout, Jun Yang, Aamer Jaleel, Xulong Tang:
STAR: Sub-Entry Sharing-Aware TLB for Multi-Instance GPU. 309-323 - Soyoung Park, Hojung Namkoong, Boyeol Choi, Michael B. Sullivan, Jungrae Kim:
CacheCraft: Enhancing GPU Performance under Memory Protection through Reconstructed Caching. 324-337 - Xianglong Deng, Shengyu Fan, Zhicheng Hu, Zhuoyu Tian, Zihao Yang, Jiangrui Yu, Dingyuan Cao, Dan Meng, Rui Hou, Meng Li, Qian Lou, Mingzhe Zhang:
Trinity: A General Purpose FHE Accelerator. 338-351 - Minxuan Zhou, Yujin Nam, Xuan Wang, Youhak Lee, Chris Wilkerson, Raghavan Kumar, Sachin Taneja, Sanu Mathew, Rosario Cammarota, Tajana Rosing:
UFC: A Unified Accelerator for Fully Homomorphic Encryption. 352-365 - Nikola Samardzic, Simon Langowski, Srinivas Devadas, Daniel Sánchez:
Accelerating Zero-Knowledge Proofs Through Hardware-Algorithm Co-Design. 366-379 - Zhuoran Ji, Jianyu Zhao, Zhaorui Zhang, Jiming Xu, Shoumeng Yan, Lei Ju:
A Compiler-Like Framework for Optimizing Cryptographic Big Integer Multiplication on GPUs. 380-392 - Katie Lim, Matthew Giordano, Theano Stavrinos, Irene Zhang, Jacob Nelson, Baris Kasikci, Thomas E. Anderson:
Beehive: A Flexible Network Stack for Direct-Attached Accelerators. 393-408 - Hasan Nazim Genc, Hansung Kim, Prashanth Ganesh, Yakun Sophia Shao:
Stellar: An Automated Design Framework for Dense and Sparse Spatial Accelerators. 409-422 - Zixi Li, David Wentzlaff:
LUCIE: A Universal Chiplet-Interposer Design Framework for Plug-and-Play Integration. 423-436 - Qinggang Wang, Long Zheng, Zhaozeng An, Shuyi Xiong, Runze Wang, Yu Huang, Pengcheng Yao, Xiaofei Liao, Hai Jin, Jingling Xue:
A Scalable, Efficient, and Robust Dynamic Memory Management Library for HLS-based FPGAs. 437-450 - Kevin Weston, Avery Johnson, Vahid Janfaza, Farabi Mahmud, Abdullah Muzahid:
Customizing Cache Indexing Through Entropy Estimation. 451-463 - David Schall, Andreas Sandberg, Boris Grot:
The Last-Level Branch Predictor. 464-479 - Aniket Anand Deshmukh, Lingzhe Chester Cai, Yale N. Patt:
Timely, Efficient, and Accurate Branch Precomputation. 480-492 - Kenichiro Mori, Sota Kosugi, Hiroto Yoshida, Hajime Shimada, Hideki Ando:
Localizing the Tag Comparisons in the Wakeup Logic to Reduce Energy Consumption of the Issue Queue. 493-506 - Yao Hsiao, Nikos Nikoleris, Artem Khyzha, Dominic P. Mulligan, Gustavo Petri, Christopher W. Fletcher, Caroline Trippel:
RTL2MμPATH: Multi-μPATH Synthesis with Applications to Hardware Security Verification. 507-524 - Zhuoran Song, Houshu He, Fangxin Liu, Yifan Hao, Xinkai Song, Li Jiang, Xiaoyao Liang:
SRender: Boosting Neural Radiance Field Efficiency via Sensitivity-Aware Dynamic Precision Rendering. 525-537 - Yi Chen, Yongwei Zhao, Yifan Hao, Yuanbo Wen, Yuntao Dai, Xiaqing Li, Yang Liu, Rui Zhang, Mo Zou, Xinkai Song, Xing Hu, Zidong Du, Huaping Chen, Qi Guo, Tianshi Chen:
Cambricon-C: Efficient 4-Bit Matrix Unit via Primitivization. 538-550 - Yuzong Chen, Jian Meng, Jae-sun Seo, Mohamed S. Abdelfattah:
BBS: Bi-Directional Bit-Level Sparsity for Deep Learning Acceleration. 551-564 - Mohanad Odema, Luke Chen, Hyoukjun Kwon, Mohammad Abdullah Al Faruque:
SCAR: Scheduling Multi-Model AI Workloads on Heterogeneous Multi-Chiplet Module Accelerators. 565-579 - Lingxiang Yin, Sanjay Gandham, Mingjie Lin, Hao Zheng:
SCALE: A Structure-Centric Accelerator for Message Passing Graph Neural Networks. 580-593 - Hyungkyu Ham, Jeongmin Hong, Geonwoo Park, Yunseon Shin, Okkyun Woo, Wonhyuk Yang, Jinhoon Bae, Eunhyeok Park, Hyojin Sung, Euicheol Lim, Gwangsun Kim:
Low-Overhead General-Purpose Near-Data Processing in CXL Memory Expanders. 594-611 - Pingyi Huo, Anusha Devulapally, Hasan Al Maruf, Minseo Park, Krishnakumar Nair, Meena Arunachalam, Gulsum Gudukbay Akbulut, Mahmut Taylan Kandemir, Vijaykrishnan Narayanan:
PIFS-Rec: Process-In-Fabric-Switch for Large-Scale Recommendation System Inferences. 612-626 - Dongjae Lee, Bongjoon Hyun, Taehun Kim, Minsoo Rhu:
PIM-MMU: A Memory Management Unit for Accelerating Data Transfers in Commercial PIM Systems. 627-642 - Axel Feldmann, Courtney Golden, Yifan Yang, Joel S. Emer, Daniel Sánchez:
Azul: An Accelerator for Sparse Iterative Solvers Leveraging Distributed On-Chip Memory. 643-656 - Kailin Yang, José F. Martínez:
FloatAP: Supporting High-Performance Floating-Point Arithmetic in Associative Processors. 657-670 - Yicong Zhang, Mingyu Wang, Wangguang Wang, Yangzhan Mai, Haiqiu Huang, Zhiyi Yu:
Atomic Cache: Enabling Efficient Fine-Grained Synchronization with Relaxed Memory Consistency on GPGPUs Through In-Cache Atomic Operations. 671-685 - Ni Kang, Ahmad Alawneh, Mengchi Zhang, Timothy G. Rogers:
Concurrency-Aware Register Stacks for Efficient GPU Function Calls. 686-699 - Preyesh Dalmia, Rajesh Shashi Kumar, Matthew D. Sinclair:
CPElide: Efficient Multi-Chiplet GPU Implicit Synchronization. 700-717 - Suhas Vittal, Ali Javadi-Abhari, Andrew W. Cross, Lev S. Bishop, Moinuddin Qureshi:
Flag-Proxy Networks: Overcoming the Architectural, Scheduling and Decoding Obstacles of Quantum LDPC Codes. 718-734 - Meng Wang, Poulami Das, Prashant J. Nair:
Qoncord: A Multi-Device Job Scheduling Framework for Variational Quantum Algorithms. 735-749 - Keyi Yin, Xiang Fang, Travis S. Humble, Ang Li, Yunong Shi, Yufei Ding:
Surf-Deformer: Mitigating Dynamic Defects on Surface Code via Adaptive Deformation. 750-764 - Ruifan Xu, Jin Luo, Yawen Zhang, Yibo Lin, Runsheng Wang, Ru Huang, Yun Liang:
Hestia: An Efficient Cross-Level Debugger for High-Level Synthesis. 765-779 - Ali Mosallaei, Katherine E. Isaacs, Yifan Sun:
Looking into the Black Box: Monitoring Computer Architecture Simulations in Real-Time with AkitaRTM. 780-794 - Ajay Nayak, Arkaprava Basu:
Over-Synchronization in GPU Programs. 795-809 - Juan M. Cebrian, Magnus Jahre, Alberto Ros:
Temporarily Unauthorized Stores: Write First, Ask for Permission Later. 810-822 - Vipin Patel, Swarnendu Biswas, Mainak Chaudhuri:
Leveraging Cache Coherence to Detect and Repair False Sharing On-the-fly. 823-839 - Víctor Nicolás-Conesa, J. Rubén Titos Gil, Ricardo Fernández Pascual, Manuel E. Acacio, Alberto Ros:
Chaining Transactions for Effective Concurrency Management in Hardware Transactional Memory. 840-855 - William Won, Midhilesh Elavazhagan, Sudarshan Srinivasan, Swati Gupta, Tushar Krishna:
TACOS: Topology-Aware Collective Algorithm Synthesizer for Distributed Machine Learning. 856-870 - Yinxiao Feng, Wei Li, Kaisheng Ma:
Ring Road: A Scalable Polar-Coordinate-based 2D Network-on-Chip Architecture. 871-884 - Zhixian Jin, Christopher Rocca, Jiho Kim, Hans Kasan, Minsoo Rhu, Ali Bakhoda, Tor M. Aamodt, John Kim:
Uncovering Real GPU NoC Characteristics: Implications on Interconnect Architecture. 885-898 - Moinuddin Qureshi, Salman Qazi, Aamer Jaleel:
MINT: Securely Mitigating Rowhammer with a Minimalist in-DRAM Tracker. 899-914 - Oguzhan Canpolat, A. Giray Yaglikçi, Ataberk Olgun, Ismail Emir Yuksel, Yahya Can Tugrul, Konstantinos Kanellopoulos, Oguz Ergin, Onur Mutlu:
BreakHammer: Enhancing RowHammer Mitigations by Carefully Throttling Suspect Threads. 915-934 - Anish Saxena, Aamer Jaleel, Moinuddin Qureshi:
ImPress: Securing DRAM Against Data-Disturbance Errors via Implicit Row-Press Mitigation. 935-948 - Hasan Hassan, Ataberk Olgun, A. Giray Yaglikçi, Haocong Luo, Onur Mutlu:
Self-Managing DRAM: A Low-Cost Framework for Enabling Autonomous and Efficient DRAM Maintenance Operations. 949-965 - Muhammad Laghari, Yuqing Liu, Gagandeep Panwar, David Bears, Chandler Jearls, Raghavendra Srinivas, Esha Choukse, Kirk W. Cameron, Ali Raza Butt, Xun Jian:
Memory Allocation Under Hardware Compression. 966-982 - Youngin Kim, William J. Song:
Genie Cache: Non-Blocking Miss Handling and Replacement in Page-Table-Based DRAM Cache. 983-996 - Albert Cho, Alexandros Daglis:
StarNUMA: Mitigating NUMA Challenges with Memory Pooling. 997-1012 - Ahmad Alawneh, Ni Kang, Mahmoud Khairy, Timothy G. Rogers:
ThreadFuser: A SIMT Analysis Framework for MIMD Programs. 1013-1026 - Aaron Barnes, Fangjia Shen, Timothy G. Rogers:
Extending GPU Ray-Tracing Units for Hierarchical Search Acceleration. 1027-1040 - Dongho Ha, Lufei Liu, Yuan-Hsi Chou, Seokjin Go, Won Woo Ro, Hung-Wei Tseng, Tor M. Aamodt:
Generalizing Ray Tracing Accelerators for Tree Traversals on GPUs. 1041-1057 - Aurora Tomás, Juan L. Aragón, Joan-Manuel Parcerisa, Antonio González:
LIBRA: Memory Bandwidth- and Locality-Aware Parallel Tile Rendering. 1058-1072 - Hunjun Lee, Yeongwoo Jang, Daye Jung, Seunghyun Song, Jangwoo Kim:
Rearchitecting a Neuromorphic Processor for Spike-Driven Brain-Computer Interfacing. 1073-1089 - Zongwu Wang, Fangxin Liu, Ning Yang, Shiyuan Huang, Haomin Li, Li Jiang:
COMPASS: SRAM-Based Computing-in-Memory SNN Accelerator with Adaptive Spike Speculation. 1090-1106 - Ruokai Yin, Youngeun Kim, Di Wu, Priyadarshini Panda:
LoAS: Fully Temporal-Parallel Dataflow for Dual-Sparse Spiking Neural Networks. 1107-1121 - Xiaoyi Liu, Zhongzhu Pu, Peng Qu, Weimin Zheng, Youhui Zhang:
ActiveN: A Scalable and Flexibly-Programmable Event-Driven Neuromorphic Processor. 1122-1137 - Zhixian Jin, Jaeguk Ahn, Jiho Kim, Hans Kasan, Jina Song, Wonjun Song, John Kim:
Ghost Arbitration: Mitigating Interconnect Side-Channel Timing Attacks in GPU. 1138-1152 - Md Hafizul Islam Chowdhuryy, Fan Yao:
IvLeague: Side Channel-Resistant Secure Architectures Using Isolated Domains of Dynamic Integrity Trees. 1153-1168 - Yuanqing Miao, Yingtian Zhang, Dinghao Wu, Danfeng Zhang, Gang Tan, Rui Zhang, Mahmut Taylan Kandemir:
Veiled Pathways: Investigating Covert and Side Channels Within GPU Uncore. 1169-1183 - Nikhil Agarwal, Mitchell Fream, Souradip Ghosh, Brian C. Schwedock, Nathan Beckmann:
The TYR Dataflow Architecture: Improving Locality by Taming Parallelism. 1184-1200 - Yunan Zhang, Po-An Tsai, Hung-Wei Tseng:
Sparsepipe: Sparse Inter-operator Dataflow Architecture with Cross-Iteration Reuse. 1201-1216 - Rishabh Jain, Vivek M. Bhasi, Adwait Jog, Anand Sivasubramaniam, Mahmut T. Kandemir, Chita R. Das:
Pushing the Performance Envelope of DNN-based Recommendation Systems Inference on GPUs. 1217-1232 - Hyun Ryong Lee, Daniel Sánchez:
Terminus: A Programmable Accelerator for Read and Update Operations on Sparse Data Structures. 1233-1246 - Huizheng Wang, Jiahao Fang, Xinru Tang, Zhiheng Yue, Jinxi Li, Yubin Qin, Sihan Guan, Qinze Yang, Yang Wang, Chao Li, Yang Hu, Shouyi Yin:
SOFA: A Compute-Memory Optimized Sparsity Accelerator via Cross-Stage Coordinated Tiling. 1247-1263 - Hui Yu, Yu Zhang, Ligang He, Yingqi Zhao, Xintao Li, Ruida Xin, Jin Zhao, Xiaofei Liao, Haikun Liu, Bingsheng He, Hai Jin:
RAHP: A Redundancy-aware Accelerator for High-performance Hypergraph Neural Network. 1264-1277 - Brian C. Schwedock, Nathan Beckmann:
Leviathan: A Unified System for General-Purpose Near-Data Computing. 1278-1294 - Zerun Li, Xiaoming Chen, Yinhe Han:
TMiner: A Vertex-Based Task Scheduling Architecture for Graph Pattern Mining. 1295-1308 - Xuan-Jun Chen, Han-Ping Chen, Chia-Lin Yang:
PointCIM: A Computing-in-Memory Architecture for Accelerating Deep Point Cloud Analytics. 1309-1322 - Mohammad Bakhshalipour, Hamidreza Zare, Farid Samandi, Fatemeh Golshan, Pejman Lotfi-Kamran, Hamid Sarbazi-Azad:
Blenda: Dynamically-Reconfigurable Stacked DRAM. 1323-1337 - Cheng Tan, Miaomiao Jiang, Deepak Patil, Yanghui Ou, Zhaoying Li, Lei Ju, Tulika Mitra, Hyunchul Park, Antonino Tumeo, Jeff Zhang:
ICED: An Integrated CGRA Framework Enabling DVFS-Aware Acceleration. 1338-1352 - Raghu Prabhakar, Ram Sivaramakrishnan, Darshan Gandhi, Yun Du, Mingran Wang, Xiangyu Song, Kejie Zhang, Tianren Gao, Angela Wang, Xiaoyan Li, Yongning Sheng, Joshua Brot, Denis Sokolov, Apurv Vivek, Calvin Leung, Arjun Sabnis, Jiayu Bai, Tuowen Zhao, Mark Gottscho, David Jackson, Mark Luttrell, Manish K. Shah, Zhengyu Chen, Kaizhao Liang, Swayambhoo Jain, Urmish Thakker, Dawei Huang, Sumti Jairath, Kevin J. Brown, Kunle Olukotun:
SambaNova SN40L: Scaling the AI Memory Wall with Dataflow and Composition of Experts. 1353-1366 - Jaime Roelandts, Ajeya Naithani, Sam Ainsworth, Timothy M. Jones, Lieven Eeckhout:
Scalar Vector Runahead. 1367-1381 - Roman Brunner, Rakesh Kumar:
Weeding out Front-End Stalls with Uneven Block Size Instruction Cache. 1382-1396 - Jovan Stojkovic, Esha Choukse, Enrique Saurez, Íñigo Goiri, Josep Torrellas:
Mosaic: Harnessing the Micro-Architectural Resources of Servers in Serverless Environments. 1397-1412 - Peng Gao, Yang Liu, Jun Wang, Wanlin Cai, Guangchong Shen, Zonghui Hong, Jiali Qu, Ning Wang:
SOPHGO BM1684X: A Commercial High Performance Terminal AI Processor with Large Model Support. 1413-1428 - Sungmin Yun, Kwanhee Kyung, Juhwan Cho, Jaewan Choi, Jongmin Kim, Byeongho Kim, Sukhan Lee, Kyomin Sohn, Jung Ho Ahn:
Duplex: A Device for Large Language Models with Mixture of Experts, Grouped Query Attention, and Continuous Batching. 1429-1443 - Seung Yul Lee, Hyunseung Lee, Jihoon Hong, SangLyul Cho, Jae W. Lee:
VGA: Hardware Accelerator for Scalable Long Sequence Model Inference. 1444-1457 - Nandeeka Nayak, Xinrui Wu, Toluwanimi O. Odemuyiwa, Michael Pellauer, Joel S. Emer, Christopher W. Fletcher:
FuseMax: Leveraging Extended Einsums to Optimize Attention Accelerator Design. 1458-1473 - Zhongkai Yu, Shengwen Liang, Tianyun Ma, Yunke Cai, Ziyuan Nan, Di Huang, Xinkai Song, Yifan Hao, Jie Zhang, Tian Zhi, Yongwei Zhao, Zidong Du, Xing Hu, Qi Guo, Tianshi Chen:
Cambricon-LLM: A Chiplet-Based Hybrid Architecture for On-Device Inference of 70B LLM. 1474-1488 - Jian Chen, Congming Gao, Youyou Lu, Yuhao Zhang, Jiwu Shu:
Ares-Flash: Efficient Parallel Integer Arithmetic Operations Using NAND Flash Memory. 1489-1503 - Houxiang Ji, Srikar Vanavasam, Yang Zhou, Qirong Xia, Jinghan Huang, Yifan Yuan, Ren Wang, Pekon Gupta, Bhushan Chitlur, Ipoom Jeong, Nam Sung Kim:
Demystifying a CXL Type-2 Device: A Heterogeneous Cooperative Computing Perspective. 1504-1517 - Zhe Zhou, Yiqi Chen, Tao Zhang, Yang Wang, Ran Shu, Shuotao Xu, Peng Cheng, Lei Qu, Yongqiang Xiong, Jie Zhang, Guangyu Sun:
NeoMem: Hardware/Software Co-Design for CXL-Native Memory Tiering. 1518-1531 - Junhyuk Choi, Ilkwon Byun, Juwon Hong, Dongmoon Min, Junpyo Kim, Jungmin Cho, Hyeonseong Jeong, Masamitsu Tanaka, Koji Inoue, Jangwoo Kim:
SuperCore: An Ultra-Fast Superconducting Processor for Cryogenic Applications. 1532-1547 - Guowei Yang, Sina Karimi, Carlos A. Ríos Ocampo, Ayse K. Coskun, Ajay Joshi:
SOPHIE: A Scalable Recurrent Ising Machine Using Optically Addressed Phase Change Memory. 1548-1561 - Lizhou Wu, Haozhe Zhu, Siqi He, Jiapei Zheng, Chixiao Chen, Xiaoyang Zeng:
GauSPU: 3D Gaussian Splatting Processor for Real-Time SLAM Systems. 1562-1573 - Maolin Wang, Ian McInerney, Bartolomeo Stellato, Fengbin Tu, Stephen P. Boyd, Hayden Kwok-Hay So, Kwang-Ting Cheng:
Multi-Issue Butterfly Architecture for Sparse Convex Quadratic Programming. 1574-1587 - Yiming Gao, Chao Jiang, Wesley Piard, Xiangru Chen, Bhavesh Patel, Herman Lam:
HgPCN: A Heterogeneous Architecture for E2E Embedded Point Cloud Inference. 1588-1600 - Ubaid Bakhtiar, Helya Hosseini, Bahar Asgari:
Acamar: A Dynamically Reconfigurable Scientific Computing Accelerator for Robust Convergence and Minimal Resource Underutilization. 1601-1616 - Pouya Haghi, Chunshu Wu, Zahra Azad, Yanfei Li, Andrew Gui, Yuchen Hao, Ang Li, Tony Tong Geng:
Bridging the Gap Between LLMs and LNS with Dynamic Data Format and Architecture Codesign. 1617-1631 - Orian Leitersdorf, Ronny Ronen, Shahar Kvatinsky:
PyPIM: Integrating Digital Processing-in-Memory from Microarchitectural Design to Python Tensors. 1632-1647 - Yiwei Li, Boyu Tian, Yi Ren, Mingyu Gao:
Stream-Based Data Placement for Near-Data Processing with Extended Memory. 1648-1662 - Hongrui Guo, Mo Zou, Yifan Hao, Zidong Du, Erxiang Ren, Yang Liu, Yongwei Zhao, Tianrui Ma, Rui Zhang, Xing Hu, Fei Qiao, Zhiwei Xu, Qi Guo, Tianshi Chen:
Cambricon-M: A Fibonacci-Coded Charge-Domain SRAM-Based CIM Accelerator for DNN Inference. 1663-1677 - Yihang Zhu, Lei Cai, Lianfeng Yu, Anjunyi Fan, Longhao Yan, Zhaokun Jing, Bonan Yan, Pek Jun Tiw, Yuqi Li, Yaoyu Tao, Yuchao Yang:
MeMCISA: Memristor-Enabled Memory-Centric Instruction-Set Architecture for Database Workloads. 1678-1692
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.