default search action
Jiarui Fang
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c12]Shenggan Cheng, Xuanlei Zhao, Guangyang Lu, Jiarui Fang, Tian Zheng, Ruidong Wu, Xiwen Zhang, Jian Peng, Yang You:
FastFold: Optimizing AlphaFold Training and Inference on GPU Clusters. PPoPP 2024: 417-430 - [d1]Qin Su, Lei Shu, Qingsong Zhao, Zitian Jiang, Jiarui Fang:
FDI-SIL. IEEE DataPort, 2024 - [i12]Xuanlei Zhao, Shenggan Cheng, Guangyang Lu, Jiarui Fang, Haotian Zhou, Bin Jia, Ziming Liu, Yang You:
AutoChunk: Automated Activation Chunk for Memory-Efficient Long Sequence Inference. CoRR abs/2401.10652 (2024) - [i11]Jiarui Fang, Shangchun Zhao:
USP: A Unified Sequence Parallelism Approach for Long Context Generative AI. CoRR abs/2405.07719 (2024) - [i10]Jiannan Wang, Jiarui Fang, Aoyu Li, PengCheng Yang:
PipeFusion: Displaced Patch Pipeline Parallelism for Inference of Diffusion Transformer Models. CoRR abs/2405.14430 (2024) - [i9]Diandian Gu, Peng Sun, Qinghao Hu, Ting Huang, Xun Chen, Yingtong Xiong, Guoteng Wang, Qiaoling Chen, Shangchun Zhao, Jiarui Fang, Yonggang Wen, Tianwei Zhang, Xin Jin, Xuanzhe Liu:
LoongTrain: Efficient Training of Long-Sequence LLMs with Head-Context Parallelism. CoRR abs/2406.18485 (2024) - [i8]Jiarui Fang, Jinzhe Pan, Xibo Sun, Aoyu Li, Jiannan Wang:
xDiT: an Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism. CoRR abs/2411.01738 (2024) - 2023
- [j8]Jiarui Fang, Zilin Zhu, Shenggui Li, Hui Su, Yang Yu, Jie Zhou, Yang You:
Parallel Training of Pre-Trained Models via Chunk-Based Dynamic Memory Management. IEEE Trans. Parallel Distributed Syst. 34(1): 304-315 (2023) - [c11]Shenggui Li, Hongxin Liu, Zhengda Bian, Jiarui Fang, Haichen Huang, Yuliang Liu, Boxiang Wang, Yang You:
Colossal-AI: A Unified Deep Learning System For Large-Scale Parallel Training. ICPP 2023: 766-775 - [i7]Yuliang Liu, Shenggui Li, Jiarui Fang, Yanjun Shao, Boyuan Yao, Yang You:
Colossal-Auto: Unified Automation of Parallelization and Activation Checkpoint for Large-scale Models. CoRR abs/2302.02599 (2023) - 2022
- [c10]Hui Su, Weiwei Shi, Xiaoyu Shen, Xiao Zhou, Tuo Ji, Jiarui Fang, Jie Zhou:
RoCBert: Robust Chinese Bert with Multimodal Contrastive Pretraining. ACL (1) 2022: 921-931 - [i6]Jiarui Fang, Geng Zhang, Jiatong Han, Shenggui Li, Zhengda Bian, Yongbin Li, Jin Liu, Yang You:
A Frequency-aware Software Cache for Large Recommendation System Embeddings. CoRR abs/2208.05321 (2022) - [i5]Jiangsu Du, Ziming Liu, Jiarui Fang, Shenggui Li, Yongbin Li, Yutong Lu, Yang You:
EnergonAI: An Inference System for 10-100 Billion Parameter Transformer Models. CoRR abs/2209.02341 (2022) - [i4]Haichen Huang, Jiarui Fang, Hongxin Liu, Shenggui Li, Yang You:
Elixir: Train a Large Language Model on a Small GPU Cluster. CoRR abs/2212.05339 (2022) - 2021
- [c9]Jiarui Fang, Yang Yu, Chengduo Zhao, Jie Zhou:
TurboTransformers: an efficient GPU serving system for transformer models. PPoPP 2021: 389-402 - [i3]Jiarui Fang, Yang Yu, Shenggui Li, Yang You, Jie Zhou:
PatrickStar: Parallel Training of Pre-trained Models via a Chunk-based Memory Management. CoRR abs/2108.05818 (2021) - 2020
- [j7]Liandeng Li, Jiarui Fang, Jinlei Jiang, Lin Gan, Weijie Zheng, Haohuan Fu, Guangwen Yang:
Efficient AES implementation on Sunway TaihuLight supercomputer: A systematic approach. J. Parallel Distributed Comput. 138: 178-189 (2020) - [i2]Jiarui Fang, Yang Yu, Chengduo Zhao, Jie Zhou:
TurboTransformers: An Efficient GPU Serving System For Transformer Models. CoRR abs/2010.05680 (2020)
2010 – 2019
- 2019
- [j6]Jiarui Fang, Haohuan Fu, Guangwen Yang, Cho-Jui Hsieh:
RedSync: Reducing synchronization bandwidth for distributed deep learning training system. J. Parallel Distributed Comput. 133: 30-39 (2019) - [j5]Weijia Li, Conghui He, Jiarui Fang, Juepeng Zheng, Haohuan Fu, Le Yu:
Semantic Segmentation-Based Building Footprint Extraction Using Very High-Resolution Satellite Images and Multi-Source GIS Data. Remote. Sens. 11(4): 403 (2019) - [c8]Wei Gao, Jiarui Fang, Wenlai Zhao, Jinzhe Yang, Long Wang, Lin Gan, Haohuan Fu, Guangwen Yang:
swATOP: Automatically Optimizing Deep Learning Operators on SW26010 Many-Core Processor. ICPP 2019: 89:1-89:10 - [i1]Jiarui Fang, Liandeng Li, Haohuan Fu, Jinlei Jiang, Wenlai Zhao, Conghui He, Xin You, Guangwen Yang:
swCaffe: a Parallel Framework for Accelerating Deep Learning Applications on Sunway TaihuLight. CoRR abs/1903.06934 (2019) - 2018
- [j4]Xiao Huang, Chaoqing Yu, Jiarui Fang, Guorui Huang, Shaoqiang Ni, Jim W. Hall, Conrad Zorn, Xiaomeng Huang, Wenyuan Zhang:
A dynamic agricultural prediction system for large-scale drought assessment on the Sunway TaihuLight supercomputer. Comput. Electron. Agric. 154: 400-410 (2018) - [j3]Wenlai Zhao, Haohuan Fu, Jiarui Fang, Weijie Zheng, Lin Gan, Guangwen Yang:
Optimizing Convolutional Neural Networks on the Sunway TaihuLight Supercomputer. ACM Trans. Archit. Code Optim. 15(1): 13:1-13:26 (2018) - [c7]Liandeng Li, Jiarui Fang, Haohuan Fu, Jinlei Jiang, Wenlai Zhao, Conghui He, Xin You, Guangwen Yang:
swCaffe: A Parallel Framework for Accelerating Deep Learning Applications on Sunway TaihuLight. CLUSTER 2018: 413-422 - [c6]Weijia Li, Conghui He, Jiarui Fang, Haohuan Fu:
Semantic Segmentation Based Building Extraction Method Using Multi-Source GIS Map Datasets and Satellite Imagery. CVPR Workshops 2018: 238-241 - 2017
- [j2]Weijia Li, Haohuan Fu, Yang You, Le Yu, Jiarui Fang:
Parallel Multiclass Support Vector Machine for Remote Sensing Data Classification on Multicore and Many-Core Architectures. IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens. 10(10): 4387-4398 (2017) - [c5]Jiarui Fang, Haohuan Fu, Wenlai Zhao, Bingwei Chen, Weijie Zheng, Guangwen Yang:
swDNN: A Library for Accelerating Deep Learning Applications on Sunway TaihuLight. IPDPS 2017: 615-624 - [c4]Liandeng Li, Jiarui Fang, Jinlei Jiang, Lin Gan, Weijie Zheng, Haohuan Fu, Guangwen Yang:
SW-AES: Accelerating AES Algorithm on the Sunway TaihuLight. ISPA/IUCC 2017: 1204-1211 - 2016
- [c3]Jiarui Fang, Haohuan Fu, Guangwen Yang:
Cache-Friendly Design for Complex Spatially-Variable Coefficient Stencils on Many-Core Architectures. HiPC 2016: 222-231 - [c2]Haohuan Fu, Junfeng Liao, Wei Xue, Lanning Wang, Dexun Chen, Long Gu, Jinxiu Xu, Nan Ding, Xinliang Wang, Conghui He, Shizhen Xu, Yishuang Liang, Jiarui Fang, Yuanchao Xu, Weijie Zheng, Jingheng Xu, Zhen Zheng, Wanjing Wei, Xu Ji, He Zhang, Bingwei Chen, Kaiwei Li, Xiaomeng Huang, Wenguang Chen, Guangwen Yang:
Refactoring and optimizing the community atmosphere model (CAM) on the sunway taihulight supercomputer. SC 2016: 969-980 - 2015
- [c1]Jiarui Fang, Haohuan Fu, He Zhang, Wei Wu, Nanxun Dai, Lin Gan, Guangwen Yang:
Optimizing Complex Spatially-Variant Coefficient Stencils for Seismic Modeling on GPU. ICPADS 2015: 641-648 - 2013
- [j1]Jiarui Fang, Lei Zhao, Jan C. Fransoo, Tom Van Woensel:
Sourcing strategies in supply risk management: An approximate dynamic programming approach. Comput. Oper. Res. 40(5): 1371-1382 (2013)
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-12-12 21:00 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint