Keren Zhou
Keren Zhou
Home
Experience
Projects
Featured
Publications
Talks
Students
Tags
News
Light
Dark
Automatic
GPU
Hardware-Aware Compression with Random Operation Access Specific Tile (ROAST) Hashing
Advancements in deep learning are often associated with increasing model sizes. Training and deploying large models require …
Aditya Desai
,
Keren Zhou
,
Anshumali Shrivastava
Cite
Project
URL
Towards Agile Development of Efficient Deep Learning Operators (Hardware Insights)
Presented a talk about Triton and requested feedback from Intel engineers
Jun 29, 2023 10:56 PM — 10:56 PM
Virtual
Keren Zhou
Project
Slides
Towards Agile Development of Efficient Deep Learning Operators (Call for Contributions)
Presented a talk about Triton and called for contributions to improving the language
Jun 19, 2023 10:56 PM — 10:56 PM
Lake Tahoe, California
Keren Zhou
Project
Slides
DrGPUM: Guiding Memory Optimization for GPU-Accelerated Applications
GPUs are widely used in today’s computing platforms to accelerate applications in various domains. However, scarce GPU memory resources …
Mao Lin
,
Keren Zhou
,
Pengfei Su
Cite
Project
DOI
URL
Towards Agile Development of Efficient Deep Learning Operators (Pre-MLIR)
Presented triton programming language and its next step
Dec 2, 2022 10:03 PM — 10:03 PM
Virtual
Keren Zhou
Project
Slides
Practical Performance Optimization for Deep Learning Applications
Presented triton programming language and a deep learning profiler
May 18, 2022 10:02 PM — 10:02 PM
Virtual
Keren Zhou
Project
Project
Slides
ValueExpert: Exploring Value Patterns in GPU-accelerated Applications
Presented a talk about our value profiling tool at ASPLOS'22
Mar 2, 2022 12:00 AM — 12:00 AM
Virtual
Keren Zhou
Project
Slides
Accelerating High-order Stencils on GPUs
Finite-difference methods based on high-order stencils are commonly used for modeling of seismic wave propagation, weather forecasting, …
Ryuichi Sai
,
John Mellor-Crummey
,
Xiaozhu Meng
,
Keren Zhou
,
Mauricio Araya-Polo
,
Jie Meng
Cite
Project
DOI
URL
An Automated Tool for Analysis and Tuning of GPU-Accelerated Code in HPC Applications
The US Department of Energy’s fastest supercomputers and forthcoming exascale systems employ Graphics Processing Units (GPUs) to …
Keren Zhou
,
Xiaozhu Meng
,
Ryuichi Sai
,
Dejan Grubisic
,
John Mellor-Crummey
Cite
Project
DOI
URL
Low Overhead and Context Sensitive Profiling of GPU-Accelerated Applications
As we near the end of Moore’s law scaling, the next-generation computing platforms are increasingly exploring heterogeneous …
Keren Zhou
,
Jonathon Anderson
,
Xiaozhu Meng
,
John Mellor-Crummey
Cite
Project
DOI
URL
«
»
Cite
×