
Pranjal Shankhdhar
Member of Technical Staff at xAI, ex-OpenAI and Meta, HPC specialist
San Francisco Bay Area
Summary
Pranjal is a high-performance computing (HPC) and distributed systems expert, demonstrated by his work at xAI and his deep dive into optimizing CUDA matrix multiplication kernels on H100 GPUs, achieving significant performance improvements over standard libraries. substack+1
He has extensive experience in large-scale data infrastructure and SQL analytics from his time at Meta (Facebook), where he contributed to the Presto distributed SQL engine and co-authored papers on its analytics capabilities and query optimization techniques. facebook+1
A highly skilled competitive programmer, Pranjal was a 2x ACM ICPC World Finalist during his dual-degree program at IIT Kharagpur, showcasing strong algorithmic and problem-solving abilities. codeforces+1
Work
Education
Writing
Outperforming cuBLAS on H100: a Worklog
November 1, 2024A personal blog post detailing the process of writing CUDA matmul kernels from scratch and optimizing them for NVIDIA H100 GPUs, achieving significant performance gains.
Presto's History-Based Query Optimizer
January 1, 2024Co-authored paper presenting a novel history-based query optimizer for Presto, designed to improve query performance and resource utilization in large-scale data warehousing environments.
Presto: A Decade of SQL Analytics at Meta
January 1, 2023Co-authored paper discussing the evolution, architecture, and impact of Presto, an open-source distributed SQL query engine, highlighting its role in Meta's data analysis infrastructure.