
Yue Dai
Founder and software engineer building multimodal biology datasets
San Francisco, California
Summary
Founder and entrepreneur in biotech AI: Yue co-founded Strand AI (listed on Y Combinator) to build curated multimodal datasets and data-generation/imputation models for biology, positioning the company to help life-sciences teams impute missing patient modalities and accelerate biomarker discovery. strandai+1
Software engineer with diverse industry experience: Public summaries and listings show Yue has held multiple software engineering roles across industry (Enable Medicine, Microsoft Research, AWS, Element AI, Microsoft, Lightspeed HQ, Nuance) and worked at Pathos as a software engineer more recently. substack+2
Active academic and author in multimodal and NLP research: Yue has authored papers on multimodal persuasiveness (ImageArg) and long-document understanding (ChuLo), and maintains an academic profile on Google Scholar and ACL Anthology. aclanthology+1
Student leader and product-focused contributor during university: While at McGill Yue co-founded/served in leadership for Hack4Impact McGill and acted as product manager on the MealCare project, indicating engagement in student-run engineering and social-impact projects. hack4impact+1
Work
Education
Projects
Writing
ChuLo: Chunk-Level Key Information Representation for Long Document Understanding
January 1, 2025Presents ChuLo, a chunk representation method for long-document understanding that groups input tokens using unsupervised keyphrase extraction to retain semantically important content while reducing input length for transformer-based models.
ImageArg: A Multi-modal Tweet Dataset for Image Persuasiveness Mining
January 1, 2022Introduces ImageArg, a multi-modal dataset of tweets annotated for image persuasiveness from an argumentative perspective and benchmarks multimodal persuasiveness tasks.