I am an AI research scientist at Meta. Before Meta, I was a senior research scientist Before at Megagon Labs. I have worked on research topics in data management, database theory, and natural language processing. In particular, my recent research interests have been focusing on applying machine learning techniques to data preparation and integration tasks, including entity matching, data cleaning, data discovery, and table annotation.
Before joining Megagon, I received a PhD degree in Computer Science from UC San Diego (UCSD), advised by Alin Deutsch and Victor Vianu. My PhD thesis is on the Verification of Data-driven workflows, a research direction that lies in the intersection of Database Theory, Software Model Checking, and Business Process Management. Before UCSD, I obtained my undergraduate degree in Computer Science from Hong Kong University of Science and Technology.
I was an intern in Google Research during 2013 Summer and in Microsoft Research during 2014 Summer. I was an intern at IBM Thomas J. Watson Research Center during 2017 Summer.
- Bryan Wang, Yuliang Li, Zhaoyang Lv, Haijun Xia, Yan Xu, Raj Sodhi, "LAVE: LLM-Powered Agent Assistance and Language Augmentation for Video Editing", IUI 2024 [Link] [Video]
-
Wang-Chiew Tan, Jane Dwivedi-Yu, Yuliang Li, Lambert Mathias, Marzieh Saeidi, Jing Nathan Yan, Alon Y. Halevy, "TimelineQA: A Benchmark for Question Answering over Timelines", ACL (Findings) 2023 [Repo]
-
Wang-Chiew Tan, Yuliang Li, Pedro Rodriguez, Richard James, Xi Victoria Lin, Alon Y. Halevy, Wen-tau Yih, "Reimagining Retrieval Augmented Language Models for Answering Queries", ACL (Findings) 2023 [Link]
-
Yuliang Li, Nitin Kamra, Ruta Desai, Alon Y. Halevy, "Human-Centered Planning", ArXiv 2023 [Link]
- Grace Fan, Jin Wang, Yuliang Li, Dan Zhang, and Renée Miller. "Semantics-aware Dataset Discovery from Data Lakes with Contextualized Column-based Representation Learning." in VLDB 2023 [ArXiv]
- Runhui Wang, Yuliang Li, Jin Wang, "Sudowoodo: Contrastive Self-supervised Learning for Multi-purpose Data Integration and Preparation", in ICDE 2023 in [ArXiv]
- Jin Wang, Yuliang Li, "Minun: Evaluating Counterfactual Explanations for Entity Matching", in DEEM 2022 (Best paper award, co-located w. SIGMOD) [Link]
- Jin Wang, Yuliang Li, Wataru Hirota, Eser Kandogan, "Machop: an End-to-End Generalized Entity Matching Framework", in aiDM 2022 (co-located w. SIGMOD) [Link]
- Yu-Ching Hu, Yuliang Li, Hung-Wei Tseng, "TCUDB: Accelerating Database with Tensor Processors", in SIGMOD 2022 [ArXiv]
- Yoshihiko Suhara, Jinfeng Li, Yuliang Li, Dan Zhang, Cagatay Demiralp, Chen Chen, Wang-Chiew Tan, "Annotating Columns with Pre-trained Language Models", in SIGMOD 2022 [ArXiv]
- Jin Wang, Yuliang Li, Wataru Hirota, "Machamp: A Generalized Entity Matching Benchmark", In CIKM 2021 [ArXiv] [Datasets]
- Yuliang Li, Xiaolan Wang, Zhengjie Miao, Wang-Chiew Tan, "Data Augmentation for ML-driven Data Preparation and Integration", In VLDB Tutorial 2021 [Link] [Videos] [Slides]
- Zhengjie Miao, Yuliang Li, Xiaolan Wang, "Rotom: A Meta-Learned Data Augmentation Framework for Entity Matching, Data Cleaning, Text Classification, and Beyond", In SIGMOD 2021 [Link] [Blog] [Demo] [Code]
- Nofar Carmeli, Xiaolan Wang, Yoshihiko Suhara, Stefanos Angelidis, Yuliang Li, Jinfeng Li, Wang-Chiew Tan, "Constructing Explainable Opinion Graphs from Reviews", In theWebConf 2021 [ArXiv]
- Yuliang Li, Jinfeng Li, Yoshihiko Suhara, AnHai Doan, Wang-Chiew Tan, "Deep Entity Matching with Pre-Trained Language Models", In VLDB 2021 [ArXiv] [Code]
- Jinfeng Li, Yuliang Li, Xiaolan Wang, Wang-Chiew Tan, "Deep or Simple Models for Semantic Tagging? It Depends on your Data [Experiments]", In VLDB 2020 [ArXiv]
- Xiaolan Wang, Yoshihiko Suhara, Natalie Nuno, Yuliang Li, Jinfeng Li, Nofar Carmeli, Stefanos Angelidis, Eser Kandogan and Wang-Chiew Tan, "ExtremeReader: An interactive explorer for customizable and explainable review summarization", In theWebConf (WWW) 2020 (Demo track)
- Zhengjie Miao, Yuliang Li, Xiaolan Wang, Wang-Chiew Tan, "Snippext: Semi-supervised Opinion Mining with Augmented Data", In theWebConf (WWW) 2020 [ArXiv] [Slides] [Code]
- Xiong Zhang, Jonathan Engel, Sara Evensen, Yuliang Li, Çagatay Demiralp, Wang-Chiew Tan, "Teddy: A System for Interactive Review Analysis", In CHI 2020 [Paper] [Video] [Code]
- Yuliang Li, Aaron Xixuan Feng, Jinfeng Li, Saran Mumick, Alon Halevy, Vivian Li, Wang-Chiew Tan, "Subjective Databases", In PVLDB 2019 (invited to VLDBJ as "one of the best paper candidates"; Finalist of the Recruit Engine Forum) [ArXiv] [Slides] [Poster]
- Sara Evensen, Aaron Feng, Alon Halevy, Jinfeng Li, Vivian Li, Yuliang Li, Huining Liu, George Mihaila, John Morales, Natalie Nuno, Ekaterina Pavlovic, Wang-Chiew Tan, Xiaolan Wang, "Voyageur: An Experiential Travel Search Engine", In the Web Conference (WWW) 2019 (Demo track) [ArXiv] [Poster] [Demo]
- Yuliang Li, Jianguo Wang, Benjamin Pullman, Nuno Bandeira, Yannis Papakonstantinou, "Index-based High-dimensional Cosine Threshold Querying with Optimality Guarantees", In ICDT 2019 (Invited to ToCS special issue, collection the best of ICDT 2019) [Link] [Full version] [Slides]
- Tara Astigarraga, Xiaoyan Chen, Yaoliang Chen, Jingxiao Gu, Richard Hull, Limei Jiao, Yuliang Li, and Petr Novotny, "Empowering Business-Level Blockchain Users with a Rules Framework for Smart Contracts", In ICSOC 2018 [Link]
- Yuliang Li, Alin Deutsch, Victor Vianu, "VERIFAS: A Practical Verifier for Artifact Systems", In PVLDB 2018 [Link] [ArXiv] [Slides] [Code]
- Yuliang Li, "Practical Verification of Hierarchical Artifact Systems," In VLDB PhD Workshop 2017 [Link] [Slides].
- Alin Deutsch, Yuliang Li, Victor Vianu, "Verification of Hierarchical Artifact Systems," In PODS 2016 [Link] [ArXiv] [Slides]
- Aaron Traylor, Chen Chen, Behzad Golshan, Xiaolan Wang, Yuliang Li, Yoshihiko Suhara, Jinfeng Li, Çagatay Demiralp, Wang-Chiew Tan, "Enhancing Review Comprehension with Domain-Specific Commonsense.", arXiv preprint, 2020 [ArXiv]
- Aaron Feng, Shuwei Chen, Yuliang Li, Hiroshi Matsuda, Hidekazu Tamaki, Wang-Chiew Tan, "Towards Productionizing Subjective Search Systems", arXiv preprint, 2020 [ArXiv]
- Yuliang Li, Alin Deutsch, Victor Vianu, "SpinArt: A Spin-based Verifier for Artifact Systems", arXiv preprint, 2017 [ArXiv] [Slides]
- Yuliang Li, Jinfeng Li, Yoshihiko Suhara, Jin Wang, Wataru Hirota, Wang-Chiew Tan, "Deep Entity Matching: Challenges and Opportunities", In JDIQ 2020 (invited paper) [Link]
- Yuliang Li, Jianguo Wang, Benjamin Pullman, Nuno Bandeira, Yannis Papakonstantinou, "Index-based High-dimensional Cosine Threshold Querying with Optimality Guarantees", In ToCS 2020 (inivted paper, bests of ICDT 2019) [Link]
- Yuliang Li, Aaron Xixuan Feng, Jinfeng Li, Shuwei Chen, Saran Mumick, Alon Halevy, Vivian Li, Wang-Chiew Tan, "Querying Subjective Data", In VLDB Journal 2020 (invited paper, bests of VLDB 2019) [Paper] [ArXiv]
- Alin Deutsch, Yuliang Li, Victor Vianu, "Verification of Hierarchical Artifact Systems," In TODS 2019, [Link]
- Alin Deutsch, Richard Hull, Yuliang Li, Victor Vianu, “Automatic Verification of Database-Centric Systems”, ACM SIGLOG News 5, 2 (April 2018), 37-56 [Link]
- Qiong Fang, Wilfred Ng, Jianlin Feng, Yuliang Li, “Mining Order-Preserving SubMatrices from Probabilistic Matrices,” ACM Transactions on Database Systems (TODS), volume 39, issue 1, Jan. 2014 [Link]
- Qiong Fang, Wilfred Ng, Jianlin Feng, Yuliang Li, “Mining Bucket Order-Preserving SubMatrices in Gene Expression Data,” IEEE Transactions on Knowledge and Data Engineering (TKDE), vol. 24 no. 12, Dec. 2012 [Link]
- Program committee for SIGMOD 2020, SIGMOD 2022, VLDB 2022
- Reviewer for ACL 2021, EMNLP 2021
- UC San Diego ACM-ICPC graduate student coach, 2013 - 2018
- Teaching Assistant of CSE 205A, Logic in Computer Science, 2017 Spring
- Teaching Assistant of CSE 233, Database Theory, 2016 Spring, 2018 Spring
- Teaching Assistant of CSE 132A, Database System Principles, 2014 Spring
- SIGMOD 2016 Travel Award
- UCSD Jacobs PhD Fellowship (2012 - 2015)
- The HKUST Academic Achievement Medal
- May 2012, ACM/ICPC World Finals (Warsaw, Poland), The 36th Place
- May 2011, ACM/ICPC World Finals (Orlando, US), Honourable Mention