Bin Bi

About Me

I am an Applied Science Manager at Amazon in Seattle, WA, USA, leading a talented science team to build a foundation model for shopping. Prior to joining Amazon, I was a Principal Research Scientist and Manager at Alibaba DAMO Academy in Bellevue, WA, leading a research team towards deep understanding and generation of the natural language and multi-modalities. Before joining Alibaba, I was a Senior Researcher at Microsoft Research Redmond, WA, USA. I obtained my Ph.D. degree from the Computer Science Department at University of California, Los Angeles (UCLA) in 2015.

My interests and experience include natural language processing, text mining, information retrieval, and machine learning in general. I am currently working on various NLP problems, models, algorithms and techniques, including:
◎ Pre-training and fine-tuning of large language models (LLMs) for natural language generation and understanding
◎ Cross-modal learning; Vision-language pre-training; Cross-lingual modeling
◎ Open-domain question answering; Machine reading comprehension
◎ Deep learning NLP for text retrieval, search, and recommendation

Selected Publications

Bin Bi*, Ming Yan*, Haiyang Xu*, Chenliang Li*, Junfeng Tian*, Wei Wang, Weihua Chen, Xianzhe Xu, Fan Wang, Zheng Cao, Zhicheng Zhang, Qiyu Zhang, Ji Zhang, Songfang Huang, Fei Huang, Luo Si, and Rong Jin. Achieving Human Parity on Visual Question Answering. ACM Transactions on Information Systems (TOIS), 2023. (* denotes equal contribution) (Unprecedented superhuman performance & Rank #1 as of 4/15/2023 @ VQA Leaderboard)
Haoyu Wang, Ruirui Li, Haoming Jiang, Zhengyang Wang, Xianfeng Tang, Bin Bi, Monica Cheng, Bing Yin, Yaqing Wang, Tuo Zhao, and Jing Gao. LightToken: A Task and Model-agnostic Lightweight Token Embedding Framework for Pre-trained Language Models. Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2023.
Haiyang Xu, Qinghao Ye, Ming Yan, Yaya Shi, Jiabo Ye, Yuanhong Xu, Chenliang Li, Bin Bi, Qi Qian, Wei Wang, Guohai Xu, Ji Zhang, Songfang Huang, Fei Huang, and Jingren Zhou. mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video. Proceedings of the 40th International Conference on Machine Learning (ICML), 2023.
Chaoya Jiang, Haiyang Xu, Wei Ye, Qinghao Ye, Chenliang Li, Ming Yan, Bin Bi, Shikun Zhang, Ji Zhang, and Fei Huang. COPA: Efficient Vision-Language Pre-training Through Collaborative Object- and Patch-Text Alignment. Proceedings of the 31st ACM International Conference on Multimedia (MM), 2023.
Chenliang Li, Haiyang Xu, Junfeng Tian, Wei Wang, Ming Yan, Bin Bi, Jiabo Ye, Hehong Chen, Guohai Xu, Zheng Cao, Ji Zhang, Songfang Huang, Fei Huang, Jingren Zhou, and Luo Si. mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022.
Chaoya Jiang, Haiyang Xu, Chenliang Li, Ming Yan, Wei Ye, Shikun Zhang, Bin Bi, and Songfang Huang. TRIPS: Efficient Vision-and-Language Pre-training with Text-Relevant Image Patch Selection. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022.
Chenliang Li, Bin Bi, Ming Yan, Wei Wang, Songfang Huang, Fei Huang, and Luo Si. StructuralLM: Structural Pre-training for Form Understanding. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics (ACL), 2021. (Rank #1 as of 12/21/2020 @ DocVQA Leaderboard)
Haiyang Xu, Ming Yan, Chenliang Li, Bin Bi, Songfang Huang, Wenming Xiao, and Fei Huang. E2E-VLP: End-to-End Vision-Language Pre-training Enhanced by Visual Learning. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics (ACL), 2021.
Fuli Luo, Wei Wang, Jiahao Liu, Yijia Liu, Bin Bi, Songfang Huang, Fei Huang, and Luo Si. VECO: Variable and Flexible Cross-lingual Pre-training for Language Understanding and Generation. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics (ACL), 2021. (Rank #1 as of 9/21/2021 @ XTREME Leaderboard)
Chenliang Li, Bin Bi, Ming Yan, Wei Wang, and Songfang Huang. Addressing Semantic Drift in Generative Question Answering with Auxiliary Extraction. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics (ACL), 2021.
Guohai Xu, Yan Shao, Chenliang Li, Feng-Lin Li, Bin Bi, Ji Zhang and Haiqing Chen. AliMe DA: a Data Augmentation Framework for Question Answering in Cold-start Scenarios. Proceedings of the 44th International ACM SIGIR Conference (SIGIR), 2021.
Ming Yan, Chenliang Li, Bin Bi, Wei Wang, and Songfang Huang. A Unified Pretraining Framework for Passage Ranking and Expansion. Proceedings of the 35th AAAI Conference on Artificial Intelligence (AAAI), 2021.
Bin Bi, Chenliang Li, Chen Wu, Ming Yan, Wei Wang, Songfang Huang, Fei Huang, and Luo Si. PALM: Pre-training an Autoencoding & Autoregressive Language Model for Context-conditioned Generation. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020. (Rank #1 @ MARCO NLG Leaderboard)
Yao Fu, Chuanqi Tan, Bin Bi, Mosha Chen, Yansong Feng, and Alexander Rush. Latent Template Induction with Gumbel-CRFs. Proceedings of the 34th International Conference on Neural Information Processing Systems (NIPS), 2020.
Bin Bi, Chen Wu, Ming Yan, Wei Wang, Jiangnan Xia, and Chenliang Li. Generating Well-formed Answers by Machine Reading with Stochastic Selector Networks. Proceedings of the 34th AAAI Conference on Artificial Intelligence (AAAI), 2020.
Wei Wang, Bin Bi, Ming Yan, Chen Wu, Zuyi Bao, Jiangnan Xia, Liwei Peng, and Luo Si. StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding. Proceedings of the 8th International Conference on Learning Representations (ICLR), 2020. (Rank #1 @ GLUE Leaderboard)
Bin Bi, Chen Wu, Ming Yan, Wei Wang, Jiangnan Xia, and Chenliang Li. Incorporating External Knowledge into Machine Reading for Generative Question Answering. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP), 2019.
Ming Yan, Jiangnan Xia, Chen Wu, Bin Bi, Zhongzhou Zhao, Ji Zhang, Luo Si, Rui Wang, Wei Wang, and Haiqing Chen. A Deep Cascade Model for Multi-Document Reading Comprehension. Proceedings of the 33rd AAAI Conference on Artificial Intelligence (AAAI), 2019.
Bin Bi, and Junghoo Cho. Modeling a Retweet Network via an Adaptive Bayesian Approach. Proceedings of the 25th International World Wide Web Conference (WWW), 2016.
Bin Bi, Hao Ma, Paul Hsu, Wei Chu, Kuansan Wang, and Junghoo Cho. Learning to Recommend Related Entities to Search Users. Proceedings of the 8th ACM International Conference on Web Search and Data Mining (WSDM), 2015.
Youngchul Cha, Keng-hao Chang, Hari Bommaganti, Ye Chen, Tak Yan, Bin Bi, and Junghoo Cho. A Universal Topic Framework (UniZ) and Its Application in Online Search. Proceedings of the 30th ACM SIGAPP Symposium On Applied Computing (SAC), 2015.
Bin Bi, Yuanyuan Tian, Yannis Sismanis, Andrey Balmin, and Junghoo Cho. Scalable Topic-Specific Influence Analysis on Microblogs. Proceedings of the 7th ACM International Conference on Web Search and Data Mining (WSDM), 2014.
Bin Bi, Ben Kao, Chang Wan, and Junghoo Cho. Who Are Experts Specializing in Landscape Photography? - Analyzing Topic-specific Authority on Content Sharing Services. Proceedings of the 20th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2014.
Bin Bi, Milad Shokouhi, Michal Kosinski, and Thore Graepel. Inferring the Demographics of Search Users - Social Data Meets Search Queries. Proceedings of the 22nd International World Wide Web Conference (WWW), 2013.
Bin Bi, and Junghoo Cho. Automatically Generating Descriptions for Resources by Tag Modeling. Proceedings of the 22nd ACM International Conference on Information and Knowledge Management (CIKM), 2013.
Youngchul Cha, Bin Bi, Chu-Cheng Hsieh, and Junghoo Cho. Incorporating Popularity in Topic Models for Social Network Analysis. Proceedings of the 36th International ACM SIGIR Conference (SIGIR), 2013 (Best Paper Runner Up).
Ruirui Li, Ben Kao, Bin Bi, Reynold Cheng, and Eric Lo. DQR: A Probabilistic Approach to Diversified Query Recommendation. Proceedings of the 21st ACM International Conference on Information and Knowledge Management (CIKM), 2012.
Bin Bi, Sau Dan Lee, Ben Kao, and Reynold Cheng. CubeLSI: An Effective and Efficient Method for Searching Resources in Social Tagging Systems. Proceedings of the 27th IEEE International Conference on Data Engineering (ICDE), 2011.
Bin Bi, Lifeng Shang, and Ben Kao. Collaborative Resource Discovery in Social Tagging Systems. Proceedings of the 18th ACM International Conference on Information and Knowledge Management (CIKM), 2009.

Academic Achievements

I have been working with my team to advance the state-of-the-art in the premier academic competitions/benchmarks in NLP and multi-modal learning:

Rank	Competition/Benchmark	Solution
#1	MARCO (Announcement)	PALM & BayesQA
#1	GLUE	StructBERT
#2	CommonsenseQA	RoBERTa + KE
#1	SQuAD v1.1 (as of 11/1/2018)	SLQA+
#1	DuReader	AliReader
#1	VQA Challenge	mPLUG
#1	DocVQA (as of 12/21/2020)	Structural LM
#1	XTREME (as of 9/21/2021)	VECO

Research Internships

Microsoft Research Redmond, WA	Ad Click Prediction, Knowledge Graph Mining	12/2012 - 3/2013, 6/2013 - 9/2013
Microsoft Research Cambridge, UK	Machine Learning, Social Network Analysis	10/2012 - 12/2012
IBM Research Almaden, CA	Large-scale Data Mining, Social Media Analysis	7/2012 - 9/2012
Microsoft Research Asia	Image Retrieval, Computer Vision	11/2007 - 6/2008

Patents

Identifying Influencers for Topics in Social Media

Inventors: Bin Bi, Andrey Balmin, John Sismanis, and Yuanyuan Tian
Patent Numbers: US9449096B2, US9864807B2

Media Coverage

Inferring the Demographics of Search Users - Social Data Meets Search Queries by Bin Bi, Milad Shokouhi, Michal Kosinski, and Thore Graepel, Proceedings of ACM International Conference on World Wide Web (WWW), 2013.

Facebook knows your sexuality, race and religion through ‘likes’, International Business Times, 2013.
‘Like’ curly fries on Facebook? Then you’re clever, The Telegraph, 2013.
How Facebook ‘likes’ can reveal clues to your sexuality, political beliefs and religion, Daily Mail, 2013.
Facebook ‘likes’ can reveal your secrets, study finds, CNN, 2013.
Facebook ‘likes’ used to predict personality traits, social preferences, eWEEK, 2013.

Selected Talks & Presentations

Generating Well-formed Answers by Machine Reading with Stochastic Selector Networks	AAAI, 2/2020
Open-domain Question Answering with Machine Reading Comprehension	Microsoft Research, 7/2017
Detecting and Typing Entities via CRF and Deep Neural Networks	Microsoft Research, 5/2016
Modeling a Retweet Network via an Adaptive Bayesian Approach	WWW, 4/2016
Learning to Recommend Related Entities to Search Users	WSDM, 2/2015
Bayesian Modeling for Analyzing Online Content and Users	Yahoo Labs, 1/2015
Learning to Discover High-quality Information for Web Users	Symantec Research Labs, 8/2014
Analyzing Topic-specific Authority on Content Sharing Services	KDD, 8/2014
Inferring the Demographics of Search Users - Social Data Meets Search Queries	WWW, 5/2013
Automatically Generating Descriptions for Resources by Tag Modeling	CIKM, 10/2013
Scalable Topic-Specific Influence Analysis on Microblogs	IBM Almaden Research Center, 9/2012
An Effective and Efficient Method for Searching Resources in Social Tagging Systems	ICDE, 4/2011
Collaborative Resource Discovery in Social Tagging Systems	CIKM, 11/2009