WANG, Sheng

Research Scientist

Database and Storage Lab
Alibaba Cloud Intelligence, Alibaba Group

Office: Lazada One, 51 Bras Basah Road, Singapore 189554
Email: sh.wang AT alibaba-inc.com

We are hiring researchers/engineers @ several locations!
Interns, postdocs and visiting professors are welcome!

About Me

I am a Research Scientist/Director in the Database and Storage Lab at Alibaba Cloud. Prior to joining Alibaba, I was a Research Fellow in DB System Group at National University of Singapore. I obtained my Ph.D. in Computer Science from National University of Singapore, and my B.S. in Computer Science and Technology from Harbin Institute of Technology, China.

Our Database and Storage Lab conducts research on latest challenges for database and storage systems in the cloud era. To support both online services within Alibaba Group and enterprise customers on Alibaba Cloud, we are working towards building reliable, secure, intelligent, performant and globally distributed database systems and services.

Research Interests

I am interested in solving research problems and building practical systems for real-world applications:

  • Cloud-Native and Distributed Database Architectures
  • Encrypted, Verifiable and Trustworthy Databases
  • Confidential and Privacy-Preserving Computing
  • Data Storage and Indexing
  • Data Analytics Systems
  • Blockchain Systems

Selected Publications

  • Object-oriented Unified Encrypted Memory Management for Heterogeneous Memory Architectures, by M. Sha, Y. Cai, S. Wang, L.T.X. Phan, F. Li, K.-L. Tan. In Proceedings of the 43rd ACM SIGMOD International Conference on Management of Data (SIGMOD 2024), pages TBD, Santiago, Chile, June 2024.
  • TEE-based General-purpose Computational Backend for Secure Delegated Data Processing, by M. Sha, J. Li, S. Wang, F. Li, K.-L. Tan. In Proceedings of the 43rd ACM SIGMOD International Conference on Management of Data (SIGMOD 2024), Article No. 263, pages 1-28, Santiago, Chile, June 2024.
  • Secure Sampling for Approximate Multi-party Query Processing, by Q. Luo, Y. Wang, K. Yi, S. Wang, F. Li. In Proceedings of the 43rd ACM SIGMOD International Conference on Management of Data (SIGMOD 2024), Article No. 219, pages 1-27, Santiago, Chile, June 2024.
  • Lindorm TSDB: A Cloud-native Time-series Database for Large-scale Monitoring Systems, by C. Shen, Q. Ouyang, F. Li, Z. Liu, L. Zhu, Y. Zhou, Q. Su, T. Yu, Y. Yi, J. Hu, C. Zheng, B. Wen, H. Zheng, L. Xu, S. Pan, B. Wu, X. He, Y. Li, J. Tan, S. Wang, D. Pei, W. Zhang, F. Li. In Proceedings of the 49th International Conference on Very Large Data Bases (VLDB 2023), pages 3715-3727, Vancouver, Canada, August 2023.
  • HEDA: Multi-Attribute Unbounded Aggregation over Homomorphically Encrypted Database, by X. Ren, L. Su, S. Bian, Z. Gu, S. Wang, F. Li, C. Li, F. Zhang, Y. Xie. In Proceedings of the 49th International Conference on Very Large Data Bases (VLDB 2023), pages 601-614, Vancouver, Canada, August 2023.
  • Encrypted Databases Made Secure Yet Maintainable, by M. Li, X. Zhao, L. Chen, C. Tan, H. Li, S. Wang, Z. Mi, Y. Xia, F. Li, H. Chen. In Proceedings of the 17th USENIX Symposium on Operating Systems Design and Implementation (OSDI 2023), pages 117-133, Boston, USA, July 2023.
  • The RLR-Tree: A Reinforcement Learning Based R-Tree for Spatial Data, by T. Gu, K. Feng, G. Cong, C. Long, Z. Wang, S. Wang. In Proceedings of the 42nd ACM SIGMOD International Conference on Management of Data (SIGMOD 2023), Article No. 63, pages 1-26, Seattle, USA, June 2023.
  • EulerFD: An Efficient Double-Cycle Approximation of Functional Dependencies, by Q. Lin, Y. Gu, J. Sai, J. Liu, K. Ren, L. Xiong, T. Wang, Y. Pang, S. Wang, F. Li. In Proceedings of the 39th IEEE International Conference on Data Engineering (ICDE 2023), pages 2878-2891, Anaheim, USA, April 2023.
  • Scaling Blockchain Consensus via a Robust Shared Mempool, by F. Gai, J. Niu, I. Beschastnikh, C. Feng, S. Wang. In Proceedings of the 39th IEEE International Conference on Data Engineering (ICDE 2023), pages 530-543, Anaheim, USA, April 2023.
  • CloudJump: Optimizing Cloud Databases for Cloud Storages, by Z. Chen, X. Yang, F. Li, X. Cheng, Q. Hu, Z. Miao, R. Xie, X. Wu, K. Wang, Z. Song, H. Sun, Z. Zhuang, Y. Yang, J. Xu, L. Yin, W. Zhou, S. Wang. In Proceedings of the 48th International Conference on Very Large Data Bases (VLDB 2022), pages 3432-3444, Sydney, Australia, September 2022.
  • VRE: A Versatile, Robust, and Economical Trajectory Data System, by H. Lan, J. Xie, Z. Bao, F. Li, W. Tian, F. Wang, S. Wang, A. Zhang. In Proceedings of the 48th International Conference on Very Large Data Bases (VLDB 2022), pages 3398-3410, Sydney, Australia, September 2022.
  • Tair-PMem: A Fully Durable Non-Volatile Memory Database, by C. Gong, C. Tian, Z. Wang, S. Wang, X. Wang, Q. Fu, W. Qin, L. Qian, R. Chen, J. Qi, R. Wang, G. Zhu, C. Yang, W. Zhang, F. Li. In Proceedings of the 48th International Conference on Very Large Data Bases (VLDB 2022), pages 3346-3358, Sydney, Australia, September 2022.
  • Operon: An Encrypted Database for Ownership-Preserving Data Management, by S. Wang, Y. Li, H. Li, F. Li, C. Tian, L. Su, Y. Zhang, Y. Ma, L. Yan, Y. Sun, X. Cheng, X. Xie, Y. Zou. In Proceedings of the 48th International Conference on Very Large Data Bases (VLDB 2022), pages 3332-3345, Sydney, Australia, September 2022.
  • ESDB: Processing Extremely Skewed Workloads in Real-time, by J. Zhang, S. Cheng, Z. Xue, J. Deng, C. Fu, W. Zhou, S. Wang, C. Chen, F. Li. In Proceedings of the 41st ACM SIGMOD International Conference on Management of Data (SIGMOD 2022), pages 2286-2298, Philadelphia, USA, June 2022.
  • Towards Practical Oblivious Join, by Z. Chang, D. Xie, S. Wang, F. Li. In Proceedings of the 41st ACM SIGMOD International Conference on Management of Data (SIGMOD 2022), pages 803-817, Philadelphia, USA, June 2022.
  • PolarDB-X: An Elastic Distributed Relational Database for Cloud-Native Applications, by W. Cao, F. Li, G. Huang, J. Lou, J. Zhao, D. He, M. Sun, Y. Zhang, S. Wang, X. Wu, H. Liao, Z. Chen, X. Fang, M. Chen, C. Liang, Y. Luo, H. Wang, S. Wang, Z. Ma, J. Yang, X. Peng, Y. Ruan, Y. Wang, J. Zhou, J. Wang, Q. Hu, J. Kang. In Proceedings of the 38th IEEE International Conference on Data Engineering (ICDE 2022), pages 2859-2872, Kuala Lumpur, Malaysia, May 2022.
  • Ubiquitous Verification in Centralized Ledger Database, by X. Yang, S. Wang, F. Li, Y. Zhang, W. Yan, F. Gai, B. Yu, L. Feng, Q. Gao, Y. Li. In Proceedings of the 38th IEEE International Conference on Data Engineering (ICDE 2022), pages 1808-1821, Kuala Lumpur, Malaysia, May 2022.
  • ROVEC: Runtime Optimization of Vectorized Expression Evaluation for Column Store, by M. Li, Z. Miao, D. Wu, F. Li, S. Wang, W. Cao, Z. Qiao, Y. Ruan, Y. Liang, J. Yang, H. Dai, G. Chen. IEEE Transactions on Knowledge and Data Engineering (TKDE 2021), pages 3045-3058, November 2021.
  • Learning Multi-context Aware Location Representations from Large-scale Geotagged Images, by Y. Yin, Y. Zhang, Z. Liu, Y. Liang, S. Wang, R. Shah, R. Zimmermann. In Proceedings of the 29th ACM International Conference on Multimedia (ACM MM 2021), pages 899-907, Chengdu, China, October 2021.
  • Building Enclave-Native Storage Engines for Practical Encrypted Databases, by Y. Sun, S. Wang, H. Li, F. Li. In Proceedings of the 47th International Conference on Very Large Data Bases (VLDB 2021), pages 1019-1032, Copenhagen, Denmark, August 2021.
  • PolarDB Serverless: A Cloud Native Database for Disaggregated Data Centers, by W. Cao, Y. Zhang, X. Yang, F. Li, S. Wang, Q. Hu, X. Cheng, Z. Chen, Z. Liu, J. Fang, B. Wang, Y. Wang, H. Sun, Z. Yang, Z. Cheng, S. Chen, J. Wu, W. Hu, J. Zhao, Y. Gao, S. Cai, Y. Zhang, J. Tong. In Proceedings of the 40th ACM SIGMOD International Conference on Management of Data (SIGMOD 2021), pages 2477-2489, Xi'an, China, June 2021.
  • VeriDB: An SGX-based Verifiable Database, by W. Zhou, Y. Cai, Y. Peng, S. Wang, K. Ma, F. Li. In Proceedings of the 40th ACM SIGMOD International Conference on Management of Data (SIGMOD 2021), pages 2182-2194, Xi'an, China, June 2021.
  • HybrIDX: New Hybrid Index for Volume-hiding Range Queries in Data Outsourcing Services, by K. Ren, Y. Guo, J. Li, X. Jia, C. Wang, Y. Zhou, S. Wang, N. Cao, F. Li. In Proceedings of the 40th IEEE International Conference on Distributed Computing Systems (ICDCS 2020, Best Paper Award), pages 23-33, Singapore, November 2020.
  • AnalyticDB-V: A Hybrid Analytical Engine Towards Query Fusion for Structured and Unstructured Data, by C. Wei, B. Wu, S. Wang, R. Lou, C. Zhan, F. Li, Y. Cai. In Proceedings of the 46th International Conference on Very Large Data Bases (VLDB 2020), pages 3152-3165, Tokyo, Japan, August 2020.
  • LedgerDB: A Centralized Ledger Database for Universal Audit and Verification, by X. Yang, Y. Zhang, S. Wang, B. Yu, F. Li, Y. Li, W. Yan. In Proceedings of the 46th International Conference on Very Large Data Bases (VLDB 2020), pages 3138-3151, Tokyo, Japan, August 2020.
  • Diagnosing Root Causes of Intermittent Slow Queries in Cloud Databases, by M. Ma, Z. Yin, S. Zhang, S. Wang, C. Zheng, X. Jiang, H. Hu, C. Luo, Y. Li, N. Qiu, F. Li, C. Chen, D. Pei. In Proceedings of the 46th International Conference on Very Large Data Bases (VLDB 2020), pages 1176-1189, Tokyo, Japan, August 2020.
  • Analysis of Indexing Structures for Immutable Data, by C. Yue, Z. Xie, M. Zhang, G. Chen, B.C. Ooi, S. Wang, X. Xiao. In Proceedings of the 39th ACM SIGMOD International Conference on Management of Data (SIGMOD 2020), pages 925-935, Portland, USA, June 2020.
  • Timon: A Timestamped Event Database for Efficient Telemetry Data Processing and Analytics, by W. Cao, Y. Gao, F. Li, S. Wang, B. Lin, K. Xu, X. Feng, Y. Wang, Z. Liu, G. Zhang. In Proceedings of the 39th ACM SIGMOD International Conference on Management of Data (SIGMOD 2020), pages 739-753, Portland, USA, June 2020.
  • Two-Level Data Compression using Machine Learning in Time Series Database, by X. Yu, Y. Peng, F. Li, S. Wang, X. Shen, H. Mai, Y. Xie. In Proceedings of the 36th IEEE International Conference on Data Engineering (ICDE 2020), pages 1333-1344, Dallas, USA, April 2020.
  • HotRing: A Hotspot-Aware In-Memory Key-Value Store, by J. Chen, L. Chen, S. Wang, G. Zhu, Y. Sun, H. Liu, F. Li. In Proceedings of the 18th USENIX Conference on File and Storage Technologies (FAST 2020), pages 239-252, Santa Clara, USA, February 2020.
  • AnalyticDB: Real-time OLAP Database System at Alibaba Cloud, by C. Zhan, M. Su, C. Wei, X. Peng, L. Lin, S. Wang, Z. Chen, F. Li, Y. Pan, F. Zheng, C. Chai. In Proceedings of the 45th International Conference on Very Large Data Bases (VLDB 2019), pages 2059-2070, Los Angeles, USA, August 2019.
  • Rafiki: Machine Learning as an Analytics Service System, by W. Wang, J. Gao, M. Zhang, S. Wang, G. Chen, T.K. Ng, B.C. Ooi, J. Shao. In Proceedings of the 45th International Conference on Very Large Data Bases (VLDB 2019), pages 128-140, Los Angeles, USA, August 2019.
  • X-Engine: An Optimized Storage Engine for Large-Scale E-Commerce Transaction Processing, by G. Huang, X. Cheng, J. Wang, Y. Wang, D. He, T. Zhang, F. Li, S. Wang, W. Cao, Q. Li. In Proceedings of the 38th ACM SIGMOD International Conference on Management of Data (SIGMOD 2019), pages 651-665, Amsterdam, Netherland, June 2019.
  • GPS2Vec: Towards Generating Worldwide GPS Embeddings, by Y. Yin, Z. Liu, Y. Zhang, S. Wang, R.R. Shan, R. Zimmermann. In Proceedings of the 27th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (SIGSPATIAL 2019), pages 416-419, Chicago, USA, November 2019.
  • Efficient Distributed Memory Management with RDMA and Caching, by Q. Cai, W. Guo, H. Zhang, D. Agrawal, G. Chen, B.C. Ooi, K.-L. Tan, Y.M. Teo, S. Wang. In Proceedings of the 44th International Conference on Very Large Data Bases (VLDB 2018), pages 1604-1617, Rio de Janeiro, Brazil, August 2018.
  • ForkBase: An Efficient Storage Engine for Blockchain and Forkable Applications, by S. Wang, A. Dinh, Q. Lin, Z. Xie, M. Zhang, Q. Cai, G. Chen, B.C. Ooi, P. Ruan. In Proceedings of the 44th International Conference on Very Large Data Bases (VLDB 2018), pages 1137-1150, Rio de Janeiro, Brazil, August 2018.
  • Fast and Adaptive Indexing of Multi-Dimensional Observational Data, by S. Wang, D. Maier, B.C. Ooi. In Proceedings of the 42nd International Conference on Very Large Data Bases (VLDB 2016), pages 1683-1694, New Delhi, India, September 2016.
  • Deep Learning at Scale and at Ease, by W. Wang, G. Chen, H. Chen, A. Dinh, J. Gao, B.C. Ooi, K.-L. Tan, S. Wang, M. Zhang. In Proceedings of ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP 2016), pages 69:1-69:25.
  • SINGA: A Distributed Deep Learning Platform, by B.C. Ooi, K.-L. Tan, S. Wang, W. Wang, Q. Cai, G. Chen, J. Gao, Z. Luo, A.K.H. Tung, Y. Wang, Z. Xie, M. Zhang, K. Zheng. In Proceedings of the 23rd ACM International Conference on Multimedia (ACM MM 2015 Open Source Software Competition), pages 685-688, Brisbane, Australia, October 2015.
  • SINGA: Putting Deep Learning in the Hands of Multimedia Users, by W. Wang, G. Chen, A. Dinh, J. Gao, B.C. Ooi, K.-L. Tan, S. Wang. In Proceedings of the 23rd ACM International Conference on Multimedia (ACM MM 2015, Best Paper Award Runner-up), pages 25-34, Brisbane, Australia, October 2015.
  • Selective Hashing: Closing the Gap between Radius Search and k-NN Search, by J. Gao, H.V. Jagadish, B.C. Ooi, S. Wang. In Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD 2015), pages 349-358, Sydney, Australia, August 2015.
  • Lightweight Indexing of Observational Data in Log Structured Storage, by S. Wang, D. Maier, B.C. Ooi. In Proceedings of the 40th International Conference on Very Large Data Bases (VLDB 2014), pages 529-540, Hangzhou, China, September 2014.
  • K-Anonymity for Crowdsourcing Database, by S. Wu, X. Wang, S. Wang, Z. Zhang, A.K.H. Tung. IEEE Transactions on Knowledge and Data Engineering (TKDE 2013), pages 2207-2221, June 2013.
  • LogBase: A Scalable Log-Structured Database System in the Cloud, by H.T. Vo, S. Wang, D. Agrawal, G. Chen, B.C. Ooi. In Proceedings of the 38th International Conference on Very Large Data Bases (VLDB 2012), pages 1004-1015, Istanbul, Turkey, August 2012.

(Check Google Scholar or DBLP for the full publication list)

Some Honors and Awards

  • (2021) DAMO Award (Individual Award), DAMO Academy, Alibaba Group
  • (2020) IEEE ICDCS 2020 Best Paper Award
  • (2016) Dean's Graduate Research Excellence Award, National University of Singapore
  • (2015 - present) Apache SINGA Project Management Committee Member (Founding Member, Southeast Asia's first Apache TLP)
  • (2015) ACM Multimedia 2015 Best Paper Award Runner-up
  • (2011) ACM/ICPC Programming Contest World Finalist (27th/105, HIT's highest-ever ranking to date)
  • (2010) Gold Medal (5th/139), ACM/ICPC Asian Regional Contest, Fuzhou Site
  • (2010) Champion (1st/125), Northeast China Collegiate Programming Contest
  • (2009) National Scholarship, China Ministry of Education