About Me

Hyunsik Choi

Hyunsik Choi, Ph.D
Apache Tajo V.P., and Director of Research at Gruter
SF Bay Area, California, USA
Email: hyunsik.choi at gmail dot com, hyunsik at apache dot org
Twitter: @hyunsik_choi
LinkedIn: http://www.linkedin.com/in/hyunsikchoi





He is a director of research at Gruter which is a big data platform startup company located in Palo Alto CA. He is a co-founder of Apache Tajo project, which is an open source data warehouse system and is one of the Apache top-level projects. Since 2013, he has been a full-time contributor of Tajo. He also obtained a PhD degree from Korea University in 2013, under the supervision of Prof. Yon Dohn Chung.

Recent Interest Issues

  • Database Systems
  • Distributed and Large systems
  • Query processing using modern hardware
  • Just-in-time query compilation techniques
  • Cost-based optimization


  • A member of Apache Software Foundation, 2014.03 – current.
  • Apache Tajo PMC Chair, 2014.03 – current.
  • Apache PMC member and committer –  Apache Tajo: A open source data warehouse system on Hadoop, 2013.03 – current
  • Apache PMC member and committer – Apache Giraph: a faul-tolerant in-memory distributed graph processing system, 2011.08 – current
  • Architect and lead developer –  Tajo (a former of Apache Tajo), 2010. 07 – 2012. 06

Talks and Conferences

  • An Evaluation of Alternative Shared-nothing Architecture for Analytical Processing Systems, 2015 IEEE International Conference on Big Data, Santa Clara, USA, October 29 – November 1, 2015.
  • What’s New Tajo 0.10 and Beyond, Big Data Day LA 2015. 06 (Slide)
  • Efficient In-situ Processing of Various Storages on Apache Tajo, Hadoop Summit North America 2015. 06 (Slide)
  • Apache Tajo: A Big Data Warehouse System Hadoop, LA Big Data Camp, 2014. 06. (Slide)
  • Query Optimization and JIT-based Vectorized Engine in Apache Tajo,  Hadoop Summit North America 2014. 06. (Slide)
  • SQL-on-Hadoop and Tajo, Tech Planet, 2013. 11.14. (Slide in Korean)
  • Introduction to Apache Tajo, Bay Area Hadoop User Group, 2013. 11. 05. (Slide)
  • Introduction to Apache Tajo, Advanced Computing Conference (ACC), 2013. 04. 17.

Publications (Google Scholar Profile)

  • Hyunsik Choi, Yong In Lee, Jongyoung Park, Kangho Roh, Kwanghyun La, An Evaluation of Alternative Shared-nothing Architecture for Analytical Processing Systems, 2015 IEEE International Conference on Big Data, Santa Clara, USA, October 29 – November 1, 2015. (to be appeared)
  • Hyunsik Choi, Jihoon Son, Haemi Yang, Hyoseok Ryu, and Yon Dohn Chung, Tajo: A Distributed Data Warehouse System on Large Clusters (demo), 29th IEEE International Conference on Data Engineering (ICDE), Brisbane, Australia, April 8-12, 2013.
  • Hyunsik Choi, Ki Yong Lee, and Yon Dohn Chung, Skyline queries on keyword-matched data, Information Sciences, Vol.  232, pp 449-463, 2013. (Impact Factor: 2.833)
  • Jihoon Son, Hyunsik Choi, and Yon Dohn Chung, Skew-Tolerant Key Distribution for Load Balancing in MapReduce, IEICE Trans. IS, E95-D(2), pp. 677-680, Feb. 2012.
  • Kyong-Ha Lee, Hyunsik Choi, Bongki Moon, Yoon-Joon Lee, and Yon Dohn Chung, Parallel Data Processing with MapReduce: A Survey, ACM SIGMOD Record 40(4): 11-20, 2011.
  • Hyunsik Choi, Jihoon Son, YongHyun Cho, Min Kyoung Sung, Yon Dohn Chung, SPIDER : A System for Scalable, Parallel / Distributed Evaluation of large-scale RDF Data (demo), the 18th ACM Conference on Information and Knowledge Management (CIKM), Hong Kong, November 2-6, 2009.
  • HaRim Jung, Hyunsik Choi and Yon Dohn Chung, Generalized Spatial Queries in the Wireless Data Broadcasting System, International Conference on Mobile Data Management Systems, Services and Middleware (MDM), Taipei, Taiwan, May 18-20, 2009.

Other Activities

  • Mentor, Google Summer of Code 2013