Computer Science ›› 2020, Vol. 47 ›› Issue (7): 1-7.doi: 10.11896/jsjkx.200500088

• Discipline Construction • Previous Articles     Next Articles

Course Design and Redesign for Introduction to Data Science

CHAO Le-men   

  1. Key Laboratory of Data Engineering and Knowledge Engineering (Renmin University of China),Beijing 100872,China
    School of Information Resource Management,Renmin University of China,Beijing 100872,China
  • Received:2020-03-12 Online:2020-07-15 Published:2020-07-16
  • About author:CHAO Le-men,born in 1979,Ph.D,associate professor,is a member of Technical Committee of Information System of China Computer Federation.His main research interests include data science,big data analytics,andknow-ledge processing on the semantic Web.
  • Supported by:
    This work was supported by MOE(Ministry of Education in China) Project of Humanities and Social Sciences (20YJA870003)

Abstract: Introduction to Data Science is an intrinsic course for not only the development of emerging majors (Data Science and Big Data Technology,Big Data Management and Application,and so on),but also only the innovation of traditional ones (Compu-ter Science and Technology,Statistics,and Information Resource Management,and so on).Course design issues for this novel course,including its objectives,contents,experiments,assessment methods,reference books,personalized design are discussed based upon conducting an in-depth for typical courses offered by Columbia University,New York University,Harvard University and Renmin University of China as well as the author’s teaching experience.The redesign of exiting courses on introduction to Data Science should focus on improving the abilities of target students on full-stack data science,data product development,co-ding for Data Science,and communicating with non-professional users,as well as leveraging alternative course construction mo-dels,reflecting social needs,highlighting its roadmap roles for the curriculums.

Key words: Data Science, Course design, Big data

CLC Number: 

  • TP391
[1] O’Neil C,SCHUTT R.Doing data science:Straight talk from the frontline[M].O’Reilly Media,Inc.,2013.
[2] PROVOST F.Data Science for Business Analytics Syllabus-Spring 2020[OL].
[3] RASCHKA S,MIRJALILI V.Python Machine Learning:Machine Learning and Deep Learning with Python,scikit-learn,and TensorFlow 2[M].Packt Publishing Ltd,2019.
[4] Institute for Applied Computational Science.CS109a:Introduction to Data Science[OL].
[5] JAMES G,WITTEN D,HASTIE T,et al.An introduction to statistical learning[M].New York:springer,2013.
[6] CHAO L M.Data science:srinciples and practices[M].Beijing:Tsinghua University Press,2017.
[7] CHAO L M,YANG C J,WANG S J,et al.Data ScienceCurri-culums Around the World:An Empirical Study [J].Data Analysis and Knowledge Discovery,2017,1(6):12-21.
[8] CONWAY D.The Data Science Venn Diagram[OL].
[9] CHAO L M,XING C X,ZHANG Y.Data science:the state of art and trend[J].Computer Science,2018,45(1):1-13.
[10] CHAO L M,XING C X,WANG Y Q.Unique Curriculums for Data Science and Big Data Technology[J].Computer Science,2018,45(3):3-10.
[1] YE Ya-zhen, LIU Guo-hua, ZHU Yang-yong. Two-step Authorization Pattern of Data Product Circulation [J]. Computer Science, 2021, 48(1): 119-124.
[2] ZHAO Hui-qun, WU Kai-feng. Big Data Valuation Algorithm [J]. Computer Science, 2020, 47(9): 110-116.
[3] MA Meng-yu, WU Ye, CHEN Luo, WU Jiang-jiang, LI Jun, JING Ning. Display-oriented Data Visualization Technique for Large-scale Geographic Vector Data [J]. Computer Science, 2020, 47(9): 117-122.
[4] GU Rong-Jie, WU Zhi-ping and SHI Huan. New Approach for Graded and Classified Cloud Data Access Control for Public Security Based on TFR Model [J]. Computer Science, 2020, 47(6A): 400-403.
[5] LI Yong. Stock Investment Strategy Development Based on BigQuant Platform [J]. Computer Science, 2020, 47(6A): 612-615.
[6] GE Yu-ming, HAN Qing-wen, WANG Miao-qiong, ZENG Ling-qiu, LI Lu. Application Mode and Challenges of Vehicular Big Data [J]. Computer Science, 2020, 47(6): 59-65.
[7] LIU Ji-qin, SHI Kai-quan. Big Data Decomposition-Fusion and Its Intelligent Acquisition [J]. Computer Science, 2020, 47(6): 66-73.
[8] ZENG Wei-liang, WU Miao-sen, SUN Wei-jun, XIE Sheng-li. Comprehensive Review of Autonomous Taxi Dispatching Systems [J]. Computer Science, 2020, 47(5): 181-189.
[9] CHAO Le-men. Open-source Course and Open-sourcing Intro to Data Science [J]. Computer Science, 2020, 47(12): 114-118.
[10] YU Xin-yi, SHI Tian-feng, TANG Quan-rui, YIN Hui-wu, OU Lin-lin. Industrial Equipment Management System for Predictive Maintenance [J]. Computer Science, 2020, 47(11A): 667-672.
[11] HAO Xiu-mei, SHI Kai-quan. Big Data Intelligent Retrieval and Big Data Block Element Intelligence Separation [J]. Computer Science, 2020, 47(11): 113-121.
[12] WANG Yang, LI Peng, JI Yi-mu, FAN Wei-bei, ZHANG Yu-jie, WANG Ru-chuan, CHEN Guo-liang. High Performance Computing and Astronomical Data:A Survey [J]. Computer Science, 2020, 47(1): 1-6.
[13] KONG Fan-yu, ZHOU Yu-feng, CHEN Gang. Traffic Flow Prediction Method Based on Spatio-Temporal Feature Mining [J]. Computer Science, 2019, 46(7): 322-326.
[14] WANG Zhen, ZHOU Ying, HUANG Cheng-dong, MIAO Quan-qiang. Survey on Blockchain Solution for Big Data [J]. Computer Science, 2019, 46(6A): 6-10.
[15] ZHAO Ying, HOU Jun-jie, YU Cheng-long, XU Hao, ZHANG Wei. Study and Application of Industrial Big Data in Production Management and Control [J]. Computer Science, 2019, 46(6A): 45-51.
Full text



[1] LEI Li-hui and WANG Jing. Parallelization of LTL Model Checking Based on Possibility Measure[J]. Computer Science, 2018, 45(4): 71 -75 .
[2] SUN Qi, JIN Yan, HE Kun and XU Ling-xuan. Hybrid Evolutionary Algorithm for Solving Mixed Capacitated General Routing Problem[J]. Computer Science, 2018, 45(4): 76 -82 .
[3] ZHANG Jia-nan and XIAO Ming-yu. Approximation Algorithm for Weighted Mixed Domination Problem[J]. Computer Science, 2018, 45(4): 83 -88 .
[4] WU Jian-hui, HUANG Zhong-xiang, LI Wu, WU Jian-hui, PENG Xin and ZHANG Sheng. Robustness Optimization of Sequence Decision in Urban Road Construction[J]. Computer Science, 2018, 45(4): 89 -93 .
[5] SHI Wen-jun, WU Ji-gang and LUO Yu-chun. Fast and Efficient Scheduling Algorithms for Mobile Cloud Offloading[J]. Computer Science, 2018, 45(4): 94 -99 .
[6] ZHOU Yan-ping and YE Qiao-lin. L1-norm Distance Based Least Squares Twin Support Vector Machine[J]. Computer Science, 2018, 45(4): 100 -105 .
[7] LIU Bo-yi, TANG Xiang-yan and CHENG Jie-ren. Recognition Method for Corn Borer Based on Templates Matching in Muliple Growth Periods[J]. Computer Science, 2018, 45(4): 106 -111 .
[8] CUI Qiong, LI Jian-hua, WANG Hong and NAN Ming-li. Resilience Analysis Model of Networked Command Information System Based on Node Repairability[J]. Computer Science, 2018, 45(4): 117 -121 .
[9] YANG Yu-qi, ZHANG Guo-an and JIN Xi-long. Dual-cluster-head Routing Protocol Based on Vehicle Density in VANETs[J]. Computer Science, 2018, 45(4): 126 -130 .
[10] HAN Kui-kui, XIE Zai-peng and LV Xin. Fog Computing Task Scheduling Strategy Based on Improved Genetic Algorithm[J]. Computer Science, 2018, 45(4): 137 -142 .