Computer Science ›› 2022, Vol. 49 ›› Issue (4): 80-87.doi: 10.11896/jsjkx.211100014

• Special Issue of Social Computing Based Interdisciplinary Integration • Previous Articles     Next Articles

Big Data-driven Based Socioeconomic Status Analysis:A Survey

YAO Xiao-ming1,2, DING Shi-chang3, ZHAO Tao4, HUANG Hong5, LUO Jar-der6, FU Xiao-ming1   

  1. 1 Institute of Computer Science, University of Goettingen, Goettingen 37077, Germany;
    2 Cloud Branch Big Data Department, China Telecom Co.Ltd, Beijing 100033, China;
    3 School of Cyberspace Security, State Key Laboratory of Mathematical Engineering & Advanced Computing, Zhengzhou 276800, China;
    4 College of Advanced Interdisciplinary Studies, National University of Defense Technology, Changsha 410073, China;
    5 College of Computer Science and Technology, Huazhong University of Science & Technology, Wuhan 430074, China;
    6 Department of Sociology, Tsinghua University, Beijing 100084, China
  • Received:2021-10-29 Revised:2022-02-16 Published:2022-04-01
  • About author:YAO Xiao-ming, born in 1970,technical director of big data unit at the Cloud Branch,China Telecom.His main research interests include smart cities,mobile big data and data mining.FU Xiao-ming,born in 1973,Ph.D,professor,IEEE fellow,IET fellow,ACM distinguished scientist,is a member of Academia Europaea.His main research interests include networked systems,cloud computing and big data analytics.
  • Supported by:
    This work was supported by the European Union's Horizon 2020 Research and Innovation Programme under the Marie Skłodowska-Curie Grant Agreement(824019) and Chinese National Key R&D Program (2020YFE0200500).

Abstract: Socioeconomic Status (SES), an overall measure of a person's economic and social status relative to others combining factors such as economics and sociology, has received a lot of attention from researchers, as its assessment can help relevant orga-nizations to make various policies and decisions (governmental formulation of social policies, advertising personalized services, etc).In addition, with the development of big data technology and machine learning in recent years, assessing people's socioeconomic attributes (SEAs) and further obtaining the corresponding socioeconomic status with a data-driven approach can address the issue of extremely high cost of traditional methods.Therefore, this paper summarizes the research progresses of applying big data techniques to socioeconomic status analysis in recent years.It first introduces the basic concept of socioeconomic status and discusses the challenges posed by big data methods compared to traditional methods.After that, it systematically summarizes and classifies the state-of-the-art related methods based on the information in the learning process, and present them in detail, discusses the pros and cons of each type of method.Finally, it discusses the challenges and problems of inferring people's socioeconomic status and provides an outlook on future research directions.

Key words: Data mining, Deep learning, Machine learning, Social media, Socioeconomic status

CLC Number: 

  • TP391
[1] ALETRAS N,CHAMBERLAIN B P.Predicting twitter usersocioeconomic attributes with network and language information[C]//Proceedings of the 29th ACM on Hypertext and Social Media.2018:20-24.
[2] SZOPI��SKI T S.Factors affecting the adoption of online ban-king in Poland[J].Journal of Business Research,2016,69(11):4763-4768.
[3] CHEN D,JIN D,GOH T T,et al.Context-awareness based personalized recommendation of anti-hypertension drugs[J].Journal of Medical Systems,2016,40(9):1-10.
[4] HUNG L.A personalized recommendation system based onproduct taxonomy for one-to-one marketing online[J].Expert Systems with Applications,2005,29(2):383-392.
[5] WU Y,CARNT N,STAPLETON F.Contact lens user profile,attitudes and level of compliance to lens care[J].Contact Lens and Anterior Eye,2010,33(4):183-188.
[6] SOTO V,FRIAS-MARTINEZ V,VIRSEDA J,et al.Prediction of socioeconomic levels using cell phone records[C]//International Conference on User Modeling,Adaptation,and Personalization.Berlin:Springer,2011:377-388.
[7] BLUMENSTOCK J,CADAMURO G,ON R.Predicting poverty and wealth from mobile phone metadata[J].Science,2015,350(6264):1073-1076.
[8] ALMAATOUQ A,PRIETO-CASTRILLO F,PENTLAND A.Mobile communication signatures of unemployment[C]//International Conference on Social Informatics.Cham:Springer,2016:407-418.
[9] XU Y,BELYI A,BOJIC I,et al.Human mobility and socioeconomic status:Analysis of Singapore and Boston[J].Computers,Environment and Urban Systems,2018,72:51-67.
[10] PREOTIUC-PIETRO D,LAMPOS V,ALETRAS N.An analysis of the user occupational class through Twitter content[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1:Long Papers).2015:1754-1764.
[11] PREOTIUC-PIETRO D,VOLKOVA S,LAMPOS V,et al.Studying user income through language,behaviour and affect in social media[J/OL].PloS One. [12] LAMPOS V,ALETRAS N,GEYTI J K,et al.Inferring the socioeconomic status of social media users based on behaviour and language[C]//European Conference on Information Retrieval.Cham:Springer,2016:689-695.
[13] WANG P,GUO J,LAN Y,et al.Your cart tells you:Inferringdemographic attributes from purchase data[C]//Proceedings of the ninth ACM International Conference on Web Search and Data Mining.2016:173-182.
[14] OYAMADA M,NAKADAI S.Relational mixture of experts:Explainable demographics prediction with behavioral data[C]//International Conference on Data Mining (ICDM).IEEE,2017:357-366.
[15] DING S,HUANG H,ZHAO T,et al.Estimating socioeconomic status via temporal-spatial mobility analysis-A case study of smart card data[C]//28th International Conference on Compu-ter Communication and Networks (ICCCN).IEEE,2019:1-9.
[16] DING S,GAO X,DONG Y,et al.Estimating Multiple Socioeconomic Attributes via Home Location—A Case Study in China[J].Journal of Social Computing,2021,2(1):71-88.
[17] CULLUMBINE H.The health of a tropical people.A survey in Ceylon.2.Environment,health and physique[J].Lancet,1953,264:1144-1147.
[18] GOVER M.Physical impairments of members of low-incomefarm families;11490 persons in 2,477 rural families examined by the Farm Security Administration,1940;variation of blood pressure and heart disease with age;and the correlation of blood pressure with height and weight[J].Public Health Reports,1944,59(36):1163-1184.
[19] AYYAGARI P,GROSSMAN D,SLOAN F.Education andhealth:evidence on adults with diabetes[J].International Journal of Health Care Finance and Economics,2011,11(1):35-54.
[20] SHORTELL S M.Occupational prestige differences within the medical and allied health professions[J].Social Science & Medicine,1974,8(1):1-9.
[21] SMITH A M,BAGHURST K I.Public health implications of dietary differences between social status and occupational category groups[J].Journal of Epidemiology & Community Health,1992,46(4):409-416.
[22] MEEKER M,EELLS K.Social Class in America[J].Journal of Consulting Psychology,1949,13(6):451-452.
[23] CONGER R D,CONGER K J,MARTIN M J.Socioeconomicstatus,family processes,and individual development[J].Journal of Marriage and Family,2010,72(3):685-704.
[24] JETTEN J,HASLAM S A,BARLOW F K.Bringing back the system:One reason why conservatives are happier than liberals is that higher socioeconomic status gives them access to more group memberships[J].Social Psychological and Personality Science,2013,4(1):6-13.
[25] BRADLEY R H,CORWYN R F.Socioeconomic status and child development[J].Annual Review of Psychology,2002,53(1):371-399.
[26] SIRIN S R.Socioeconomic status and academic achievement:A meta-analytic review of research[J].Review of Educational Research,2005,75(3):417-453.
[27] ABITBOL J L,KARSAI M.Socioeconomic correlations of urban patterns inferred from aerial images:interpreting activation maps of Convolutional Neural Networks[J].arXiv:2004.04907,2020.
[28] ZHAO T,HUANG H,YAO X,et al.Predicting individual socio-economic status from mobile phone data:a semi-supervised hypergraph-based factor graph approach[J].International Journal of Data Science and Analytics,2019,9(1):1-12.
[29] BAGCHI M,WHITE P R.The potential of public transportsmart card data[J].Transport Policy,2005,12(5):464-474.
[30] MOHAMED K,CÔME E,OUKHELLOU L,et al.Clusteringsmart card data for urban mobility analysis[J].IEEE Transactions on intelligent transportation systems,2016,18(3):712-728.
[31] ZHONG Y,YUAN N J,ZHONG W,et al.You are where you go:Inferring demographic attributes from location check-ins[C]//Proceedings of the Eighth ACM International Conference on Web Search and Data Mining.2015:295-304.
[32] ANTIPOV G,BERRANI S A,DUGELAY J L.MinimalisticCNN-based ensemble model for gender prediction from face images[J].Pattern Recognition Letters,2016,70:59-65.
[33] STEELE J E,SUNDSØY P R,PEZZULO C,et al.Mappingpoverty using mobile phone and satellite data[J/OL].Journal of The Royal Society Interface,2017,14(127).[34] XIE M,JEAN N,BURKE M,et al.Transfer learning from deep features for remote sensing and poverty mapping[C]//Thirtieth AAAI Conference on Artificial Intelligence.2016.
[35] LOBELL D B.The use of satellite data for crop yield gap analysis[J].Field Crops Research,2013,143:56-64.
[36] YOU J,LI X,LOW M,et al.Deep gaussian process for crop yield prediction based on remote sensing data[C]//Thirty-First AAAI Conference on Artificial Intelligence.2017.
[37] GEBRU T,KRAUSE J,WANG Y,et al.Using deep learning and Google Street View to estimate the demographic makeup of neighborhoods across the United States[J].Proceedings of the National Academy of Sciences,2017,114(50):13108-13113.
[38] NAIK N,PHILIPOOM J,RASKAR R,et al.Streetscore-predicting the perceived safety of one million streetscapes[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops.2014:779-785.
[39] NAIK N,KOMINERS S D,RASKAR R,et al.Computer vision uncovers predictors of physical urban change[J].Proceedings of the National Academy of Sciences,2017,114(29):7571-7576.
[40] SEIFERLING I,NAIK N,RATTI C,et al.Green streets-Quantifying and mapping urban trees with street-level imagery and computer vision[J].Landscape and Urban Planning,2017,165:93-101.
[41] RICHARDS D R,EDWARDS P J.Quantifying street tree regulating ecosystem services using Google Street View[J].Ecological Indicators,2017,77:31-40.
[42] BLUMENSTOCK J E.Estimating economic characteristics with phone data[C]//AEA Papers and Proceedings.2018:72-76.
[43] VOLKOVA S.Predicting demographics and Affect in social networks[D/OL].John Hopkins University.
[44] VOLKOVA S,BACHRACH Y.Inferring perceived demogra-phics from user emotional tone and user-environment emotional contrast[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers).2016:1567-1578.
[45] HASANUZZAMAN M,KAMILA S,KAUR M,et al.Temporal orientation of tweets for predicting income of users[C]//Asso-ciation for Computational Linguistics (ACL).2017.
[46] VOLKOVA S,BACHRACH Y.On predicting sociodemographic traits and emotions from communications in social networks and their implications to online self-disclosure[J].Cyberpsychology,Behavior,and Social Networking,2015,18(12):726-736.
[47] VOLKOVA S,BACHRACH Y,ARMSTRONG M,et al.Inferring latent user properties from texts published in social media[C]//Twenty-Ninth AAAI Conference on Artificial Intelligence.2015.
[48] Annual survey of hours and earnings[OL].
[49] FILHO R M,BORGES G R,ALMEIDA J M,et al.Inferringuser social class in online social networks[C]//Proceedings of the 8th Workshop on Social Network Mining and Analysis.2014:1-5.
[50] MATZ S C,MENGES J I,STILLWELL D J,et al.Predicting individual-level income from Facebook profiles[J/OL].PLoS One.
[51] FIXMAN M,BERENSTEIN A,BREA J,et al.A Bayesian approach to income inference in a communication network[C]//2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).IEEE,2016:579-582.
[52] SUNDSØY P,BJELLAND J,REME B A,et al.Deep learning applied to mobile phone data for individual income classification[C]//Proceedings of the 2016 International Conference on Artificial Intelligence:Technologies and Applications.Bangkok,Thailand.2016:24-25.
[53] ATAHAN P.Learning profiles from user interactions and personalizing recommendations based on learnt profiles[M].The University of Texas at Dallas,2009.
[54] REN Y,TOMKO M,SALIM F D,et al.Understanding the predictability of user demographics from cyber-physical-social behaviours in indoor retail spaces[J].EPJ Data Science,2018,7:1-21.
[55] ZHANG Y,YANG Q.A survey on multi-task learning[J].ar-Xiv:1707.08114,2017.
[56] KIM R,KIM H,LEE J,et al.Predicting multiple demographic attributes with task specific embedding transformation and attention network[C]//Proceedings of the 2019 SIAM International Conference on Data Mining.Society for Industrial and Applied Mathematics,2019:765-773.
[57] LI C L.Prestige Stratification in Contemporary Chinese Society-Occupational Prestige and Socioeconomic Status Index Measurements[J].Sociological Studies,2005(2):74-102.
[58] QI L S,WANG C W.Health status and socioeconomic status:a study based on multiple indicators[J].Chinese Health Econo-mics,2010,29(8):47-50.
[59] ZHANG W H,YU Y M.Effects of social network,social status and social trust on Residents’ mental health[J].Journal of Fujian Normal University (Philosophy and Social Sciences Edition),2020(2):100-111,170.
[60] WEI X P,WU R J.The impact of social participation of the el-derly on the risk of death in China[J].Southern Population,2015(2):57-69.
[61] WANG F Q.Socioeconomic status,lifestyle and health inequality[J].Society,2012(2):125-143.
[1] ZHOU Xu, QIAN Sheng-sheng, LI Zhang-ming, FANG Quan, XU Chang-sheng. Dual Variational Multi-modal Attention Network for Incomplete Social Event Classification [J]. Computer Science, 2022, 49(9): 132-138.
[2] XU Yong-xin, ZHAO Jun-feng, WANG Ya-sha, XIE Bing, YANG Kai. Temporal Knowledge Graph Representation Learning [J]. Computer Science, 2022, 49(9): 162-171.
[3] RAO Zhi-shuang, JIA Zhen, ZHANG Fan, LI Tian-rui. Key-Value Relational Memory Networks for Question Answering over Knowledge Graph [J]. Computer Science, 2022, 49(9): 202-207.
[4] LENG Dian-dian, DU Peng, CHEN Jian-ting, XIANG Yang. Automated Container Terminal Oriented Travel Time Estimation of AGV [J]. Computer Science, 2022, 49(9): 208-214.
[5] NING Han-yang, MA Miao, YANG Bo, LIU Shi-chang. Research Progress and Analysis on Intelligent Cryptology [J]. Computer Science, 2022, 49(9): 288-296.
[6] TANG Ling-tao, WANG Di, ZHANG Lu-fei, LIU Sheng-yun. Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy [J]. Computer Science, 2022, 49(9): 297-305.
[7] LI Yao, LI Tao, LI Qi-fan, LIANG Jia-rui, Ibegbu Nnamdi JULIAN, CHEN Jun-jie, GUO Hao. Construction and Multi-feature Fusion Classification Research Based on Multi-scale Sparse Brain Functional Hyper-network [J]. Computer Science, 2022, 49(8): 257-266.
[8] WANG Jian, PENG Yu-qi, ZHAO Yu-fei, YANG Jian. Survey of Social Network Public Opinion Information Extraction Based on Deep Learning [J]. Computer Science, 2022, 49(8): 279-293.
[9] HAO Zhi-rong, CHEN Long, HUANG Jia-cheng. Class Discriminative Universal Adversarial Attack for Text Classification [J]. Computer Science, 2022, 49(8): 323-329.
[10] JIANG Meng-han, LI Shao-mei, ZHENG Hong-hao, ZHANG Jian-peng. Rumor Detection Model Based on Improved Position Embedding [J]. Computer Science, 2022, 49(8): 330-335.
[11] ZHANG Guang-hua, GAO Tian-jiao, CHEN Zhen-guo, YU Nai-wen. Study on Malware Classification Based on N-Gram Static Analysis Technology [J]. Computer Science, 2022, 49(8): 336-343.
[12] SUN Qi, JI Gen-lin, ZHANG Jie. Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection [J]. Computer Science, 2022, 49(8): 172-177.
[13] HE Qiang, YIN Zhen-yu, HUANG Min, WANG Xing-wei, WANG Yuan-tian, CUI Shuo, ZHAO Yong. Survey of Influence Analysis of Evolutionary Network Based on Big Data [J]. Computer Science, 2022, 49(8): 1-11.
[14] LI Rong-fan, ZHONG Ting, WU Jin, ZHOU Fan, KUANG Ping. Spatio-Temporal Attention-based Kriging for Land Deformation Data Interpolation [J]. Computer Science, 2022, 49(8): 33-39.
[15] HOU Yu-tao, ABULIZI Abudukelimu, ABUDUKELIMU Halidanmu. Advances in Chinese Pre-training Models [J]. Computer Science, 2022, 49(7): 148-163.
Full text



No Suggested Reading articles found!