表格问答研究综述

doi:10.11896/jsjkx.250900006

Abstract

Abstract: Tables serve as essential data carriers,capable of efficiently storing large volumes of high-value information.They are widely used across domains such as economics,finance,scientific research,and more.Table question answering(TableQA) aims to automatically derive answers from tabular data in response to natural language queries,representing a key research direction at the intersection of natural language processing and data analysis.Compared to traditional text-based or knowledge-base question answering,TableQA presents greater challenges,as it requires not only natural language understanding but also the interpretation of two-dimensional table structures,numerical computations,and complex logical reasoning.In recent years,the continuous deve-lopment of diverse datasets has driven steady progress in TableQA research.The field has evolved from early rule-based and template-based approaches to statistical learning and neural network models,and more recently,to the integration of pre-trained language models,resulting in consistent performance improvements.Notably,the emergence of large language models(LLMs) has ushered in a new phase of development.Leveraging their strong cross-task generalization and reasoning capabilities,LLMs have accelerated innovation and fostered new research paradigms in TableQA.This paper systematically reviews the evolution and representative methods of TableQA,with a particular emphasis on recent advances enabled by LLMs.It also outlines the key challenges currently facing the field and provides a forward-looking perspective on future research directions.

Key words: Table question answering, Table reasoning, Large language model, Natural language processing

CLC Number:

TP391

WU Xianjie, LI Tongliang, LI Zhoujun. Survey of Table Question Answering Research[J].Computer Science, 2026, 53(3): 295-306.

References

[1]LU W,ZHANG J,FAN J,et al.Large language model for table processing:A survey[J].Frontiers of Computer Science,2025,19(2):192350.
[2]DONG H,ZHAO J,TIAN Y,et al.SpreadsheetLLM:encoding spreadsheets for large language models[J].arXiv:2407.09025,2024.
[3]NAVEED H,KHAN A U,QIU S,et al.A comprehensive overview of large language models[J].ACM Transactions on Intelligent Systems and Technology,2025,16(5):1-72.
[4]LI P,HE Y,YASHAR D,et al.Table-gpt:Table fine-tuned gpt for diverse table tasks[J].Proceedings of the ACM on Management of Data,2024,2(3):1-28.
[5]BADARO G,SAEED M,PAPOTTI P.Transformers for tabular data representation:A survey of models and applications[J].Transactions of the Association for Computational Linguistics,2023,11:227-249.
[6]CHENG Z,DONG H,WANG Z,et al.HiTab:A HierarchicalTable Dataset for Question Answering and Natural Language Generation[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers).2022:1094-1110.
[7]DONG H,CHENG Z,HE X,et al.Table Pre-training:A Survey on Model Architectures,Pre-training Objectives,and Downstream Tasks[C]//Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence.2022.
[8]GU Y,PAHUJA V,CHENG G,et al.Knowledge Base Question Answering:A Semantic Parsing Perspective[C]//AKBC.2022.
[9]NASSIRI K,AKHLOUFI M.Transformer models used for text-based question answering systems[J].Applied Intelligence,2023,53(9):10602-10635.
[10]LIU X,SHEN S,LI B,et al.A Survey of NL2SQL with Large Language Models:Where are we,and where are we going?[J].arXiv:2408.05109,2024.
[11]KATSIS Y,CHEMMENGATH S,KUMAR V,et al.AIT-QA:Question Answering Dataset over Complex Tables in the Airline Industry[C]//Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies:Industry Track.2022:305-314.
[12]SHIGAROV A.Table understanding:Problem overview[J].Wiley Interdisciplinary Reviews:Data Mining and Knowledge Discovery,2023,13(1):e1482
[13]WU X,YANG J,CHAI L,et al.Tablebench:A comprehensive and complex benchmark for table question answering[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2025:25497-25506.
[14]CHEN Z,CHEN W,SMILEY C,et al.FinQA:A Dataset of Numerical Reasoning over Financial Data[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.2021:3697-3711.
[15]PASUPAT P,LIANG P.Compositional Semantic Parsing onSemi-Structured Tables[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing(Volume 1:Long Papers).2015:1470-1480.
[16]IYYER M,YIH W,CHANG M W.Search-based neural structured learning for sequential question answering[C]//Procee-dings of the 55th Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers).2017:1821-1831.
[17]CHEN W,WANG H,CHEN J,et al.Tabfact:A large-scaledataset for table-based fact verification[J].arXiv:1909.02164,2019.
[18]ZHONG V,XIONG C,SOCHER R.Seq2sql:Generating structured queries from natural language using reinforcement lear-ning[J].arXiv:1709.00103,2017.
[19]YU T,ZHANG R,YANG K,et al.Spider:A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Proces-sing.2018:3911-3921.
[20]LI J,HUI B,QU G,et al.Can llm already serve as a database interface? a big bench for large-scale database grounded text-to-sqls[J].Advances in Neural Information Processing Systems,2023,36:42330-42357.
[21]MIN Q,SHI Y,ZHANG Y.A Pilot Study for Chinese SQL Semantic Parsing[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing(EMNLP-IJCNLP).2019:3652-3658.
[22]PARIKH A,WANG X,GEHRMANN S,et al.ToTTo:A Controlled Table-To-Text Generation Dataset[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing(EMNLP).2020:1173-1186.
[23]CHEN W,CHANG M W,SCHLINGER E,et al.Open question answering over tables and text[J].arXiv:2010.10439,2020.
[24]NAN L,HSIEH C,MAO Z,et al.FeTaQA:Free-form tablequestion answering[J].Transactions of the Association for Computational Linguistics,2022,10:35-49.
[25]KRUMDICK M,KONCEL-KEDZIORSKI R,LAI V D,et al.Bizbench:A quantitative reasoning benchmark for business and finance[C]//Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers).2024:8309-8332.
[26]CHEN W,ZHA H,CHEN Z,et al.HybridQA:A Dataset of Multi-Hop Question Answering over Tabular and Textual Data[C]//Findings of the Association for Computational Linguistics:EMNLP 2020.2020:1026-1036.
[27]ZHU F,LEI W,HUANG Y,et al.TAT-QA:A Question Answering Benchmark on a Hybrid of Tabular and Textual Content in Finance[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing(Volume 1:Long Papers).2021:3277-3287.
[28]ZHENG M,HAO Y,JIANG W,et al.IM-TQA:A Chinese table question answering dataset with implicit and multi-type table structures[C]//Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers).2023:5074-5094.
[29]ZHAO Y,LI Y,LI C,et al.MultiHiertt:Numerical Reasoningover Multi Hierarchical Tabular and Textual Data[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers).2022:6588-6600.
[30]ZHOU W,MESGAR M,ADEL H,et al.FREB-TQA:A Fine-Grained Robustness Evaluation Benchmark for Table Question Answering[C]//Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies(Volume 1:Long Papers).2024:2479-2497.
[31]ASHURY-TAHAN S,MAI Y,GERA A,et al.The mightytorr:A benchmark for table reasoning and robustness[J].arXiv:2502.19412,2025.
[32]KIM Y,YIM M,SONG K Y.Tablevqa-bench:A visual question answering benchmark on multiple table domains[J].arXiv:2404.19205,2024.
[33]ZHAO W,FENG H,LIU Q,et al.Tabpedia:Towards comprehensive visual table understanding with concept synergy[J].Advances in Neural Information Processing Systems,2024,37:7185-7212.
[34]ZHENG X,BURDICK D,POPA L,et al.Global table extractor(gte):A framework for joint table identification and cell structure recognition using visual context[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision.2021:697-706.
[35]SMOCK B,PESALA R,ABRAHAM R.PubTables-1M:To-wards comprehensive table extraction from unstructured documents[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:4634-4642.
[36]ZHENG M,FENG X,SI Q,et al.Multimodal Table Under-standing[C]//Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers).2024:9102-9124.
[37]WU P,YANG Y,ZHU G,et al.RealHiTBench:A Comprehensive Realistic Hierarchical Table Benchmark for Evaluating LLM-Based Table Analysis[J].arXiv:2506.13405,2025.
[38]SINGH A,BIEMANN C,STRICH J.MTabVQA:EvaluatingMulti-Tabular Reasoning of Language Models in Visual Space[J].arXiv:2506.11684,2025.
[39]PAPINENI K,ROUKOS S,WARD T,et al.Bleu:a method for automatic evaluation of machine translation[C]//Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics.2002:311-318.
[40]CHIN-YEW L.Rouge:A package for automatic evaluation ofsummaries[C]//Proceedings of the Workshop on Text Summarization Branches Out,2004.2004.
[41]ZHANG T,KISHORE V,WU F,et al.BERTScore:Evaluating Text Generation with BERT[C]//International Conference on Learning Representations.2020.
[42]COPESTAKE A,JONES K S.Natural language interfaces todatabases[J].The Knowledge Engineering Review,1990,5(4):225-249.
[43]WOODS W A,KAPLAN R M,NASH-WEBBER B.The lunar sciences natural language information system:Final report [R].Cambridge,MA:Bolt,Beranek and Newman,Inc.,1972.
[44] ANDROUTSOPOULOS I,RITCHIE G,THANISCH P.MASQUE/SQL:an efficient and portable natural language query interface for relational databases [C]//Proceedings of the 6th International Conference on Industrial and Engineering Applications of Artificial Intelligence and Expert Systems(IEA/AIE’93).1993:327-330
[45]POPESCU A M,ARMANASU A,ETZIONI O,et al.Precise on atis:semantic tractability and experimental results[C]//AAAI.2004:1026-1027.
[46]PASUPAT P,LIANG P.Inferring Logical Forms From Denotations[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers).2016:23-32.
[47]XU X,LIU C,SONG D.Sqlnet:Generating structured queries from natural language without reinforcement learning[J].ar-Xiv:1711.04436,2017.
[48]ZHANG S,BALOG K.Ad hoc table retrieval using semanticsimilarity[C]//Proceedings of the 2018 World Wide Web Conference.2018:1553-1562.
[49]JAUHAR S K,TURNEY P,HOVY E.Tables as semi-structured knowledge for question answering[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers).2016:474-483.
[50]SUN H,MA H,HE X,et al.Table cell search for question answering[C]//Proceedings of the 25th International Conference on World Wide Web.2016:771-782.
[51]ZHANG S,TUAN L A,ZHAO C.SynTQA:Synergistic Table-based Question Answering via Mixture of Text-to-SQL and E2E TQA[C]//Findings of the Association for Computational Linguistics:EMNLP 2024.2024:2352-2364.
[52]ZHONG V,XIONG C,SOCHER R.Seq2sql:Generating structured queries from natural language using reinforcement learning[J].arXiv:1709.00103,2017.
[53]YIN P,NEUBIG G,YIH W,et al.TaBERT:Pretraining forJoint Understanding of Textual and Tabular Data[C]//Procee-dings of the 58th Annual Meeting of the Association for Computational Linguistics.2020:8413-8426.
[54]HERZIG J,NOWAK P K,MÜLLER T,et al.TaPas:Weakly Supervised Table Parsing via Pre-training[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.Association for Computational Linguistics,2020.
[55]DENG X,SUN H,LEES A,et al.Turl:Table understandingthrough representation learning[J].ACM SIGMOD Record,2022,51(1):33-40.
[56]WANG Z,DONG H,JIA R,et al.Tuta:Tree-based transfor-mers for generally structured table pre-training[C]//Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery &Data Mining.2021:1780-1790.
[57]IIDA H,THAI D,MANJUNATHA V,et al.TABBIE:Pre-trained Representations of Tabular Data[C]//Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2021:3446-3456.
[58]YU T,WU C S,LIN X V,et al.GraPPa:Grammar-Augmented Pre-Training for Table Semantic Parsing[C]//International Conference on Learning Representations.2021.
[59]LIU Q,CHEN B,GUO J,et al.TAPEX:Table Pre-training via Learning a Neural SQL Executor[C]//International Conference on Learning Representations.2022.
[60]BROWN T,MANN B,RYDER N,et al.Language models arefew-shot learners[J].Advances in Neural Information Proces-sing Systems,2020,33:1877-1901.
[61]DONG Q,LI L,DAI D,et al.A Survey on In-context Learning[C]//Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing.2024:1107-1128.
[62]WANG Y Z,LI Q,DAI Z J,et al.Research status and trends of large language models [J].Journal of Engineering Science,2024,46(8):1411-1425.
[63]SAHOO P,SINGH A K,SAHA S,et al.A systematic survey of prompt engineering in large language models:Techniques and applications[J].arXiv:2402.07927,2024.
[64]CHEN W.Large Language Models are few(1)-shot Table Reasoners[C]//Findings of the Association for Computational Linguistics:EACL 2023.2023:1120-1130.
[65]SINGHA A,CAMBRONERO J,GULWANI S,et al.TabularRepresentation,Noisy Operators,and Impacts on Table Structure Understanding Tasks in LLMs[C]//NeurIPS 2023 Second Table Representation Learning Workshop.2023.
[66]GAO D,WANG H,LI Y,et al.Text-to-SQL Empowered by Large Language Models:A Benchmark Evaluation[J].Proceedings of the VLDB Endowment,2024,17(5):1132-1145.
[67]CHEN W,MA X,WANG X,et al.Program of thoughts prompting:Disentangling computation from reasoning for numerical reasoning tasks[J].arXiv:2211.12588,2022.
[68]YAO S,ZHAO J,YU D,et al.React:Synergizing reasoning and acting in language models[C]//International Conference on Learning Representations(ICLR).2023.
[69]ZHENG M,YANG H,JIANG W,et al.Chain-of-thought rea-soning in tabular language models[C]//Findings of the Association for Computational Linguistics:EMNLP 2023.2023:11006-11019.
[70]WANG Z,ZHANG H,LI C L,et al.Chain-of-Table:Evolving Tables in the Reasoning Chain for Table Understanding[C]//ICLR.2024.
[71]WEI J,WANG X,SCHUURMANS D,et al.Chain-of-thoughtprompting elicits reasoning in large language models[J].Advances in Neural Information Processing Systems,2022,35:24824-24837.
[72]LI Z,WANG Z,YANG F G,et al.Research and prospects of table-based automatic question answering [J].Journal of Computer Engineering and Applications,2021,57(13).
[73]JINZ Q,LU W.Tab-CoT:Zero-shot Tabular Chain of Thought[C]//Findings of the Association for Computational Linguistics:ACL 2023.2023:10259-10277.
[74]ZHANG Y,HENKEL J,FLORATOU A,et al.Reactable:Enhancing react for table question answering[J].arXiv:2310.00815,2023.
[75]CHENG Z,XIE T,SHI P,et al.Binding language models insymbolic languages[J].arXiv:2210.02875,2022.
[76]YE Y,HUI B,YANG M,et al.Large language models are versatile decomposers:Decomposing evidence and questions for table-based reasoning[C]//Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval.2023:174-184.
[77]NGUYEN G,BRUGERE I,SHARMA S,et al.Interpretable LLM-based Table Question Answering[J].arXiv:2412.12386v3,2024.
[78]ZHAO Y,CHEN L,COHAN A,et al.TaPERA:Enhancing faithfulness and interpretability in long-form TableQA by content planning and execution-based reasoning[C]//Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers).2024:12824-12840.
[79]WANG X,WEI J,SCHUURMANS D,et al.Self-consistencyimproves chain of thought reasoning in language models[J].arXiv:2203.11171,2022.
[80]LUO H,SHEN Y,DENG Y.Unifying text,tables,and images for multimodal question answering[C].Findings of the Association for Computational Linguistics:EMNLP 2023.Association for Computational Linguistics,2023.
[81]KIM G,HONG T,YIM M,et al.Ocr-free document understanding transformer[C]//European Conference on Computer Vision.Cham:Springer Nature Switzerland,2022:498-517.
[82]LEE K,JOSHI M,TURC I R,et al.Pix2struct:Screenshot parsing as pretraining for visual language understanding[C]//International Conference on Machine Learning.PMLR,2023:18893-18912.
[83]WEI X,ZHANG T,LI Y,et al.Multi-modality cross attention network for image and sentence matching[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:10941-10950.
[84]ZHANG M,CHANG K,WU Y.Multi-modal Semantic Understanding with Contrastive Cross-modal Feature Alignment[C]//Proceedings of the 2024 Joint International Conference on Computational Linguistics,Language Resources and Evaluation(LREC-COLING 2024).2024:11934-11943.
[85]RADFORD A,KIM J W,HALLACY C,et al.Learning transferable visual models from natural language supervision[C]//International Conference on Machine Learning.PmLR,2021:8748-8763.
[86]LIU H,LI C,WU Q,et al.Visual instruction tuning[J].Advances in Neural Information Processing Systems,2023,36:34892-34916.
[87]ZHANG Z,LIU D,LIU S,et al.Turbo your multi-modal classification with contrastive learning[J].arXiv:2409.09282,2024.
[88]LIU Z,WANG H,LI X,et al.Hippo:Enhancing the table understanding capability of large language models through hybrid-modal preference optimization[J].arXiv:2502.17315,2025.
[89]ZHOU B,GAO Z,WANG Z,et al.SynTab-LLaVA:Enhancing Multimodal Table Understanding with Decoupled Synthesis[C]//Proceedings of the Computer Vision and Pattern Recognition Conference.2025:24796-24806.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Survey of Table Question Answering Research

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0

[1]	WANG Zhibin, LI Shipeng, ZHOU Yuhang, LI Xue, ZHANG Zhonghui, JIANG Zhiwei, GU Rong, TIAN Chen, CHEN Guihai, ZHONG Sheng. Optimization of Service Level Objectives and System Level Metrics in Large Language ModelServing System [J]. Computer Science, 2026, 53(3): 23-32.
[2]	ZHOU Yueyuan, LU Guanze, XIANG Jiawei, ZHANG Jiawei, SHAO En, HE Xin. Training System for Large Language Models Based on Adaptive Transpose on Hygon DCU [J]. Computer Science, 2026, 53(3): 33-40.
[3]	CHEN Han, XU Zefeng, JIANG Jiu, FAN Fan, ZHANG Junjian, HE Chu, WANG Wenwei. Large Language Model and Deep Network Based Cognitive Assessment Automatic Diagnosis [J]. Computer Science, 2026, 53(3): 41-51.
[4]	QIAN Qing, CHEN Huicheng, CUI Yunhe, TANG Ruixue, FU Jinmei. Joint Entity and Relation Extraction Method with Multi-scale Collaborative Aggregation and Axial-semantic Guidance [J]. Computer Science, 2026, 53(3): 97-106.
[5]	XU Cheng, LIU Yuxuan, WANG Xin, ZHANG Cheng, YAO Dengfeng, YUAN Jiazheng. Review of Speech Disorder Assessment Methods Driven by Large Language Models [J]. Computer Science, 2026, 53(3): 307-320.
[6]	LI Wenli, FENG Xiaonian, QIAN Tieyun. Few-shot Continuous Toxicity Detection Based on Large Language Model Augmentation [J]. Computer Science, 2026, 53(3): 321-330.
[7]	CHEN Yuyin, LI Guanfeng, QIN Jing, XIAO Yuhang. Survey on Complex Logical Query Methods in Knowledge Graphs [J]. Computer Science, 2026, 53(2): 273-288.
[8]	SUN Mingxu, LIANG Gang, WU Yifei, HU Haixin. Chinese Hate Speech Detection Incorporating Hate Object Features and Variant Word Restoration Mechanism [J]. Computer Science, 2026, 53(2): 289-299.
[9]	GUO Luxiang, WANG Yueyu, LI Qianyue, LI Shasha, LIU Xiaodong, JI Bin, YU Jie. Comprehensive Survey of LLM-based Agent Operating Systems [J]. Computer Science, 2026, 53(1): 1-11.
[10]	LIU Lilong, LIU Guoming, QI Baoyuan, DENG Xueshan, XUE Dizhan, QIAN Shengsheng. Efficient Inference Techniques of Large Models in Real-world Applications:A Comprehensive Survey [J]. Computer Science, 2026, 53(1): 12-28.
[11]	SHAO Xinyi, ZHU Jingwei, ZHANG Liang. LLM-based Business Process Adaptation Method to Respond Long-tailed Changes [J]. Computer Science, 2026, 53(1): 29-38.
[12]	LIU Leyuan, CHEN Gege, WU Wei, WANG Yong, ZHOU Fan. Survey of Data Classification and Grading Studies [J]. Computer Science, 2025, 52(9): 195-211.
[13]	CAI Qihang, XU Bin, DONG Xiaodi. Knowledge Graph Completion Model Using Semantically Enhanced Prompts and Structural Information [J]. Computer Science, 2025, 52(9): 282-293.
[14]	ZHONG Boyang, RUAN Tong, ZHANG Weiyan, LIU Jingping. Collaboration of Large and Small Language Models with Iterative Reflection Framework for Clinical Note Summarization [J]. Computer Science, 2025, 52(9): 294-302.
[15]	CHENG Zhangtao, HUANG Haoran, XUE He, LIU Leyuan, ZHONG Ting, ZHOU Fan. Event Causality Identification Model Based on Prompt Learning and Hypergraph [J]. Computer Science, 2025, 52(9): 303-312.