Code Soliloquies for Accurate Calculations in Large Language Models

Shashank Sonkar; Xinghe Chen; Myco Le; Naiming Liu; Debshila Basu Mallick; Richard Baraniuk

doi:10.1145/3636555.3636889

Code Soliloquies for Accurate Calculations in Large Language Models

Shashank Sonkar, Xinghe Chen, Myco Le, Naiming Liu, Debshila Basu Mallick, Richard Baraniuk

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Abstract

High-quality conversational datasets are crucial for the successful development of Intelligent Tutoring Systems (ITS) that utilize a Large Language Model (LLM) backend. Synthetic student-teacher dialogues, generated using advanced GPT-4 models, are a common strategy for creating these datasets. However, subjects like physics that entail complex calculations pose a challenge. While GPT-4 presents impressive language processing capabilities, its limitations in fundamental mathematical reasoning curtail its efficacy for such subjects. To tackle this limitation, we introduce in this paper an innovative stateful prompt design. Our design orchestrates a mock conversation where both student and tutorbot roles are simulated by GPT-4. Each student response triggers an internal monologue, or 'code soliloquy' in the GPT-tutorbot, which assesses whether its subsequent response would necessitate calculations. If a calculation is deemed necessary, it scripts the relevant Python code and uses the Python output to construct a response to the student. Our approach notably enhances the quality of synthetic conversation datasets, especially for subjects that are calculation-intensive. The preliminary Subject Matter Expert evaluations reveal that our Higgs model, a fine-tuned LLaMA model, effectively uses Python for computations, which significantly enhances the accuracy and computational reliability of Higgs' responses.

Original language	English (US)
Title of host publication	LAK 2024 Conference Proceedings - 14th International Conference on Learning Analytics and Knowledge
Publisher	Association for Computing Machinery
Pages	828-835
Number of pages	8
ISBN (Electronic)	9798400716188
DOIs	https://doi.org/10.1145/3636555.3636889
State	Published - Mar 18 2024
Event	14th International Conference on Learning Analytics and Knowledge, LAK 2024 - Kyoto, Japan Duration: Mar 18 2024 → Mar 22 2024

Publication series

Name	ACM International Conference Proceeding Series

Conference

Conference	14th International Conference on Learning Analytics and Knowledge, LAK 2024
Country/Territory	Japan
City	Kyoto
Period	3/18/24 → 3/22/24

ASJC Scopus subject areas

Human-Computer Interaction
Computer Networks and Communications
Computer Vision and Pattern Recognition
Software

Access to Document

10.1145/3636555.3636889

Cite this

Sonkar, S., Chen, X., Le, M., Liu, N., Basu Mallick, D., & Baraniuk, R. (2024). Code Soliloquies for Accurate Calculations in Large Language Models. In LAK 2024 Conference Proceedings - 14th International Conference on Learning Analytics and Knowledge (pp. 828-835). (ACM International Conference Proceeding Series). Association for Computing Machinery. https://doi.org/10.1145/3636555.3636889

Code Soliloquies for Accurate Calculations in Large Language Models. / Sonkar, Shashank; Chen, Xinghe; Le, Myco et al.
LAK 2024 Conference Proceedings - 14th International Conference on Learning Analytics and Knowledge. Association for Computing Machinery, 2024. p. 828-835 (ACM International Conference Proceeding Series).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Sonkar, S, Chen, X, Le, M, Liu, N, Basu Mallick, D & Baraniuk, R 2024, Code Soliloquies for Accurate Calculations in Large Language Models. in LAK 2024 Conference Proceedings - 14th International Conference on Learning Analytics and Knowledge. ACM International Conference Proceeding Series, Association for Computing Machinery, pp. 828-835, 14th International Conference on Learning Analytics and Knowledge, LAK 2024, Kyoto, Japan, 3/18/24. https://doi.org/10.1145/3636555.3636889

@inproceedings{12405abe99b24596bed4e0039906bc5a,

title = "Code Soliloquies for Accurate Calculations in Large Language Models",

abstract = "High-quality conversational datasets are crucial for the successful development of Intelligent Tutoring Systems (ITS) that utilize a Large Language Model (LLM) backend. Synthetic student-teacher dialogues, generated using advanced GPT-4 models, are a common strategy for creating these datasets. However, subjects like physics that entail complex calculations pose a challenge. While GPT-4 presents impressive language processing capabilities, its limitations in fundamental mathematical reasoning curtail its efficacy for such subjects. To tackle this limitation, we introduce in this paper an innovative stateful prompt design. Our design orchestrates a mock conversation where both student and tutorbot roles are simulated by GPT-4. Each student response triggers an internal monologue, or 'code soliloquy' in the GPT-tutorbot, which assesses whether its subsequent response would necessitate calculations. If a calculation is deemed necessary, it scripts the relevant Python code and uses the Python output to construct a response to the student. Our approach notably enhances the quality of synthetic conversation datasets, especially for subjects that are calculation-intensive. The preliminary Subject Matter Expert evaluations reveal that our Higgs model, a fine-tuned LLaMA model, effectively uses Python for computations, which significantly enhances the accuracy and computational reliability of Higgs' responses.",

author = "Shashank Sonkar and Xinghe Chen and Myco Le and Naiming Liu and {Basu Mallick}, Debshila and Richard Baraniuk",

note = "Publisher Copyright: {\textcopyright} 2024 ACM.; 14th International Conference on Learning Analytics and Knowledge, LAK 2024 ; Conference date: 18-03-2024 Through 22-03-2024",

year = "2024",

month = mar,

day = "18",

doi = "10.1145/3636555.3636889",

language = "English (US)",

series = "ACM International Conference Proceeding Series",

publisher = "Association for Computing Machinery",

pages = "828--835",

booktitle = "LAK 2024 Conference Proceedings - 14th International Conference on Learning Analytics and Knowledge",

}

TY - GEN

T1 - Code Soliloquies for Accurate Calculations in Large Language Models

AU - Sonkar, Shashank

AU - Chen, Xinghe

AU - Le, Myco

AU - Liu, Naiming

AU - Basu Mallick, Debshila

AU - Baraniuk, Richard

PY - 2024/3/18

Y1 - 2024/3/18

N2 - High-quality conversational datasets are crucial for the successful development of Intelligent Tutoring Systems (ITS) that utilize a Large Language Model (LLM) backend. Synthetic student-teacher dialogues, generated using advanced GPT-4 models, are a common strategy for creating these datasets. However, subjects like physics that entail complex calculations pose a challenge. While GPT-4 presents impressive language processing capabilities, its limitations in fundamental mathematical reasoning curtail its efficacy for such subjects. To tackle this limitation, we introduce in this paper an innovative stateful prompt design. Our design orchestrates a mock conversation where both student and tutorbot roles are simulated by GPT-4. Each student response triggers an internal monologue, or 'code soliloquy' in the GPT-tutorbot, which assesses whether its subsequent response would necessitate calculations. If a calculation is deemed necessary, it scripts the relevant Python code and uses the Python output to construct a response to the student. Our approach notably enhances the quality of synthetic conversation datasets, especially for subjects that are calculation-intensive. The preliminary Subject Matter Expert evaluations reveal that our Higgs model, a fine-tuned LLaMA model, effectively uses Python for computations, which significantly enhances the accuracy and computational reliability of Higgs' responses.

AB - High-quality conversational datasets are crucial for the successful development of Intelligent Tutoring Systems (ITS) that utilize a Large Language Model (LLM) backend. Synthetic student-teacher dialogues, generated using advanced GPT-4 models, are a common strategy for creating these datasets. However, subjects like physics that entail complex calculations pose a challenge. While GPT-4 presents impressive language processing capabilities, its limitations in fundamental mathematical reasoning curtail its efficacy for such subjects. To tackle this limitation, we introduce in this paper an innovative stateful prompt design. Our design orchestrates a mock conversation where both student and tutorbot roles are simulated by GPT-4. Each student response triggers an internal monologue, or 'code soliloquy' in the GPT-tutorbot, which assesses whether its subsequent response would necessitate calculations. If a calculation is deemed necessary, it scripts the relevant Python code and uses the Python output to construct a response to the student. Our approach notably enhances the quality of synthetic conversation datasets, especially for subjects that are calculation-intensive. The preliminary Subject Matter Expert evaluations reveal that our Higgs model, a fine-tuned LLaMA model, effectively uses Python for computations, which significantly enhances the accuracy and computational reliability of Higgs' responses.

UR - http://www.scopus.com/inward/record.url?scp=85187555630&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85187555630&partnerID=8YFLogxK

U2 - 10.1145/3636555.3636889

DO - 10.1145/3636555.3636889

M3 - Conference contribution

AN - SCOPUS:85187555630

T3 - ACM International Conference Proceeding Series

SP - 828

EP - 835

BT - LAK 2024 Conference Proceedings - 14th International Conference on Learning Analytics and Knowledge

PB - Association for Computing Machinery

T2 - 14th International Conference on Learning Analytics and Knowledge, LAK 2024

Y2 - 18 March 2024 through 22 March 2024

ER -

Code Soliloquies for Accurate Calculations in Large Language Models

Abstract

Publication series

Conference

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this