Problem Statement can be found at https://challenge.zalo.ai/portal/elementary-maths-solving
Our contributions as follow: 😍
- Creating rule-base algorithm to pass basic testcases
- Collecting and augmenting Vietnamese elementary mathematics from internet
- Training a LLM
- Applying RAG technique
- Implementing re-evaluation method for inference
We were able to calculate and derive outcomes directly by using regex
and numexpr
to infer results without relying solely on LLM.
The given dataset contains approx. 1200 training examples, half of them include explanation
field. So we decided to collect more multiple choice data from VietJack. Furthermore, we augmented data by calling GPT-4 API to fill missing explanation
samples.
We also created dataset programmatically for some types of math problem (including basic calculation). To diversify our dataset, we translated famous public datasets from Huggingface 🤗.
Note that our dataset not only contains samples in multiple choice format, but also in question-answering format.
We conducted experiments using publicly available 7B and 13B models from WizardLM
, meta-math
, FelixChao
, and EleutherAI
. For efficient training, we employed LoRA
, deepspeed
and used hyperparameter tuning techniques to identify the optimal model configuration.
We utilized RAG to enhance accuracy of LLM. Our model encountered frequent failures in certain problem types, and this is where RAG shined. We appended RAG knowledges into the input prompt to provide LLM with additional information, hence improving its reasoning abilities. In total, we had 10-20 knowledges and employed a simple keyword-based matching algorithm for retrieval.
Although applying advanced techniques, we observed that LLM still encountered challenges in certain problems due to limitations in calculation abilities, despite their correct reasoning capabilities. To address this, we implemented a big loop where we re-evaluated calculation results using numexpr
each time a equation appeared in output.
Note that, in order to reduce the complexity of our solution, only basic arithmetic equations would be considered.
Hope you guys love our solution ! 🥰 🥰 🥰