Releases: mrigankpawagi/HinglishEval
Releases · mrigankpawagi/HinglishEval
Results
Results:
- The results of the evaluation using various parameters is attached in the csv files in this release.
- The Pass@1 scores of the models for both the English and Hinglish prompts is detailed in
Pass@1_data.csv
. - The IRT latency - (2 Parameter IRT model) scores of the models (English and Hinglish versions of models taken in the same analysis) are present in
IRT_results.csv
- The binary matrix result (0 for incorrect and 1 for correct) of the evaluated solutions to all the problems in the dataset by each of the models is contained in
binary_matrix.txt
.
Note : The zip file containing these is attached below.
Hinglish Data v0.1
Hinglish Dataset
This dataset contains the codes generation provided by the below mentioned models for HinglishEval.json which is the Hinglish translated version of original HumanEval data, both unsantinized(raw) and sanitized versions. These codes are already mentioned in the repository.
Models
- codegen2-1B
- codegen-2B-mono
- codegen-2B-multi
- codegen-2B-nl
- codegen-350M-mono
- codegen-350M-multi
- codegen-350M-nl
- codegen-6B-mono
- codegen-6B-multi
- codegen-6B-nl
- gemma-2B
- gemma-7B
- gpt-3.5-turbo
- gpt-4
- llama3-70B
- mistral7B-instruct-v0.1
- mistral7B-instruct-v0.2
- mistral7B-instruct-v0.3
- phi-3-medium-4k-instruct
- polycoder-0.4B
- polycoder-160M
- polycoder-2.7B
- santacoder
Parameters set
- Temperature: 0
- Max_tokens: 512
NOTE: The below binaries contains zip files for each model.
English Data v0.1
English Dataset
This dataset contains the codes generation provided by the below mentioned models on original HumanEval data in English, both unsantinized(raw) and sanitized versions. These codes are already mentioned in the repository.
Models
- codegen2-1B
- codegen-2B-mono
- codegen-2B-multi
- codegen-2B-nl
- codegen-350M-mono
- codegen-350M-multi
- codegen-350M-nl
- codegen-6B-mono
- codegen-6B-multi
- codegen-6B-nl
- gemma-2B
- gemma-7B
- gpt-3.5-turbo
- gpt-4
- llama3-70B
- mistral7B-instruct-v0.1
- mistral7B-instruct-v0.2
- mistral7B-instruct-v0.3
- phi-3-medium-4k-instruct
- polycoder-0.4B
- polycoder-160M
- polycoder-2.7B
Parameters set
- Temperature: 0
- Max_tokens: 512
NOTE: The below binaries contains zip files for each model.