Skip to content

Releases: mrigankpawagi/HinglishEval

Results

29 Jul 18:32
81bfcd5
Compare
Choose a tag to compare

Results:

  • The results of the evaluation using various parameters is attached in the csv files in this release.
  • The Pass@1 scores of the models for both the English and Hinglish prompts is detailed in Pass@1_data.csv .
  • The IRT latency - (2 Parameter IRT model) scores of the models (English and Hinglish versions of models taken in the same analysis) are present in IRT_results.csv
  • The binary matrix result (0 for incorrect and 1 for correct) of the evaluated solutions to all the problems in the dataset by each of the models is contained in binary_matrix.txt.

Note : The zip file containing these is attached below.

Hinglish Data v0.1

27 Jul 08:53
81bfcd5
Compare
Choose a tag to compare

Hinglish Dataset

This dataset contains the codes generation provided by the below mentioned models for HinglishEval.json which is the Hinglish translated version of original HumanEval data, both unsantinized(raw) and sanitized versions. These codes are already mentioned in the repository.

Models

  • codegen2-1B
  • codegen-2B-mono
  • codegen-2B-multi
  • codegen-2B-nl
  • codegen-350M-mono
  • codegen-350M-multi
  • codegen-350M-nl
  • codegen-6B-mono
  • codegen-6B-multi
  • codegen-6B-nl
  • gemma-2B
  • gemma-7B
  • gpt-3.5-turbo
  • gpt-4
  • llama3-70B
  • mistral7B-instruct-v0.1
  • mistral7B-instruct-v0.2
  • mistral7B-instruct-v0.3
  • phi-3-medium-4k-instruct
  • polycoder-0.4B
  • polycoder-160M
  • polycoder-2.7B
  • santacoder

Parameters set

  • Temperature: 0
  • Max_tokens: 512

NOTE: The below binaries contains zip files for each model.

English Data v0.1

27 Jul 08:58
81bfcd5
Compare
Choose a tag to compare

English Dataset

This dataset contains the codes generation provided by the below mentioned models on original HumanEval data in English, both unsantinized(raw) and sanitized versions. These codes are already mentioned in the repository.

Models

  • codegen2-1B
  • codegen-2B-mono
  • codegen-2B-multi
  • codegen-2B-nl
  • codegen-350M-mono
  • codegen-350M-multi
  • codegen-350M-nl
  • codegen-6B-mono
  • codegen-6B-multi
  • codegen-6B-nl
  • gemma-2B
  • gemma-7B
  • gpt-3.5-turbo
  • gpt-4
  • llama3-70B
  • mistral7B-instruct-v0.1
  • mistral7B-instruct-v0.2
  • mistral7B-instruct-v0.3
  • phi-3-medium-4k-instruct
  • polycoder-0.4B
  • polycoder-160M
  • polycoder-2.7B

Parameters set

  • Temperature: 0
  • Max_tokens: 512

NOTE: The below binaries contains zip files for each model.