29 Jul 18:32

SidZRed

81bfcd5

Results Latest

Latest

Results:

The results of the evaluation using various parameters is attached in the csv files in this release.
The Pass@1 scores of the models for both the English and Hinglish prompts is detailed in Pass@1_data.csv .
The IRT latency - (2 Parameter IRT model) scores of the models (English and Hinglish versions of models taken in the same analysis) are present in IRT_results.csv
The binary matrix result (0 for incorrect and 1 for correct) of the evaluated solutions to all the problems in the dataset by each of the models is contained in binary_matrix.txt.

Note : The zip file containing these is attached below.

Assets 3

27 Jul 08:53

AnirudhG07

Hinglish

81bfcd5

Hinglish Data v0.1

Hinglish Dataset

This dataset contains the codes generation provided by the below mentioned models for HinglishEval.json which is the Hinglish translated version of original HumanEval data, both unsantinized(raw) and sanitized versions. These codes are already mentioned in the repository.

Models

codegen2-1B
codegen-2B-mono
codegen-2B-multi
codegen-2B-nl
codegen-350M-mono
codegen-350M-multi
codegen-350M-nl
codegen-6B-mono
codegen-6B-multi
codegen-6B-nl
gemma-2B
gemma-7B
gpt-3.5-turbo
gpt-4
llama3-70B
mistral7B-instruct-v0.1
mistral7B-instruct-v0.2
mistral7B-instruct-v0.3
phi-3-medium-4k-instruct
polycoder-0.4B
polycoder-160M
polycoder-2.7B
santacoder

Parameters set

Temperature: 0
Max_tokens: 512

NOTE: The below binaries contains zip files for each model.

Assets 4

27 Jul 08:58

AnirudhG07

English

81bfcd5

English Data v0.1

English Dataset

This dataset contains the codes generation provided by the below mentioned models on original HumanEval data in English, both unsantinized(raw) and sanitized versions. These codes are already mentioned in the repository.

Models

codegen2-1B
codegen-2B-mono
codegen-2B-multi
codegen-2B-nl
codegen-350M-mono
codegen-350M-multi
codegen-350M-nl
codegen-6B-mono
codegen-6B-multi
codegen-6B-nl
gemma-2B
gemma-7B
gpt-3.5-turbo
gpt-4
llama3-70B
mistral7B-instruct-v0.1
mistral7B-instruct-v0.2
mistral7B-instruct-v0.3
phi-3-medium-4k-instruct
polycoder-0.4B
polycoder-160M
polycoder-2.7B

Parameters set

Temperature: 0
Max_tokens: 512

NOTE: The below binaries contains zip files for each model.

Assets 4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Results:

Hinglish Dataset

Models

Parameters set

English Dataset

Models

Parameters set

Releases: mrigankpawagi/HinglishEval

Results

Results:

Hinglish Data v0.1

Hinglish Dataset

Models

Parameters set

English Data v0.1

English Dataset

Models

Parameters set