Education Dialogue Dataset

Education Dialogue (ED) contains conversations, generated by prompting Gemini Ultra. These are conversations between a teacher and a student, where the teacher is prompted with a specific topic to teach the student, and the student is prompted with their learning preferences.

For more details on the design and content of the dataset, please see the paper Multi-turn Reinforcement Learning from Preference Human Feedback.

Data Description

ED contains 40,000 training examples and 7,234 examples for testing. Each example is a full conversation between a teacher and a student, including metadata on the topic and the teacher/student preferences. In the paper, we perform experiments in multi-turn reinforcement learning on the dataset.

Data Format

The data is composed of 6 JSON files: 5 for the training data and 1 for the test data. Each entry represents a conversation with the following fields:

background_info - the contextual information of the conversation, containing the following fields:
- topic - the topic that the teacher needs to teach.
- student_prefrences - the ways in which the student prefers to learn, e.g., lecture-based learning or hands-on activities.
- teacher_prefrences - the ways in which the teacher prefers to teach, e.g., lecture-based learning or hands-on activities.
- student_reactions - the reaction of the student if they don’t learn in their preferred way, e.g., gets disengaged or might adapt to other methods.
- teacher_reactions - the reaction of the teacher if the student doesn't learn in their preferred teaching way, e.g., gets frustrated or might adapt to the student.
conversation - a list of turns that represent the conversation between the teacher and the student. Each turn has a field (called role) that identifies the speaker and a field (called text) of the content.

The data was generated by prompting Gemini Ultra with the following prompt:

Simulate a conversation between a teacher in school and a student. There is a small chance that the teacher is successful in teaching the student so he understands the topic. The conversation lasts roughly 10-15 turns but ends when either side says [end of conversation]. The teacher wants to teach the student about {topic}. The student likes {student_pref}. The teacher does not know that beforehand. The student prefers to learn this way, {student_reaction}. The teacher likes {teacher_pref}. He prefers to teach thisway, {teacher_reaction}. Output the conversation and the probability that the student understood the material, in the following format.
#
Conversation:
[
Teacher: "...",
Student: "...",
Teacher: "...",
Student: "...",
]
Probability: "...",
#

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
README.md		README.md
conversations_eval.json		conversations_eval.json
conversations_train1.json		conversations_train1.json
conversations_train2.json		conversations_train2.json
conversations_train3.json		conversations_train3.json
conversations_train4.json		conversations_train4.json
conversations_train5.json		conversations_train5.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Education Dialogue Dataset

Data Description

Data Format

About

Releases

Packages

google-research-datasets/Education-Dialogue-Dataset

Folders and files

Latest commit

History

Repository files navigation

Education Dialogue Dataset

Data Description

Data Format

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages