We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
/!\ PLEASE INCLUDE THE FULL STACKTRACE AND CODE SNIPPET
Short description Description of the bug. Docker colab runtime crashes on running the below python code.
Environment information
Operating System: Windows 11 Home Single Language Build: 26100.2605
Python version: 3.10.12 (main, Mar 22 2024, 16:50:05) [GCC 11.4.0]
tensorflow-datasets/tfds-nightly version: tensorflow-datasets 4.9.6
tensorflow-datasets
tfds-nightly
tensorflow/tf-nightly version: tensorflow 2.15.0
tensorflow
tf-nightly
Does the issue still exists with the last tfds-nightly package (pip install --upgrade tfds-nightly) ?
pip install --upgrade tfds-nightly
Reproduction instructions Recommend using a RAM constrained machine.
Connect to a local Google colab runtime and install tensorflow_datasets
import tensorflow_datasets as tfds class Wmt14TranslateFrEn(tfds.translate.wmt14.Wmt14Translate): BUILDER_CONFIGS = [ tfds.translate.wmt.WmtConfig( # pylint:disable=g-complex-comprehension description="WMT 2014 %s-%s translation task dataset." % ("fr", "en"), url=tfds.translate.wmt14._URL, citation=tfds.translate.wmt14._CITATION, language_pair=("fr", "en"), version=tfds.core.Version("1.0.0"), ) ] @property def _subsets(self): return { tfds.Split.TRAIN: [ "gigafren", ] } wmt14_fr_en_translate = Wmt14TranslateFrEn() wmt14_fr_en_translate.download_and_prepare()``` If you share a colab, make sure to update the permissions to share it. **Link to logs** If applicable, Nothing useful from logs **Expected behavior** successful execution of the code. **Additional context** I did an investigation. I believe https://github.com/tensorflow/datasets/blob/1a8fed713ed3a58bd459e1a8cccd31eb641d9b58/tensorflow_datasets/translate/wmt.py#L966 tries to load the entire gzip uncompressed file into memory which causes OOM on my machine.
The text was updated successfully, but these errors were encountered:
No branches or pull requests
/!\ PLEASE INCLUDE THE FULL STACKTRACE AND CODE SNIPPET
Short description
Description of the bug.
Docker colab runtime crashes on running the below python code.
Environment information
Operating System: Windows 11 Home Single Language Build: 26100.2605
Python version: 3.10.12 (main, Mar 22 2024, 16:50:05) [GCC 11.4.0]
tensorflow-datasets
/tfds-nightly
version: tensorflow-datasets 4.9.6tensorflow
/tf-nightly
version: tensorflow 2.15.0Does the issue still exists with the last
tfds-nightly
package (pip install --upgrade tfds-nightly
) ?Reproduction instructions
Recommend using a RAM constrained machine.
Connect to a local Google colab runtime and install tensorflow_datasets
The text was updated successfully, but these errors were encountered: