The Wayback Machine - https://web.archive.org/web/20220402064749/https://github.com/huggingface/datasets/commits/master
Skip to content
Permalink
master

Commits on Apr 1, 2022

  1. Create metric card for seqeval (#4070)

    * Create metric card for seqeval
    
    Proposing metric card for seqeval
    
    * Update README.md
    
    * Update metrics/seqeval/README.md
    
    Co-authored-by: Albert Villanova del Moral <[email protected]>
    
    * Update metrics/seqeval/README.md
    
    Co-authored-by: Albert Villanova del Moral <[email protected]>
    
    * Update metrics/seqeval/README.md
    
    Co-authored-by: Albert Villanova del Moral <[email protected]>
    
    * Update metrics/seqeval/README.md
    
    Co-authored-by: Albert Villanova del Moral <[email protected]>
    
    * Update metrics/seqeval/README.md
    
    Co-authored-by: Albert Villanova del Moral <[email protected]>
    
    * Update metrics/seqeval/README.md
    
    Co-authored-by: Albert Villanova del Moral <[email protected]>
    
    * Update metrics/seqeval/README.md
    
    Co-authored-by: Albert Villanova del Moral <[email protected]>
    
    * Update metrics/seqeval/README.md
    
    Co-authored-by: Albert Villanova del Moral <[email protected]>
    
    * Update metrics/seqeval/README.md
    
    Co-authored-by: Albert Villanova del Moral <[email protected]>
    
    * Update metrics/seqeval/README.md
    
    Co-authored-by: Albert Villanova del Moral <[email protected]>
    
    * Update metrics/seqeval/README.md
    
    Co-authored-by: Albert Villanova del Moral <[email protected]>
    
    * Update metrics/seqeval/README.md
    
    Co-authored-by: Albert Villanova del Moral <[email protected]>
    
    Co-authored-by: Albert Villanova del Moral <[email protected]>
    sashavor and albertvillanova committed Apr 1, 2022
  2. Create a metric card for Competition MATH (#4073)

    * Create a metric card for Competition MATH
    
    Proposing metric card for Competition MATH
    
    * Update metrics/competition_math/README.md
    
    Co-authored-by: Albert Villanova del Moral <[email protected]>
    
    * Update metrics/competition_math/README.md
    
    Co-authored-by: Albert Villanova del Moral <[email protected]>
    
    * Update metrics/competition_math/README.md
    
    Co-authored-by: Albert Villanova del Moral <[email protected]>
    
    * Update metrics/competition_math/README.md
    
    Co-authored-by: Albert Villanova del Moral <[email protected]>
    
    * Update metrics/competition_math/README.md
    
    Co-authored-by: Albert Villanova del Moral <[email protected]>
    
    * Update metrics/competition_math/README.md
    
    Co-authored-by: Albert Villanova del Moral <[email protected]>
    
    * Update metrics/competition_math/README.md
    
    Co-authored-by: Albert Villanova del Moral <[email protected]>
    
    * Update metrics/competition_math/README.md
    
    Co-authored-by: Albert Villanova del Moral <[email protected]>
    
    * Update metrics/competition_math/README.md
    
    Co-authored-by: Albert Villanova del Moral <[email protected]>
    
    Co-authored-by: Albert Villanova del Moral <[email protected]>
    sashavor and albertvillanova committed Apr 1, 2022
  3. close parquet writer (#4081)

    lhoestq committed Apr 1, 2022
  4. Add MetaShift dataset (#3900)

    * Initial draft for the MetaShift Dataset.
    
    * add dataset preprocessing and yield images.
    
    * use selected classes.
    
    * format code as required.
    
    * add selected_classes as a config parameter.
    
    * update fields in Dataset Card.
    
    * add dataset tagset.
    
    * Update datasets/metashift/README.md
    
    Rename card name.
    
    Co-authored-by: Mario Šaško <[email protected]>
    
    * Update datasets/metashift/README.md
    
    Naming for links and add point of contact info.
    
    Co-authored-by: Mario Šaško <[email protected]>
    
    * Update datasets/metashift/README.md
    
    Fix extra whitespace.
    
    Co-authored-by: Mario Šaško <[email protected]>
    
    * Update datasets/metashift/README.md
    
    Extra full stop removed.
    
    Co-authored-by: Mario Šaško <[email protected]>
    
    * Update datasets/metashift/README.md
    
    Add bibtex tag.
    
    Co-authored-by: Mario Šaško <[email protected]>
    
    * Update datasets/metashift/metashift.py
    
    Cleaner code changes.
    
    Co-authored-by: Mario Šaško <[email protected]>
    
    * Update datasets/metashift/metashift.py
    
    Use os.path.join instead.
    
    Co-authored-by: Mario Šaško <[email protected]>
    
    * Update datasets/metashift/metashift.py
    
    Use staticmethod, remove print statements.
    
    Co-authored-by: Mario Šaško <[email protected]>
    
    * Update datasets/metashift/metashift.py
    
    Add task template.
    
    Co-authored-by: Mario Šaško <[email protected]>
    
    * Update datasets/metashift/metashift.py
    
    add static method.
    
    Co-authored-by: Mario Šaško <[email protected]>
    
    * add annotation info.
    
    * use multi-line comment.
    
    * add minor fixes.
    
    * add the generated meta-graphs to the card as images.
    
    * usage of os.path.join for src_image_path.
    
    * add config to generate metashift-attributes dataset.
    
    * add config to expose image metadata.
    
    * add contants as config parameters.
    
    * add Dataset Usage section to cards.
    
    * add name, dataset version to MetashiftConfig.
    
    * add dataset_infos.json
    
    * pass URLs to images and add alt tags.
    
    * set default classes as in original repo.
    
    * add dummy data.
    
    * format code.
    
    * update dataset structure section for config options.
    
    * Update datasets/metashift/README.md
    
    CI fixes.
    
    Co-authored-by: Quentin Lhoest <[email protected]>
    
    * Update datasets/metashift/README.md
    
    Correct task categories.
    
    Co-authored-by: Quentin Lhoest <[email protected]>
    
    * Update datasets/metashift/metashift.py
    
    Add encoding.
    
    Co-authored-by: Quentin Lhoest <[email protected]>
    
    * add contributions section
    
    * Update datasets/metashift/README.md
    
    Add paperswithcode id.
    
    Co-authored-by: Mario Šaško <[email protected]>
    
    * Update datasets/metashift/README.md
    
    Correct sentence.
    
    Co-authored-by: Mario Šaško <[email protected]>
    
    * Update datasets/metashift/README.md
    
    Co-authored-by: Mario Šaško <[email protected]>
    
    * Update datasets/metashift/README.md
    
    Co-authored-by: Mario Šaško <[email protected]>
    
    * Update datasets/metashift/README.md
    
    Co-authored-by: Mario Šaško <[email protected]>
    
    * Update datasets/metashift/README.md
    
    Co-authored-by: Mario Šaško <[email protected]>
    
    * Update datasets/metashift/README.md
    
    Co-authored-by: Mario Šaško <[email protected]>
    
    * Update datasets/metashift/README.md
    
    add default classes info.
    
    Co-authored-by: Mario Šaško <[email protected]>
    
    * Update datasets/metashift/README.md
    
    Co-authored-by: Mario Šaško <[email protected]>
    
    * Update datasets/metashift/README.md
    
    Co-authored-by: Mario Šaško <[email protected]>
    
    * Update datasets/metashift/README.md
    
    Co-authored-by: Mario Šaško <[email protected]>
    
    * indent params list and update with suggestions.
    
    * Apply suggestions from code review
    
    * Update datasets/metashift/metashift.py
    
    Co-authored-by: Mario Šaško <[email protected]>
    Co-authored-by: Quentin Lhoest <[email protected]>
    3 people committed Apr 1, 2022
  5. Fix GithubMetricModuleFactory instantiation with None download_config (

    …#4078)
    
    * Test instantiation of module factories
    
    * Fix GithubMetricModuleFactory instantiation with None download_config
    albertvillanova committed Apr 1, 2022

Commits on Mar 31, 2022

  1. Create metric card for METEOR (#4065)

    * Create metric card for METEOR
    
    Proposing a metric card for METEOR
    
    * Update metrics/meteor/README.md
    
    Co-authored-by: Albert Villanova del Moral <[email protected]>
    
    * Update metrics/meteor/README.md
    
    Co-authored-by: Albert Villanova del Moral <[email protected]>
    
    * Update metrics/meteor/README.md
    
    Co-authored-by: Albert Villanova del Moral <[email protected]>
    
    * Update README.md
    
    * Update README.md
    
    removing NLTK requirement
    
    Co-authored-by: Albert Villanova del Moral <[email protected]>
    sashavor and albertvillanova committed Mar 31, 2022
  2. Fix docs on audio feature installation (#4028)

    * Fix docs on audio feature installation
    
    * Remove mention as experimental feature
    albertvillanova committed Mar 31, 2022
  3. Remove old wikipedia leftovers (#3989)

    * Remove old wikipedia leftovers
    
    * Fix some typos in wiki_snippets script
    
    * Update metadata JSON of wiki_snippets dataset
    
    * Update dataset card of wiki_snippets
    
    * Fix tag in dataset card
    
    * Add task tags to dataset card
    
    * Make consistent the conversion to MB
    
    * Add comment warning not to use load_dataset in another script
    albertvillanova committed Mar 31, 2022
  4. Fix CLI dummy data generation (#4045)

    * Fix CLI dummy data generation
    
    * Test CLI dummy data generation
    
    * Fix style
    albertvillanova committed Mar 31, 2022
  5. Increase max retries for GitHub metrics (#4063)

    * Increase max retries for GitHub metrics
    
    * Address requested changes
    albertvillanova committed Mar 31, 2022

Commits on Mar 30, 2022

  1. Docs maintenance (#3999)

    *  doc maintenance
    
    * 🖍 apply feedback
    
    * 🖍 use the shorthand syntax
    stevhliu committed Mar 30, 2022

Commits on Mar 29, 2022

  1. Create metric card for CUAD (#4043)

    * Create README.md
    
    Proposing a CUAD metric card
    
    * Update metrics/cuad/README.md
    
    Co-authored-by: Albert Villanova del Moral <[email protected]>
    
    * Update metrics/cuad/README.md
    
    Co-authored-by: Albert Villanova del Moral <[email protected]>
    
    * Update metrics/cuad/README.md
    
    Co-authored-by: Albert Villanova del Moral <[email protected]>
    
    * Update metrics/cuad/README.md
    
    Co-authored-by: Albert Villanova del Moral <[email protected]>
    
    * Update metrics/cuad/README.md
    
    Co-authored-by: Albert Villanova del Moral <[email protected]>
    
    Co-authored-by: Albert Villanova del Moral <[email protected]>
    sashavor and albertvillanova committed Mar 29, 2022
  2. BLEU metric card (#3947)

    * Add metric card for bleu metric
    
    * fix formatting for bleu metric card
    
    * add How to Use intro
    
    * add limitation to bleu mc
    
    * fix example format, add original paper values
    
    * fix example in metrics/bleu/README.md
    
    Co-authored-by: Quentin Lhoest <[email protected]>
    
    * fix next example in metrics/bleu/README.md
    
    Co-authored-by: Quentin Lhoest <[email protected]>
    
    Co-authored-by: Quentin Lhoest <[email protected]>
    emibaylor and lhoestq committed Mar 29, 2022
  3. Fix Audio.encode_example() when writing an array (#3998)

    * specify format in soundfile.write() when encoding an array
    
    * test encoding audio from array
    polinaeterna committed Mar 29, 2022
  4. Support float data types in pearsonr and spearmanr metrics (#4054)

    * Support float data types in pearsonr/spearmanr metrics
    
    * Update doctests
    albertvillanova committed Mar 29, 2022
  5. Add TER metric card (#3981)

    * update case_sensitive input to ignore_case for consistency with other inputs and metrics
    
    * update ignore_punct input for consistency
    
    * update input docs
    
    * change input args because zh and ja are not the only asian languages
    
    * add new examples to py file
    
    * variable consistency
    
    * add ter metric card
    
    * fix metric card formatting
    
    * fix ter style
    
    * remove trailing whitespace
    
    * fix example format
    
    * fix example format in py file
    
    * change input ignore_case back to case_sensitive
    
    * remove trailing whitespace
    emibaylor committed Mar 29, 2022
  6. Create metric card for the Code Eval metric (#4049)

    * Create README.md
    
    Code Eval metric card
    
    * Update metrics/code_eval/README.md
    
    Co-authored-by: Quentin Lhoest <[email protected]>
    
    * Update metrics/code_eval/README.md
    
    Co-authored-by: Quentin Lhoest <[email protected]>
    
    Co-authored-by: Quentin Lhoest <[email protected]>
    sashavor and lhoestq committed Mar 29, 2022
  7. Create metric card for XNLI (#4046)

    * Create README.md
    
    Proposing a metric card for XNLI
    
    * Update metrics/xnli/README.md
    
    Co-authored-by: Quentin Lhoest <[email protected]>
    
    * Update metrics/xnli/README.md
    
    Co-authored-by: Quentin Lhoest <[email protected]>
    
    * Update metrics/xnli/README.md
    
    Co-authored-by: Quentin Lhoest <[email protected]>
    
    Co-authored-by: Quentin Lhoest <[email protected]>
    sashavor and lhoestq committed Mar 29, 2022
  8. Update main readme (#3927)

    * update readme
    
    * add link to zenodo DOIs
    
    * Apply suggestions from code review
    
    Co-authored-by: Albert Villanova del Moral <[email protected]>
    
    Co-authored-by: Albert Villanova del Moral <[email protected]>
    lhoestq and albertvillanova committed Mar 29, 2022

Commits on Mar 28, 2022

  1. Support streaming xcopa dataset (#4039)

    * Support streaming xcopa dataset
    
    * Update metadata JSON
    
    * Update dummy data
    albertvillanova committed Mar 28, 2022
  2. Rename wer to cer (#4012)

    Co-authored-by: Pramesh <[email protected]>
    pmgautam and gautampramesh committed Mar 28, 2022
  3. ASSIN 2 dataset: replace broken Google Drive _URLS by links on github (

    …#4004)
    
    * Replace broken Drive _URLS by links on github
    
    * Change github urls.
    
    * update dummy data
    
    Co-authored-by: Quentin Lhoest <[email protected]>
    ruanchaves and lhoestq committed Mar 28, 2022
  4. Add "Adversarial GLUE" dataset to datasets library (#3849)

    * update link in wiki_bio dataset
    
    * run linter and update dummy data
    
    * fix markdown so that test passes (even though I didnt break it)
    
    * init adversarial glue in hf datasets
    
    * format and add card
    
    * update label computation
    
    * remove backslash
    
    * remove print statement
    
    * Update datasets/adv_glue/README.md
    
    Co-authored-by: Quentin Lhoest <[email protected]>
    
    * Update datasets/adv_glue/README.md
    
    Co-authored-by: Quentin Lhoest <[email protected]>
    
    * flesh out README.md
    
    * Update datasets/adv_glue/README.md
    
    advglue
    
    Co-authored-by: Quentin Lhoest <[email protected]>
    
    * Update datasets/adv_glue/README.md
    
    Co-authored-by: Quentin Lhoest <[email protected]>
    
    * update tags and fields
    
    * update tags
    
    * minor changes to trigger the CI
    
    Co-authored-by: Quentin Lhoest <[email protected]>
    jxmorris12 and lhoestq committed Mar 28, 2022
  5. Fix None issue with Sequence of dict (#4010)

    * fix None issue with Sequence of dict
    
    * update test
    
    * move the None check up a bit
    lhoestq committed Mar 28, 2022
  6. Replace yahoo_answers_topics data url (#4023)

    * replace yahoo_answers_topics data url
    
    * update dummy data
    lhoestq committed Mar 28, 2022

Commits on Mar 25, 2022

  1. Adding Roman Urdu Hate Speech dataset (#3972)

    * Adding Roman Urdu Hate Speech dataset
    
    * Update Readme
    
    * Update data structure sections in README
    
    * Update Additional Information Section
    
    * Update Contributions Section with some contents
    
    * Remove typos in README
    
    * Update Dataset Script
    
    * Update Dummy_Data
    
    * Apply suggestions from code review
    
    Co-authored-by: Bhavish Pahwa <[email protected]>
    Co-authored-by: Quentin Lhoest <[email protected]>
    3 people committed Mar 25, 2022
Older