67 captures
01 Oct 2020 - 06 Apr 2025
May
JUN
Sep
29
2020
2021
2022
success
fail
About this capture
COLLECTED BY
Organization:
Internet Archive
Focused crawls are collections of frequently-updated webcrawl data from narrow (as opposed to broad or wide) web crawls, often focused on a single domain or subdomain.
Collection:
github.com
TIMESTAMPS
The Wayback Machine - https://web.archive.org/web/20210629122539/https://github.com/huggingface/datasets/pulls
Skip to content
Sign up
Why GitHub?
Features
→
Mobile
→
Actions
→
Codespaces
→
Packages
→
Security
→
Code review
→
Issues
→
Integrations
→
GitHub Sponsors
→
Customer stories
→
Team
Enterprise
Explore
Explore GitHub
→
Learn and contribute
Topics
→
Collections
→
Trending
→
Learning Lab
→
Open source guides
→
Connect with others
The ReadME Project
→
Events
→
Community forum
→
GitHub Education
→
GitHub Stars program
→
Marketplace
Pricing
Plans
→
Compare plans
→
Contact Sales
→
Education
→
In this repository
All GitHub
↵
Jump to
↵
No suggested jump to results
In this repository
All GitHub
↵
Jump to
↵
In this organization
All GitHub
↵
Jump to
↵
In this repository
All GitHub
↵
Jump to
↵
Sign in
Sign up
{{ message }}
huggingface
/
datasets
Notifications
Star
8.5k
Fork
980
Code
Issues
263
Pull requests
48
Discussions
Actions
Projects
1
Wiki
Security
Insights
More
Code
Issues
Pull requests
Discussions
Actions
Projects
Wiki
Security
Insights
Labels
19
Milestones
3
Labels
19
Milestones
3
New pull request
New
48 Open
1,701 Closed
48 Open
1,701 Closed
Author
Filter by author
author:
Filter by this user
Label
Filter by label
Projects
Filter by project
Milestones
Filter by milestone
Reviews
Filter by reviews
No reviews
Review required
Approved review
Changes requested
Assignee
Filter by who’s assigned
Sort
Sort by
Newest
Oldest
Most commented
Least commented
Recently updated
Least recently updated
Most reactions
👍
👎
😄
🎉
😕
❤️
🚀
👀
Inject templates for ASR datasets
#2565 opened
Jun 29, 2021
by
lewtun
•
Draft
Minor fix in loading metrics docs
#2562 opened
Jun 29, 2021
by
albertvillanova
fix Dataset.map when num_procs > num rows
#2560 opened
Jun 29, 2021
by
connor-mccarthy
1
Add Parquet loader + from_parquet and to_parquet
#2537 opened
Jun 22, 2021
by
lhoestq
2
Add load_dataset_builder
#2500 opened
Jun 14, 2021
by
mariosasko
•
Draft
1 of 2 tasks
1
4
Add Rico Dataset
#2486 opened
Jun 11, 2021
by
ncoop57
1
Add TimeDial
#2476 opened
Jun 10, 2021
by
bhavitvyamalik
1
Add Disfl-QA
#2473 opened
Jun 10, 2021
by
bhavitvyamalik
3
pretty_name
for dataset in YAML tags
#2395 opened
May 22, 2021
by
bhavitvyamalik
26
preserve dtype for numpy arrays
#2361 opened
May 14, 2021
by
bhavitvyamalik
1
10
Create Audio feature
#2324 opened
May 5, 2021
by
albertvillanova
•
Draft
1.9
1
Minor refactor prepare_module
#2314 opened
May 4, 2021
by
albertvillanova
Update README.md
#2310 opened
May 4, 2021
by
cryoff
1
Create ExtractManager
refactoring
#2295 opened
Apr 30, 2021
by
albertvillanova
1.9
Create CacheManager
refactoring
#2277 opened
Apr 28, 2021
by
albertvillanova
1.9
Allow downloading/processing/caching only specific splits
enhancement
#2249 opened
Apr 22, 2021
by
albertvillanova
1.9
35
Implement Dataset from Parquet
enhancement
#2247 opened
Apr 22, 2021
by
albertvillanova
1.9
2
Set specific cache directories per test function call
#2244 opened
Apr 20, 2021
by
albertvillanova
1.9
4
[WIP] Add ArrayXD support for fixed size list.
#2228 opened
Apr 16, 2021
by
jblemoine
2
Filtering refactor
#2060 opened
Mar 16, 2021
by
theo-m
1
14
[WIP] Adding Support for Reading Pandas Category
#1936 opened
Feb 23, 2021
by
justin-yan
11
Use arrow ipc file format
#1933 opened
Feb 23, 2021
by
lhoestq
•
Draft
Update README.md
#1927 opened
Feb 22, 2021
by
JieyuZhao
5
dtype fix when using numpy arrays
#1884 opened
Feb 15, 2021
by
bhavitvyamalik
Create Remote Manager
#1882 opened
Feb 15, 2021
by
albertvillanova
22
Previous
1
2
Next
Previous
Next
ProTip!
Mix and match filters to narrow down what you’re looking for.
You can’t perform that action at this time.
You signed in with another tab or window.
Reload
to refresh your session.
You signed out in another tab or window.
Reload
to refresh your session.