Updates

WILDS is under active development. For updates on new releases, please subscribe to our mailing list by clicking on the "Join group" button.

For more details, please read our release notes.

NeurIPS DistShift workshop (December 13, 2021)

We will present the WILDS 2.0 update at NeurIPS Workshop on Distribution Shifts (DistShift).

Version 2.0 release (December 10, 2021)

Release notes | Paper

We have added unlabeled data to the following datasets:

iwildcam
camelyon17
ogb-molpcba
globalwheat
civilcomments
fmow
poverty
amazon

The labeled training, validation, and test data in all datasets have been kept exactly the same.

We have also updated and/or added new algorithms that make use of the unlabeled data:

CORAL (Sun and Saenko, 2016)
DANN (Ganin et al., 2016)
AFN (Xu et al., 2019)
Pseudo-Label (Lee, 2013)
FixMatch (Sohn et al., 2020)
Noisy Student (Xie et al., 2020)
SwAV pre-training (Caron et al., 2020)
Masked language model pre-training (Devlin et al., 2019)

Other minor changes include:

We updated GlobalWheat v1.0 -> v1.1 to fix some errors in metadata. This should not affect most users and does not change any baseline results.
We added support for the DomainNet dataset (Peng at al., 2019).
We have added support for RandAugment (Cubuk et al.) and other data augmentation techniques.

Version 1.2.2 release (August 4, 2021)

Release notes

Additional input sanity checking and other minor updates.

ICML (July 22, 2021)

We presented WILDS as a long talk at ICML. We also wrote an accompanying blog post.

Version 1.2 release (July 19, 2021)

Release notes | Announcement | Tweet

Added two new benchmark datasets: GlobalWheat-WILDS and RxRx1-WILDS.
Updated the leaderboard submission guidelines and added infrastructure to support submission.
Added a non-benchmark dataset, ENCODE.

Version 1.1 release (March 9, 2021)

Release notes | Announcement | Tweet

Added a new benchmark dataset, Py150-WILDS.
Added a non-benchmark dataset, SQF.
Made major breaking updates to existing WILDS datasets:
- Amazon-WILDS v1.0 -> v2.0, which subsamples the dataset to speed up model training.
- iWildCam-WILDS v1.0 -> v2.0, which introduces a new validation (ID) and test (ID) split.
Made minor, backwards-compatible updates to existing WILDS datasets:
- FMoW v1.0 -> v1.1, which losslessly converts the previous files into individual PNG images.
- PovertyMap v1.0 -> v1.1, which losslessly converts the previous files into individual compressed NPZ files.
Changed the default models for most datasets to make them significantly faster and easier to use.
Several of the above changes are breaking changes that will impact users who are currently running experiments with WILDS. We sincerely apologize for the inconvenience, and we ask all users to update their package to v1.1.0, which will automatically update your datasets.

Initial release! (December 14, 2020)

Annnouncement | Tweet

We’re excited to announce WILDS, a benchmark of in-the-wild distribution shifts with datasets across diverse data modalities and real-world applications!