Yahoo Web Search

Search results

  1. Kaggle is the world’s largest data science community with powerful tools and resources to help you achieve your data science goals.

    • WeatherBench: A benchmark dataset for data-driven weather forecasting
    • Quick start
    • Download the data
    • Baselines and evaluation
    • Data processing

    🚨🚨🚨 WeatherBench 2 has been released. It provides an updated and much improved benchmark including more comprehensive and more easily accessible datasets.🚨🚨🚨

    If you are using this dataset please cite

    This repository contains all the code for downloding and processing the data as well as code for the baseline models in the paper.

    Note! The data has been changed from the original release. Here is a list of changes:

    •New vertical levels. Used to be [1, 10, 100, 200, 300, 400, 500, 600, 700, 850, 1000], now is [50, 100, 150, 200, 250, 300, 400, 500, 600, 700, 850, 925, 1000]. This is to be compatible with CMIP output. The new levels include all of the old ones with the exception of [1, 10].

    •CMIP data. Regridded CMIP data of some variables was added. This is the historical simulation of the MPI-ESM-HR model.

    You can follow the quickstart guide in this notebook or lauch it directly from Binder.

    The data is hosted here with the following directory structure

    To start out download either the entire 5.625 degree data (175G) using

    or simply the single level (500 hPa) geopotential data using

    and then unzip the files using unzip .zip. You can also use ftp or rsync to download the data. For instructions, follow the download link.

    Baselines

    The baselines are created using Jupyter notebooks in notebooks/. In all notebooks, the forecasts are saved as a NetCDF file in the predictions directory of the dataset.

    CNN baselines

    An example of how to load the data and train a CNN using Keras is given in notebooks/3-cnn-example.ipynb. In addition a command line script for training CNNs is provided in src/train_nn.py. For the baseline CNNs in the paper the config files are given in src/nn_configs/. To reproduce the results in the paper run e.g. python -m src.train_nn -c src/nn_configs/fccnn_3d.yml.

    Evaluation

    Evaluation and comparison of the different baselines in done in notebooks/4-evaluation.ipynb. The scoring is done using the functions in src/score.py. The RMSE values for the baseline models are also saved in the predictions directory of the dataset. This is useful for plotting your own models alongside the baselines.

    Downloading and processing the raw data from the ERA5 archive

    The workflow to get to the processed data that ended up in the data repository above is: 1.Download monthly files from the ERA5 archive (src/download.py) 2.Regrid the raw data to the required resolutions (src/regrid.py) The raw data is from the ERA5 reanalysis archive. Information on how to download the data can be found here and here. Because downloading the data can take a long time (several weeks), the workflow is encoded using Snakemake. See Snakefile and the configuration files for each variable in scripts/config_ {variable}.yml. These files can be modified if additional variables are required. To execute Snakemake for a particular variable type : snakemake -p -j 4 all --configfile scripts/config_toa_incident_solar_radiation.yml. In addition to the time-dependent fields, the constant fields were downloaded and processed using scripts /download_and_regrid_constants.sh

    Downloading the TIGGE IFS baseline

    To obtain the operational IFS baseline, we use the TIGGE Archive. Downloading the data for Z500 and T850 is done in scripts/download_tigge.py; regridding is done in scripts /convert_and_regrid_tigge.sh.

    Regridding the T21 IFS baseline

    The T21 baseline was created by Peter Dueben. The raw output can be found in the dataset. To regrid the data scripts /convert_and_regrid_IFS_TXX.sh was used.

  2. Weather (Max-Planck-Institut Weather Dataset for Long-term Time Series Forecasting) Weather is recorded every 10 minutes for the 2020 whole year, which contains 21 meteorological indicators, such as air temperature, humidity, etc. The dataset in CSV format can be downloaded at https://drive.google.com/file/d/1Tc7GeVN7DLEl-RAs-JVwG9yFMf--S8dy ...

  3. Dec 11, 2023 · In this, model learns the underlying patterns in the relationships between temperature, humidity and windspeed to discern the associated weather conditions. Through this training process, the...

  4. People also ask

  5. Aug 31, 2023 · WB2 is an update to the original benchmark published in 2020, which was based on initial, lower-resolution ML models. The goal of WB2 is to accelerate the progress of data-driven weather models by providing a trusted, reproducible framework for evaluating and comparing different methodologies.

  6. The Global Forecast System (GFS) is a weather forecast model produced by the National Centers for Environmental Prediction (NCEP). The GFS dataset consists of selected model outputs (described...

  7. Jul 23, 2021 · We present Daymet V4, a 40-year daily meteorological dataset on a 1 km grid for North America, Hawaii, and Puerto Rico, providing temperature, precipitation, shortwave radiation, vapor pressure,...

  1. People also search for