This function checks range of values for METData and implements various test on daily weather data (persistence tests, internal consistency tests) provided by the user.

qc_raw_weather_data(
  daily_weather_data,
  info_environments,
  path_flagged_values,
  et0 = F
)

Arguments

daily_weather_data

a data.frame which contains the following mandatory columns:

  1. longitude numeric

  2. latitude numeric

  3. year numeric

  4. location character

  5. YYYYMMDD Date Date of the daily observation written as YYYY-MM-DD

  6. IDenv character Environment ID written Location_Year

  7. T2M numeric Average mean temperature (degree Celsius)

  8. T2M_MIN numeric Min. temperature (degree Celsius)

  9. T2M_MAX numeric Max. temperature (degree Celsius)

  10. PRECTOTCORR numeric Total daily precipitation (mm)

Additional weather data provided by user must be a subset of the following weather variable names (= next columns): (Any imputation step should be performed before providing this daily weather dataset to the package. ):

  1. RH2M numeric Daily mean relative humidity (%)

  2. RH2M_MIN numeric Daily minimum relative humidity (%)

  3. RH2M_MAX numeric Daily maximum relative humidity (%)

  4. daily_solar_radiation numeric daily solar radiation (MJ/m^2/day)

  5. T2MDEW numeric Dew Point (°C)

Default is NULL.

info_environments

data.frame object with at least the 4 first columns.

  1. year: numeric Year label of the environment

  2. location: character Name of the location

  3. longitude: numeric longitude of the environment

  4. latitude: numeric latitude of the environment

  5. planting.date: (optional) Date YYYY-MM-DD

  6. harvest.date: (optional) Date YYYY-MM-DD

  7. elevation: (optional) numeric

  8. IDenv: character ID of the environment (location x year)

The data.frame should contain as many rows as Year x Location combinations. Example: if only one location evaluated across four years, 4 rows should be present.

path_flagged_values

where to save the file with flagged values to check on (they are not removed from the data, only indicated in the output file)

et0

whether evapotranspiration should be calculated. False by default.

Value

daily_weather_data a data.frame after quality check with the same columns as before the QC.
Vapor pressure deficit is calculated if T2M_MIN, T2M_MAX, and either RH2M_MIN + RH2M_MAX or only RH2M are provided.
et0 calculated if indicated (et0 = TRUE) .
The function checks for multiple daily observations at the same EnvID.
Warning messages are also thrown if some observations do not pass either the range test, persistence test or the internal consistency test. A data.frame, with dubious values signaled by a column flagged and with the corresponding explanation in the column "reason", is provided as output. None of the flagged values is assigned as missing values or transformed; therefore we strongly recommend the user to have a second look at the daily weather data provided and to correct potential dubious values indicated by the output of the present function.

Solar radiation or wind data are automatically retrieved from NASA, if they are not provided without any missing data by the user. As for any other weather variable used in this function, these data cannot be only partially provided (no missing values accepted).

References

Zotarelli L, Dukes MD, Romero CC, Migliaccio KW, Morgan KT (2010). “Step by step calculation of the Penman-Monteith Evapotranspiration (FAO-56 Method).” Institute of Food and Agricultural Sciences. University of Florida.

Author

Cathy C. Westhues cathy.jubin@uni-goettingen.de