Codes to reproduce the paper "How unprecedented was the February 2021 Texas cold snap?" by James Doss-Gollin, David J. Farnham, Upmanu Lall, and Vijay Modi
Help improve this workflow!
This workflow has been published but could be further improved with some additional meta data:- Keyword(s) in categories input, output, operation, topic
You can help improve this workflow by suggesting the addition or removal of keywords, suggest changes and report issues, or request to become a maintainer of the Workflow .
Welcome to the code repository for the paper " How unprecedented was the February 2021 Texas cold snap? " by James Doss-Gollin (Rice), David J. Farnham (Carnegie Institute for Science), Upmanu Lall (Columbia), and Vijay Modi (Columbia).
Note: Some edits to this repository have been made since the paper was published (to add analysis of summer extremes and to run for more years). For the exact version used to generate our published results, a permananent repository is available on Zenodo .
How to cite
This paper is available OPEN ACCESS in the journal Environmental Research Letters. To cite our results and/or methods, please cite it as something like:
@article{doss-gollin_txtreme:2021,
title = {How Unprecedented Was the {{February}} 2021 {{Texas}} Cold Snap?},
author = {{Doss-Gollin}, James and Farnham, David J. and Lall, Upmanu and Modi, Vijay},
year = {2021},
issn = {1748-9326},
doi = {10.1088/1748-9326/ac0278},
journal = {Environmental Research Letters},
}
Several summaries of this work are available. If you'd like a high-level overview of this work, we suggest this Twitter thread by James Doss-Gollin, a summary by Rice University, or a Columbia Earth Institute blog post by all authors. You can also view the poster summarizing this work included in this repository .
For researchers
We strive to make our work accessible to an inquiring public and to the scientific community. Thus we use only publicly available data sets. All code is posted on this repository.
The following sections outline the steps you can take to examine and reproduce our work.
Repository organization
-
LICENSE
describes the terms of the GNU GENERAL PUBLIC LICENSE under which our code is licensed. This is a "free, copyleft license". -
README.md
is what you are looking at -
Snakefile
implements our workflow. From the Snakemake documentation : "The Snakemake workflow management system is a tool to create reproducible and scalable data analyses. Workflows are described via a human readable, Python based language." -
codebase/
provides a Python package that various scripts use. Modules are provided to read the GHCN data (read_ghcn
), to keep track of the directory path structure (path
), to parse the input data sources (data
), to perform common calculations (calc
), and to add features for interacting with figures (fig
) -
data/
contains only raw inputs, stored indata/raw
. If you reproduce our codes (see instructions below), approximately 60 GB of data will be downloaded to `data/processed. -
doc/
contains latex files for our paper submissions. You shouldn't need to compile these because our paper is available open access, but you're welcome to browse. One useful thing you might do here is to identify which figure (infig/
) is used in the text! -
environment.yml
specifies the conda environment used. This should be sufficient to install all required packages for reproducibility. In case you run into issues,conda.txt
specifies the exact versions of all packages used. -
fig/
contains final versions of all our figures in both vector graphic (.pdf
) and image (.jpg
) formats. All are generated by our source code except forEGOVA.pdf
, which is generated from the EGOVA tool produced by Edgar Virgüez. If you want to use these figures, you are responsible for complying with the policies of Environmental Research Letters , but we mainly ask that you cite our paper when you do so. -
scripts/
contains the python scripts used to get and parse raw data, process data, and produce outputs. All figures are produced within Jupyter notebooks; they are the last step of the analysis. Please note that many scripts import modules fromcodebase
, the internal package described above. -
setup.py
makescodebase
available for installation
To browse the codes
If you want to browse our code, this section is for you.
You will find four Jupyter notebooks in
scripts/
.
You can open them and they will render in GitHub.
This will show you how we produced all figures in our paper, along with some additional commentary.
If you want to dig deeper, but not to run our codes, then you may want to look at the Python scripts in
scripts/
and/or the module in
codebase/
.
To run the codes
If you want to reproduce or modify our results, this section is for you
Please note: running this will require approximately 60GB of disk space . All commands here assume standard UNIX terminal; Windows may be subtly different .
First,
git clone
the repository to your machine.
Next, you will need to install conda (we recommend miniconda) and
wget
.
Next, you need to create th conda environment:
conda env create --file environment.yml
If this gives you any trouble, you can use the exact version of packages that we did (this worked on an Apple M1 Macbook emulating OSX-64 but your mileage may vary on other systems):
conda create --name txtreme --file conda.txt
Once you have created the environment, then activate it:
conda activate txtreme
You will also need to install our custom module in
codebase
pip install -e .
In order to run, you will need to do two things to access required data.
-
Download the GPWV4 data. See instructions in
data/raw/gpwv4/README.md
. -
Register for a CDSAPI key with the ECMWF. This key is required for you to access this data. If you do not properly install the CDSAPI key, you will not be able to download the ERA-5 reanalysis data.
Now you can run!
snakemake --n <some number>
where
<some number>
specifies the number of cores to use (if you have no idea what this means, try 3:
snakemake --n 3
.
We again remind you that running will use nearly 60GB of disk space; a fast internet connection will be helpful.
Issues and comments
-
If you have issues related to the software, please raise an issue in the Issues tab.
-
If you have comments, please contact the corresponding author, James Doss-Gollin , directly
Code Snippets
42 43 | shell: "wget -O {output} https://www.eia.gov/electricity/data/eia860m/archive/xls/november_generator2020.xlsx" |
52 53 | shell: "python {input.script} --year {wildcards.year} -o {output}" |
62 63 | shell: "python {input.script} --year {wildcards.year} -o {output}" |
70 71 | shell: "wget -O {output} http://berkeleyearth.lbl.gov/auto/Global/Gridded/Complete_{wildcards.var}_Daily_LatLong1_{wildcards.decade}.nc" |
78 79 | shell: "wget -O {output} https://www1.ncdc.noaa.gov/pub/data/ghcn/daily/ghcnd-stations.txt" |
86 87 | shell: "wget https://www1.ncdc.noaa.gov/pub/data/ghcn/daily/ghcnd_all.tar.gz -O - | tar -xz -C data/raw" |
104 105 | shell: "python {input.script} -i {input.infiles} -o {output}" |
117 118 | shell: "python {input.script} -i {input.infiles} -o {output}" |
133 134 | shell: "python {input.script} --population {input.population} --temperature {input.temperature} -o {output}" |
145 146 | shell: "python {input.script} --boundary {input.interconnect} --hdd {input.hdd} -o {output}" |
158 159 | shell: "python {input.script} -i {input.infiles} -o {output}" |
172 173 | shell: "python {input.script} -i {input.files} -o {output}" |
184 185 | shell: "python {input.script} --tmin {input.tmin} --tmax {input.tmax} -o {output}" |
196 197 | shell: "python {input.script} -i {input.stations} -o {output}" |
207 208 | shell: "python {input.script} -i {input.stations} -o {output}" |
Support
- Future updates