Characterizing the neurogenomics of parental care in the rock dove

public public 1yr ago Version: v0.3 0 bookmarks

Note: Binder not working :(

To quickly explore the data in an internet browser, use our R Shiny app .

The aims of the research associated with this GitHub repository reree-fold:

  1. produce a high quality and reproducible transcriptional characterization of the HPG over reproduction and parental care in male and female rock doves. We were specifically interested in understanding how gene activity in the HPG changed as individuals transitioned from a sexually mature, non-reproductive state into reproduction and parenting.

  2. characterize the major patterns of transcriptional changes between the tissues, sexes, and parental stages.

  3. identify the drivers underlying changes in transcription. Specifically, we tested whether external or internal factors were regulating gene activity.

Methods

These data anlyseses are conducted in R and the workflow is automated via Snakemake. The order of operations are descriped in the Snakefile .

The data, scripts, and resutls are oranzied in the following manner:

  • R : contains custom functions and themes that are used by multiple anlaysis files

  • analysis : contains .R scripts that are executed by snakemake live

  • figures : contains images and illustrations

  • metadata : contains files with biological descriptions of samples or genes

  • results : contains outputs

Figures

High quality PDFs of all figures are available here

Figure 1 Experimental design. A) We investigated nine timepoints that spanned the majority of reproductive efforts in this species. These time-points consisted of a "control" or non-parenting state (from MacManes et al 2016 and Calisi et al. 2017), nest-building, clutch initiation or the onset of incubation, clutch completion and early incubation, mid-incubation, late incubation, initiation of nestling care, early nestling care, and mid-nestling care. To teste whether external or internal factors were regulating gene activity, we also conducted a series of offspring removal (B) and replacement (C) manipulations to test whether transcriptional changes were dependent upon offspring presence or were regulated by an internal clock. (D) We used t-Distributed Stochastic Neighbor Embedding (t-SNE) to reduce the dimensionality of the transcriptomes from the hypothalamus, pituitary, and gonads of male and female rock. Circles and triangles represent female and male samples, respectively, and points are colored by treatment.

Figure 2

Figure 2. Hypothesis-driven and data-driven approaches to identifying changes in gene expression. Box-and-whisker plots show changes in gene expression of serotonin receptor ( HTR2C ) in the hypothalamus (A) , prolactin ( PRL ) in the pituitary (B) , and estrogen receptor 1 ( ESR1 ) in the gonads.Bar and statistics are shown for one contrast of interest between treatment groups. Boxes are colored by treatment. Volcano plots show the log-fold change (LFC) and -log10(FDR) for all genes that are significantly different differentially between the aforementioned treatment groups. Boxes are colored by the treatment were gene expression was higher.

Figure 3

Figure 3. Summary of differentially expressed genes. Bar plots show the total number of significantly differentially expressed genes for all two-way comparisons. Contrasts are made relative to the non-breeding controls, the nest-building group, the previous stage, or the internal or external control group. Bars are colored by the treatment where gene expression was higher. A negative number of differentially expressed genes means that many genes had increase expression in the reference group.

Tables

Related documentation

Code Snippets

  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
library(tidyverse)

source("R/themes.R")

## import Kallisto transcript data, make gene info file 

# import count data, set rows at entreziz

print("reading kallisto data")
kallistodata <- read.table("results/kallistocounts.txt", 
                           sep = ",", row.names = NULL) %>%
  dplyr::rename("NCBI" = "entrezid", "gene" = "Name") %>%
  select(-row.names) %>% 
  filter(gene != "") %>%  # rm sample without valid identifier
  as_tibble()
head(kallistodata)

## save gene info
geneinfo <- kallistodata %>%
  select(gene, geneid, NCBI)

# for example, these gene have 2 and 3 isoforms
geneinfo %>% filter(gene == "GRIN1")
geneinfo %>% filter(gene == "CACNA1C")

## check for genes that have multple transcripts expressed

print("creating isoforms df")
isoforms <- kallistodata %>%
  group_by(gene) %>%
  summarize(n = n()) %>%
  filter(n > 1) %>%
  group_by(n) %>% 
  summarize(genes = str_c(gene, collapse = ", ")) %>%
  arrange(desc(n)) %>%
  rename("n(isoforms)" = "n") %>% 
  mutate("n(counts)" = str_count(genes, ",") + 1)
head(isoforms)


# aggregate transcript counts to gene counts
print("aggregating count data")

countData <- kallistodata %>% 
  select(-geneid, -NCBI) %>% 
  pivot_longer(-gene, names_to = "samples", values_to = "counts") %>%  
  pivot_wider(
    names_from = samples, 
    values_from = counts,
    values_fn = list(counts = sum))  %>% 
  arrange(gene) %>% 
  select(-contains("R2XR"))  #remove duplicate samples
countData <- as.data.frame(countData)
row.names(countData) <- countData$gene ## make gene the row name
countData[1] <- NULL ## make gene the row name
countData <- round(countData) #round all value for deseq2
head(countData[1:2])

## remove duplicate bird
names(countData)

# print tolal num of genes and samples
dim(countData)

## wrangle colData 

print("creating colData file")

colData <- read.table(file.path( "metadata/kallistosamples.txt"),
                      header = F, stringsAsFactors = F) %>%
  # use strsplit to cut the filename into meaningful columns
  mutate(bird = sapply(strsplit(V1,'\\_'), "[", 1),
         sex = sapply(strsplit(V1,'\\_'), "[", 2),
         tissue = sapply(strsplit(V1,'\\_'), "[", 3),
         temp = sapply(strsplit(V1,'\\_'), "[", 4)) %>%
  mutate(treatmenttemp = sapply(strsplit(temp,'\\.'), "[", 1),
         NYNO = sapply(strsplit(temp,'\\.'), "[", 2)) %>%

  # rename variables
  mutate(treatment = ifelse(grepl("extend-hatch", treatmenttemp), "extend",
                            ifelse(grepl("inc-prolong", treatmenttemp), "prolong",
                                   ifelse(grepl("m.hatch", treatmenttemp), "m.n2",
                                          ifelse(grepl("m.inc.d8", treatmenttemp), "early",
                                                 treatmenttemp))))) %>%
  select(-temp, -NYNO, -treatmenttemp ) %>%
  # replace dashes with periods (to make deseq happy)
  mutate(bird = gsub("-", ".", bird),
         treatment = gsub("-", ".", treatment),
         V1 = gsub("-", ".", V1)) %>%

  ## fix sample assignments

  mutate(sex = ifelse(grepl("blu119.w84.x", bird), "male", sex),
         treatment = ifelse(grepl("y128.g23.x|blu108.w40.o158|x.blu43.g132|r37.w100.x", 
                                  bird), "m.inc.d9", treatment)) %>%
  filter(bird != "r.r.x.ATLAS.R2XR") %>%

  mutate(bird = tolower(bird),
         sex = factor(sex),
         tissue = factor(tissue, levels = c("hypothalamus", "pituitary", "gonad"))) %>% 
  mutate(tissue = fct_recode(tissue, "gonads" = "gonad"),
         treatment = factor(treatment, levels = alllevels)) %>%

  ## add external and internalhypothesis
  mutate(external = fct_collapse(treatment,
                                 eggs = c("lay", "inc.d3", "inc.d9", "inc.d17", "prolong"),
                                 chicks = c("hatch", "n5", "n9", "extend" , "early" ),
                                 nest = c("bldg", "m.inc.d3",  "m.inc.d9",  "m.inc.d17",  "m.n2"),
                                 control = c("control")),
         internal = fct_collapse(treatment,
                                 early = c("lay", "inc.d3", "bldg", "m.inc.d3",  "m.inc.d9",
                                           "inc.d9",  "early"),
                                 late = c("hatch", "n5", "n9", "extend" ,  "prolong", "inc.d17",
                                          "m.inc.d17",  "m.n2"),
                                 control = c("control"))) %>%
  mutate(group = paste(sex, tissue, treatment, sep = "."),
         study = ifelse(grepl("m.|extend|prolong", treatment), 
                        "manipulation", "charcterization")) %>%
  mutate(study = factor(study, levels = c("charcterization", "manipulation")))
str(colData)
head(colData)

## check that rownames and colnames match for DESeq
ncol(countData) == nrow(colData)

## summarize data

colData %>% select(sex, tissue, treatment, study)  %>%  summary()
table(colData$sex, colData$treatment, colData$tissue)

colData %>%
  count(sex, tissue, treatment, sort = TRUE)

length(unique(colData$bird))
table(colData$sex, colData$tissue)


## save bird data to join with 

print("creating sample file for future hamonoization")

samples <- read_csv("metadata/00_birds_sachecked.csv") %>%
  select(-treatment, -sex) %>%
  mutate(bird = tolower(bird)) %>%
  full_join(., colData, by = c("bird")) %>%
  dplyr::rename("rnaseq.id" = "V1") %>%
  select(bird, horm.id, rnaseq.id, sex, tissue, treatment, external, internal, group) %>%
  distinct()  %>%
  filter(rnaseq.id != "NA")
head(samples)

## save files for downstream use

print("saving files")

write.csv(countData, "results/00_countData.csv")
write.csv(colData, "metadata/00_colData.csv")
write.csv(samples, "metadata/00_samples.csv")
write.csv(geneinfo, "metadata/00_geneinfo.csv", row.names = TRUE)
write.csv(isoforms, "results/00_isoforms.csv", row.names = TRUE)
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
library(tidyverse)
library(cowplot)
library(Rtsne)

source("R/themes.R")
source("R/functions.R")

#  Experimental design, tSNE analysis figure

countData <- read_csv("results/00_countData.csv") %>%
  column_to_rownames(var = "X1")
countData <- as.data.frame(t(countData))
head(countData[1:3])

colData <- read_csv("metadata/00_colData.csv") %>%
  mutate(treatment = factor(treatment, levels = alllevels),
         tissue = factor(tissue, levels = tissuelevels)) %>% 
  column_to_rownames(var = "V1")  %>% select(-X1)
head(colData)

# check ready for analysis
# row.names(countData) == row.names(colData)
head(row.names(countData) == row.names(colData))

## tsne

# prep for tsne useing count data and the custom `subsetmaketsne` function
hyptsne <- subsetmaketsne("hypothalamus", alllevelswithmanip, sexlevels)
pittsne <- subsetmaketsne("pituitary", alllevelswithmanip, sexlevels)
gontsne <- subsetmaketsne("gonads", alllevelswithmanip, sexlevels)

hyptsnechar <- subsetmaketsne("hypothalamus", charlevels, sexlevels)
pittsnechar <- subsetmaketsne("pituitary", charlevels, sexlevels)
gontsnechar <- subsetmaketsne("gonads", charlevels, sexlevels)


## Figure 1


ab <- png::readPNG("figures/images/fig_fig1a.png")
ab <- ggdraw() +  draw_image(ab, scale = 1)


f <- plottsne(hyptsnechar, hyptsnechar$treatment, allcolors) + 
  labs(subtitle = "hypothalamus", y = "Characterization\ntSNE2") 
g <- plottsne(pittsnechar, pittsnechar$treatment, allcolors ) + 
  labs(subtitle = "pituitary") 
theme(strip.text = element_blank())
h <- plottsne(gontsnechar, gontsnechar$treatment, allcolors ) + 
  labs(subtitle = "gonads") 

fgh <- plot_grid(f,g,h, ncol = 3, 
                 label_size = 8,
                 labels = c("C", "D", "E"),
                 rel_widths = c(1.2,1,1))



c <- plottsne(hyptsne, hyptsne$treatment, allcolors) + 
  labs(subtitle = "hypothalamus", y = "Manipulation\ntSNE2") 
d <- plottsne(pittsne, pittsne$treatment, allcolors ) + 
  labs(subtitle = "pituitary") 
theme(strip.text = element_blank())
e <- plottsne(gontsne, gontsne$treatment, allcolors ) + 
  labs(subtitle = "gonads") 

cde <- plot_grid(c,d,e, ncol = 3, 
                 labels = c("F", "G", "H"), 
                label_size = 8,
                rel_widths = c(1.2,1,1))



fig1 <- plot_grid(ab, fgh, cde, nrow = 3, rel_heights = c(2,1,1))
fig1

pdf(file="figures/fig1-1.pdf", width=7, height=6)
plot(fig1)
dev.off()

png("figures/fig1-1.png", width = 7, height = 6, 
    units = 'in', res = 300)
plot(fig1) 
dev.off()


sessionInfo()
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
library(tidyverse)
library(DESeq2)
library(BiocParallel)
register(MulticoreParam(6))

source("R/functions.R")  # load custom functions 
source("R/themes.R")  # load custom themes and color palletes
source("R/genelists.R") # load candidate genes

# DESeq2 was not designed to run on 100 + samples. But, I really like it, so I do it anyways. 
# Some of these commands take like 15 min to run using 6 cores. 

# count data
countData <- read_csv("results/00_countData.csv") %>%
  column_to_rownames(var = "X1")

#### uncomment this to subset the data for quick analysis
#print("subset for quick run")
#countData  <- head(countData, 1550)

# col data or variable informaiton
colData <- read.csv("metadata/00_colData.csv", header = T, row.names = 1) %>%
  mutate(tissue = factor(tissue, levels = tissuelevels),
         treatment <- factor(treatment, levels = alllevels)) %>%
  mutate(sextissue = as.factor(paste(sex, tissue, sep = "_"))) 
colData <- colData %>% drop_na()
head(colData)
dim(colData)
# 984 samples, 11 variables

print(ncol(countData) == nrow(colData))

head(names(countData))
head(row.names(colData))

savealltheDEGs <- function(whichdds, whichgroup){

  createDEGdfs(whichdds, whichgroup, "bldg", "control")
  createDEGdfs(whichdds, whichgroup, references, charlevelsnocontrol)

  # sequential
  createDEGdfs(whichdds, whichgroup, "lay", "inc.d3")
  createDEGdfs(whichdds, whichgroup, "inc.d3", "inc.d9")
  createDEGdfs(whichdds, whichgroup, "inc.d9", "inc.d17")
  createDEGdfs(whichdds, whichgroup, "inc.d17", "hatch")
  createDEGdfs(whichdds, whichgroup, "hatch", "n5")
  createDEGdfs(whichdds, whichgroup, "n5", "n9")

  # removal
  createDEGdfs(whichdds, whichgroup, "inc.d3", "m.inc.d3")
  createDEGdfs(whichdds, whichgroup, "inc.d9", "m.inc.d9")
  createDEGdfs(whichdds, whichgroup, "inc.d17", "m.inc.d17")
  createDEGdfs(whichdds, whichgroup, "hatch", "m.n2")

  # replacement
  createDEGdfs(whichdds, whichgroup, manipcontrols, replacements)
}

######## differential gene expression


ddsHf <- returndds2(c("female_hypothalamus"))
savealltheDEGs(ddsHf, "female_hypothalamus")

ddsHm <- returndds2(c("male_hypothalamus"))
savealltheDEGs(ddsHm, "male_hypothalamus")


ddsGm <- returndds2(c("male_gonads"))
savealltheDEGs(ddsGm, "male_gonads")

ddsGf <- returndds2(c("female_gonads"))
savealltheDEGs(ddsGf, "female_gonads")

ddsPf <- returndds2(c("female_pituitary"))
savealltheDEGs(ddsPf, "female_pituitary")

ddsPm <- returndds2(c("male_pituitary"))
savealltheDEGs(ddsPm, "male_pituitary")
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
library(tidyverse)
library(corrr)

samples <- read_csv("metadata/00_birds_sachecked.csv") %>%
  mutate(id = bird) %>%
  select(id, horm.id)
head(samples)

hormones <- read_csv("results/AllHorm_02042020_nomiss.csv") %>%
  mutate(id = str_replace_all(id, c("-" = ".", "/" = "."))) %>%
  mutate(prl = as.numeric(prl),
         cort = as.numeric(cort),
         p4 = as.numeric(p4),
         e2 = as.numeric(e2),
         t = as.numeric(t)) %>%
  full_join(., samples) %>%
  select(id, treatment, sex, prl, cort, p4, e2, t) %>%
  filter(treatment != "NA", treatment != ".")
tail(hormones)

femalegenesnhomrmones <- function(vsds){
  df <- vsds %>% 
    inner_join(hormones, ., by = "id") %>%
    filter(sex == "f") %>%
    select(-t) 
  print(head(df))
  return(df)
}

malegenesnhomrmones <- function(vsds){
  df <-  vsds %>% 
    inner_join(hormones, ., by = "id") %>%
    filter(sex == "m") %>%
    select(-e2)
  print(head(df))
  return(df)
}

pitvsdAll <- read_csv("results/03_pitvsdAll.csv")
hypvsdAll <- read_csv("results/03_hypvsdAll.csv")
gonvsdAll <- read_csv("results/03_gonvsdAll.csv")

pitf <- femalegenesnhomrmones(pitvsdAll)
gonf <- femalegenesnhomrmones(hypvsdAll)
hypf <- femalegenesnhomrmones(gonvsdAll) 

pitm <- malegenesnhomrmones(pitvsdAll)
gonm <- malegenesnhomrmones(hypvsdAll)
hypm <- malegenesnhomrmones(gonvsdAll) 




plotcorrelation <- function(df, xvar, yvar, xlab, ylab){
  df %>%
    ggplot(aes(x = xvar, y = yvar, color = sex)) +
    geom_point() +
    geom_smooth(method = "lm") +
    scale_y_log10() +
    scale_x_log10() +
    labs(x = xlab, y = ylab) 
}

plotcorrelation(pitm,
                pitm$prl, pitm$PRL,
                "circulating PRL", "PRL expression")

plotcorrelation(pitf,
                pitf$prl, pitf$PRL,
                "circulating PRL", "PRL expression")

calculatecorrs <- function(df){
  df2 <- df %>%
    select(-id, -sex, -treatment) %>%
    correlate(.)
  print(head(df2)[1:6])
  return(df2)
}

pitfcorrs <- calculatecorrs(pitf)
pitmcorrs <- calculatecorrs(pitm)
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
library(tidyverse)
library(readr)

source("R/themes.R")
source("R/genelists.R")

## All DEGs 

DEG_path <- "results/DEseq2/treatment/"   # path to the data
DEG_files <- dir(DEG_path, pattern = "*DEGs") # get file names
DEG_pathfiles <- paste0(DEG_path, DEG_files)
#DEG_files

allDEG <- DEG_pathfiles %>%
  setNames(nm = .) %>% 
  map_df(~read_csv(.x), .id = "file_name") %>% 
  mutate(DEG = sapply(strsplit(as.character(file_name),'results/DEseq2/treatment/'), "[", 2))  %>% 
  mutate(DEG = sapply(strsplit(as.character(DEG),'_diffexp.csv'), "[", 1))  %>% 
  mutate(tissue = sapply(strsplit(as.character(DEG),'\\.'), "[", 1)) %>%
  mutate(down = sapply(strsplit(as.character(DEG),'\\_'), "[", 3)) %>%
  mutate(up = sapply(strsplit(as.character(DEG),'\\_'), "[", 4)) %>%
  mutate(comparison = paste(down,up, sep = "_")) %>%
  mutate(sex = sapply(strsplit(as.character(sextissue),'\\_'), "[", 1)) %>%
  mutate(tissue = sapply(strsplit(as.character(sextissue),'\\_'), "[", 2)) %>%
  dplyr::select(sex,tissue,comparison, gene, lfc, 
                padj, direction, pvalue, direction2,  logpadj)  %>%
  mutate(posneg = ifelse(lfc >= 0, "+", "-")) %>%
  drop_na()
head(allDEG)


## drop pvalues from allDEG

allDEG <- allDEG %>%
  #remove pvalue
  select(-pvalue, -direction2) %>% 
  filter(direction != "NS")

# make df of candidate genes to add NS genes
gene <- candidategenes
candidategenedf <- as.data.frame(gene) 

## subset only candidates
candidateDEGs <- allDEG %>%
  mutate(sex = recode(sex, "female" = "F", "male" = "M" ),
         tissue = recode(tissue, 
                         "hypothalamus" = "H",
                         "pituitary" = "P",
                         "gonad" = "G",
                         "gonads" = "G"),
         group = paste(sex, tissue, sep = "")) %>%
  filter(gene %in% candidategenes) %>%
  mutate(compres = paste(group, posneg, sep = "")) %>%
  group_by(gene, comparison) %>%
  summarize(results = str_c(compres, collapse = "  ")) %>%
  pivot_wider(names_from = comparison, values_from = results) %>%
  full_join(., candidategenedf) %>%
  arrange(gene)
head(candidateDEGs)

# make tables


# helper function to add column of NAs if no DEGs exist
fncols <- function(data, cname) {
  add <-cname[!cname%in%names(data)]
  if(length(add)!=0) data[add] <- NA
  data
}

# make table of candidate pairwise degs
makefinaltables <- function(df, whichlevels){
  table <- fncols(df, whichlevels) %>% 
    select(gene, whichlevels)  
  return(table)
}

table1 <- makefinaltables(candidateDEGs, levelsbldgchar) %>%
  select(gene, bldg_control:bldg_n9)
table2 <- makefinaltables(candidateDEGs, c(levelsrm, levelsreplace))

tableS1 <- makefinaltables(candidateDEGs, levelscontrolcharmanip) %>%
  select(gene, bldg_control:control_n9)
tableS2 <- makefinaltables(candidateDEGs, levelssequential) 
tableS3 <- makefinaltables(candidateDEGs, levelscontrolcharmanip) %>%
  select(gene, control_m.inc.d3:control_extend)


## save files
write.csv(allDEG, "results/04_allDEG.csv", row.names = F)
write.csv(allDEG, "results/04_candidateDEGs.csv", row.names = F)

write_csv(table1, "results/table1.csv")
write_csv(table2, "results/table2.csv")

write_csv(tableS1, "results/tableS1.csv")
write_csv(tableS2, "results/tableS2.csv")
write_csv(tableS3, "results/tableS3.csv")

## sex differences

hypsex <- read_csv("results/DEseq2/sex/pituitary_female_male_DEGs.csv") %>%
  rename(tissue = sextissue)
pitsex <- read_csv("results/DEseq2/sex/hypothalamus_female_male_DEGs.csv") %>%
  rename(tissue = sextissue)
gonsex <- read_csv("results/DEseq2/sex/gonad_female_male_DEGs.csv")

sharedsexdifs <- rbind(hypsex, pitsex) %>%
  mutate(res = paste("higher in the", direction, tissue, sep = " "))  %>%
  filter(gene %in% candidategenes) %>%
  select(gene, res) %>%
  group_by(gene) %>%
  summarize(res = str_c(res, collapse = "; ")) %>%
  group_by(res) %>%
  summarize(genes = str_c(gene, collapse = " ")) %>%
  mutate(n = str_count(genes, " "),
         n = n + 1) %>%
  select(n, res, genes) %>%
  arrange(desc(n))

head(sharedsexdifs) 
write.csv(sharedsexdifs, "results/sharedsexdifs.csv")

head(allDEG)

allDEG %>%
  filter(tissue == "hypothalamus") %>%
  filter(comparison %in% c("bldg_control", "") )
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
library(tidyverse)
library(cowplot)
library(ggsignif)
library(stringr)

source("R/themes.R")
source("R/functions.R")

###### variance statbilized data

hypf <- wranglevsds("results/03_hypvsdf.csv")
hypm <- wranglevsds("results/03_hypvsdm.csv")
pitf <- wranglevsds("results/03_pitvsdf.csv")
pitm <- wranglevsds("results/03_pitvsdm.csv")
gonf <- wranglevsds("results/03_gonvsdf.csv")
gonm <- wranglevsds("results/03_gonvsdm.csv")

hyp <- rbind(hypf, hypm)
pit <- rbind(pitf, pitm)
gon <- rbind(gonf, gonm)

allDEG <- read_csv("results/04_allDEG.csv")

###### DEGs


filterDEGs <- function(whichlevels){  
  df <- allDEG %>% 
    filter(comparison %in% whichlevels) %>%
    droplevels() %>%
    mutate(comparison = factor(comparison, levels = whichlevels)) %>%
    mutate(updown = ifelse(lfc > 0, 1, -1)) %>%
    group_by(sex, tissue, comparison, direction, updown, .drop=FALSE) %>%
    summarise(n = n()) %>%
    mutate(n = n*updown ) %>%
    select(-updown) %>%
    dplyr::mutate(n = replace_na(n, 0))
  print(head(df))
  return(df)
}

##  figure 2 DEGs
fig2Acomps <- c("bldg_inc.d17", "inc.d9_inc.d17" )
fig2Dcomps <- c("inc.d17_m.inc.d17", "inc.d17_prolong" )

fig2Alabels <- c("inc.d9_inc.d17 vs.\ninc.d9", "inc.d9 vs.\ninc.d17" )
fig2Dlabels <- c("inc.d17vs\nm.inc.d17", "inc.d17vs\nprolong" )


fig2Adegs <- filterDEGs(fig2Acomps)
fig2Ddegs <- filterDEGs(fig2Dcomps)


a <- plotcandidatechar(hyp, "AVP") + labs(subtitle = "Hypothalamus", x = " ", title = "Characterization") 
f <- plotcandidatemanip(hyp, "AVP") + labs(subtitle = "Hypothalamus",  x = " ", title = "Manipulation")

b <- plotcandidatechar(pit, "PRL") + labs(subtitle = "Pituitary")
g <- plotcandidatemanip(pit, "PRL") + labs(subtitle = "Pituitary")

c <- plotcandidatechar(gon, "ESR1") + labs(subtitle = "Gonads", x = " ")
h <- plotcandidatemanip(gon, "ESR1") + labs(subtitle = "Gonads", x = " ")


d <- plot.volcano("pituitary", "bldg_inc.d17") + labs(subtitle = "Pituitary")
e <- plot.volcano("pituitary", "inc.d9_inc.d17") + labs(subtitle = " ", y = NULL) 

e2 <- makebargraphv4(fig2Adegs, "pituitary", "No. of DEGs", 
                     fig2Acomps, fig2Alabels)  + 
  labs(x = "Comparison", subtitle = "Pituitary", title = " ")


i <- plot.volcano("pituitary", "inc.d17_m.inc.d17")  + labs(subtitle = "Pituitary")
j <- plot.volcano("pituitary", "inc.d17_prolong") + labs(subtitle = " ", y = NULL)

j2 <- makebargraphv4(fig2Ddegs, "pituitary", "No. of DEGs", 
                     fig2Dcomps, fig2Dlabels) + 
  labs(x = "Comparison", subtitle = "Pituitary", title = " ")

abcde <- plot_grid(a,b,c, nrow = 1, 
                   labels = c("A", "B", "C" ), label_size = 8,
                   rel_widths = c(1,1,1))
fghij <- plot_grid(f,g,h,nrow = 1, 
                   labels = c("D", "E", "F" ), label_size = 8,
                   rel_widths = c(1,1,1))

fig2 <- plot_grid(abcde,fghij, nrow  = 2)
fig2

png(file = "figures/fig2-1.png", width = 7, height = 7, 
    units = 'in', res = 300)
plot(fig2) 
dev.off()

pdf(file = "figures/fig2-1.pdf", width=7, height=7)
plot(fig2)
dev.off()

### fig 3 DEGs

DEGbldg <- filterDEGs(levelsbldgchar)
DEGsequential <- filterDEGs(levelssequential)
DEGinc9 <- filterDEGs(levelsinc9)
DEGinc17 <- filterDEGs(levelsinc17)
DEGhatch <- filterDEGs(levelshatch)
DEGearly <- filterDEGs(levelsearly) %>%
  mutate(tissue = factor(tissue, levels = tissuelevels),
         sex = factor(sex , levels = sexlevels))





makefig3 <- function(tissue, label1){

  tissuetitle <- str_to_sentence(tissue, locale = "en")

  ylab <- paste(tissuetitle, "\nNo. of DEGs", sep = "")

  #ylab <- "Genes with increased expression"


  a <-  plot.volcano(tissue, "bldg_control") + 
    labs(subtitle = tissuetitle) 

  b <- makebargraphv5(DEGbldg, tissue, ylab, 
                      levelsbldgchar, labelsbldgchar) +
    labs(x = "Relative to nest-building controls,", subtitle = " ") 

  c <- makebargraphv5(DEGsequential, tissue, NULL,  
                      levelssequential, labelsssequential) +
    labs(x = "Relative to the previous stage,", subtitle = " ") 

  d <- makebargraphv5(DEGinc9, tissue, NULL,  
                      levelsinc9, labelsinc9) +
    labs(x = "... to inc.d9,", subtitle = " ") 


  e <- makebargraphv5(DEGinc17, tissue, NULL,  
                      levelsinc17, labelsinc17) +
    labs(x = "... to inc.d17,", subtitle = " ") 

  f <- makebargraphv5(DEGhatch, tissue, NULL,  
                      levelshatch, labelshatch) +
    labs(x = "... to hatch,", subtitle = " ") 

   g <- makebargraphv5(DEGearly, tissue, NULL,  
                      levelsearly, labelsearly) +
    labs(x = "Relative to early.", subtitle = " ") 

  fig <- plot_grid(b,c,d, e,f, nrow = 1, rel_widths = c(7,5.5,2.5,2.5,3.5),
                      labels = label1, label_size = 8, align = "h")

  return(fig)
}

ab <- makefig3("hypothalamus", c("A"))
cd <- makefig3("pituitary", c("B"))
ef <- makefig3("gonads", c("C"))

fig3 <- plot_grid(ab,cd,ef, nrow = 3)
fig3


png(file = "figures/fig3-1.png", width = 7, height = 7, 
    units = 'in', res = 300)
plot(fig3) 
dev.off()

pdf(file = "figures/fig3-1.pdf", width=7, height=7)
plot(fig3)
dev.off()
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
library(cowplot)

a <- png::readPNG("figures/fig_S1.png")
a <- ggdraw() +  draw_image(a, scale = 1)


b <- png::readPNG("figures/fig_S2.png")
b <- ggdraw() +  draw_image(b, scale = 1)


fig <- plot_grid(a,b, ncol = 1, rel_heights = c(1,1),
          labels = c("A", "B"), label_size = 8)
fig

pdf(file="figures/figsup1-1.pdf", width=5, height=5)
plot(fig)
dev.off()

png("figures/figsup1-1.png", width = 5, height = 5, 
    units = 'in', res = 300)
plot(fig) 
dev.off()
8
9
shell:
  "Rscript analysis/00_wrangle.R"
17
18
shell:
  "Rscript analysis/01_fig1.R"  
28
29
shell:
  "Rscript analysis/02_DESeq2.R"
39
40
shell:
  "Rscript analysis/03_vsd.R"    
49
50
shell:
  "Rscript analysis/04_DEGs.R"
60
61
shell:
  "Rscript analysis/05_figs23.R" 
68
69
shell:
  "Rscript analysis/06_figsup1.R"     
ShowHide 9 more snippets with no or duplicated tags.

Login to post a comment if you would like to share your experience with this workflow.

Do you know this workflow well? If so, you can request seller status , and start supporting this workflow.

Free

Created: 1yr ago
Updated: 1yr ago
Maitainers: public
URL: https://github.com/macmanes-lab/DoveParentsRNAseq
Name: doveparentsrnaseq
Version: v0.3
Badge:
workflow icon

Insert copied code into your website to add a link to this workflow.

Downloaded: 0
Copyright: Public Domain
License: None
  • Future updates

Related Workflows

cellranger-snakemake-gke
snakemake workflow to run cellranger on a given bucket using gke.
A Snakemake workflow for running cellranger on a given bucket using Google Kubernetes Engine. The usage of this workflow ...