EU-kalat: Difference between revisions

From Opasnet
Jump to navigation Jump to search
(→‎Calculations: code updated but not debugged)
 
(47 intermediate revisions by the same user not shown)
Line 12: Line 12:
== Answer ==
== Answer ==


[[File:Dioxin concentrations in Baltic herring.png|thumb|400px|Dioxin concentrations in Baltic herring.]]
The original sample results can be acquired from [http://en.opasnet.org/w/Special:Opasnet_Base?id=Op_en3104 Opasnet base]. The study showed that levels of PCDD/Fs and PCBs depends especially on the fish species. Highest levels were on salmon and large sized herring. Levels of PCDD/Fs exceeded maximum level of 4 pg TEQ/g fw multiple times. Levels of PCDD/Fs were correlated positively with age of the fish.
The original sample results can be acquired from [http://en.opasnet.org/w/Special:Opasnet_Base?id=Op_en3104 Opasnet base]. The study showed that levels of PCDD/Fs and PCBs depends especially on the fish species. Highest levels were on salmon and large sized herring. Levels of PCDD/Fs exceeded maximum level of 4 pg TEQ/g fw multiple times. Levels of PCDD/Fs were correlated positively with age of the fish.


Mean congener concentrations as WHO2005-TEQ in Baltic herring can be printed out with the Run code below.
Mean congener concentrations as WHO2005-TEQ in Baltic herring can be [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=9Q7skOCFyPfpJbmg printed out with this link] or by running the codel below.


<rcode embed=1 graphics=1>
<rcode graphics=1>
# This code is Op_en on page [[EU-kalat]].
# This code is Op_en on page [[EU-kalat]].


Line 47: Line 48:
ggplot(temp, aes(x = Congener, y = Result, fill = Congener)) + geom_bar(stat = "identity") + facet_wrap(~ Catch_location) +  
ggplot(temp, aes(x = Congener, y = Result, fill = Congener)) + geom_bar(stat = "identity") + facet_wrap(~ Catch_location) +  
labs(x = "Congener", y = "TEQ concentration (pg/g fat)") + theme_gray(base_size = 18) + coord_flip()
labs(x = "Congener", y = "TEQ concentration (pg/g fat)") + theme_gray(base_size = 18) + coord_flip()
temp2 <- temp[temp$Congener %in% levels(temp$Congener)[1:17],]
ggplot(temp2, aes(x = Congener, y = Result, fill = Congener)) + geom_bar(stat = "identity") + facet_wrap(~ Catch_location) +
  labs(
    title="Dioxin concentrations in Baltic herring",
    x = "Congener",
    y = "TEQ concentration (pg/g fat)"
  ) + theme_gray(base_size = 18) + coord_flip()
</rcode>
</rcode>


Line 57: Line 67:


=== Calculations ===
=== Calculations ===
==== Preprocess ====


* Preprocess model 22.2.2017 [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=p53h4gxtFfDMOtHz]
* Preprocess model 22.2.2017 [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=p53h4gxtFfDMOtHz]
** Objects used in [[Benefit-risk assessment of Baltic herring and salmon intake]]
** Objects used in [[Benefit-risk assessment of Baltic herring and salmon intake]]
* Model run 25.1.2017 [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=wzisMQHAqcF30zcl]
* Model run 25.1.2017 [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=wzisMQHAqcF30zcl]
* Model run 22.5.2017 with new ovariables euRaw, euAll, euMain, and euRatio [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=7uTqQeaekwRFwA2J]
* Model run 23.5.2017 with adjusted ovariables euRaw, eu, euRatio [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=qkseWM9rmRysGwKM]
* Model run 11.10.2017: Small herring and Large herring added as new species [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=cVl6fTatRsE0DwiQ]
* Model rerun 15.11.2017 because the previous stored run was lost in update [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=njuQREEjuMtY4MlQ]
* Model run 21.3.2018: Small and large herring replaced by actual fish length [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=rzvSZKM7ekjDqEhu]
* Model run 26.3.2018 eu2 moved here [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=XHZrJOwEdDwWGJQm]


<rcode name="preprocess" label="Preprocess" graphics=1>
See an updated version of preprocess code for eu on [[Health effects of Baltic herring and salmon: a benefit-risk assessment#Code for estimating TEQ from chinese PCB7]]
 
<rcode name="preprocess" label="Preprocess (for developers only)">
# This is code Op_en3104/preprocess on page [[EU-kalat]]
# This is code Op_en3104/preprocess on page [[EU-kalat]]
library(rjags)
library(OpasnetUtils)
library(OpasnetUtils)
library(ggplot2)
library(ggplot2)
library(reshape2)
library(reshape2)


# Divide herring into small and large (as new "species")
euRaw <- Ovariable("euRaw", ddata = "Op_en3104", subset = "POPs")
euRaw <- Ovariable("euRaw", ddata = "Op_en3104", subset = "POPs")
#euRaw <- EvalOutput(euRaw)
#temp <- euRaw[euRaw$Fish_species == "Baltic herring" , ]
#temp$Fish_species <- as.factor(
#  ifelse(
#    as.numeric(as.character(temp$Length_mean_mm)) >= 170,
#    "Large herring",
#    "Small herring"
#  )
#)
#euRaw <- combine(euRaw, temp, name = "euRaw")


# levels(TEF$Group)
# levels(TEF$Group)
Line 75: Line 106:
#[3] "Mono-ortho–substituted PCBs"  "Non-ortho–substituted PCBs"   
#[3] "Mono-ortho–substituted PCBs"  "Non-ortho–substituted PCBs"   


TEQgroups <- list(
indexguide <- Ovariable(data=data.frame( # This is needed to make these indices marginals in eu (which is in long format).
   In = c(
   Length = 1,
    "Chlorinated dibenzo-p-dioxins",  
  Year = 1,
    "Chlorinated dibenzofurans",
  Weight = 1,
    "Mono-ortho–substituted PCBs",
   Large = 1,
    "Non-ortho–substituted PCBs"
   Result = 1
   ),
))
   Out = c("TEQdx", "TEQdx", "TEQpcb", "TEQpcb")
)


euAll <- Ovariable(
eu <- Ovariable(
   "euAll",
   "eu",
   dependencies = data.frame(
   dependencies = data.frame(
     Name=c("euRaw", "TEF"),
     Name=c("euRaw", "TEF", "indexguide"),
     Ident=c(NA,"Op_en4017/initiate")
     Ident=c(NA,"Op_en4017/initiate", NA)
   ),
   ),
   formula = function(...) {
   formula = function(...) {
     euAll <- euRaw[,c(1:4, 18, 19)] # THL code, Matrix, Congener, Fish species
     euRaw$Length<-as.numeric(as.character(euRaw$Length_mean_mm))
     colnames(euAll@output)[1:4] <- c("THLcode", "Matrix", "Compound", "Fish")
    euRaw$Year <- as.numeric(substr(euRaw$Catch_date, nchar(as.character(euRaw$Catch_date))-3,100))
    if(exists("TEQgroups")) levels(TEF$Group) <- TEQgroups$Out
    euRaw@marginal[colnames(euRaw@output) %in% c("Length","Year")] <- TRUE
#    euRaw$Weight<-as.numeric(as.character(euRaw$Weight_mean_g))
#    euRaw$Large <- euRaw$Length >= 170
   
    eu <- euRaw[,c(1:4, 10, 20, 21, 18)] # See below + Length, Year, Result
     colnames(eu@output)[1:5] <- c("THLcode", "Matrix", "Compound", "Fish", "N")
      
      
     temp <- oapply(euAll * TEF, cols = "Compound", FUN = "sum")
     temp <- oapply(eu * TEF, cols = "Compound", FUN = "sum")
     colnames(temp@output)[colnames(temp@output)=="Group"] <- "Compound"
     colnames(temp@output)[colnames(temp@output)=="Group"] <- "Compound"
     eu2 <- combine(euAll, temp)
     eu <- combine(eu, temp)
      
      
     return(eu2)
    eu$Compound <- factor( # Compound levels are ordered based on the data table on [[TEF]]
      eu$Compound,
      levels = unique(c(levels(TEF$Compound), levels(eu$Compound)))
    )
    eu$Compound <- eu$Compound[,drop=TRUE]
   
     return(eu)
   }
   }
)
)


euAll <- EvalOutput(euAll)
eu2 <- Ovariable(
  "eu2",
  dependencies=data.frame(Name="eu", Ident = NA),
  formula = function(...) {
 
replaces <- list(
  c("Chlorinated dibenzo-p-dioxins", "PCDDF"),
  c("Chlorinated dibenzofurans", "PCDDF"),
  c("Mono-ortho-substituted PCBs", "PCB"),
  c("Non-ortho-substituted PCBs", "PCB")
)
eu2 <- eu
for(i in 1:length(replaces)) {
  levels(eu2$Compound)[levels(eu2$Compound)==replaces[[i]][1]] <- replaces[[i]][2]
}
 
eu2@marginal[colnames(eu2@output) %in% c("Length","Year")] <- TRUE # Indexguide should take care of this but it doesn't!
eu2 <- unkeep(eu2, prevresults = TRUE, sources = TRUE)
eu2 <- oapply(eu2, cols = "TEFversion", FUN = "sum") # Sums up dioxin+furan and non+monoortho. This goes wrong if > 1 TEFversion.
return(eu2)
}
)
 
euRatio <- Ovariable(
  "euRatio",
  dependencies = data.frame(Name=c("eu")),
  formula = function(...) {
    euRatio <- eu[
      eu$Compound == "2378TCDD" & eu$Matrix == "Muscle" & result(eu) != 0 , ] # Zeros cannot be used in ratio estimates
    euRatio$Compound <- NULL
    euRatio <- log10(eu / euRatio)@output
    euRatio <- euRatio[!euRatio$Compound %in% c("2378TCDD", "2378-TCDD", "TCDD") , ]
    return(euRatio)
  }
)
 
# Analysis: a few rows disappear here, as shown by numbers per fish species. Why?
# aggregate(eut@output, by = eut@output["Fish"], FUN = length)[[2]]
# [1] 1181  141  131  55  158  51  96  30  36  40  102  49  61  180 eu
# [1] 1173  141  131  55  158  51  96  27  36  40  102  49  53  180 eu/eut
# There are samples where TCDD concenctration is zero.
#eu@data[eu@data$POP == "2378TCDD" & eu@data$euResult == 0 , c(1,4)][[1]]
#eu@data[eu@data[[1]] %in% c("09K0585", "09K0698", "09K0740", "09K0748") , c(1,3,4,18)]
# Some rows disappear because there is no TCDD measurement to compare with:
#8 Baltic herrings, 8 sprats, 3 rainbow trouts.
# Conclusion: this is ok. Total 2292 rows.


################## Data for the main congeners and species only
################## Data for the main congeners and species only


#> unique(euAll$Congener)
#> unique(eu$Congener)
#[1] 2378TCDD    12378PeCDD  123478HCDD  123678HCDD  123789HCDD  1234678HpCDD
#[1] 2378TCDD    12378PeCDD  123478HCDD  123678HCDD  123789HCDD  1234678HpCDD
#[7] OCDD        2378TCDF    12378PeCDF  23478PeCDF  123478HCDF  123678HCDF   
#[7] OCDD        2378TCDF    12378PeCDF  23478PeCDF  123478HCDF  123678HCDF   
#[13] 123789HCDF  234678HCDF  1234678HpCDF 1234789HpCDF OCDF        CoPCB77 ...     
#[13] 123789HCDF  234678HCDF  1234678HpCDF 1234789HpCDF OCDF        CoPCB77 ...     


# Remove the four with too little data (>70% BDL) and all non-PCDDF
# Remove the four PCDDFs with too little data (>70% BDL) and all non-PCDDF
# aggregate(eu@data$euResult, by = eu@data["POP"], FUN = function(x) mean(x == 0))
# aggregate(eu@data$euResult, by = eu@data["POP"], FUN = function(x) mean(x == 0))
conl <- unique(euAll@output$Congener)[c(1:12, 14, 15)] # 7 OCDD should be removed
euMain <- euAll[euAll$Congener %in% conl , ]


#[1] Baltic herring Sprat          Salmon        Sea trout      Vendace       
#[1] Baltic herring Sprat          Salmon        Sea trout      Vendace       
#[6] Roach          Perch          Pike          Pike-perch    Burbot         
#[6] Roach          Perch          Pike          Pike-perch    Burbot         
#[11] Whitefish      Flounder      Bream          River lamprey  Cod           
#[11] Whitefish      Flounder      Bream          River lamprey  Cod           
#[16] Trout          Rainbow trout  Arctic char   
#[16] Trout          Rainbow trout  Arctic char  Small herring  Large herring


fisl <- unique(euAll@output$Fish)[c(1:4, 6:14, 17)]  
indices <- list(
euMain <- euMain[euMain$Fish %in% fisl , ] # Remove four with too little data
  Compound.TEQ2 = c("PCDDF", "PCB"),
  Compound.PCDDF14 = as.character(unique(euRaw$POP)[c(1:12, 14, 15)]), # 7 OCDD should be removed
  Fish.Fish16 = as.character(unique(euRaw$Fish_species)[c(1:4, 6:14, 17, 19, 20)])
)


euRatio <- euMain[
#> indices
  euMain$Congener == "2378TCDD" & euMain$Matrix == "Muscle" & result(euMain) != 0 , ] # Zeros cannot be used in ratio estimates
#$Compound.TEQ2
euRatio$Congener <- NULL
#[1] "PCDDF" "PCB"
euRatio <- log10(euMain / euRatio)@output
#
euRatio <- euRatio[euRatio$Congener %in% setdiff(conl, "2378TCDD") , ]
#$Compound.PCDDF14
 
# [1] "2378TCDD"     "12378PeCDD"  "123478HCDD"  "123678HCDD"  "123789HCDD"  "1234678HpCDD"
# Analysis: a few rows disappear here, as shown by numbers per fish species. Why?
# [7] "OCDD"        "2378TCDF"    "12378PeCDF"   "23478PeCDF"   "123478HCDF"   "123678HCDF"  
# aggregate(eut@output, by = eut@output["Fish"], FUN = length)[[2]]
#[13] "234678HCDF"   "1234678HpCDF"
# [1] 1181  141  131   55  158   51   96  30  36  40 102  49  61  180 eu
#
# [1] 1173  141  131   55  158  51  96  27  36  40  102  49  53  180 eu/eut
#$Fish.Fish16
# There are samples where TCDD concenctration is zero.
# [1] "Baltic herring" "Sprat"          "Salmon"        "Sea trout"      "Roach"        
#eu@data[eu@data$POP == "2378TCDD" & eu@data$euResult == 0 , c(1,4)][[1]]
# [6] "Perch"          "Pike"          "Pike-perch"    "Burbot"        "Whitefish"   
#eu@data[eu@data[[1]] %in% c("09K0585", "09K0698", "09K0740", "09K0748") , c(1,3,4,18)]
#[11] "Flounder"      "Bream"         "River lamprey" "Rainbow trout" "Small herring"  
# Some rows disappear because there is no TCDD measurement to compare with:
#[16] "Large herring"
#8 Baltic herrings, 8 sprats, 3 rainbow trouts.
# Conclusion: this is ok. Total 2292 rows.


ggplot(euAll@output, aes(x = euAllResult, colour = Fish))+geom_density()+
#ggplot(euRaw@output, aes(x=Length, y=Weight))+geom_point()+
  facet_wrap(~ Compound) + scale_x_log10()
facet_wrap(~Fish_species, scales="free")


ggplot(euRatio, aes(x = Result, colour = Fish))+geom_density()+
#oprint(summary(
  facet_wrap(~ Compound)  
#  euRaw[euRaw$Fish_species %in% c("Baltic herring","Salmon") & euRaw$POP == "2378TCDD" , ],
#  marginals=c("Catch_location","Large","Fish_species"),
#  fun = "length"
#))


objects.store(euAll, euMain, euRatio)
objects.store(euRaw, eu, eu2, euRatio, indices, indexguide)
cat("Ovariables euAll and euMain, and data.frame eutRatio stored.\n")
cat("Ovariables euRaw, eu, eu2, euRatio, and indexguide and list indices stored.\n")
</rcode>
</rcode>


Line 163: Line 250:
* Model run 19.5.2017 without ovariable concentration [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=bXhdwkBaQQi1LTcu] {{attack|# |The model does not mix well, so the results should not be used for final results.|--[[User:Jouni|Jouni]] ([[User talk:Jouni|talk]]) 19:37, 19 May 2017 (UTC)}}
* Model run 19.5.2017 without ovariable concentration [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=bXhdwkBaQQi1LTcu] {{attack|# |The model does not mix well, so the results should not be used for final results.|--[[User:Jouni|Jouni]] ([[User talk:Jouni|talk]]) 19:37, 19 May 2017 (UTC)}}
::{{comment|# |Maybe we should just estimate TEQs until the problem is fixed.|--[[User:Jouni|Jouni]] ([[User talk:Jouni|talk]]) 19:37, 19 May 2017 (UTC)}}
::{{comment|# |Maybe we should just estimate TEQs until the problem is fixed.|--[[User:Jouni|Jouni]] ([[User talk:Jouni|talk]]) 19:37, 19 May 2017 (UTC)}}
* Model run 22.5.2017 with TEQdx and TEQpcb as the only Compounds [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=2vTgALXXTzLgd4l1]
* Model run 23.5.2017 debugged [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=rMSAZy6PSKzKhHwp] [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=1P7ZPBbghEfisEcH] [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=BcZDhfjpv3fa4IRU]
* Model run 24.5.2017 TEQdx, TECpcb -> PCDDF, PCB [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=kNNzEMTSD4N2f0Yy]
* Model run 11.10.2017 with small and large herring [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=ICIWZTUZR6rlNwuD] (removed in update)
* Model run 12.3.2018: bugs fixed with data used in Bayes. In addition, redundant fish species removed and Omega assumed to be the same for herring and salmon. [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=k0n2CFnjdGBklm9E]
* Model run 22.3.2018 [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=2jX2XxWpiIEZPyzJ] Model does not mix well. Thinning gives little help?
* Model run 25.3.2018 with conc.param as ovariable [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=DbcmZJmuZ0h0vaGx]


<rcode name="bayes" label="Sample Bayes model (for developers only)" graphics=1>
<rcode name="bayes" label="Sample Bayes model (for developers only)" graphics=1>
Line 174: Line 268:
library(car) # scatterplotMatrix
library(car) # scatterplotMatrix


objects.latest("Op_en3104", code_name = "preprocess") # [[EU-kalat]]
objects.latest("Op_en3104", code_name = "preprocess") # [[EU-kalat]] eu, eu2, euRatio, indices


# Hierarchical Bayes model.
conl <- indices$Compound.TEQ2
 
# fisl <- indices$Fish.Fish16
# PCDD/F concentrations in fish.
# fisl <- c("Baltic herring","Large herring","Salmon","Small herring")
# It uses the sum of PCDD/F (Pcdsum) as the total concentration of dioxin in fish.
fisl <- c("Baltic herring","Salmon")
# Cong_j is the fraction of a congener from pcdsum.
# pcdsum is log-normally distributed. cong_j follows Dirichlet distribution.
# pcdsum depends on age of fish, fish species and catchment area, but we only have species now so other variables are omitted.
# cong_j depends on fish species.
 
if(FALSE) { # Multicongener model NOT updated to reflect new names: eu -> euMain
conl <- as.character(unique(eu@output$Congener))
fisl <- sort(as.character(unique(eu@output$Fish)))
C <- length(conl)
C <- length(conl)
Fi <- length(fisl)
Fi <- length(fisl)
Line 193: Line 279:
conl
conl
fisl
fisl
fishsamples <- reshape(
 
   eu@output,  
eu2 <- EvalOutput(eu2)
 
# Hierarchical Bayes model.
 
# PCDD/F concentrations in fish.
# It uses the TEQ sum of PCDD/F (PCDDF) as the total concentration
# of dioxin and PCB respectively for PCB in fish.
# PCDDF depends on size of fish, fish species, catchment time, and catchment area,
# but we omit catchment area. In addition, we assume that size of fish has
# zero effect for other fish than Baltic herring.
# Catchment year affects all species similarly.
 
eu2 <- eu2[eu2$Compound %in% conl & eu2$Fish %in% fisl & eu2$Matrix == "Muscle" , ]
eu3 <- reshape(  
   eu2@output,  
   v.names = "euResult",  
   v.names = "euResult",  
   idvar = "THLcode",  
   idvar = c("THLcode", "Fish"),
   timevar = "Congener",  
   timevar = "Compound",  
   drop = c("Matrix", "euSource"),  
   drop = c("Matrix"),  
   direction = "wide"
   direction = "wide"
)
)


# Find the level of quantification for dinterval function
oprint(head(eu3))
LOQ <- unlist(lapply(fishsamples[3:ncol(fishsamples)], FUN = function(x) min(x[x!=0])))
names(LOQ) <- conl
cong <- data.matrix(fishsamples[3:ncol(fishsamples)])
cong <- sapply(1:length(LOQ), FUN = function(x) ifelse(cong[,x]==0, 0.5*LOQ[x], cong[,x]))


mod <- textConnection("
#> colnames(eu3)
#[1] "THLcode"        "Fish"          "N"              "Length"        "Year"         
#[6] "euResult.PCDDF" "euResult.PCB" 
 
conc <- data.matrix(eu3[6:ncol(eu3)])
 
if(FALSE){
  # Find the level of quantification for dinterval function
  LOQ <- unlist(lapply(eu3[6:ncol(eu3)], FUN = function(x) min(x[x!=0])))
  # With TEQ, there are no zeros. So this is useful only if there are congener-specific results.
  names(LOQ) <- conl
 
 
  conc <- sapply(
    1:length(LOQ),
    FUN = function(x) ifelse(conc[,x]==0, 0.5*LOQ[x], conc[,x])
  )
}
# This version of the model looks only at Baltic herring and salmon.
# It assumes that all fish groups have the same Omega but mu varies.
 
mod <- textConnection(
  "
   model{
   model{
    for(i in 1:S) { # s = fish sample
  for(i in 1:S) { # S = fish sample
      #        below.LOQ[i,j] ~ dinterval(-cong[i,j], -LOQ[j])
  #        below.LOQ[i,j] ~ dinterval(-conc[i,j], -LOQ[j])
      cong[i,1:C] ~ dmnorm(mu[fis[i],], Omega[fis[i],,])
  conc[i,1:C] ~ dmnorm(muind[i,], Omega[fis[i],,])
    }
  muind[i,1:C] <- mu[fis[i],1:C] + lenp[fis[i]]*length[i] + timep*year[i]
    for(i in 1:Fi) { # Fi = fish species
  }
      for(j in 1:C) {
 
        mu[i,j] ~ dnorm(mu1[j], tau1[j])
  # Priors for parameters
      }
  # Time trend. Assumed a known constant because at the moment there is little time variation in data.
      Omega[i,1:C,1:C] ~ dwish(Omega0[1:C,1:C],S)
  # https://www.evira.fi/elintarvikkeet/ajankohtaista/2018/itameren-silakoissa-yha-vahemman-ymparistomyrkkyja---paastojen-rajoitukset-vaikuttavat/
      pred[i,1:C] ~ dmnorm(mu[i,], Omega[i,,]) # Model prediction
  # PCDDF/PCB-concentations 2001: 9 pg/g fw, 2016: 3.5 pg/g fw. (3.5/9)^(1/15)-1=-0.06102282
    }
  timep ~ dnorm(-0.0610, 10000)  
    for(i in 1:C) { # C = Compound
  lenp[1] ~ dnorm(0.01,0.01) # length parameter for herring
      mu1[i] ~ dnorm(0, 0.0001)
  lenp[2] ~ dnorm(0,10000) # length parameter for salmon: assumed zero
      tau1[i] ~ dunif(0,10000)
 
      pred1[i] ~ dnorm(mu1[i], tau1[i])
  for(i in 1:Fi) { # Fi = fish species
    }
  Omega[i,1:C,1:C] ~ dwish(Omega0[1:C,1:C],S)
    Omega1[1:C,1:C] ~ dwish(Omega0[1:C,1:C],S)
  pred[i,1:C] ~ dmnorm(mu[i,1:C]+lenp[i]*lenpred+timep*timepred, Omega[i,,]) # Model prediction.
  for(j in 1:C) {  
  mu[i,j] ~ dnorm(0, 0.0001) # mu1[j], tau1[j]) # Congener-specific mean for fishes
  }
  }
   }
   }
")
  ")


jags <- jags.model(
jags <- jags.model(
   mod,
   mod,
   data = list(
   data = list(
     S = nrow(fishsamples),
     S = nrow(eu3),
     C = C,
     C = C,
     Fi = Fi,
     Fi = Fi,
     cong = log(cong),
     conc = log(conc),
     #    LOQ = LOQ,
     length = eu3$Length-170, # Subtract average herring size
     #    below.LOQ = is.na(cong)*1,
     year = eu3$Year-2009, # Substract baseline year
     fis = match(fishsamples$Fish, fisl),
     fis = match(eu3$Fish, fisl),
    lenpred = 233-170,
    timepred = 2009-2009,
     Omega0 = diag(C)/100000
     Omega0 = diag(C)/100000
   ),
   ),
Line 245: Line 369:
   n.adapt = 100
   n.adapt = 100
)
)
} # if(FALSE)


conl <- c("TEQdx", "TEQpcb")
update(jags, 1000)
fisl <- sort(as.character(unique(euMain$Fish))) # Take only the main fish species
 
samps.j <- jags.samples(
  jags,
  c(
    'mu', # mean by fish and compound
    'Omega', # precision matrix by compound
    'lenp',# parameters for length
    'timep', # parameter for Year
    'pred' # predicted concentration for year 2009 and length 17 cm
  ),
  #  thin=1000,
  N
)
dimnames(samps.j$Omega) <- list(Fish = fisl, Compound = conl, Compound2 = conl, Iter=1:N, Chain=1:4)
dimnames(samps.j$mu) <- list(Fish = fisl, Compound = conl, Iter = 1:N, Chain = 1:4)
dimnames(samps.j$lenp) <- list(Fish = fisl, Iter = 1:N, Chain = 1:4)
dimnames(samps.j$pred) <- list(Fish = fisl, Compound = conl, Iter = 1:N, Chain = 1:4)
dimnames(samps.j$timep) <- list(Dummy = "time", Iter = 1:N, Chain = 1:4)
 
##### conc.param contains expected values of the distribution parameters from the model
 
conc.param <- Ovariable(
  "conc.param",
  dependencies = data.frame(Name = "samps.j", Ident=NA),
  formula = function(...) {
    conc.param <- list(
      Omega = apply(samps.j$Omega, MARGIN = 1:3, FUN = mean),
      lenp = cbind(
        mean = apply(samps.j$lenp, MARGIN = 1, FUN = mean),
        sd = apply(samps.j$lenp, MARGIN = 1, FUN = sd)
      ),
      mu = apply(samps.j$mu, MARGIN = 1:2, FUN = mean),
      timep = cbind(
        mean = apply(samps.j$timep, MARGIN = 1, FUN = mean),
        sd = apply(samps.j$timep, MARGIN = 1, FUN = sd)
      )
    )
    names(dimnames(conc.param$lenp)) <- c("Fish","Metaparam")
    names(dimnames(conc.param$timep)) <- c("Dummy","Metaparam")
    conc.param <- melt(conc.param)
    colnames(conc.param)[colnames(conc.param)=="value"] <- "Result"
    colnames(conc.param)[colnames(conc.param)=="L1"] <- "Parameter"
    conc.param$Dummy <- NULL
    conc.param$Metaparam <- ifelse(is.na(conc.param$Metaparam), conc.param$Parameter, as.character(conc.param$Metaparam))
    return(Ovariable(output=conc.param, marginal=colnames(conc.param)!="Result"))
  }
)
 
objects.store(conc.param, samps.j)
cat("Lists conc.params and samps.j stored.\n")
 
######################3
 
cat("Descriptive statistics:\n")
 
# Leave only the main fish species and congeners and remove others
 
#oprint(summary(
#  eu2[eu2$Compound %in% indices$Compound.PCDDF14 & eu$Fish %in% fisl , ],
#  marginals = c("Fish", "Compound"), # Matrix is always 'Muscle'
#  function_names = c("mean", "sd")
#))
 
ggplot(eu2@output, aes(x = euResult, colour=Compound))+stat_ecdf()+
  facet_wrap( ~ Fish)+scale_x_log10()
 
ggplot(eu2@output,
      aes(x = Length, y = euResult, colour=Compound))+geom_point()+
  facet_grid(Compound ~ Fish, scale="free_x")+scale_y_log10()
 
scatterplotMatrix(t(exp(samps.j$pred[1,,,1])), main = "Predictions for all compounds for Baltic herring")
scatterplotMatrix(t(exp(samps.j$pred[,1,,1])), main = "Predictions for all fish species for PCDDF")
scatterplotMatrix(t(samps.j$Omega[,1,1,,1]))
#scatterplotMatrix(t(cbind(samps.j$Omega[1,1,1,,1],samps.j$mu[1,1,,1])))
 
plot(coda.samples(jags, 'Omega', N))
plot(coda.samples(jags, 'mu', N))
plot(coda.samples(jags, 'lenp', N))
plot(coda.samples(jags, 'timep', N))
plot(coda.samples(jags, 'pred', N))
</rcode>
 
==== Initiate conc_pcddf for PFAS disease burden study ====
 
===== Initiate euw data.frame =====
This code is similar to preprocess but is better and includes PFAS concentrations from [[:op_fi:PFAS-yhdisteiden tautitaakka]]. It produces data.frame euw that is the EU-kalat + PFAS data in wide format and, for PFAS but not EU-kalat, a sampled value for measurements below the level of quantification.
 
<rcode name="preprocess2" label="Preprocess and initiate data.frame euw (for developers only)" embed=1>
# This is code Op_en3104/preprocess2 on page [[EU-kalat]]
library(OpasnetUtils)
library(ggplot2)
library(reshape2)
 
openv.setN(1)
opts = options(stringsAsFactors = FALSE)
euRaw <- Ovariable("euRaw", ddata = "Op_en3104", subset = "POPs") # [[EU-kalat]]
 
eu <- Ovariable(
  "eu",
  dependencies = data.frame(
    Name=c("euRaw", "TEF"),
    Ident=c(NA,"Op_en4017/initiate")
  ),
  formula = function(...) {
    out <- euRaw
    out$Length<-as.numeric(as.character(out$Length_mean_mm))
    out$Year <- as.numeric(substr(out$Catch_date, nchar(as.character(out$Catch_date))-3,100))
    out$Weight<-as.numeric(as.character(out$Weight_mean_g))
 
    out <- out[,c(1:6, 8: 10, 14:17, 19:22, 18)] # See below
   
    #[1] "ﮮTHL_code"            "Matrix"                "POP"                  "Fish_species"       
    #[5] "Catch_site"            "Catch_location"        "Catch_season"          "Catch_square"       
    #[9] "N_individuals"        "Sex"                  "Age"                  "Fat_percentage"     
    #[13] "Dry_matter_percentage" "euRawSource"          "Length"                "Year"               
    #[17] "Weight"                "euRawResult"         
   
    colnames(out@output)[1:13] <- c("THLcode", "Matrix", "Compound", "Fish", "Site", "Location", "Season",
                                  "Square","N","Sex","Age","Fat","Dry_matter")
    out@marginal <- colnames(out)!="euRawResult"
   
    tmp <- oapply(out * TEF, cols = "Compound", FUN = "sum")
    colnames(tmp@output)[colnames(tmp@output)=="Group"] <- "Compound"
    # levels(tmp$Compound)
    # [1] "Chlorinated dibenzo-p-dioxins" "Chlorinated dibenzofurans"    "Mono-ortho-substituted PCBs" 
    # [4] "Non-ortho-substituted PCBs" 
    levels(tmp$Compound) <- c("PCDD","PCDF","moPCB","noPCB")
   
    out <- OpasnetUtils::combine(out, tmp)
   
    out$Compound <- factor( # Compound levels are ordered based on the data table on [[TEF]]
      out$Compound,
      levels = unique(c(levels(TEF$Compound), unique(out$Compound)))
    )
    out$Compound <- out$Compound[,drop=TRUE]
   
    return(out)
  }
)
 
eu <- EvalOutput(eu)
 
euw <- reshape(
  eu@output,
  v.names = "euResult",
  idvar = c("THLcode", "Matrix", "Fish"), # , "Site","Location","Season","Square","N","Sex","Age","Fat", "Dry_matter","Length","Year","Weight"
  timevar = "Compound",
  drop = c("euRawSource","TEFversion","TEFrawSource","TEFSource","Source","euSource"),
  direction = "wide"
)
colnames(euw) <- gsub("euResult\\.","",colnames(euw))
euw$PCDDF <- euw$PCDD + euw$PCDF
euw$PCB <- euw$noPCB + euw$moPCB
euw$TEQ <- euw$PCDDF + euw$PCB
euw$PFOA <- euw$PFOA / 1000 # pg/g --> ng/g
euw$PFOS <- euw$PFOS / 1000 # pg/g --> ng/g
euw$PFAS <- euw$PFOA + euw$PFOS
 
#################### PFAS measurements from Porvoo
 
conc_pfas_raw <- EvalOutput(Ovariable(
  "conc_pfas_raw",
  data=opbase.data("Op_fi5932", subset="PFAS concentrations"), # [[PFAS-yhdisteiden tautitaakka]]
  unit="ng/g f.w.")
)@output
 
conc_pfas_raw <- reshape(conc_pfas_raw,
                        v.names="conc_pfas_rawResult",
                        timevar="Compound",
                        idvar=c("Obs","Fish"),
                        drop="conc_pfas_rawSource",
                        direction="wide")
 
colnames(conc_pfas_raw) <- gsub("conc_pfas_rawResult\\.","",colnames(conc_pfas_raw))
conc_pfas_raw <- within(conc_pfas_raw, PFAS <- PFOS + PFHxS + PFOA + PFNA)
conc_pfas_raw$Obs <- NULL
 
euw <- orbind(euw, conc_pfas_raw)
 
objects.store(euw)
cat("Data.frame euw stored.\n")
</rcode>
 
===== Initiate conc_param using Bayesian approach =====
Bayesian approach for PCDDF, PCB, OT, PFAS.
* Model run 2021-03-08 [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=ZvJDOo7xL8d7x7EI]
* Model run 2021-03-08 [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=VpSUS4pfGavspLG9] with the fish needed in PFAS assessment
* Model run 2021-03-12 [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=Lc9KWY7r1tTuGWVD] using euw
* Model run 2021-03-13 [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=pTiMHkD4Lq0EdLab] with location parameter for PFAS
* Model run 2021-03-17 [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=Mrko9rrynNRELP07] location parameter not plotted because problems with older R version in Opasnet.
 
<rcode name="pollutant_bayes" label="Initiate conc_param with PCDDF, PFAS, OT (for developers only)" embed=0 graphics=1>
# This is code Op_en3104/pollutant_bayes on page [[EU-kalat]]
# The code is also available at https://github.com/jtuomist/pfas/blob/main/conc_pcddf_preprocess.R
 
library(OpasnetUtils)
library(reshape2)
library(rjags) # JAGS
library(ggplot2)
library(MASS) # mvrnorm
library(car) # scatterplotMatrix
 
#' Find the level of quantification for dinterval function
#' @param df data.frame
#' @return data.matrix
add_loq <- function(df) { # This should reflect the fraction of observations below LOQ.
  LOQ <- unlist(lapply(df, FUN = function(x) min(x[x!=0], na.rm=TRUE)))
  out <- sapply(
    1:length(LOQ),
    FUN = function(x) ifelse(df[,x]==0, 0.5*LOQ[x], df[,x])
  )
  out <- data.matrix(out)
  return(out)
}
 
#size <- Ovariable("size", ddata="Op_en7748", subset="Size distribution of fish species")
#time <- Ovariable("time", data = data.frame(Result=2015))
 
objects.latest("Op_en3104", code_name = "preprocess2") # [[EU-kalat]] euw
 
# Hierarchical Bayes model.
 
# PCDD/F concentrations in fish.
# It uses the TEQ sum of PCDD/F (PCDDF) as the total concentration
# of dioxin and PCB respectively for PCB in fish.
# PCDDF depends on size of fish, fish species, catchment time, and catchment area,
# but we omit catchment area. In addition, we assume that size of fish has
# zero effect for other fish than Baltic herring.
# Catchment year affects all species similarly.
 
eu3 <- euw[!colnames(euw) %in% c("MPhT","DOT","BDE138")] # No values > 0
 
eu3 <- eu3[eu3$Matrix == "Muscle" , ]
eu3$Locat <- ifelse(eu3$Location=="Porvoo",2,
                      ifelse(eu3$Location=="Helsinki, Vanhankaupunginlahti Bay",3,1))
locl <- c("Finland","Porvoo","Helsinki")
 
#conl_nd <- c("PFAS","PFOA","PFOS","DBT","MBT","TBT","DPhT","TPhT")
conl_nd <- c("PFAS","PFOS") # TBT would drop Porvoo measurements
fisl <- fisl_nd <- c("Baltic herring","Bream","Flounder","Perch","Roach","Salmon","Whitefish")
 
eu4 <- eu3[rowSums(is.na(eu3[conl_nd]))<length(conl_nd) & eu3$Fish %in% fisl_nd ,
          c(1:5,match(c("Locat",conl_nd),colnames(eu3)))]
 
conc_nd <- add_loq(eu4[conl_nd])
 
conl <- c("TEQ","PCDDF","PCB") # setdiff(colnames(eu3)[-(1:5)], conl_nd)
eu3 <- eu3[!is.na(eu3$PCDDF) & eu3$Fish %in% fisl , c(1:5, match(conl,colnames(eu3)))]
 
oprint(head(eu3))
oprint(head(eu4))
 
C <- length(conl)
C <- length(conl)
Fi <- length(fisl)
Fi <- length(fisl)
N <- 1000
N <- 200
thin <- 100
conl
conl
fisl
fisl
fishsamples <- euAll@output[euAll$Compound %in% conl , ]
conl_nd
fishsamples <- reshape(
fisl_nd
  fishsamples,
 
  v.names = "euAllResult",
eu3 <- eu3[rowSums(is.na(eu3))==0,]
  idvar = "THLcode",
conc <- add_loq(eu3[conl]) # Remove rows with missing data.
  timevar = "Compound",  
  drop = c("Matrix", "euRawSource"),
  direction = "wide"
)


# Find the level of quantification for dinterval function
# The model assumes that all fish groups have the same Omega but mu varies.
LOQ <- unlist(lapply(fishsamples[3:ncol(fishsamples)], FUN = function(x) min(x[x!=0])))
names(LOQ) <- conl
cong <- data.matrix(fishsamples[3:ncol(fishsamples)])
cong <- sapply(1:length(LOQ), FUN = function(x) ifelse(cong[,x]==0, 0.5*LOQ[x], cong[,x]))


mod <- textConnection("
mod <- textConnection(
  "
   model{
   model{
     for(i in 1:S) { # s = fish sample
     for(i in 1:S) { # S = fish sample
       #        below.LOQ[i,j] ~ dinterval(-cong[i,j], -LOQ[j])
       #        below.LOQ[i,j] ~ dinterval(-conc[i,j], -LOQ[j])
       cong[i,1:C] ~ dmnorm(mu[fis[i],], Omega[fis[i],,])
       conc[i,1:C] ~ dmnorm(muind[i,], Omega[fis[i],,])
      muind[i,1:C] <- mu[fis[i],1:C] #+ lenp[fis[i]]*length[i] + timep*year[i]
     }
     }
    for(i in 1:S_nd) {
      for(j in 1:C_nd) {
        conc_nd[i,j] ~ dnorm(muind_nd[i,j], tau_nd[j])
        muind_nd[i,j] <- mu_nd[fis_nd[i],j] + mulocat[locat[i]] #+ lenp[fis[i]]*length[i] + timep*year[i]
      }
    }
   
    # Priors for parameters
    # Time trend. Assumed a known constant because at the moment there is little time variation in data.
    # https://www.evira.fi/elintarvikkeet/ajankohtaista/2018/itameren-silakoissa-yha-vahemman-ymparistomyrkkyja---paastojen-rajoitukset-vaikuttavat/
    # PCDDF/PCB-concentations 2001: 9 pg/g fw, 2016: 3.5 pg/g fw. (3.5/9)^(1/15)-1=-0.06102282
  #  timep ~ dnorm(-0.0610, 10000)
  #  lenp[1] ~ dnorm(0.01,0.01) # length parameter for herring
  #  lenp[2] ~ dnorm(0,10000) # length parameter for salmon: assumed zero
   
     for(i in 1:Fi) { # Fi = fish species
     for(i in 1:Fi) { # Fi = fish species
      Omega[i,1:C,1:C] ~ dwish(Omega0[1:C,1:C],S)
      pred[i,1:C] ~ dmnorm(mu[i,1:C], Omega[i,,]) #+lenp[i]*lenpred+timep*timepred, Omega[i,,]) # Model prediction.
       for(j in 1:C) {  
       for(j in 1:C) {  
         mu[i,j] ~ dnorm(mu1[j], tau1[j])
         mu[i,j] ~ dnorm(0, 0.0001) # mu1[j], tau1[j]) # Congener-specific mean for fishes
       }
       }
      Omega[i,1:C,1:C] ~ dwish(Omega0[1:C,1:C],S)
      pred[i,1:C] ~ dmnorm(mu[i,], Omega[i,,]) # Model prediction
     }
     }
     for(i in 1:C) { # C = Compound
     # Non-dioxins
      mu1[i] ~ dnorm(0, 0.0001)
    mulocat[1] <- 0
       tau1[i] ~ dunif(0,10000)
    mulocat[2] ~ dnorm(0,0.001)
       pred1[i] ~ dnorm(mu1[i], tau1[i])
    mulocat[3] ~ dnorm(0,0.001)
    for(j in 1:C_nd) {
       tau_nd[j] ~ dgamma(0.001,0.001)
       for(i in 1:Fi_nd) { # Fi = fish species
        pred_nd[i,j] ~ dnorm(mu[i,j], tau_nd[j])
        mu_nd[i,j] ~ dnorm(0, 0.0001)
      }
     }
     }
    Omega1[1:C,1:C] ~ dwish(Omega0[1:C,1:C],S)
   }
   }
")
")
Line 295: Line 684:
   mod,
   mod,
   data = list(
   data = list(
     S = nrow(fishsamples),
     S = nrow(conc),
    S_nd = nrow(conc_nd),
     C = C,
     C = C,
    C_nd = ncol(conc_nd),
     Fi = Fi,
     Fi = Fi,
     cong = log(cong),
     Fi_nd = length(fisl_nd),
     #    LOQ = LOQ,
    conc = log(conc),
     #    below.LOQ = is.na(cong)*1,
    conc_nd = log(conc_nd),
     fis = match(fishsamples$Fish, fisl),
    locat = eu4$Locat,
     #    length = eu3$Length-170, # Subtract average herring size
     #    year = eu3$Year-2009, # Substract baseline year
    fis = match(eu3$Fish, fisl),
     fis_nd = match(eu4$Fish, fisl_nd),
    #    lenpred = 233-170,
    #    timepred = 2009-2009,
     Omega0 = diag(C)/100000
     Omega0 = diag(C)/100000
   ),
   ),
   n.chains = 4,
   n.chains = 4,
   n.adapt = 100
   n.adapt = 200
)
)


update(jags, 100)
update(jags, 1000)


samps.j <- jags.samples(
samps.j <- jags.samples(
   jags,  
   jags,  
   c('mu', 'Omega', 'pred', 'mu1', 'Omega1', 'tau1', 'pred1'),  
   c(
   N
    'mu', # mean by fish and compound
    'Omega', # precision matrix by compound
    #    'lenp',# parameters for length
    #    'timep', # parameter for Year
    'pred', # predicted concentration for year 2009 and length 17 cm
    'pred_nd',
    'mu_nd',
    'tau_nd',
    'mulocat'
  ),
  thin=thin,
   N*thin
)
)
dimnames(samps.j$Omega) <- list(Fish = fisl, Compound = conl, Compound2 = conl, Iter=1:N, Chain=1:4)
dimnames(samps.j$mu) <- list(Fish = fisl, Compound = conl, Iter = 1:N, Chain = 1:4)
dimnames(samps.j$mu) <- list(Fish = fisl, Compound = conl, Iter = 1:N, Chain = 1:4)
dimnames(samps.j$mu1) <- list(Compound = conl, Iter = 1:N, Chain = 1:4)
#dimnames(samps.j$lenp) <- list(Fish = fisl, Iter = 1:N, Chain = 1:4)
dimnames(samps.j$pred) <- list(Fish = fisl, Compound = conl, Iter = 1:N, Chain = 1:4)
dimnames(samps.j$pred) <- list(Fish = fisl, Compound = conl, Iter = 1:N, Chain = 1:4)
dimnames(samps.j$mu1) <- list(Compound = conl, Iter = 1:N, Chain = 1:4)
dimnames(samps.j$pred_nd) <- list(Fish = fisl_nd, Compound = conl_nd, Iter = 1:N, Chain = 1:4)
dimnames(samps.j$tau1) <- list(Compound = conl, Iter = 1:N, Chain = 1:4)
dimnames(samps.j$mu_nd) <- list(Fish = fisl_nd, Compound = conl_nd, Iter = 1:N, Chain = 1:4)
dimnames(samps.j$pred1) <- list(Compound = conl, Iter = 1:N, Chain = 1:4)
dimnames(samps.j$tau_nd) <- list(Compound = conl_nd, Iter = 1:N, Chain = 1:4)
dimnames(samps.j$Omega) <- list(Fish = fisl, Compound = conl, Compound2 = conl, Iter=1:N, Chain=1:4)
#dimnames(samps.j$timep) <- list(Dummy = "time", Iter = 1:N, Chain = 1:4)
dimnames(samps.j$Omega1) <- list(Compound = conl, Compound2 = conl, Iter=1:N, Chain=1:4)
dimnames(samps.j$mulocat) <- list(Area = locl, Iter = 1:N, Chain = 1:4)
 
##### conc_param contains expected values of the distribution parameters from the model


##### conc.param contains expected values of the distribution parameters from the model
conc_param <- list(
conc.param <- list(
   Omega = apply(samps.j$Omega, MARGIN = 1:3, FUN = mean),
   mu = apply(samps.j$mu[,,,1], MARGIN = 1:2, FUN = mean),
   #      lenp = cbind(
   Omega = apply(samps.j$Omega[,,,,1], MARGIN = 1:3, FUN = mean),
  #        mean = apply(samps.j$lenp, MARGIN = 1, FUN = mean),
   pred.mean = apply(samps.j$pred[,,,1], MARGIN = 1:2, FUN = mean),
   #        sd = apply(samps.j$lenp, MARGIN = 1, FUN = sd)
   pred.sd = apply(samps.j$pred[,,,1], MARGIN = 1:2, FUN = sd),
  #      ),
   mu1 = apply(samps.j$mu1[,,1], MARGIN = 1, FUN = mean),
  mu = apply(samps.j$mu, MARGIN = 1:2, FUN = mean),
   tau1 = apply(samps.j$tau1[,,1], MARGIN = 1, FUN = mean),
   #      timep = cbind(
   pred1.mean = apply(samps.j$pred1[,,1], MARGIN = 1, FUN = mean),
  #        mean = apply(samps.j$timep, MARGIN = 1, FUN = mean),
   pred1.sd = apply(samps.j$pred1[,,1], MARGIN = 1, FUN = sd)
   #        sd = apply(samps.j$timep, MARGIN = 1, FUN = sd)
  #      )
   mu_nd = apply(samps.j$mu_nd, MARGIN = 1:2, FUN = mean),
   tau_nd = apply(samps.j$tau_nd, MARGIN = 1, FUN = mean),
   mulocat = apply(samps.j$mulocat, MARGIN = 1, FUN = mean)
)
)
#    names(dimnames(conc_param$lenp)) <- c("Fish","Metaparam")
#    names(dimnames(conc_param$timep)) <- c("Dummy","Metaparam")
conc_param <- melt(conc_param)
colnames(conc_param)[colnames(conc_param)=="value"] <- "Result"
colnames(conc_param)[colnames(conc_param)=="L1"] <- "Parameter"
conc_param$Compound[conc_param$Parameter =="tau_nd"] <- conl_nd # drops out because one-dimensional
conc_param$Area[conc_param$Parameter =="mulocat"] <- locl # drops out because one-dimensional
conc_param <- fillna(conc_param,c("Fish","Area"))
for(i in 1:ncol(conc_param)) {
  if("factor" %in% class(conc_param[[i]])) conc_param[[i]] <- as.character(conc_param[[i]])
}
conc_param <- Ovariable("conc_param",data=conc_param)
objects.store(conc_param)
cat("Ovariable conc_param stored.\n")
######################3
cat("Descriptive statistics:\n")
# Leave only the main fish species and congeners and remove others
#oprint(summary(
#  eu2[eu2$Compound %in% indices$Compound.PCDDF14 & eu$Fish %in% fisl , ],
#  marginals = c("Fish", "Compound"), # Matrix is always 'Muscle'
#  function_names = c("mean", "sd")
#))
#tmp <- euw[euw$Compound %in% c("PCDDF","PCB","BDE153","PBB153","PFOA","PFOS","DBT","MBT","TBT"),]
#ggplot(tmp, aes(x = eu2Result, colour=Fish))+stat_ecdf()+
#  facet_wrap( ~ Compound, scales="free_x")+scale_x_log10()
dimnames(samps.j$mulocat)
scatterplotMatrix(t(exp(samps.j$pred[2,,,1])), main = paste("Predictions for several compounds for",
                                                            names(samps.j$pred[,1,1,1])[2]))
scatterplotMatrix(t(exp(samps.j$pred[,1,,1])), main = paste("Predictions for all fish species for",
                                                            names(samps.j$pred[1,,1,1])[1]))
scatterplotMatrix(t(samps.j$Omega[2,,1,,1]), main = "Omega for several compounds in Baltic herring")
scatterplotMatrix(t((samps.j$pred_nd[1,,,1])), main = paste("Predictions for several compounds for",
                                                            names(samps.j$pred_nd[,1,1,1])[1]))
#scatterplotMatrix(t((samps.j$mulocat[,,1])), main = paste("Predictions for location average difference",
#                                                            names(samps.j$pred_nd[,1,1,1])[1]))


objects.store(conc.param, samps.j)
#plot(coda.samples(jags, 'Omega', N))
cat("Lists conc.params and samps.j stored.\n")
plot(coda.samples(jags, 'mu', N*thin, thin))
#plot(coda.samples(jags, 'lenp', N))
#plot(coda.samples(jags, 'timep', N))
plot(coda.samples(jags, 'pred', N*thin, thin))
plot(coda.samples(jags, 'mu_nd', N*thin, thin))
plot(coda.samples(jags, 'mulocat', N*thin, thin))
tst <- (coda.samples(jags, 'pred', N))
</rcode>


# Predictions for all congeners of fish1 (Baltic herring)
===== Initiate conc_poll=====
scatterplotMatrix(t(samps.j$pred[1,,,1]))
<rcode name="conc_poll" label="Initiate conc_poll" embed=1>
# Means for all congeners of the generic fish
#This is code Op_en3104/conc_poll on page [[EU-kalat]]
scatterplotMatrix(t(samps.j$mu1[,,1]))
# Prediction for all congeners of the generic fish
scatterplotMatrix(t(samps.j$pred1[,,1]))
# Predictions for all fish species for TCDD
scatterplotMatrix(t(samps.j$pred[,1,,1]))
# Predictions for pike for omegas and PeCDD
scatterplotMatrix(t(samps.j$Omega[6,2,,,1]))


ggplot(melt(exp(samps.j$pred[,,,1])), aes(x=value, colour=Compound))+geom_density()+
library(OpasnetUtils)
  facet_wrap( ~ Fish,scales = "free_y")+scale_x_log10()
ggplot(eu@output, aes(x = euResult, colour=Congener))+geom_density()+
  facet_wrap( ~ Fish, scales = "free_y")+scale_x_log10()
#stat_ellipse()


coda.j <- coda.samples(
conc_poll <- Ovariable(
   jags,  
  "conc_poll",
   c('mu', 'pred', 'omega1', 'pred1'),  
  dependencies = data.frame(
   N
    Name=c("conc_param"), #,"lengt","time"),
    Ident=c("Op_en3104/pollutant_bayes")#,NA,NA)
   ),
   formula=function(...) {
    require(MASS)
    tmp1 <- conc_param + Ovariable(data=data.frame(Result="0-1")) # Ensures Iter #  lengt + time +
    tmp2 <- unique(tmp1@output[setdiff(
      colnames(tmp1@output)[tmp1@marginal],
      c("Compound","Compound2","Metaparam","Parameter")
    )])
    tmp2$Row <- 1:nrow(tmp2)
    tmp3 <- merge(tmp2,tmp1@output)
    out <- data.frame()
    for(i in 1:nrow(tmp2)) {
     
      ############## PCDDF (with multivariate mvnorm)
     
      tmp <- tmp3[tmp3$Row == i , ]
      Omega <- solve(tapply(
        tmp$conc_paramResult[tmp$Parameter=="Omega"],
        tmp[tmp$Parameter=="Omega", c("Compound","Compound2")],
        sum # Equal to identity because only 1 row per cell.
      )) # Precision matrix
      con <- names(Omega[,1])
     
      mu <- tmp$conc_paramResult[tmp$Parameter=="mu"][match(con,tmp$Compound[tmp$Parameter=="mu"])] # + # baseline
#        rnorm(1,
#              tmp$conc_paramResult[tmp$Parameter=="lenp" & tmp$Metaparam=="mean"][1],
#              tmp$conc_paramResult[tmp$Parameter=="lenp" & tmp$Metaparam=="sd"][1]
#        ) * (tmp$lengtResult[1]-170) + # lengt
#        rnorm(1,
#              tmp$conc_paramResult[tmp$Parameter=="timep" & tmp$Metaparam=="mean"][1],
#              tmp$conc_paramResult[tmp$Parameter=="timep" & tmp$Metaparam=="sd"][1]
#        )* (tmp$timeResult[1]-2009) # time
     
      rnd <- exp(mvrnorm(1, mu, Omega))
      out <- rbind(out, merge(tmp2[i,], data.frame(Compound=con,Result=rnd)))
     
      #################### PFAS etc (with univariate norm)
      con <- tmp$Compound[tmp$Parameter=="mu_nd"]
      mu <- tmp$conc_paramResult[tmp$Parameter=="mu_nd"]
      tau <- tmp$conc_paramResult[tmp$Parameter=="tau_nd"][match(con,tmp$Compound[tmp$Parameter=="tau_nd"])]
      mulocat <- tmp$conc_paramResult[tmp$Parameter=="mulocat"]
      for(j in 1:length(con)) {
        rnd <- exp(rnorm(1 , mu[j] + mulocat , tau[j]))
        out <- rbind(out,
                    data.frame(tmp2[i,],Compound = con[j],Result = rnd)
                    )
      }
    }
    out$Row <- NULL
#    temp <- aggregate(
#      out["Result"],
#      by=out[setdiff(colnames(out), c("Result","Compound"))],
#      FUN=sum
#    )
#    temp$Compound <- "TEQ"
    out <- Ovariable(
      output = out, # rbind(out, temp),
      marginal = colnames(out) != "Result"
    )
    return(out)
   }
)
)


plot(coda.j)
objects.store(conc_poll)
cat("Ovariable conc_poll stored.\n")
</rcode>
</rcode>


'''Initiate concentration
==== Initiate conc_pcddf for Goherr ====


* Model run 19.5.2017 [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=ystfGN6yfNwWNfnq]
* Model run 19.5.2017 [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=ystfGN6yfNwWNfnq]
* Model run 23.5.2017 with bugs fixed [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=8iYF4GXFO9bUnld4]
* Model run 12.10.2017: TEQ calculation added [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=3WLbMeiZfbvo10ko]
* Model rerun 15.11.2017 because the previous stored run was lost in update [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=1Pq6y6l1WsKJEmjY]
* 12.3.2018 adjusted to match the same Omega for all fish species [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=KwbJ1pUP98De8Gzc]
* 26.3.2018 includes length and time as parameters, lengt ovariable initiated here [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=MV1xrtP9i6JuN7Tn]


<rcode name="initiate" label="Initiate concentration (for developers only)">
<rcode name="initiate" label="Initiate conc_pcddf (for developers only)" embed=1>
# This is code Op_en3104/initiate on page [[EU-kalat]]
# This is code Op_en3104/initiate on page [[EU-kalat]]


library(OpasnetUtils)
library(OpasnetUtils)


concentration <- Ovariable(
lengt <- Ovariable(
   "concentration",
   "lengt",
   dependencies = data.frame(Name = "conc.param", Ident = "Op_en3104/bayes"),
   dependencies=data.frame(Name="size",Ident=NA),
   formula = function(...) {
   formula = function(...) {
     jsp <- lapply(
     size$Lower <- as.numeric(as.character(size$Lower))
       1:length(conc.param$mu[,1]),
    rep <- unique(size@output[
       FUN = function(x) {
       colnames(size@output)!="Lower" & size@marginal
         temp <- exp(mvrnorm(openv$N, conc.param$mu[x,], conc.param$Omega[x,,]))
      ])
         dimnames(temp) <- c(list(Iter = 1:openv$N), dimnames(conc.param$mu)[2])
    out <- data.frame()
        return(temp)
    name <- paste(size@name, "Result", sep="")
    for(j in 1:nrow(rep)) {
      siz <- numeric()
      tmp <- merge(rep[j,,drop=FALSE], size@output)
      tmp <- tmp[order(tmp$Lower),]
      num <- tmp[[name]]/sum(tmp[[name]])
       for(i in 1:(nrow(tmp)-1)) { # Pick at random from each bin
         siz <- c(siz,runif(
          openv$N,
          tmp$Lower[i],
          tmp$Lower[i+1]
         )[1:ceiling(num[i]*openv$N)])
       }
       }
      out <- rbind(out, cbind(
        rep[j,,drop=FALSE],
        Iter=1:openv$N,
        Result=sample(siz, openv$N) # Take fixed amount and shuffle
      ))
    }
    return(Ovariable(output=out, marginal=!grepl("Result", colnames(out))))
  }
)
conc_pcddf <- Ovariable(
  "conc_pcddf",
  dependencies = data.frame(
    Name=c("conc.param","lengt","time"),
    Ident=c("Op_en3104/bayes",NA,NA)
  ),
  formula=function(...) {
    require(MASS)
    tmp1 <- lengt + time + conc.param + Ovariable(data=data.frame(Result="0-1")) # Ensures Iter
    tmp2 <- unique(tmp1@output[setdiff(
      colnames(tmp1@output)[tmp1@marginal],
      c("Compound","Compound2","Metaparam","Parameter")
    )])
    tmp2$Row <- 1:nrow(tmp2)
    tmp3 <- merge(tmp2,tmp1@output)
    out <- data.frame()
    con <- levels(tmp1$Compound)
    for(i in 1:nrow(tmp2)) {
      tmp <- tmp3[tmp3$Row == i , ]
      mu <- tmp$conc.paramResult[tmp$Parameter=="mu"][match(con,tmp$Compound[tmp$Parameter=="mu"])] + # baseline
        rnorm(1,
          tmp$conc.paramResult[tmp$Parameter=="lenp" & tmp$Metaparam=="mean"][1],
          tmp$conc.paramResult[tmp$Parameter=="lenp" & tmp$Metaparam=="sd"][1]
        )* (tmp$lengtResult[1]-170) + # lengt
        rnorm(1,
          tmp$conc.paramResult[tmp$Parameter=="timep" & tmp$Metaparam=="mean"][1],
          tmp$conc.paramResult[tmp$Parameter=="timep" & tmp$Metaparam=="sd"][1]
        )* (tmp$timeResult[1]-2009) # time
   
      Omega <- solve(tapply( # Is it sure that PCDDF and PCB are not mixed to wrong order?
        tmp$conc.paramResult[tmp$Parameter=="Omega"],
        tmp[tmp$Parameter=="Omega", c("Compound","Compound2")],
        sum # Equal to identity because only 1 row per cell.
      )) # Precision matrix
   
      rnd <- exp(mvrnorm(1, mu, Omega))
      out <- rbind(out, merge(tmp2[i,], data.frame(Compound=names(rnd),Result=rnd)))
    }
    out$Row <- NULL
    temp <- aggregate(
      out["Result"],
      by=out[setdiff(colnames(out), c("Result","Compound"))],
      FUN=sum
     )
     )
     names(jsp) <- dimnames(conc.param$mu)[[1]]
     temp$Compound <- "TEQ"
    jsp <- melt(jsp, value.name = "Result")
     out <- Ovariable(
    colnames(jsp)[colnames(jsp)=="L1"] <- "Fish" # Convert automatic name to meaningful
      output = rbind(out, temp),
     jsp <- Ovariable(output=jsp, marginal = colnames(jsp) != "Result")
      marginal = colnames(out) != "Result"
     return(jsp)
    )
     return(out)
   }
   }
)
)


objects.store(concentration)
objects.store(conc_pcddf, lengt)
cat("Ovariable concentration stored.\n")
cat("Ovariables conc_pcddf, lengt stored.\n")
</rcode>
</rcode>
{{attack|# |These codes should be coherent with [[POPs in Baltic herring]].|--[[User:Jouni|Jouni]] ([[User talk:Jouni|talk]]) 12:14, 7 June 2017 (UTC)}}


==See also==
==See also==

Latest revision as of 09:04, 17 March 2021


EU-kalat is a study, where concentrations of PCDD/Fs, PCBs, PBDEs and heavy metals have been measured from fish

Question

The scope of EU-kalat study was to measure concentrations of persistent organic pollutants (POPs) including dioxin (PCDD/F), PCB and BDE in fish from Baltic sea and Finnish inland lakes and rivers. [1] [2] [3].

Answer

Dioxin concentrations in Baltic herring.

The original sample results can be acquired from Opasnet base. The study showed that levels of PCDD/Fs and PCBs depends especially on the fish species. Highest levels were on salmon and large sized herring. Levels of PCDD/Fs exceeded maximum level of 4 pg TEQ/g fw multiple times. Levels of PCDD/Fs were correlated positively with age of the fish.

Mean congener concentrations as WHO2005-TEQ in Baltic herring can be printed out with this link or by running the codel below.

+ Show code

Rationale

Data

Data was collected between 2009-2010. The study contains years, tissue type, fish species, and fat content for each concentration measurement. Number of observations is 285.

There is a new study EU-kalat 3, which will produce results in 2016.

Calculations

Preprocess

  • Preprocess model 22.2.2017 [4]
  • Model run 25.1.2017 [5]
  • Model run 22.5.2017 with new ovariables euRaw, euAll, euMain, and euRatio [6]
  • Model run 23.5.2017 with adjusted ovariables euRaw, eu, euRatio [7]
  • Model run 11.10.2017: Small herring and Large herring added as new species [8]
  • Model rerun 15.11.2017 because the previous stored run was lost in update [9]
  • Model run 21.3.2018: Small and large herring replaced by actual fish length [10]
  • Model run 26.3.2018 eu2 moved here [11]

See an updated version of preprocess code for eu on Health effects of Baltic herring and salmon: a benefit-risk assessment#Code for estimating TEQ from chinese PCB7

+ Show code

Bayes model for dioxin concentrations

  • Model run 28.2.2017 [12]
  • Model run 28.2.2017 with corrected survey model [13]
  • Model run 28.2.2017 with Mu estimates [14]
  • Model run 1.3.2017 [15]
  • Model run 23.4.2017 [16] produces list conc.param and ovariable concentration
  • Model run 24.4.2017 [17]
  • Model run 19.5.2017 without ovariable concentration [18] ⇤--#: . The model does not mix well, so the results should not be used for final results. --Jouni (talk) 19:37, 19 May 2017 (UTC) (type: truth; paradigms: science: attack)
----#: . Maybe we should just estimate TEQs until the problem is fixed. --Jouni (talk) 19:37, 19 May 2017 (UTC) (type: truth; paradigms: science: comment)
  • Model run 22.5.2017 with TEQdx and TEQpcb as the only Compounds [19]
  • Model run 23.5.2017 debugged [20] [21] [22]
  • Model run 24.5.2017 TEQdx, TECpcb -> PCDDF, PCB [23]
  • Model run 11.10.2017 with small and large herring [24] (removed in update)
  • Model run 12.3.2018: bugs fixed with data used in Bayes. In addition, redundant fish species removed and Omega assumed to be the same for herring and salmon. [25]
  • Model run 22.3.2018 [26] Model does not mix well. Thinning gives little help?
  • Model run 25.3.2018 with conc.param as ovariable [27]

+ Show code

Initiate conc_pcddf for PFAS disease burden study

Initiate euw data.frame

This code is similar to preprocess but is better and includes PFAS concentrations from op_fi:PFAS-yhdisteiden tautitaakka. It produces data.frame euw that is the EU-kalat + PFAS data in wide format and, for PFAS but not EU-kalat, a sampled value for measurements below the level of quantification.

+ Show code

Initiate conc_param using Bayesian approach

Bayesian approach for PCDDF, PCB, OT, PFAS.

  • Model run 2021-03-08 [28]
  • Model run 2021-03-08 [29] with the fish needed in PFAS assessment
  • Model run 2021-03-12 [30] using euw
  • Model run 2021-03-13 [31] with location parameter for PFAS
  • Model run 2021-03-17 [32] location parameter not plotted because problems with older R version in Opasnet.

+ Show code

Initiate conc_poll

+ Show code

Initiate conc_pcddf for Goherr

  • Model run 19.5.2017 [33]
  • Model run 23.5.2017 with bugs fixed [34]
  • Model run 12.10.2017: TEQ calculation added [35]
  • Model rerun 15.11.2017 because the previous stored run was lost in update [36]
  • 12.3.2018 adjusted to match the same Omega for all fish species [37]
  • 26.3.2018 includes length and time as parameters, lengt ovariable initiated here [38]

+ Show code

⇤--#: . These codes should be coherent with POPs in Baltic herring. --Jouni (talk) 12:14, 7 June 2017 (UTC) (type: truth; paradigms: science: attack)

See also

References

  1. A. Hallikainen, H. Kiviranta, P. Isosaari, T. Vartiainen, R. Parmanne, P.J. Vuorinen: Kotimaisen järvi- ja merikalan dioksiinien, furaanien, dioksiinien kaltaisten PCB-yhdisteiden ja polybromattujen difenyylieettereiden pitoisuudet. Elintarvikeviraston julkaisuja 1/2004. [1]
  2. E-R.Venäläinen, A. Hallikainen, R. Parmanne, P.J. Vuorinen: Kotimaisen järvi- ja merikalan raskasmetallipitoisuudet. Elintarvikeviraston julkaisuja 3/2004. [2]
  3. Anja Hallikainen, Riikka Airaksinen, Panu Rantakokko, Jani Koponen, Jaakko Mannio, Pekka J. Vuorinen, Timo Jääskeläinen, Hannu Kiviranta. Itämeren kalan ja muun kotimaisen kalan ympäristömyrkyt: PCDD/F-, PCB-, PBDE-, PFC- ja OT-yhdisteet. Eviran tutkimuksia 2/2011. ISSN 1797-2981 ISBN 978-952-225-083-4 [3]