EU-kalat: Difference between revisions

From Opasnet
Jump to navigation Jump to search
(→‎Bayes model for dioxin concentrations: length and year in model)
 
(26 intermediate revisions by the same user not shown)
Line 78: Line 78:
* Model rerun 15.11.2017 because the previous stored run was lost in update [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=njuQREEjuMtY4MlQ]
* Model rerun 15.11.2017 because the previous stored run was lost in update [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=njuQREEjuMtY4MlQ]
* Model run 21.3.2018: Small and large herring replaced by actual fish length [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=rzvSZKM7ekjDqEhu]
* Model run 21.3.2018: Small and large herring replaced by actual fish length [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=rzvSZKM7ekjDqEhu]
* Model run 26.3.2018 eu2 moved here [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=XHZrJOwEdDwWGJQm]
See an updated version of preprocess code for eu on [[Health effects of Baltic herring and salmon: a benefit-risk assessment#Code for estimating TEQ from chinese PCB7]]


<rcode name="preprocess" label="Preprocess (for developers only)">
<rcode name="preprocess" label="Preprocess (for developers only)">
Line 139: Line 142:
     return(eu)
     return(eu)
   }
   }
)
eu2 <- Ovariable(
  "eu2",
  dependencies=data.frame(Name="eu", Ident = NA),
  formula = function(...) {
replaces <- list(
  c("Chlorinated dibenzo-p-dioxins", "PCDDF"),
  c("Chlorinated dibenzofurans", "PCDDF"),
  c("Mono-ortho-substituted PCBs", "PCB"),
  c("Non-ortho-substituted PCBs", "PCB")
)
eu2 <- eu
for(i in 1:length(replaces)) {
  levels(eu2$Compound)[levels(eu2$Compound)==replaces[[i]][1]] <- replaces[[i]][2]
}
eu2@marginal[colnames(eu2@output) %in% c("Length","Year")] <- TRUE # Indexguide should take care of this but it doesn't!
eu2 <- unkeep(eu2, prevresults = TRUE, sources = TRUE)
eu2 <- oapply(eu2, cols = "TEFversion", FUN = "sum") # Sums up dioxin+furan and non+monoortho. This goes wrong if > 1 TEFversion.
return(eu2)
}
)
)


Line 210: Line 236:
#))
#))


objects.store(euRaw, eu, euRatio, indices, indexguide)
objects.store(euRaw, eu, eu2, euRatio, indices, indexguide)
cat("Ovariables euRaw, eu, euRatio, and indexguide and list indices stored.\n")
cat("Ovariables euRaw, eu, eu2, euRatio, and indexguide and list indices stored.\n")
</rcode>
</rcode>


Line 230: Line 256:
* Model run 12.3.2018: bugs fixed with data used in Bayes. In addition, redundant fish species removed and Omega assumed to be the same for herring and salmon. [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=k0n2CFnjdGBklm9E]
* Model run 12.3.2018: bugs fixed with data used in Bayes. In addition, redundant fish species removed and Omega assumed to be the same for herring and salmon. [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=k0n2CFnjdGBklm9E]
* Model run 22.3.2018 [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=2jX2XxWpiIEZPyzJ] Model does not mix well. Thinning gives little help?
* Model run 22.3.2018 [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=2jX2XxWpiIEZPyzJ] Model does not mix well. Thinning gives little help?
* Model run 25.3.2018 with conc.param as ovariable [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=DbcmZJmuZ0h0vaGx]


<rcode name="bayes" label="Sample Bayes model (for developers only)" graphics=1>
<rcode name="bayes" label="Sample Bayes model (for developers only)" graphics=1>
Line 241: Line 268:
library(car) # scatterplotMatrix
library(car) # scatterplotMatrix


objects.latest("Op_en3104", code_name = "preprocess") # [[EU-kalat]] eu, euRatio, indices
objects.latest("Op_en3104", code_name = "preprocess") # [[EU-kalat]] eu, eu2, euRatio, indices
 
eu2 <- eu <- EvalOutput(eu)


conl <- indices$Compound.TEQ2
conl <- indices$Compound.TEQ2
Line 255: Line 280:
fisl
fisl


replaces <- list(
eu2 <- EvalOutput(eu2)
  c("Chlorinated dibenzo-p-dioxins", "PCDDF"),
  c("Chlorinated dibenzofurans", "PCDDF"),
  c("Mono-ortho-substituted PCBs", "PCB"),
  c("Non-ortho-substituted PCBs", "PCB")
)
 
for(i in 1:length(replaces)) {
  levels(eu2$Compound)[levels(eu2$Compound)==replaces[[i]][1]] <- replaces[[i]][2]
}
 
eu2@marginal[colnames(eu2@output) %in% c("Length","Year")] <- TRUE # Indexguide should take care of this but it doesn't!
eu2 <- unkeep(eu2, prevresults = TRUE, sources = TRUE)
eu2 <- oapply(eu2, cols = "TEFversion", FUN = "sum") # Sums up dioxin+furan and non+monoortho. This goes wrong if > 1 TEFversion.


# Hierarchical Bayes model.
# Hierarchical Bayes model.
Line 367: Line 379:
     'lenp',# parameters for length
     'lenp',# parameters for length
     'timep', # parameter for Year
     'timep', # parameter for Year
     'pred' # predicted concentration for year 2009 and length 17 cm  
     'pred' # predicted concentration for year 2009 and length 17 cm
   ),  
   ),  
   #  thin=1000,
   #  thin=1000,
Line 376: Line 388:
dimnames(samps.j$lenp) <- list(Fish = fisl, Iter = 1:N, Chain = 1:4)
dimnames(samps.j$lenp) <- list(Fish = fisl, Iter = 1:N, Chain = 1:4)
dimnames(samps.j$pred) <- list(Fish = fisl, Compound = conl, Iter = 1:N, Chain = 1:4)
dimnames(samps.j$pred) <- list(Fish = fisl, Compound = conl, Iter = 1:N, Chain = 1:4)
dimnames(samps.j$timep) <- list(Param = "time", Iter = 1:N, Chain = 1:4)
dimnames(samps.j$timep) <- list(Dummy = "time", Iter = 1:N, Chain = 1:4)


##### conc.param contains expected values of the distribution parameters from the model
##### conc.param contains expected values of the distribution parameters from the model
conc.param <- list(
 
  Omega = apply(samps.j$Omega, MARGIN = 1:3, FUN = mean),
conc.param <- Ovariable(
  lenp.mean = apply(samps.j$lenp, MARGIN = 1, FUN = mean),
  "conc.param",
  lenp.sd = apply(samps.j$lenp, MARGIN = 1, FUN = sd),
  dependencies = data.frame(Name = "samps.j", Ident=NA),
  mu = apply(samps.j$mu, MARGIN = 1:2, FUN = mean),
  formula = function(...) {
  timep.mean = apply(samps.j$timep, MARGIN = 1, FUN = mean),
    conc.param <- list(
  timep.sd = apply(samps.j$timep, MARGIN = 1, FUN = sd)
      Omega = apply(samps.j$Omega, MARGIN = 1:3, FUN = mean),
      lenp = cbind(
        mean = apply(samps.j$lenp, MARGIN = 1, FUN = mean),
        sd = apply(samps.j$lenp, MARGIN = 1, FUN = sd)
      ),
      mu = apply(samps.j$mu, MARGIN = 1:2, FUN = mean),
      timep = cbind(
        mean = apply(samps.j$timep, MARGIN = 1, FUN = mean),
        sd = apply(samps.j$timep, MARGIN = 1, FUN = sd)
      )
    )
    names(dimnames(conc.param$lenp)) <- c("Fish","Metaparam")
    names(dimnames(conc.param$timep)) <- c("Dummy","Metaparam")
    conc.param <- melt(conc.param)
    colnames(conc.param)[colnames(conc.param)=="value"] <- "Result"
    colnames(conc.param)[colnames(conc.param)=="L1"] <- "Parameter"
    conc.param$Dummy <- NULL
    conc.param$Metaparam <- ifelse(is.na(conc.param$Metaparam), conc.param$Parameter, as.character(conc.param$Metaparam))
    return(Ovariable(output=conc.param, marginal=colnames(conc.param)!="Result"))
  }
)
)


if(FALSE){
objects.store(conc.param, samps.j)
objects.store(conc.param, samps.j)
cat("Lists conc.params and samps.j stored.\n")
cat("Lists conc.params and samps.j stored.\n")
Line 406: Line 436:
ggplot(eu2@output, aes(x = euResult, colour=Compound))+stat_ecdf()+
ggplot(eu2@output, aes(x = euResult, colour=Compound))+stat_ecdf()+
   facet_wrap( ~ Fish)+scale_x_log10()
   facet_wrap( ~ Fish)+scale_x_log10()
}
fislen <- c(233, 170)
jsp <- lapply(1:length(conc.param$mu[, 1]), FUN = function(x) {
  temp <- exp(mvrnorm(
    openv$N,
    conc.param$mu[x, ]+conc.param$lenp.mean*(fislen[x]-170)+conc.param$timep.mean*(2009-2009),
    solve(conc.param$Omega[x, , ])
  ))
  dimnames(temp) <- c(list(Iter = 1:openv$N), dimnames(conc.param$mu)[2])
  return(temp)
})
names(jsp) <- dimnames(conc.param$mu)[[1]]
jsp <- melt(jsp, value.name = "Result")
colnames(jsp)[colnames(jsp) == "L1"] <- "Fish"
   
ggplot()+
  stat_ecdf(data=eu2@output, aes(x=euResult, colour="Data"))+
  stat_ecdf(data=melt(exp(samps.j$pred[,,,1])), aes(x=value, colour="Bayes estimate"))+
  stat_ecdf(data=jsp, aes(x=Result, colour="MC estimate"))+
  facet_grid(Compound ~ Fish)+scale_x_log10()


ggplot(eu2@output,  
ggplot(eu2@output,  
Line 436: Line 443:
scatterplotMatrix(t(exp(samps.j$pred[1,,,1])), main = "Predictions for all compounds for Baltic herring")
scatterplotMatrix(t(exp(samps.j$pred[1,,,1])), main = "Predictions for all compounds for Baltic herring")
scatterplotMatrix(t(exp(samps.j$pred[,1,,1])), main = "Predictions for all fish species for PCDDF")
scatterplotMatrix(t(exp(samps.j$pred[,1,,1])), main = "Predictions for all fish species for PCDDF")
scatterplotMatrix(t(samps.j$Omega[,1,1,,1]))
#scatterplotMatrix(t(cbind(samps.j$Omega[1,1,1,,1],samps.j$mu[1,1,,1])))


plot(coda.samples(jags, 'Omega', N))
plot(coda.samples(jags, 'Omega', N))
Line 444: Line 453:
</rcode>
</rcode>


==== Initiate conc_pcddf ====
==== Initiate conc_pcddf for PFAS disease burden study ====
 
===== Initiate euw data.frame =====
This code is similar to preprocess but is better and includes PFAS concentrations from [[:op_fi:PFAS-yhdisteiden tautitaakka]]. It produces data.frame euw that is the EU-kalat + PFAS data in wide format and, for PFAS but not EU-kalat, a sampled value for measurements below the level of quantification.
 
<rcode name="preprocess2" label="Preprocess and initiate data.frame euw (for developers only)" embed=1>
# This is code Op_en3104/preprocess2 on page [[EU-kalat]]
library(OpasnetUtils)
library(ggplot2)
library(reshape2)
 
openv.setN(1)
opts = options(stringsAsFactors = FALSE)
euRaw <- Ovariable("euRaw", ddata = "Op_en3104", subset = "POPs") # [[EU-kalat]]
 
eu <- Ovariable(
  "eu",
  dependencies = data.frame(
    Name=c("euRaw", "TEF"),
    Ident=c(NA,"Op_en4017/initiate")
  ),
  formula = function(...) {
    out <- euRaw
    out$Length<-as.numeric(as.character(out$Length_mean_mm))
    out$Year <- as.numeric(substr(out$Catch_date, nchar(as.character(out$Catch_date))-3,100))
    out$Weight<-as.numeric(as.character(out$Weight_mean_g))
 
    out <- out[,c(1:6, 8: 10, 14:17, 19:22, 18)] # See below
   
    #[1] "ﮮTHL_code"            "Matrix"                "POP"                  "Fish_species"       
    #[5] "Catch_site"            "Catch_location"        "Catch_season"          "Catch_square"       
    #[9] "N_individuals"        "Sex"                  "Age"                  "Fat_percentage"     
    #[13] "Dry_matter_percentage" "euRawSource"          "Length"                "Year"               
    #[17] "Weight"                "euRawResult"         
   
    colnames(out@output)[1:13] <- c("THLcode", "Matrix", "Compound", "Fish", "Site", "Location", "Season",
                                  "Square","N","Sex","Age","Fat","Dry_matter")
    out@marginal <- colnames(out)!="euRawResult"
   
    tmp <- oapply(out * TEF, cols = "Compound", FUN = "sum")
    colnames(tmp@output)[colnames(tmp@output)=="Group"] <- "Compound"
    # levels(tmp$Compound)
    # [1] "Chlorinated dibenzo-p-dioxins" "Chlorinated dibenzofurans"    "Mono-ortho-substituted PCBs" 
    # [4] "Non-ortho-substituted PCBs" 
    levels(tmp$Compound) <- c("PCDD","PCDF","moPCB","noPCB")
   
    out <- OpasnetUtils::combine(out, tmp)
   
    out$Compound <- factor( # Compound levels are ordered based on the data table on [[TEF]]
      out$Compound,
      levels = unique(c(levels(TEF$Compound), unique(out$Compound)))
    )
    out$Compound <- out$Compound[,drop=TRUE]
   
    return(out)
  }
)
 
eu <- EvalOutput(eu)
 
euw <- reshape(
  eu@output,
  v.names = "euResult",
  idvar = c("THLcode", "Matrix", "Fish"), # , "Site","Location","Season","Square","N","Sex","Age","Fat", "Dry_matter","Length","Year","Weight"
  timevar = "Compound",
  drop = c("euRawSource","TEFversion","TEFrawSource","TEFSource","Source","euSource"),
  direction = "wide"
)
colnames(euw) <- gsub("euResult\\.","",colnames(euw))
euw$PCDDF <- euw$PCDD + euw$PCDF
euw$PCB <- euw$noPCB + euw$moPCB
euw$TEQ <- euw$PCDDF + euw$PCB
euw$PFOA <- euw$PFOA / 1000 # pg/g --> ng/g
euw$PFOS <- euw$PFOS / 1000 # pg/g --> ng/g
euw$PFAS <- euw$PFOA + euw$PFOS
 
#################### PFAS measurements from Porvoo
 
conc_pfas_raw <- EvalOutput(Ovariable(
  "conc_pfas_raw",
  data=opbase.data("Op_fi5932", subset="PFAS concentrations"), # [[PFAS-yhdisteiden tautitaakka]]
  unit="ng/g f.w.")
)@output
 
conc_pfas_raw <- reshape(conc_pfas_raw,
                        v.names="conc_pfas_rawResult",
                        timevar="Compound",
                        idvar=c("Obs","Fish"),
                        drop="conc_pfas_rawSource",
                        direction="wide")
 
colnames(conc_pfas_raw) <- gsub("conc_pfas_rawResult\\.","",colnames(conc_pfas_raw))
conc_pfas_raw <- within(conc_pfas_raw, PFAS <- PFOS + PFHxS + PFOA + PFNA)
conc_pfas_raw$Obs <- NULL
 
euw <- orbind(euw, conc_pfas_raw)
 
objects.store(euw)
cat("Data.frame euw stored.\n")
</rcode>
 
===== Initiate conc_param using Bayesian approach =====
Bayesian approach for PCDDF, PCB, OT, PFAS.
* Model run 2021-03-08 [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=ZvJDOo7xL8d7x7EI]
* Model run 2021-03-08 [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=VpSUS4pfGavspLG9] with the fish needed in PFAS assessment
* Model run 2021-03-12 [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=Lc9KWY7r1tTuGWVD] using euw
* Model run 2021-03-13 [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=pTiMHkD4Lq0EdLab] with location parameter for PFAS
* Model run 2021-03-17 [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=Mrko9rrynNRELP07] location parameter not plotted because problems with older R version in Opasnet.
 
<rcode name="pollutant_bayes" label="Initiate conc_param with PCDDF, PFAS, OT (for developers only)" embed=0 graphics=1>
# This is code Op_en3104/pollutant_bayes on page [[EU-kalat]]
# The code is also available at https://github.com/jtuomist/pfas/blob/main/conc_pcddf_preprocess.R
 
library(OpasnetUtils)
library(reshape2)
library(rjags) # JAGS
library(ggplot2)
library(MASS) # mvrnorm
library(car) # scatterplotMatrix
 
#' Find the level of quantification for dinterval function
#' @param df data.frame
#' @return data.matrix
add_loq <- function(df) { # This should reflect the fraction of observations below LOQ.
  LOQ <- unlist(lapply(df, FUN = function(x) min(x[x!=0], na.rm=TRUE)))
  out <- sapply(
    1:length(LOQ),
    FUN = function(x) ifelse(df[,x]==0, 0.5*LOQ[x], df[,x])
  )
  out <- data.matrix(out)
  return(out)
}
 
#size <- Ovariable("size", ddata="Op_en7748", subset="Size distribution of fish species")
#time <- Ovariable("time", data = data.frame(Result=2015))
 
objects.latest("Op_en3104", code_name = "preprocess2") # [[EU-kalat]] euw
 
# Hierarchical Bayes model.
 
# PCDD/F concentrations in fish.
# It uses the TEQ sum of PCDD/F (PCDDF) as the total concentration
# of dioxin and PCB respectively for PCB in fish.
# PCDDF depends on size of fish, fish species, catchment time, and catchment area,
# but we omit catchment area. In addition, we assume that size of fish has
# zero effect for other fish than Baltic herring.
# Catchment year affects all species similarly.
 
eu3 <- euw[!colnames(euw) %in% c("MPhT","DOT","BDE138")] # No values > 0
 
eu3 <- eu3[eu3$Matrix == "Muscle" , ]
eu3$Locat <- ifelse(eu3$Location=="Porvoo",2,
                      ifelse(eu3$Location=="Helsinki, Vanhankaupunginlahti Bay",3,1))
locl <- c("Finland","Porvoo","Helsinki")
 
#conl_nd <- c("PFAS","PFOA","PFOS","DBT","MBT","TBT","DPhT","TPhT")
conl_nd <- c("PFAS","PFOS") # TBT would drop Porvoo measurements
fisl <- fisl_nd <- c("Baltic herring","Bream","Flounder","Perch","Roach","Salmon","Whitefish")
 
eu4 <- eu3[rowSums(is.na(eu3[conl_nd]))<length(conl_nd) & eu3$Fish %in% fisl_nd ,
          c(1:5,match(c("Locat",conl_nd),colnames(eu3)))]
 
conc_nd <- add_loq(eu4[conl_nd])
 
conl <- c("TEQ","PCDDF","PCB") # setdiff(colnames(eu3)[-(1:5)], conl_nd)
eu3 <- eu3[!is.na(eu3$PCDDF) & eu3$Fish %in% fisl , c(1:5, match(conl,colnames(eu3)))]
 
oprint(head(eu3))
oprint(head(eu4))
 
C <- length(conl)
Fi <- length(fisl)
N <- 200
thin <- 100
conl
fisl
conl_nd
fisl_nd
 
eu3 <- eu3[rowSums(is.na(eu3))==0,]
conc <- add_loq(eu3[conl]) # Remove rows with missing data.
 
# The model assumes that all fish groups have the same Omega but mu varies.
 
mod <- textConnection(
  "
  model{
    for(i in 1:S) { # S = fish sample
      #        below.LOQ[i,j] ~ dinterval(-conc[i,j], -LOQ[j])
      conc[i,1:C] ~ dmnorm(muind[i,], Omega[fis[i],,])
      muind[i,1:C] <- mu[fis[i],1:C] #+ lenp[fis[i]]*length[i] + timep*year[i]
    }
    for(i in 1:S_nd) {
      for(j in 1:C_nd) {
        conc_nd[i,j] ~ dnorm(muind_nd[i,j], tau_nd[j])
        muind_nd[i,j] <- mu_nd[fis_nd[i],j] + mulocat[locat[i]] #+ lenp[fis[i]]*length[i] + timep*year[i]
      }
    }
   
    # Priors for parameters
    # Time trend. Assumed a known constant because at the moment there is little time variation in data.
    # https://www.evira.fi/elintarvikkeet/ajankohtaista/2018/itameren-silakoissa-yha-vahemman-ymparistomyrkkyja---paastojen-rajoitukset-vaikuttavat/
    # PCDDF/PCB-concentations 2001: 9 pg/g fw, 2016: 3.5 pg/g fw. (3.5/9)^(1/15)-1=-0.06102282
  #  timep ~ dnorm(-0.0610, 10000)
  #  lenp[1] ~ dnorm(0.01,0.01) # length parameter for herring
  #  lenp[2] ~ dnorm(0,10000) # length parameter for salmon: assumed zero
   
    for(i in 1:Fi) { # Fi = fish species
      Omega[i,1:C,1:C] ~ dwish(Omega0[1:C,1:C],S)
      pred[i,1:C] ~ dmnorm(mu[i,1:C], Omega[i,,]) #+lenp[i]*lenpred+timep*timepred, Omega[i,,]) # Model prediction.
      for(j in 1:C) {
        mu[i,j] ~ dnorm(0, 0.0001) # mu1[j], tau1[j]) # Congener-specific mean for fishes
      }
    }
    # Non-dioxins
    mulocat[1] <- 0
    mulocat[2] ~ dnorm(0,0.001)
    mulocat[3] ~ dnorm(0,0.001)
    for(j in 1:C_nd) {
      tau_nd[j] ~ dgamma(0.001,0.001)
      for(i in 1:Fi_nd) { # Fi = fish species
        pred_nd[i,j] ~ dnorm(mu[i,j], tau_nd[j])
        mu_nd[i,j] ~ dnorm(0, 0.0001)
      }
    }
  }
")
 
jags <- jags.model(
  mod,
  data = list(
    S = nrow(conc),
    S_nd = nrow(conc_nd),
    C = C,
    C_nd = ncol(conc_nd),
    Fi = Fi,
    Fi_nd = length(fisl_nd),
    conc = log(conc),
    conc_nd = log(conc_nd),
    locat = eu4$Locat,
    #    length = eu3$Length-170, # Subtract average herring size
    #    year = eu3$Year-2009, # Substract baseline year
    fis = match(eu3$Fish, fisl),
    fis_nd = match(eu4$Fish, fisl_nd),
    #    lenpred = 233-170,
    #    timepred = 2009-2009,
    Omega0 = diag(C)/100000
  ),
  n.chains = 4,
  n.adapt = 200
)
 
update(jags, 1000)
 
samps.j <- jags.samples(
  jags,
  c(
    'mu', # mean by fish and compound
    'Omega', # precision matrix by compound
    #    'lenp',# parameters for length
    #    'timep', # parameter for Year
    'pred', # predicted concentration for year 2009 and length 17 cm
    'pred_nd',
    'mu_nd',
    'tau_nd',
    'mulocat'
  ),
  thin=thin,
  N*thin
)
dimnames(samps.j$Omega) <- list(Fish = fisl, Compound = conl, Compound2 = conl, Iter=1:N, Chain=1:4)
dimnames(samps.j$mu) <- list(Fish = fisl, Compound = conl, Iter = 1:N, Chain = 1:4)
#dimnames(samps.j$lenp) <- list(Fish = fisl, Iter = 1:N, Chain = 1:4)
dimnames(samps.j$pred) <- list(Fish = fisl, Compound = conl, Iter = 1:N, Chain = 1:4)
dimnames(samps.j$pred_nd) <- list(Fish = fisl_nd, Compound = conl_nd, Iter = 1:N, Chain = 1:4)
dimnames(samps.j$mu_nd) <- list(Fish = fisl_nd, Compound = conl_nd, Iter = 1:N, Chain = 1:4)
dimnames(samps.j$tau_nd) <- list(Compound = conl_nd, Iter = 1:N, Chain = 1:4)
#dimnames(samps.j$timep) <- list(Dummy = "time", Iter = 1:N, Chain = 1:4)
dimnames(samps.j$mulocat) <- list(Area = locl, Iter = 1:N, Chain = 1:4)
 
##### conc_param contains expected values of the distribution parameters from the model
 
conc_param <- list(
  Omega = apply(samps.j$Omega, MARGIN = 1:3, FUN = mean),
  #      lenp = cbind(
  #        mean = apply(samps.j$lenp, MARGIN = 1, FUN = mean),
  #        sd = apply(samps.j$lenp, MARGIN = 1, FUN = sd)
  #      ),
  mu = apply(samps.j$mu, MARGIN = 1:2, FUN = mean),
  #      timep = cbind(
  #        mean = apply(samps.j$timep, MARGIN = 1, FUN = mean),
  #        sd = apply(samps.j$timep, MARGIN = 1, FUN = sd)
  #      )
  mu_nd =  apply(samps.j$mu_nd, MARGIN = 1:2, FUN = mean),
  tau_nd =  apply(samps.j$tau_nd, MARGIN = 1, FUN = mean),
  mulocat = apply(samps.j$mulocat, MARGIN = 1, FUN = mean)
)
#    names(dimnames(conc_param$lenp)) <- c("Fish","Metaparam")
#    names(dimnames(conc_param$timep)) <- c("Dummy","Metaparam")
 
conc_param <- melt(conc_param)
colnames(conc_param)[colnames(conc_param)=="value"] <- "Result"
colnames(conc_param)[colnames(conc_param)=="L1"] <- "Parameter"
conc_param$Compound[conc_param$Parameter =="tau_nd"] <- conl_nd # drops out because one-dimensional
conc_param$Area[conc_param$Parameter =="mulocat"] <- locl # drops out because one-dimensional
conc_param <- fillna(conc_param,c("Fish","Area"))
for(i in 1:ncol(conc_param)) {
  if("factor" %in% class(conc_param[[i]])) conc_param[[i]] <- as.character(conc_param[[i]])
}
conc_param <- Ovariable("conc_param",data=conc_param)
 
objects.store(conc_param)
cat("Ovariable conc_param stored.\n")
 
######################3
 
cat("Descriptive statistics:\n")
 
# Leave only the main fish species and congeners and remove others
 
#oprint(summary(
#  eu2[eu2$Compound %in% indices$Compound.PCDDF14 & eu$Fish %in% fisl , ],
#  marginals = c("Fish", "Compound"), # Matrix is always 'Muscle'
#  function_names = c("mean", "sd")
#))
 
#tmp <- euw[euw$Compound %in% c("PCDDF","PCB","BDE153","PBB153","PFOA","PFOS","DBT","MBT","TBT"),]
#ggplot(tmp, aes(x = eu2Result, colour=Fish))+stat_ecdf()+
#  facet_wrap( ~ Compound, scales="free_x")+scale_x_log10()
 
dimnames(samps.j$mulocat)
 
scatterplotMatrix(t(exp(samps.j$pred[2,,,1])), main = paste("Predictions for several compounds for",
                                                            names(samps.j$pred[,1,1,1])[2]))
scatterplotMatrix(t(exp(samps.j$pred[,1,,1])), main = paste("Predictions for all fish species for",
                                                            names(samps.j$pred[1,,1,1])[1]))
scatterplotMatrix(t(samps.j$Omega[2,,1,,1]), main = "Omega for several compounds in Baltic herring")
 
scatterplotMatrix(t((samps.j$pred_nd[1,,,1])), main = paste("Predictions for several compounds for",
                                                            names(samps.j$pred_nd[,1,1,1])[1]))
 
#scatterplotMatrix(t((samps.j$mulocat[,,1])), main = paste("Predictions for location average difference",
#                                                            names(samps.j$pred_nd[,1,1,1])[1]))
 
#plot(coda.samples(jags, 'Omega', N))
plot(coda.samples(jags, 'mu', N*thin, thin))
#plot(coda.samples(jags, 'lenp', N))
#plot(coda.samples(jags, 'timep', N))
plot(coda.samples(jags, 'pred', N*thin, thin))
plot(coda.samples(jags, 'mu_nd', N*thin, thin))
plot(coda.samples(jags, 'mulocat', N*thin, thin))
tst <- (coda.samples(jags, 'pred', N))
</rcode>
 
===== Initiate conc_poll=====
<rcode name="conc_poll" label="Initiate conc_poll" embed=1>
#This is code Op_en3104/conc_poll on page [[EU-kalat]]
 
library(OpasnetUtils)
 
conc_poll <- Ovariable(
  "conc_poll",
  dependencies = data.frame(
    Name=c("conc_param"), #,"lengt","time"),
    Ident=c("Op_en3104/pollutant_bayes")#,NA,NA)
  ),
  formula=function(...) {
    require(MASS)
    tmp1 <- conc_param + Ovariable(data=data.frame(Result="0-1")) # Ensures Iter #  lengt + time +
    tmp2 <- unique(tmp1@output[setdiff(
      colnames(tmp1@output)[tmp1@marginal],
      c("Compound","Compound2","Metaparam","Parameter")
    )])
    tmp2$Row <- 1:nrow(tmp2)
    tmp3 <- merge(tmp2,tmp1@output)
    out <- data.frame()
    for(i in 1:nrow(tmp2)) {
     
      ############## PCDDF (with multivariate mvnorm)
     
      tmp <- tmp3[tmp3$Row == i , ]
      Omega <- solve(tapply(
        tmp$conc_paramResult[tmp$Parameter=="Omega"],
        tmp[tmp$Parameter=="Omega", c("Compound","Compound2")],
        sum # Equal to identity because only 1 row per cell.
      )) # Precision matrix
      con <- names(Omega[,1])
     
      mu <- tmp$conc_paramResult[tmp$Parameter=="mu"][match(con,tmp$Compound[tmp$Parameter=="mu"])] # + # baseline
#        rnorm(1,
#              tmp$conc_paramResult[tmp$Parameter=="lenp" & tmp$Metaparam=="mean"][1],
#              tmp$conc_paramResult[tmp$Parameter=="lenp" & tmp$Metaparam=="sd"][1]
#        ) * (tmp$lengtResult[1]-170) + # lengt
#        rnorm(1,
#              tmp$conc_paramResult[tmp$Parameter=="timep" & tmp$Metaparam=="mean"][1],
#              tmp$conc_paramResult[tmp$Parameter=="timep" & tmp$Metaparam=="sd"][1]
#        )* (tmp$timeResult[1]-2009) # time
     
      rnd <- exp(mvrnorm(1, mu, Omega))
      out <- rbind(out, merge(tmp2[i,], data.frame(Compound=con,Result=rnd)))
     
      #################### PFAS etc (with univariate norm)
      con <- tmp$Compound[tmp$Parameter=="mu_nd"]
      mu <- tmp$conc_paramResult[tmp$Parameter=="mu_nd"]
      tau <- tmp$conc_paramResult[tmp$Parameter=="tau_nd"][match(con,tmp$Compound[tmp$Parameter=="tau_nd"])]
      mulocat <- tmp$conc_paramResult[tmp$Parameter=="mulocat"]
      for(j in 1:length(con)) {
        rnd <- exp(rnorm(1 , mu[j] + mulocat , tau[j]))
        out <- rbind(out,
                    data.frame(tmp2[i,],Compound = con[j],Result = rnd)
                    )
      }
    }
    out$Row <- NULL
#    temp <- aggregate(
#      out["Result"],
#      by=out[setdiff(colnames(out), c("Result","Compound"))],
#      FUN=sum
#    )
#    temp$Compound <- "TEQ"
    out <- Ovariable(
      output = out, # rbind(out, temp),
      marginal = colnames(out) != "Result"
    )
    return(out)
  }
)
 
objects.store(conc_poll)
cat("Ovariable conc_poll stored.\n")
</rcode>
 
==== Initiate conc_pcddf for Goherr ====


* Model run 19.5.2017 [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=ystfGN6yfNwWNfnq]
* Model run 19.5.2017 [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=ystfGN6yfNwWNfnq]
Line 451: Line 892:
* Model rerun 15.11.2017 because the previous stored run was lost in update [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=1Pq6y6l1WsKJEmjY]
* Model rerun 15.11.2017 because the previous stored run was lost in update [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=1Pq6y6l1WsKJEmjY]
* 12.3.2018 adjusted to match the same Omega for all fish species [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=KwbJ1pUP98De8Gzc]
* 12.3.2018 adjusted to match the same Omega for all fish species [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=KwbJ1pUP98De8Gzc]
* 26.3.2018 includes length and time as parameters, lengt ovariable initiated here [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=MV1xrtP9i6JuN7Tn]


<rcode name="initiate" label="Initiate conc_pcddf (for developers only)">
<rcode name="initiate" label="Initiate conc_pcddf (for developers only)" embed=1>
# This is code Op_en3104/initiate on page [[EU-kalat]]
# This is code Op_en3104/initiate on page [[EU-kalat]]


library(OpasnetUtils)
library(OpasnetUtils)
lengt <- Ovariable(
  "lengt",
  dependencies=data.frame(Name="size",Ident=NA),
  formula = function(...) {
    size$Lower <- as.numeric(as.character(size$Lower))
    rep <- unique(size@output[
      colnames(size@output)!="Lower" & size@marginal
      ])
    out <- data.frame()
    name <- paste(size@name, "Result", sep="")
    for(j in 1:nrow(rep)) {
      siz <- numeric()
      tmp <- merge(rep[j,,drop=FALSE], size@output)
      tmp <- tmp[order(tmp$Lower),]
      num <- tmp[[name]]/sum(tmp[[name]])
      for(i in 1:(nrow(tmp)-1)) { # Pick at random from each bin
        siz <- c(siz,runif(
          openv$N,
          tmp$Lower[i],
          tmp$Lower[i+1]
        )[1:ceiling(num[i]*openv$N)])
      }
      out <- rbind(out, cbind(
        rep[j,,drop=FALSE],
        Iter=1:openv$N,
        Result=sample(siz, openv$N) # Take fixed amount and shuffle
      ))
    }
    return(Ovariable(output=out, marginal=!grepl("Result", colnames(out))))
  }
)


conc_pcddf <- Ovariable(
conc_pcddf <- Ovariable(
   "conc_pcddf",
   "conc_pcddf",
   dependencies = data.frame(Name = "conc.param", Ident = "Op_en3104/bayes"),
   dependencies = data.frame(
   formula = function(...) {
    Name=c("conc.param","lengt","time"),
    Ident=c("Op_en3104/bayes",NA,NA)
  ),
   formula=function(...) {
     require(MASS)
     require(MASS)
     require(reshape2)
     tmp1 <- lengt + time + conc.param + Ovariable(data=data.frame(Result="0-1")) # Ensures Iter
     jsp <- lapply(1:length(conc.param$mu[, 1]), FUN = function(x) {
     tmp2 <- unique(tmp1@output[setdiff(
      temp <- exp(mvrnorm(
      colnames(tmp1@output)[tmp1@marginal],
        openv$N,  
      c("Compound","Compound2","Metaparam","Parameter")
        conc.param$mu[x, ],
    )])
         solve(conc.param$Omega[ , ])
    tmp2$Row <- 1:nrow(tmp2)
      ))
    tmp3 <- merge(tmp2,tmp1@output)
      dimnames(temp) <- c(list(Iter = 1:openv$N), dimnames(conc.param$mu)[2])
    out <- data.frame()
      return(temp)
    con <- levels(tmp1$Compound)
    })
    for(i in 1:nrow(tmp2)) {
    names(jsp) <- dimnames(conc.param$mu)[[1]]
      tmp <- tmp3[tmp3$Row == i , ]
     jsp <- melt(jsp, value.name = "Result")
      mu <- tmp$conc.paramResult[tmp$Parameter=="mu"][match(con,tmp$Compound[tmp$Parameter=="mu"])] + # baseline
    colnames(jsp)[colnames(jsp) == "L1"] <- "Fish"
         rnorm(1,
          tmp$conc.paramResult[tmp$Parameter=="lenp" & tmp$Metaparam=="mean"][1],
          tmp$conc.paramResult[tmp$Parameter=="lenp" & tmp$Metaparam=="sd"][1]
        )* (tmp$lengtResult[1]-170) + # lengt
        rnorm(1,
          tmp$conc.paramResult[tmp$Parameter=="timep" & tmp$Metaparam=="mean"][1],
          tmp$conc.paramResult[tmp$Parameter=="timep" & tmp$Metaparam=="sd"][1]
        )* (tmp$timeResult[1]-2009) # time
      
      Omega <- solve(tapply( # Is it sure that PCDDF and PCB are not mixed to wrong order?
        tmp$conc.paramResult[tmp$Parameter=="Omega"],
        tmp[tmp$Parameter=="Omega", c("Compound","Compound2")],
        sum # Equal to identity because only 1 row per cell.
      )) # Precision matrix
      
      
     jsp <- Ovariable(
      rnd <- exp(mvrnorm(1, mu, Omega))
       output = jsp,
      out <- rbind(out, merge(tmp2[i,], data.frame(Compound=names(rnd),Result=rnd)))
       marginal = colnames(jsp) != "Result"
    }
     out$Row <- NULL
    temp <- aggregate(
      out["Result"],
      by=out[setdiff(colnames(out), c("Result","Compound"))],
      FUN=sum
    )
    temp$Compound <- "TEQ"
    out <- Ovariable(
       output = rbind(out, temp),
       marginal = colnames(out) != "Result"
     )
     )
   
     return(out)
    d <- oapply(jsp, cols = "Compound", FUN = sum)
    d$Compound <- "TEQ"
   
     return(combine(jsp,d))
   }
   }
)
)


objects.store(conc_pcddf)
objects.store(conc_pcddf, lengt)
cat("Ovariable conc_pcddf stored.\n")
cat("Ovariables conc_pcddf, lengt stored.\n")
</rcode>
</rcode>



Latest revision as of 09:04, 17 March 2021


EU-kalat is a study, where concentrations of PCDD/Fs, PCBs, PBDEs and heavy metals have been measured from fish

Question

The scope of EU-kalat study was to measure concentrations of persistent organic pollutants (POPs) including dioxin (PCDD/F), PCB and BDE in fish from Baltic sea and Finnish inland lakes and rivers. [1] [2] [3].

Answer

Dioxin concentrations in Baltic herring.

The original sample results can be acquired from Opasnet base. The study showed that levels of PCDD/Fs and PCBs depends especially on the fish species. Highest levels were on salmon and large sized herring. Levels of PCDD/Fs exceeded maximum level of 4 pg TEQ/g fw multiple times. Levels of PCDD/Fs were correlated positively with age of the fish.

Mean congener concentrations as WHO2005-TEQ in Baltic herring can be printed out with this link or by running the codel below.

+ Show code

Rationale

Data

Data was collected between 2009-2010. The study contains years, tissue type, fish species, and fat content for each concentration measurement. Number of observations is 285.

There is a new study EU-kalat 3, which will produce results in 2016.

Calculations

Preprocess

  • Preprocess model 22.2.2017 [4]
  • Model run 25.1.2017 [5]
  • Model run 22.5.2017 with new ovariables euRaw, euAll, euMain, and euRatio [6]
  • Model run 23.5.2017 with adjusted ovariables euRaw, eu, euRatio [7]
  • Model run 11.10.2017: Small herring and Large herring added as new species [8]
  • Model rerun 15.11.2017 because the previous stored run was lost in update [9]
  • Model run 21.3.2018: Small and large herring replaced by actual fish length [10]
  • Model run 26.3.2018 eu2 moved here [11]

See an updated version of preprocess code for eu on Health effects of Baltic herring and salmon: a benefit-risk assessment#Code for estimating TEQ from chinese PCB7

+ Show code

Bayes model for dioxin concentrations

  • Model run 28.2.2017 [12]
  • Model run 28.2.2017 with corrected survey model [13]
  • Model run 28.2.2017 with Mu estimates [14]
  • Model run 1.3.2017 [15]
  • Model run 23.4.2017 [16] produces list conc.param and ovariable concentration
  • Model run 24.4.2017 [17]
  • Model run 19.5.2017 without ovariable concentration [18] ⇤--#: . The model does not mix well, so the results should not be used for final results. --Jouni (talk) 19:37, 19 May 2017 (UTC) (type: truth; paradigms: science: attack)
----#: . Maybe we should just estimate TEQs until the problem is fixed. --Jouni (talk) 19:37, 19 May 2017 (UTC) (type: truth; paradigms: science: comment)
  • Model run 22.5.2017 with TEQdx and TEQpcb as the only Compounds [19]
  • Model run 23.5.2017 debugged [20] [21] [22]
  • Model run 24.5.2017 TEQdx, TECpcb -> PCDDF, PCB [23]
  • Model run 11.10.2017 with small and large herring [24] (removed in update)
  • Model run 12.3.2018: bugs fixed with data used in Bayes. In addition, redundant fish species removed and Omega assumed to be the same for herring and salmon. [25]
  • Model run 22.3.2018 [26] Model does not mix well. Thinning gives little help?
  • Model run 25.3.2018 with conc.param as ovariable [27]

+ Show code

Initiate conc_pcddf for PFAS disease burden study

Initiate euw data.frame

This code is similar to preprocess but is better and includes PFAS concentrations from op_fi:PFAS-yhdisteiden tautitaakka. It produces data.frame euw that is the EU-kalat + PFAS data in wide format and, for PFAS but not EU-kalat, a sampled value for measurements below the level of quantification.

+ Show code

Initiate conc_param using Bayesian approach

Bayesian approach for PCDDF, PCB, OT, PFAS.

  • Model run 2021-03-08 [28]
  • Model run 2021-03-08 [29] with the fish needed in PFAS assessment
  • Model run 2021-03-12 [30] using euw
  • Model run 2021-03-13 [31] with location parameter for PFAS
  • Model run 2021-03-17 [32] location parameter not plotted because problems with older R version in Opasnet.

+ Show code

Initiate conc_poll

+ Show code

Initiate conc_pcddf for Goherr

  • Model run 19.5.2017 [33]
  • Model run 23.5.2017 with bugs fixed [34]
  • Model run 12.10.2017: TEQ calculation added [35]
  • Model rerun 15.11.2017 because the previous stored run was lost in update [36]
  • 12.3.2018 adjusted to match the same Omega for all fish species [37]
  • 26.3.2018 includes length and time as parameters, lengt ovariable initiated here [38]

+ Show code

⇤--#: . These codes should be coherent with POPs in Baltic herring. --Jouni (talk) 12:14, 7 June 2017 (UTC) (type: truth; paradigms: science: attack)

See also

References

  1. A. Hallikainen, H. Kiviranta, P. Isosaari, T. Vartiainen, R. Parmanne, P.J. Vuorinen: Kotimaisen järvi- ja merikalan dioksiinien, furaanien, dioksiinien kaltaisten PCB-yhdisteiden ja polybromattujen difenyylieettereiden pitoisuudet. Elintarvikeviraston julkaisuja 1/2004. [1]
  2. E-R.Venäläinen, A. Hallikainen, R. Parmanne, P.J. Vuorinen: Kotimaisen järvi- ja merikalan raskasmetallipitoisuudet. Elintarvikeviraston julkaisuja 3/2004. [2]
  3. Anja Hallikainen, Riikka Airaksinen, Panu Rantakokko, Jani Koponen, Jaakko Mannio, Pekka J. Vuorinen, Timo Jääskeläinen, Hannu Kiviranta. Itämeren kalan ja muun kotimaisen kalan ympäristömyrkyt: PCDD/F-, PCB-, PBDE-, PFC- ja OT-yhdisteet. Eviran tutkimuksia 2/2011. ISSN 1797-2981 ISBN 978-952-225-083-4 [3]