Skip to content

NA in bhat at initialization - wrong dimension due to additional intercept? #16

@palmjulia

Description

@palmjulia

I'm trying to run a simple distributed poisson GLMM using the DPQL model. However, at initialization my bhat vector always contains NA and if I'm understanding the output correctly it is also too long. I tried to look a bit into the pda() source code and it seems that we end up with 2 Intercept terms somehow, resulting in the NA estimate due to collinearity.

I'm not sure if this is a bug or if I specified the model incorrectly.

Here's the minimal working example:

# ############################  STEP 0: Generate testdata  ###############################
set.seed(42)
n <- 3 * 6 * 20  # 3 sites, 6 wards, 20 obs each

d <- data.frame(
  site    = rep(paste0("site", 1:3), each = 6 * 20),
  ward    = rep(1:18, each = 20),
  treatment   = rbinom(n, 1, 0.5)
)

# Random intercepts per ward
ward_re <- rnorm(18, 0, 0.5)

lambda <- exp(0.3 +  0.4 * d$treatment + ward_re[d$ward])
d$y <- rpois(n, lambda)

ipdata_site1 <- d[d$site=="site1",]
ipdata_site2 <- d[d$site=="site2",]
ipdata_site3 <- d[d$site=="site3",]

# ############################  STEP 1: initialize  ###############################
#pdaCatalog()

control <- list(project_name = 'glmm possion',
                step = 'initialize',
                sites = unique(d$site),
                heterogeneity = TRUE,
                variables_heterogeneity = "Intercept",
                model = 'DPQL',
                family = 'poisson',
                outcome = "y",
                variables = c('treatment'),
                optim_maxit = 100,
                lead_site = 'site1',
                init_method = 'lead',
                optim_method = "BFGS",
                upload_date = as.character(Sys.time()),
                maxround = 3)

#write control file
pda(site_id = 'site1', control = control, dir = getwd(),upload_without_confirm = T)

#initialize
pda(site_id = 'site3', ipdata = ipdata_site3, dir=getwd())

#check
config <- getCloudConfig(site_id = 'site3', dir=getwd())
site3_initial <- pdaGet(name = 'site3_initialize', config = config)

#this should have length 2 right? 
site3_initial$bhat_i

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions