3.3. Part 3: GLMM Analysis

3.3.1. 1. Creating the model predictors using GridFix

In this part of the tutorial, we will use the grid and image features defined in the previous chapters to create a GLMM predictor matrix and output some model code for R. First, let’s reproduce the analysis from previous chapters and add some features that might influence each grid cell’s fixation probability:

In [15]:
%matplotlib inline

import numpy as np
import matplotlib as mp
import matplotlib.pyplot as plt

from gridfix import *

# Load images and define 8x6 grid from part 1
images = ImageSet('images/tutorial_images.csv', label='tutorial')
grid = GridRegionSet(size=images.size, gridsize=(8,6), label='testgrid')

# Define some simple features from part 2
fLum = LuminanceFeature(grid, images)
fEdge = SobelEdgeFeature(grid, images)
fCent = CentralBiasFeature(grid, images, measure='gaussian', sig2=0.23, nu=0.45)

# Load IKN98 feature maps and define a MapFeature
ids = ['112', '67', '6', '52', '37', '106', '129', '9', '107', '97', '58', '111', '85', '149', '150']
maps = ImageSet('maps', imageids=ids, mat_var='IKN98')
fIKN = MapFeature(grid, maps, stat=np.mean)

# Load fixation data
fix = Fixations('tutorial_fixations.csv', imageid='image_id', fixid='CURRENT_FIX_INDEX',
                x='CURRENT_FIX_X', y='CURRENT_FIX_Y', imageset=images)

Now the actual GLMM preprocessing can be performed using a FixationModel object which combines fixation data, RegionSet and all features specified in the features= argument into a common predictor matrix, which can then be loaded into R. With a big dataset, this could take a while, but for this tutorial, updating the model should be a matter of a few seconds. Note that the chunks= argument specifies the names of columns over which data should not be aggregated, e.g. individual subjects, while the features= argument contains a Python list of the actual Feature objects defined above.

In our example, this predictor matrix contains one line for each subject, image and grid cell, which will yield 8 x 15 x 48 = 5760 individual data points to be entered into GLMM. We can print the resulting model object to verify this:

In [45]:
model = FixationModel(fix, grid, chunks=['subject_number', 'image_id'], features=[fLum, fCent], dv_type='fixated')
print(model)
<gridfix.FixationModel, 5760 samples, DV=fixated, chunked by: ['subject_number', 'image_id']>
Fixations:
        <gridfix.Fixations data set, 2556 samples, 15 images>
Images:
        <gridfix.ImageSet "tutorial", 15 images, size=(800, 600), normalized>
Regions:
        <gridfix.GridRegionSet (testgrid), size=(800, 600), 8x6 grid, 48 cells, memory=22500.0 kB>

Features:
        fCentr  CentralBiasFeature
        fLumin  LuminanceFeature

The resulting predictor matrix now contains one row per combination of image and region (and possible other variables used for chunking in the model specification, such as the subject id). Within each row, the column dvFix indicates the fixation state of each cell, i.e., whether it was fixated (1) or not (0), while the remaining columns contain the feature values for the corresponding regions - here, mean cell luminance (fLumin) and distance from image center following an anisotropic Gaussian CB model (fCentr).

The predictor matrix can be accessed as a DataFrame using the predictors attribute of the generated model object:

In [46]:
print(model.predictors)
    subject_number image_id  region  dvFix    fCentr    fLumin
0            201.0      106     1.0    0.0  0.971265  0.484337
1            201.0      106     2.0    0.0  0.935023  0.484956
2            201.0      106     3.0    0.0  0.888111  0.567180
3            201.0      106     4.0    0.0  0.853275  0.657081
4            201.0      106     5.0    0.0  0.853474  0.792208
5            201.0      106     6.0    0.0  0.888567  0.800865
6            201.0      106     7.0    0.0  0.935464  0.779572
7            201.0      106     8.0    0.0  0.971537  0.681514
8            201.0      106     9.0    0.0  0.903758  0.489519
9            201.0      106    10.0    1.0  0.782377  0.395110
10           201.0      106    11.0    0.0  0.625257  0.555128
11           201.0      106    12.0    0.0  0.508580  0.538794
12           201.0      106    13.0    0.0  0.509249  0.882058
13           201.0      106    14.0    0.0  0.626785  0.983519
14           201.0      106    15.0    0.0  0.783854  0.958491
15           201.0      106    16.0    0.0  0.904671  0.880573
16           201.0      106    17.0    1.0  0.824134  0.293749
17           201.0      106    18.0    0.0  0.602332  0.349848
18           201.0      106    19.0    1.0  0.315222  0.542880
19           201.0      106    20.0    1.0  0.102016  0.561462
20           201.0      106    21.0    0.0  0.103238  0.574169
21           201.0      106    22.0    1.0  0.318015  0.756059
22           201.0      106    23.0    1.0  0.605031  0.922472
23           201.0      106    24.0    0.0  0.825803  0.944817
24           201.0      106    25.0    1.0  0.824666  0.224936
25           201.0      106    26.0    0.0  0.603535  0.431133
26           201.0      106    27.0    1.0  0.317293  0.583595
27           201.0      106    28.0    1.0  0.104732  0.436877
28           201.0      106    29.0    1.0  0.105951  0.428582
29           201.0      106    30.0    0.0  0.320077  0.625334
..             ...      ...     ...    ...       ...       ...
18           209.0       97    19.0    1.0  0.315222  0.507149
19           209.0       97    20.0    1.0  0.102016  0.517995
20           209.0       97    21.0    1.0  0.103238  0.446762
21           209.0       97    22.0    1.0  0.318015  0.482895
22           209.0       97    23.0    1.0  0.605031  0.414101
23           209.0       97    24.0    0.0  0.825803  0.408590
24           209.0       97    25.0    0.0  0.824666  0.397340
25           209.0       97    26.0    1.0  0.603535  0.358494
26           209.0       97    27.0    0.0  0.317293  0.398970
27           209.0       97    28.0    0.0  0.104732  0.451527
28           209.0       97    29.0    0.0  0.105951  0.602446
29           209.0       97    30.0    0.0  0.320077  0.470265
30           209.0       97    31.0    0.0  0.606226  0.427404
31           209.0       97    32.0    1.0  0.826330  0.343827
32           209.0       97    33.0    0.0  0.904629  0.360975
33           209.0       97    34.0    0.0  0.784346  0.354982
34           209.0       97    35.0    0.0  0.628647  0.551488
35           209.0       97    36.0    0.0  0.513026  0.635785
36           209.0       97    37.0    0.0  0.513689  0.661605
37           209.0       97    38.0    0.0  0.630161  0.611297
38           209.0       97    39.0    0.0  0.785810  0.546509
39           209.0       97    40.0    1.0  0.905534  0.480870
40           209.0       97    41.0    0.0  0.971697  0.359889
41           209.0       97    42.0    0.0  0.936000  0.428950
42           209.0       97    43.0    0.0  0.889793  0.546199
43           209.0       97    44.0    0.0  0.855480  0.628933
44           209.0       97    45.0    0.0  0.855677  0.662128
45           209.0       97    46.0    0.0  0.890243  0.647953
46           209.0       97    47.0    0.0  0.936435  0.627720
47           209.0       97    48.0    0.0  0.971965  0.506024

[5760 rows x 6 columns]

To facilitate analysis of the generated predictor matrix in R, GridFix also generates R source code that contains the necessary commands to import the generated data file, define factors, standardize predictors and set up a model formula. This can serve as a starting point for analysis but should be adapted to the individual factor structure of each hypothesis and dataset - as a reminder, the actual call to glmer is commented out.

The model.save() function saves both the predictor matrix and the R code to files to be read into R. Note that the file name argument should be the base name of generated files, e.g., model.save(“tutorial”) will generate tutorial.csv and tutorial.R.

In [49]:
# Print the source code here as an example
print(model.r_source())

model.save('tutorial')
# GridFix GLMM R source, generated on 04.05.17, 14:45:09
# input file:   gridfix.csv
# RegionSet:    <gridfix.GridRegionSet (testgrid), size=(800, 600), 8x6 grid, 48 cells, memory=22500.0 kB>
# DV type:      fixated

library(lme4)

data  <- read.table("gridfix.csv", header=T, sep="\t", row.names=NULL)

# Define R factors for all chunking variables and group dummy codes
data$subject_number <- as.factor(data$subject_number)
data$image_id <- as.factor(data$image_id)

# Center and scale predictors
data$fCentr_C <- scale(data$fCentr, center=TRUE, scale=TRUE)
data$fLumin_C <- scale(data$fLumin, center=TRUE, scale=TRUE)

# NOTE: this source code can only serve as a scaffolding for your own analysis!
# You MUST adapt the GLMM model formula below to your model, then uncomment the corresponding line!
#model <- glmer(dvFix ~ 1 + fCentr_C  + fLumin_C  + (1 | image_id), control=glmerControl(optimizer="bobyqa"), data=data, family=binomial)

save(file="gridfix_GLMM.Rdata", list = c("model"))

print(summary(model))

3.3.2. 2. Evaluating the model using R

Note: The following cells will only work interactively if you have a working R environment and the rpy2 Python module installed. The next cell will set up the %%R magic code to run R code within this notebook

In [50]:
# Initialize R environment for Jupyter notebook using rpy2
%load_ext rpy2.ipython
The rpy2.ipython extension is already loaded. To reload it, use:
  %reload_ext rpy2.ipython

The Python code cell above saved the tutorial model (central bias and luminance) to “tutorial.csv”, which we will now load into R. The following code example fits a model containing central bias and mean cell luminance as fixed factors and includes random intercepts (but not slopes) for individual images. The resulting R model object can then be manipulated using R commands as usual and/or saved to an Rdata file.

In [51]:
%%R
# GridFix GLMM R source, generated on 04.05.17, 14:22:46
# input file:       gridfix.csv
# RegionSet:        <gridfix.GridRegionSet (testgrid), size=(800, 600), 8x6 grid, 48 cells, memory=22500.0 kB>
# DV type:  fixated

library(lme4)

data  <- read.table("tutorial.csv", header=T, sep="\t", row.names=NULL)

# Define R factors for all chunking variables and group dummy codes
data$subject_number <- as.factor(data$subject_number)
data$image_id <- as.factor(data$image_id)

# Center and scale predictors
data$fCentr_C <- scale(data$fCentr, center=TRUE, scale=TRUE)
data$fLumin_C <- scale(data$fLumin, center=TRUE, scale=TRUE)

# NOTE: this source code can only serve as a scaffolding for your own analysis!
# You MUST adapt the GLMM model formula below to your model, then uncomment the corresponding line!
model <- glmer(dvFix ~ 1 + fCentr_C  + fLumin_C  + (1 | image_id), control=glmerControl(optimizer="bobyqa"), data=data, family=binomial)

print(summary(model))
Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: dvFix ~ 1 + fCentr_C + fLumin_C + (1 | image_id)
   Data: data
Control: glmerControl(optimizer = "bobyqa")

     AIC      BIC   logLik deviance df.resid
  6318.4   6345.0  -3155.2   6310.4     5756

Scaled residuals:
    Min      1Q  Median      3Q     Max
-1.3908 -0.5507 -0.4629  0.7890  2.5595

Random effects:
 Groups   Name        Variance Std.Dev.
 image_id (Intercept) 0.007605 0.0872
Number of obs: 5760, groups:  image_id, 15

Fixed effects:
            Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.01691    0.03874 -26.250   <2e-16 ***
fCentr_C    -0.65888    0.02978 -22.126   <2e-16 ***
fLumin_C    -0.07915    0.03374  -2.346    0.019 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
         (Intr) fCnt_C
fCentr_C  0.157
fLumin_C  0.027 -0.090

3.3.3. 4. Concluding Remarks

This concludes the GridFix tutorial - we hope that it can prove helpful in setting up a preprocessing script and GLMM-based fixation analysis. For more details of supported image features and other attributes of the GridFix toolbox, you can use the navigation bar on the left to browse the module documentation or look at some example scripts for common analyses. Thank you for your interest in this method!