3.3. Part 3: GLMM Analysis¶
3.3.1. 1. Creating the model predictors using GridFix¶
In this part of the tutorial, we will use the grid and image features defined in the previous chapters to create a GLMM predictor matrix and output some model code for R. First, let’s reproduce the analysis from previous chapters and add some features that might influence each grid cell’s fixation probability:
In [15]:
%matplotlib inline
import numpy as np
import matplotlib as mp
import matplotlib.pyplot as plt
from gridfix import *
# Load images and define 8x6 grid from part 1
images = ImageSet('images/tutorial_images.csv', label='tutorial')
grid = GridRegionSet(size=images.size, gridsize=(8,6), label='testgrid')
# Define some simple features from part 2
fLum = LuminanceFeature(grid, images)
fEdge = SobelEdgeFeature(grid, images)
fCent = CentralBiasFeature(grid, images, measure='gaussian', sig2=0.23, nu=0.45)
# Load IKN98 feature maps and define a MapFeature
ids = ['112', '67', '6', '52', '37', '106', '129', '9', '107', '97', '58', '111', '85', '149', '150']
maps = ImageSet('maps', imageids=ids, mat_var='IKN98')
fIKN = MapFeature(grid, maps, stat=np.mean)
# Load fixation data
fix = Fixations('tutorial_fixations.csv', imageid='image_id', fixid='CURRENT_FIX_INDEX',
x='CURRENT_FIX_X', y='CURRENT_FIX_Y', imageset=images)
Now the actual GLMM preprocessing can be performed using a FixationModel object which combines fixation data, RegionSet and all features specified in the features= argument into a common predictor matrix, which can then be loaded into R. With a big dataset, this could take a while, but for this tutorial, updating the model should be a matter of a few seconds. Note that the chunks= argument specifies the names of columns over which data should not be aggregated, e.g. individual subjects, while the features= argument contains a Python list of the actual Feature objects defined above.
In our example, this predictor matrix contains one line for each subject, image and grid cell, which will yield 8 x 15 x 48 = 5760 individual data points to be entered into GLMM. We can print the resulting model object to verify this:
In [45]:
model = FixationModel(fix, grid, chunks=['subject_number', 'image_id'], features=[fLum, fCent], dv_type='fixated')
print(model)
<gridfix.FixationModel, 5760 samples, DV=fixated, chunked by: ['subject_number', 'image_id']>
Fixations:
<gridfix.Fixations data set, 2556 samples, 15 images>
Images:
<gridfix.ImageSet "tutorial", 15 images, size=(800, 600), normalized>
Regions:
<gridfix.GridRegionSet (testgrid), size=(800, 600), 8x6 grid, 48 cells, memory=22500.0 kB>
Features:
fCentr CentralBiasFeature
fLumin LuminanceFeature
The resulting predictor matrix now contains one row per combination of image and region (and possible other variables used for chunking in the model specification, such as the subject id). Within each row, the column dvFix indicates the fixation state of each cell, i.e., whether it was fixated (1) or not (0), while the remaining columns contain the feature values for the corresponding regions - here, mean cell luminance (fLumin) and distance from image center following an anisotropic Gaussian CB model (fCentr).
The predictor matrix can be accessed as a DataFrame using the predictors attribute of the generated model object:
In [46]:
print(model.predictors)
subject_number image_id region dvFix fCentr fLumin
0 201.0 106 1.0 0.0 0.971265 0.484337
1 201.0 106 2.0 0.0 0.935023 0.484956
2 201.0 106 3.0 0.0 0.888111 0.567180
3 201.0 106 4.0 0.0 0.853275 0.657081
4 201.0 106 5.0 0.0 0.853474 0.792208
5 201.0 106 6.0 0.0 0.888567 0.800865
6 201.0 106 7.0 0.0 0.935464 0.779572
7 201.0 106 8.0 0.0 0.971537 0.681514
8 201.0 106 9.0 0.0 0.903758 0.489519
9 201.0 106 10.0 1.0 0.782377 0.395110
10 201.0 106 11.0 0.0 0.625257 0.555128
11 201.0 106 12.0 0.0 0.508580 0.538794
12 201.0 106 13.0 0.0 0.509249 0.882058
13 201.0 106 14.0 0.0 0.626785 0.983519
14 201.0 106 15.0 0.0 0.783854 0.958491
15 201.0 106 16.0 0.0 0.904671 0.880573
16 201.0 106 17.0 1.0 0.824134 0.293749
17 201.0 106 18.0 0.0 0.602332 0.349848
18 201.0 106 19.0 1.0 0.315222 0.542880
19 201.0 106 20.0 1.0 0.102016 0.561462
20 201.0 106 21.0 0.0 0.103238 0.574169
21 201.0 106 22.0 1.0 0.318015 0.756059
22 201.0 106 23.0 1.0 0.605031 0.922472
23 201.0 106 24.0 0.0 0.825803 0.944817
24 201.0 106 25.0 1.0 0.824666 0.224936
25 201.0 106 26.0 0.0 0.603535 0.431133
26 201.0 106 27.0 1.0 0.317293 0.583595
27 201.0 106 28.0 1.0 0.104732 0.436877
28 201.0 106 29.0 1.0 0.105951 0.428582
29 201.0 106 30.0 0.0 0.320077 0.625334
.. ... ... ... ... ... ...
18 209.0 97 19.0 1.0 0.315222 0.507149
19 209.0 97 20.0 1.0 0.102016 0.517995
20 209.0 97 21.0 1.0 0.103238 0.446762
21 209.0 97 22.0 1.0 0.318015 0.482895
22 209.0 97 23.0 1.0 0.605031 0.414101
23 209.0 97 24.0 0.0 0.825803 0.408590
24 209.0 97 25.0 0.0 0.824666 0.397340
25 209.0 97 26.0 1.0 0.603535 0.358494
26 209.0 97 27.0 0.0 0.317293 0.398970
27 209.0 97 28.0 0.0 0.104732 0.451527
28 209.0 97 29.0 0.0 0.105951 0.602446
29 209.0 97 30.0 0.0 0.320077 0.470265
30 209.0 97 31.0 0.0 0.606226 0.427404
31 209.0 97 32.0 1.0 0.826330 0.343827
32 209.0 97 33.0 0.0 0.904629 0.360975
33 209.0 97 34.0 0.0 0.784346 0.354982
34 209.0 97 35.0 0.0 0.628647 0.551488
35 209.0 97 36.0 0.0 0.513026 0.635785
36 209.0 97 37.0 0.0 0.513689 0.661605
37 209.0 97 38.0 0.0 0.630161 0.611297
38 209.0 97 39.0 0.0 0.785810 0.546509
39 209.0 97 40.0 1.0 0.905534 0.480870
40 209.0 97 41.0 0.0 0.971697 0.359889
41 209.0 97 42.0 0.0 0.936000 0.428950
42 209.0 97 43.0 0.0 0.889793 0.546199
43 209.0 97 44.0 0.0 0.855480 0.628933
44 209.0 97 45.0 0.0 0.855677 0.662128
45 209.0 97 46.0 0.0 0.890243 0.647953
46 209.0 97 47.0 0.0 0.936435 0.627720
47 209.0 97 48.0 0.0 0.971965 0.506024
[5760 rows x 6 columns]
To facilitate analysis of the generated predictor matrix in R, GridFix also generates R source code that contains the necessary commands to import the generated data file, define factors, standardize predictors and set up a model formula. This can serve as a starting point for analysis but should be adapted to the individual factor structure of each hypothesis and dataset - as a reminder, the actual call to glmer is commented out.
The model.save() function saves both the predictor matrix and the R code to files to be read into R. Note that the file name argument should be the base name of generated files, e.g., model.save(“tutorial”) will generate tutorial.csv and tutorial.R.
In [49]:
# Print the source code here as an example
print(model.r_source())
model.save('tutorial')
# GridFix GLMM R source, generated on 04.05.17, 14:45:09
# input file: gridfix.csv
# RegionSet: <gridfix.GridRegionSet (testgrid), size=(800, 600), 8x6 grid, 48 cells, memory=22500.0 kB>
# DV type: fixated
library(lme4)
data <- read.table("gridfix.csv", header=T, sep="\t", row.names=NULL)
# Define R factors for all chunking variables and group dummy codes
data$subject_number <- as.factor(data$subject_number)
data$image_id <- as.factor(data$image_id)
# Center and scale predictors
data$fCentr_C <- scale(data$fCentr, center=TRUE, scale=TRUE)
data$fLumin_C <- scale(data$fLumin, center=TRUE, scale=TRUE)
# NOTE: this source code can only serve as a scaffolding for your own analysis!
# You MUST adapt the GLMM model formula below to your model, then uncomment the corresponding line!
#model <- glmer(dvFix ~ 1 + fCentr_C + fLumin_C + (1 | image_id), control=glmerControl(optimizer="bobyqa"), data=data, family=binomial)
save(file="gridfix_GLMM.Rdata", list = c("model"))
print(summary(model))
3.3.2. 2. Evaluating the model using R¶
Note: The following cells will only work interactively if you have a working R environment and the rpy2 Python module installed. The next cell will set up the %%R magic code to run R code within this notebook
In [50]:
# Initialize R environment for Jupyter notebook using rpy2
%load_ext rpy2.ipython
The rpy2.ipython extension is already loaded. To reload it, use:
%reload_ext rpy2.ipython
The Python code cell above saved the tutorial model (central bias and luminance) to “tutorial.csv”, which we will now load into R. The following code example fits a model containing central bias and mean cell luminance as fixed factors and includes random intercepts (but not slopes) for individual images. The resulting R model object can then be manipulated using R commands as usual and/or saved to an Rdata file.
In [51]:
%%R
# GridFix GLMM R source, generated on 04.05.17, 14:22:46
# input file: gridfix.csv
# RegionSet: <gridfix.GridRegionSet (testgrid), size=(800, 600), 8x6 grid, 48 cells, memory=22500.0 kB>
# DV type: fixated
library(lme4)
data <- read.table("tutorial.csv", header=T, sep="\t", row.names=NULL)
# Define R factors for all chunking variables and group dummy codes
data$subject_number <- as.factor(data$subject_number)
data$image_id <- as.factor(data$image_id)
# Center and scale predictors
data$fCentr_C <- scale(data$fCentr, center=TRUE, scale=TRUE)
data$fLumin_C <- scale(data$fLumin, center=TRUE, scale=TRUE)
# NOTE: this source code can only serve as a scaffolding for your own analysis!
# You MUST adapt the GLMM model formula below to your model, then uncomment the corresponding line!
model <- glmer(dvFix ~ 1 + fCentr_C + fLumin_C + (1 | image_id), control=glmerControl(optimizer="bobyqa"), data=data, family=binomial)
print(summary(model))
Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) [glmerMod]
Family: binomial ( logit )
Formula: dvFix ~ 1 + fCentr_C + fLumin_C + (1 | image_id)
Data: data
Control: glmerControl(optimizer = "bobyqa")
AIC BIC logLik deviance df.resid
6318.4 6345.0 -3155.2 6310.4 5756
Scaled residuals:
Min 1Q Median 3Q Max
-1.3908 -0.5507 -0.4629 0.7890 2.5595
Random effects:
Groups Name Variance Std.Dev.
image_id (Intercept) 0.007605 0.0872
Number of obs: 5760, groups: image_id, 15
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.01691 0.03874 -26.250 <2e-16 ***
fCentr_C -0.65888 0.02978 -22.126 <2e-16 ***
fLumin_C -0.07915 0.03374 -2.346 0.019 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) fCnt_C
fCentr_C 0.157
fLumin_C 0.027 -0.090
3.3.3. 4. Concluding Remarks¶
This concludes the GridFix tutorial - we hope that it can prove helpful in setting up a preprocessing script and GLMM-based fixation analysis. For more details of supported image features and other attributes of the GridFix toolbox, you can use the navigation bar on the left to browse the module documentation or look at some example scripts for common analyses. Thank you for your interest in this method!