Here we are going to have a look at the raw image data using the nibabel package.
Windows installers for nibabel are here: https://pypi.python.org/pypi/nibabel
First we need to set up inline plotting
%pylab inline
This command did lots of imports for us, but to be explicit, I'm going to import some useful namespaces again
import os # Module with commands to interact with the operating system
import numpy as np # this is the Python package for handling arrays
import matplotlib.pyplot as plt # This is for making 2D plots like MATLAB
Next we are going to load an image into memory and do some math on it.
We first load the anatomical image for the first subject
anat_fname = 'ds107_sub001_highres.nii.gz'
anat_fname
What is an image? We have already seen that it is an array, with some metadata. Nibabel does the work of reading the metadata into a nice format, and getting the image data as an array.
We first import the nibabel package to allow us to load the image.
import nibabel as nib
Remember you can check what's in the package by doing tab completion. Try doing tab complete on "nib."
anat_img = nib.load(anat_fname)
anat_img
Try tab completing on anat_img. (remember the dot at the end).
An image has a shape
anat_img.shape
As we saw before, our image is 256 by 256 by 192. This means that the image consists of an array with dimensions (256, 256, 192), where the elements of the array are the voxel (3D pixel) intensity values.
anat_arr = anat_img.get_data() # get the data array
type(anat_arr)
An array has a shape. You can check the other attribute it has by tab completing on anat_arr.. Remember the trailing dot.
anat_arr.shape
We can pick out a slice using slicing syntax rather similar to MATLAB. Remember the 0-based indexing though.
middle_slice = anat_arr[:, :, 96]
middle_slice.shape
This is just a 2D array of numbers, where the numbers are intensity values. We can look at it using matplotlib image display
plt.imshow(middle_slice)
This gave us an axial slice. It looks like this image is stored with left to right changing over the first axis of the array, front to back changing over the second axis of the array, and bottom to top changing over the third axis. How would I get a coronal slice? A sagittal slice?
The slice, like the whole of the array, give the intensity values at each location in the image:
middle_slice
We have lots of zeros at the outside, but in the centre... :
middle_slice[122:133, 122:133]
What data type does the image have?
Now we have an array, we can do things like find the maximum, and minimum and mean value in the volume:
anat_arr.max()
anat_arr.min()
anat_arr.mean()
We can also do useful things like looking at the histogram of values in the image. We first flatten the array to be a 1D vector, then take the histogram of that:
anat_arr_1d = anat_arr.ravel() # as 1D vector
tmp = plt.hist(anat_arr_1d, 100) # 100 histogram bins for nice detail
It looks like there are some outlier values. We can can knock of the top 5 percent and redo the histogram
percentile_95 = np.percentile(anat_arr_1d, 95)
clipped_vector = anat_arr_1d[anat_arr_1d <= percentile_95]
tmp = plt.hist(clipped_vector, 100)
An image also has meta-data. This is a NIfTI image, and it has NIfTI metadata, in the form of a header:
hdr = anat_img.get_header()
print hdr
An image also has an affine. The affine is a 4 by 4 matrix that gives the relationship of the array data to real space (in millimeters, relative to the scanner).
aff = anat_img.get_affine()
aff
In this case you can see that this affine came from the srow fields in the header.
Specifically the affine specifies the relationship between the array coordinates from our (256, 256, 192) array, and millimeters relative to the center of the scanner. Let's say we want to know the position in millimeter space corresponding to position [0, 0, 0] in array coordinates. To get the mm coordinate, we matrix multiply the affine with the array coordinate.
But wait, the array coordinate has 3 values ([0, 0, 0]) and the affine has 4 columns - that won't work.
This is because the 4 by 4 array operates in homogenous coordinates - see http://en.wikipedia.org/wiki/Homogeneous_coordinates. Homogenous coordinates are a trick to allow the affine matrix to contain translations as well as rotations and zooms and shears. In practice, before we multiply our array coordinate, we have to append a 1 at the end of the 3 coordinate values, to make it work with our 4 by 4 affine.
So, we get the position like this:
array_coords = [0, 0, 0]
homogenous_array_coords = [0, 0, 0, 1]
mm_coordinates = aff.dot(homogenous_array_coords)
mm_coordinates
That means that array position [0, 0, 0] corresponds to millimeter position 125 mm to the right of the scanner center, 144 mm behind the scanner center, and 99 mm below the scanner center.
Another example - the mm location of center of the array:
homogenous_array_coords = [127, 127, 46, 1] # note the extra 1 again for homogenous coordinates
mm_coordinates = aff.dot(homogenous_array_coords)
mm_coordinates
Images can also be four dimensional, where the fourth dimension is usually time. The EPI images are usually 4D images.
bold_fname = 'ds107_sub001_run01.nii.gz'
bold_fname
bold_img = nib.load(bold_fname)
bold_img
bold_img.shape
This EPI run contains 164 volumes, each of size (64, 64, 35). We can look at individual volumes by slicing:
bold_arr = bold_img.get_data()
bold_vol0_arr = bold_arr[:, :, :, 0]
bold_vol0_arr.shape
bold_middle_slice = bold_vol0_arr[:, :, 17]
plt.imshow(bold_middle_slice)
The 4D array allows us to do interesting things like making an image which is the mean across time:
time_course_mean = bold_arr.mean(axis=3) # the last axis (0-based numbering)
time_course_mean.shape
plt.imshow(time_course_mean[:, :, 17])
Or - mabye of more interest - the standard deviation across time:
time_course_std = bold_arr.std(axis=3) # the last axis (0-based numbering)
middle_slice_std = time_course_std[:, :, 17]
plt.imshow(middle_slice_std)
Aha - there's lot's of variation at the edge of the brain and in the ventricles.
What's the location of the voxel in this slice with the highest variance?
max_index = np.argmax(middle_slice_std) # flattens the array first
max_coord_2d = np.unravel_index(max_index, middle_slice_std.shape) # get 2D coordinate equivalent
max_coord_2d
max_coord_3d = max_coord_2d + (17,) # add the slice number as third coordinate
max_coord_3d
time_course_std[max_coord_3d]
middle_slice_std.max()
We can plot voxel time courses:
max_std_over_time = bold_arr[max_coord_3d]
plt.plot(max_std_over_time)
If you have nipy installed, you can try running the diagnostics. They use calculations that are almost identical to the ones we've already run here.
import nipy.algorithms.diagnostics as nads
tdiff = nads.time_slice_diffs(bold_arr, time_axis=-1, slice_axis=-2)
nads.plot_tsdiffs(tdiff)