Wednesday 26 March 2014

gdalinfo and statistics

This is a response to a query. What to do about a file with no statistics. Using gdalinfo on this SPOT4 scene we get the following:
$ gdalinfo SPOT4_HRVIR2_046-209_2_080911_110453.tif
Driver: GTiff/GeoTIFF
Files: SPOT4_HRVIR2_046-209_2_080911_110453.tif
       SPOT4_HRVIR2_046-209_2_080911_110453.tfw
Size is 4440, 4180
Coordinate System is:
PROJCS["SWEREF99 TM",
    GEOGCS["SWEREF99",
        DATUM["SWEREF99",
            SPHEROID["GRS 1980",6378137,298.2572221010002,
                AUTHORITY["EPSG","7019"]],
            AUTHORITY["EPSG","6619"]],
        PRIMEM["Greenwich",0],
        UNIT["degree",0.0174532925199433],
        AUTHORITY["EPSG","4619"]],
    PROJECTION["Transverse_Mercator"],
    PARAMETER["latitude_of_origin",0],
    PARAMETER["central_meridian",15],
    PARAMETER["scale_factor",0.9996],
    PARAMETER["false_easting",500000],
    PARAMETER["false_northing",0],
    UNIT["metre",1,
        AUTHORITY["EPSG","9001"]],
    AUTHORITY["EPSG","3006"]]
Origin = (610600.000000000000000,7560400.000000000000000)
Pixel Size = (20.000000000000000,-20.000000000000000)
Metadata:
  AREA_OR_POINT=Point
  TIFFTAG_COPYRIGHT=Lantm?teriet
  TIFFTAG_RESOLUTIONUNIT=1 (unitless)
  TIFFTAG_XRESOLUTION=1
  TIFFTAG_YRESOLUTION=1
Image Structure Metadata:
  INTERLEAVE=BAND
Corner Coordinates:
Upper Left  (  610600.000, 7560400.000) ( 17d39'42.97"E, 68d 8' 9.21"N)
Lower Left  (  610600.000, 7476800.000) ( 17d34'41.45"E, 67d23'12.54"N)
Upper Right (  699400.000, 7560400.000) ( 19d47'31.12"E, 68d 5'16.24"N)
Lower Right (  699400.000, 7476800.000) ( 19d38'30.00"E, 67d20'25.87"N)
Center      (  655000.000, 7518600.000) ( 18d40' 6.86"E, 67d44'28.15"N)
Band 1 Block=4440x128 Type=Byte, ColorInterp=Red
Band 2 Block=4440x128 Type=Byte, ColorInterp=Green
Band 3 Block=4440x128 Type=Byte, ColorInterp=Blue
Band 4 Block=4440x128 Type=Byte, ColorInterp=Undefined
Pretty good but no stats. We can force minimum and maximum calculation the the -mm flag
$ gdalinfo -mm SPOT4_HRVIR2_046-209_2_080911_110453.tif
Driver: GTiff/GeoTIFF
Files: SPOT4_HRVIR2_046-209_2_080911_110453.tif
       SPOT4_HRVIR2_046-209_2_080911_110453.tfw
Size is 4440, 4180
Coordinate System is:
PROJCS["SWEREF99 TM",
    GEOGCS["SWEREF99",
        DATUM["SWEREF99",
            SPHEROID["GRS 1980",6378137,298.2572221010002,
                AUTHORITY["EPSG","7019"]],
            AUTHORITY["EPSG","6619"]],
        PRIMEM["Greenwich",0],
        UNIT["degree",0.0174532925199433],
        AUTHORITY["EPSG","4619"]],
    PROJECTION["Transverse_Mercator"],
    PARAMETER["latitude_of_origin",0],
    PARAMETER["central_meridian",15],
    PARAMETER["scale_factor",0.9996],
    PARAMETER["false_easting",500000],
    PARAMETER["false_northing",0],
    UNIT["metre",1,
        AUTHORITY["EPSG","9001"]],
    AUTHORITY["EPSG","3006"]]
Origin = (610600.000000000000000,7560400.000000000000000)
Pixel Size = (20.000000000000000,-20.000000000000000)
Metadata:
  AREA_OR_POINT=Point
  TIFFTAG_COPYRIGHT=Lantm?teriet
  TIFFTAG_RESOLUTIONUNIT=1 (unitless)
  TIFFTAG_XRESOLUTION=1
  TIFFTAG_YRESOLUTION=1
Image Structure Metadata:
  INTERLEAVE=BAND
Corner Coordinates:
Upper Left  (  610600.000, 7560400.000) ( 17d39'42.97"E, 68d 8' 9.21"N)
Lower Left  (  610600.000, 7476800.000) ( 17d34'41.45"E, 67d23'12.54"N)
Upper Right (  699400.000, 7560400.000) ( 19d47'31.12"E, 68d 5'16.24"N)
Lower Right (  699400.000, 7476800.000) ( 19d38'30.00"E, 67d20'25.87"N)
Center      (  655000.000, 7518600.000) ( 18d40' 6.86"E, 67d44'28.15"N)
Band 1 Block=4440x128 Type=Byte, ColorInterp=Red
    Computed Min/Max=0.000,255.000
Band 2 Block=4440x128 Type=Byte, ColorInterp=Green
    Computed Min/Max=0.000,255.000
Band 3 Block=4440x128 Type=Byte, ColorInterp=Blue
    Computed Min/Max=0.000,255.000
Band 4 Block=4440x128 Type=Byte, ColorInterp=Undefined
    Computed Min/Max=0.000,255.000
But that is done on the fly every time and isn’t available to any other programme. To get permanently attached statistics we must create a new file with gdal_translate and the -stats flag.
$ gdal_translate -stats SPOT4_HRVIR2_046-209_2_080911_110453.tif output.tif
Input file size is 4440, 4180
0...10...20...30...40...50...60...70...80...90...100 - done.
Then we get this from gdalinfo.
$ gdalinfo output.tif
Driver: GTiff/GeoTIFF
Files: output.tif
Size is 4440, 4180
Coordinate System is:
PROJCS["SWEREF99 TM",
    GEOGCS["SWEREF99",
        DATUM["SWEREF99",
            SPHEROID["GRS 1980",6378137,298.2572221010002,
                AUTHORITY["EPSG","7019"]],
            AUTHORITY["EPSG","6619"]],
        PRIMEM["Greenwich",0],
        UNIT["degree",0.0174532925199433],
        AUTHORITY["EPSG","4619"]],
    PROJECTION["Transverse_Mercator"],
    PARAMETER["latitude_of_origin",0],
    PARAMETER["central_meridian",15],
    PARAMETER["scale_factor",0.9996],
    PARAMETER["false_easting",500000],
    PARAMETER["false_northing",0],
    UNIT["metre",1,
        AUTHORITY["EPSG","9001"]],
    AUTHORITY["EPSG","3006"]]
Origin = (610600.000000000000000,7560400.000000000000000)
Pixel Size = (20.000000000000000,-20.000000000000000)
Metadata:
  AREA_OR_POINT=Point
  TIFFTAG_COPYRIGHT=Lantm?teriet
  TIFFTAG_RESOLUTIONUNIT=1 (unitless)
  TIFFTAG_XRESOLUTION=1
  TIFFTAG_YRESOLUTION=1
Image Structure Metadata:
  INTERLEAVE=PIXEL
Corner Coordinates:
Upper Left  (  610600.000, 7560400.000) ( 17d39'42.97"E, 68d 8' 9.21"N)
Lower Left  (  610600.000, 7476800.000) ( 17d34'41.45"E, 67d23'12.54"N)
Upper Right (  699400.000, 7560400.000) ( 19d47'31.12"E, 68d 5'16.24"N)
Lower Right (  699400.000, 7476800.000) ( 19d38'30.00"E, 67d20'25.87"N)
Center      (  655000.000, 7518600.000) ( 18d40' 6.86"E, 67d44'28.15"N)
Band 1 Block=4440x1 Type=Byte, ColorInterp=Red
  Min=0.000 Max=255.000 
  Minimum=0.000, Maximum=255.000, Mean=34.493, StdDev=40.536
  Mask Flags: PER_DATASET ALPHA 
  Metadata:
    STATISTICS_MAXIMUM=255
    STATISTICS_MEAN=34.493035637312
    STATISTICS_MINIMUM=0
    STATISTICS_STDDEV=40.53643151195
Band 2 Block=4440x1 Type=Byte, ColorInterp=Green
  Min=0.000 Max=255.000 
  Minimum=0.000, Maximum=255.000, Mean=34.832, StdDev=42.388
  Mask Flags: PER_DATASET ALPHA 
  Metadata:
    STATISTICS_MAXIMUM=255
    STATISTICS_MEAN=34.832346814518
    STATISTICS_MINIMUM=0
    STATISTICS_STDDEV=42.388441124655
Band 3 Block=4440x1 Type=Byte, ColorInterp=Blue
  Min=0.000 Max=255.000 
  Minimum=0.000, Maximum=255.000, Mean=44.688, StdDev=49.705
  Mask Flags: PER_DATASET ALPHA 
  Metadata:
    STATISTICS_MAXIMUM=255
    STATISTICS_MEAN=44.688375576533
    STATISTICS_MINIMUM=0
    STATISTICS_STDDEV=49.70528306057
Band 4 Block=4440x1 Type=Byte, ColorInterp=Alpha
  Min=0.000 Max=255.000 
  Minimum=0.000, Maximum=255.000, Mean=54.075, StdDev=60.230
  Metadata:
    STATISTICS_MAXIMUM=255
    STATISTICS_MEAN=54.075123227294
    STATISTICS_MINIMUM=0
    STATISTICS_STDDEV=60.229994798131

Filtering rasters

I recently needed/wanted to write a filter function to use with GDAL in Python. What I wanted was something that could be passed a kernel such as spatial average (blur) or edge detection (actually it was this I was after mostly). So this is the code I wrote for it.
Lets start with the selection of a kernel (filter). Firstly, if I know which one I want I can pass the name to the selection function, otherwise the function sets the choice to nothing to begin with.
def kernelChoice(ans = ''):
Then there is a dictionary containing all the kernels as Numpy arrays (see code below). After which the function asks for the choice if none has been passed to it already and returns the array and its name. Note also that Numpy has been imported as np (import numpy as np).
def kernelChoice(ans = ''):  
        '''Returns a kernel to use with mustard as well as the name of the kernel.
        To use: kernel, ans = kernelChoice(ans = '') '''
        # Create dictionary of kernels
        kernelDict ={
        'Average':np.array([[0.111,0.111,0.111],[0.111,0.111,0.111],[0.111,0.111,0.111]]),
        'HighPass1':np.array([[-0.7,-1.0,-0.7],[-1.0,6.8,-1.0],[-0.7,-1.0,-0.7]]),
        'HighPass2':np.array([[-1.0,-1.0,-1.0],[-1.0,9.0,-1.0],[-1.0,-1.0,-1.0]]),
        'PrewittX':np.array([[-1.0,0.0,1.0],[-1.0,0.0,1.0],[-1.0,0.0,1.0]]),
        'PrewittY':np.array([[1.0,1.0,1.0],[0.0,0.0,0.0],[-1.0,-1.0,-1.0]]),
        'SobelX':np.array([[-1.0,0.0,1.0],[-2.0,0.0,2.0],[-1.0,0.0,1.0]]),
        'SobelY':np.array([[1.0,2.0,1.0],[0.0,0.0,0.0],[-1.0,-2.0,-1.0]])
        }
        # Create list of kernel names
        KerList = kernelDict.keys()
        KerList.sort()
        # Get choice from user
        while ans not in KerList:
            print KerList
            ans = raw_input('Enter name of kernel: ')
        kernel = kernelDict[ans]
        print ans,':\n', kernel
    return kernel, ans
The next step then is to open the raster file, read in the data and then run the kernel over that. This function calls upon rasterImport which we will also define as function further down, but is similar to previous raster importers I have messed around with before. Getting the code for that sorted out should be a priority because as it stands now it is pretty messy and requires a lot of manipulation. Well, back to this code.
The mustard function sets NoData values in the imported data grid to Numpy nan. Then there is a whole load of getting shapes and sizes in order. After which an array, based on the kernel is created that contains relative locations to the central pixel; left and up are negative, right and down are positive (remember that coordinates here start at the upper left).
Then we loop starting top left going right then down to the next row, left to right each time. Then at last the filtered raster is returned.
def mustard(file,kernel):
    '''Run a kernel (window,matrix) over a single layer raster. Uses rasterImport and rasterExport.
    Kernel must be numpy array of odd number dimensions.
    To use: outdata = mustard(file,kernel) '''
    # Import data file
    data, meta, metadata = rasterImport(file)
    # Set no data values to numpy nan
    data[data==meta['dats'][0]]=np.nan
    # Get size of indata array
    inrows =  meta['dimension']['rows']
    incols =  meta['dimension']['cols']
    # Get size of kernel
    krows = np.shape(kernel)[0]
    kcols = np.shape(kernel)[1]
    # Check kernel is smaller than data grid.
    if krows >= inrows or kcols >= incols: sys.exit('Bad kernel. Too large.')
    # Check kernel has a central pixel
    if krows % 2 == 0: sys.exit('Bad kernel. Even number rows.')
    if kcols % 2 == 0: sys.exit('Bad kernel. Even number columns.')
    # Get central pixel location in kernel
    kmidrow = int(krows)/2
    kmidcol = int(kcols)/2
    # Create relative extent of kernel
    rowminext = -1*((krows-1)/2)
    rowmaxext = (krows-1)/2
    rowrange = range(rowminext,rowmaxext+1,1)
    colminext = -1*((kcols-1)/2)
    colmaxext = (kcols-1)/2
    colrange = range(colminext,colmaxext+1,1)
    # Set initial starting location of kernel on grid
    dati = kmidrow
    datj = kmidcol
    # Get number of rows to run kernel over
    gridrows = range(inrows + rowminext*2)
    # Get number of columns to run kernel over
    gridcols = range(incols + colminext*2)
    # Create output array filled with nan
    outdata = np.empty((inrows,incols))
    outdata[:] = np.nan
    # Start loop
    for row in gridrows:
        datj = kmidcol
        rowvec = np.ones((1,krows))*dati + rowrange
        for col in gridcols:
            if np.isnan(data[dati,datj]):
                datj = datj + 1
                outdata[dati,datj] = np.nan
            else:
                colvec = np.ones((1,kcols))*datj + colrange
                extract = np.empty((krows,kcols))
                for i in range(krows):
                    for j in range(kcols):
                        extract[i,j] = data[int(rowvec[0,i]),int(colvec[0,j])]
                pixval = np.nansum(extract * kernel)
                outdata[dati,datj] = pixval
                datj = datj + 1
        dati = dati + 1
    return outdata
To finish off with here are the other functions that make this work as well as a few lines to get everything running. Don’t forget to import Numpy as well as os to get this all working.
def namer(pathstring):
    '''Get name and extension of a file
    To use: name,ext,path,namefull = namer(pathstring) '''
    ## Called by: kriging
    namefull = os.path.basename(pathstring)
    path = os.path.dirname(pathstring)
    ext = namefull.split('.')[1]
    name = namefull.split('.')[0]
    return name,ext,path,namefull
#
def rasterImport(file):
    '''Import a raster file and return grid plus meta data.
    To use: data, meta, metadata = rasterImport(file)'''
    time_one = datetime.now()
    print '\nImporting ',file
    #print time_one.strftime('at day:%j %H:%M:%S')
    # register all of the GDAL drivers
    gdal.AllRegister()
    # Open file
    raster = gdal.Open(file)
    if raster is None:
        print 'Could not open ',file,'\n'
        sys.exit(1)
    # Get coordinate system parameters
    projec = raster.GetProjection()
    srs=osr.SpatialReference(wkt=projec)
    transf = raster.GetGeoTransform()
    ul_x = transf[0]
    ul_y = transf[3]
    xres = transf[1]
    yres = transf[5]
    # get image size
    rows = raster.RasterYSize
    cols = raster.RasterXSize
    dims = {'xres':xres,'yres':yres,'rows':rows,'cols':cols}
    # Calculate corners
    ll_x = ul_x
    ll_y = ul_y + (rows * yres)
    ur_x = ul_x + (cols * xres)
    ur_y = ul_y
    lr_x = ur_x
    lr_y = ll_y
    corners = {'ll':(ll_x,ll_y),'ul':(ul_x,ul_y),'ur':(ur_x,ur_y),'lr':(lr_x,lr_y)}
    #
    driveshrt = raster.GetDriver().ShortName
    driver = gdal.GetDriverByName(driveshrt)
    metadata = driver.GetMetadata()
    metakeys = metadata.keys()
    # Read the file band to a matrix called band_1
    band = raster.GetRasterBand(1)
    # Access data in rastern band as array
    data = band.ReadAsArray(0,0,cols,rows)
    #
    # gdal interpolation creates "upside down files, hence this
    Yres=yres
    if Yres/np.fabs(Yres)==-1:
        Yres=-1*Yres
        data = np.flipud(data)
    #
    # get nodata value
    nandat = band.GetNoDataValue()
    # get minimum and maximum value
    mindat = band.GetMinimum()
    maxdat = band.GetMaximum()
    if mindat is None or maxdat is None:
            (mindat,maxdat) = band.ComputeRasterMinMax(1)
    dats = [nandat,mindat,maxdat]
    sumvals = data[np.where(np.logical_not(data == nandat))]
    sumval = sumvals.sum()
    numvals = len(sumvals)
    avval = sumval/numvals
    volvals = sumval*(math.fabs((transf[1]*transf[5])))
    meta = {'transform':transf,'projection':projec,'corners':corners,'dimension':dims,'dats':dats,'sumval':sumval,'numvals':numvals,'avval':avval,'volvals':volvals}
    if srs.IsProjected:
        print 'projcs 1: ',srs.GetAttrValue('projcs')
        print 'projcs 2: ',srs.GetAuthorityCode('projcs')
        print 'projcs 3: ',srs.GetAuthorityName('projcs')
    print 'geogcs 1:: ',srs.GetAttrValue('geogcs')
    print 'Sum of pixel values = ',sumval,' in range ',mindat,' to ',maxdat
    print 'From ',numvals,' of ',rows,' X ',cols,' = ',rows*cols,' pixels, or ',((1.0*numvals)/(rows*cols))*100,'%'
    print 'Average then is ',avval
    print 'Pixels size, x:',transf[1],' y:',transf[5]
    print '"Volume" is then the sum * 1 pixel area: ',volvals
    return data, meta, metadata
#
def rasterExport(outdata,meta,filnm):
    '''Export array as single layer GeoTiff. Uses metadata from rasterImport and numpy nan as mask.
    To use: rasterExport(outdata,meta,filnm)'''
    inrows =  meta['dimension']['rows']
    incols =  meta['dimension']['cols']
    datdriver = gdal.GetDriverByName( "GTiff" )
    datout = datdriver.Create(filnm,incols,inrows,1,gdal.GDT_Float32)
    datout.SetGeoTransform(meta['transform'])
    datout.SetProjection(meta['projection'])
    outdata_m = np.flipud(outdata)
    outdata_m = np.where(np.isnan(outdata_m),meta['dats'][0],outdata_m)
    datout.GetRasterBand(1).WriteArray(outdata_m)
    datout.GetRasterBand(1).SetNoDataValue(meta['dats'][0])
    datout.GetRasterBand(1).ComputeStatistics(True)
    datout = None
    return 0
#
# Here is a little starter, just set the variable 'file' first
file = './file1.tif'
name,ext,path,namefull = AM.namer(file)
filnm = os.path.join(path,(name + ans + '.tif'))
outdata,meta = mustard(file,kernel)
rasterExport(outdata,meta,filnm)

Friday 14 March 2014

Quick script to batch crop images

Batch crop a folder full of image files, sounds simple enough. Apple Automator can do that. Sort of. If you want to crop them centred. But if you want to select a non-centred area to crop in each image (say a set of maps generated by a script) then you can use PIL, a Python module.
Here goes:
#!/usr/bin/env python
from __future__ import division
import sys
import os
from PIL import Image
# ANDREW MERCER
# 12/04/2014
# V.1
#
## FUNCTION DEFINITIONS
#
## MAIN PROGRAM
# Set file folder name to current folder
flistfolder = './'
# Get file type from user. PIL can handle other types but checking all that is more
# than I am willing to do for this quick script.
filetype = ''
while filetype not in ['jpg','png','tif','gif']:
    filetype = raw_input('Enter file type: ')
# Create file list containing all and only image files in folder
FileList = filter(lambda x: x.endswith(filetype),os.listdir(flistfolder))
# Get crop area
left = ''
right = ''
upper = ''
lower = ''
print "Enter coordinates for crop area. 0,0 is upper left of image\n"
# Use isinstance as you may want to set to 0. This checks that the variable is set to
# the integer value of 0 and not just empty. Also checks against non-numerical input.
while not isinstance(left, int):
            leftin = raw_input('\nEnter left coordinate: ')
            try:
                left = int(leftin)
                print 'Left set at: ',left
            except:
                left = ''
while not isinstance(right, int):
            rightin = raw_input('\nEnter right coordinate: ')
            try:
                right = int(rightin)
                print 'Right set at: ',right
            except:
                right = ''
while not isinstance(upper, int):
            upperin = raw_input('\nEnter upper coordinate: ')
            try:
                upper = int(upperin)
                print 'Upper set at: ',upper
            except:
                upper = ''
while not isinstance(lower, int):
            lowerin = raw_input('\nEnter lower coordinate: ')
            try:
                lower = int(lowerin)
                print 'Lower set at: ',lower
            except:
                lower = ''
# Create folder for new images
if not os.path.exists('./Crop'):
    os.makedirs('./Crop')
# Loop through file list
for f in FileList:
    # Create new file name for edited file
    file, ext = os.path.splitext(f)
    fed = file+'_e'+ext
    fout = os.path.join('./Crop',fed)
    # Open image with PIL
    try:
        im = Image.open(f)
    except IOError:
        continue
    # Crop and save image
    im.crop((left,upper,right,lower)).save(fout)
Either put that in your bash path or in the folder with the image files and just call it from the command line, either: $Croper.py or $./Croper.py Only one problem (well probably many more): how to get the coordinates in the first place?

Sunday 9 March 2014

A few bash one liners I have found useful

Once you start working via the terminal, using gdal, python etc. you start wondering if bash might not be capable of a lot more than just triggering those nice programmes. Of course it is, but is it easy? No. Well, sort of. Sometimes the syntax (especially regular expressions) can seem arcane and impenetrable. I'm not here to rescue you, I am a duffer at this but I have learnt how to do a few things that I think are useful and above all, fast.

Let's say you want to reorder the columns of a csv file, you want to completely drop the first column and switch the 4th and 5th columns around. The following does just that and exports the rearranged columns to a new file. The $ signs indicate column numbers and their order in the list determines their order in the output.

awk -v OFS="," -F"," '{print $2,$3,$5,$4}' file1.csv > file1_e.csv

What if we want to do some more editing of the csv file? Lets say you want to remove the rows where the last entry is empty.

sed -i ".bak" '/,$/d' file1_e.csv

Or remove rows containing "nan".

sed -i ".bak" '/nan/d' file1_e.csv

Or remove both (drop the -E on Linux using gsed).

sed -i ".bak" -E '/,$|nan/d' file1_e.csv

How about renaming a folder full of files, removing some part of the name and adding something else. The following code removes the string "bs04" from all tiff file names, no matter where in the name and prefixes all with "trial".

for i in *.tif; do mv "$i" trial"${i/bs04/}"; done

Alternatively remove "bs04" and append "trial" to all png files.

for i in *.png; do mv "$i" "${i/bs04/trial}"; done

Now that the files have been renamed, lets make a folder, called "results" and move them to it.

mkdir results; mv *trial* results

There are many other blogs and pages dedicated to bash code and one liners (e.g. bashoneliners.com) but you will have to wade through a great deal of chat and geek nonsense. I hope these lines were to the point and generic enough to be useful. I shall try and update this list with useful (for me) code as I use it but lots of stuff is so specific and/or hard for me to understand, let alone explain, that it won't make it on to here.

Thursday 6 March 2014

Positive and negative masks

I'm working on kriging at the moment and need to split my beloved glacier into two regions: the interpolated region and the extrapolated region.

  1. Firstly, create a delaunay polygon shape file in QGIS from point data. This represents the area of the glacier where values can be interpolated (with some editing). 
  2. From this and a mask of the entire glacier (a raster with 1 = glacier, nan = not a glacier) I can use gdalwarp to create a masked mask, i.e. mask the glacier to show only those parts covered by interpolation.
  3.  gdalwarp refuses to respect the size of the original raster when clipping, so that has to be fixed running gdalwarp again and forcing the new raster to the right size
  4. Now, to get the reverse mask, the bits not covered by interpolation. Well, there I hit a bit of a problem. gdal_calc.py won't do it, at least no matter what I tried.
  5. So write a script.
Maybe I'm missing something but by this point, writing something myself seemed quicker and infinitely less frustrating.

The function below uses a couple of other functions already encountered here on this blog, the DemImport function and namer, although you can easily get rid of namer. The code is messy and probably not efficient but I'm bolting together this stuff on the fly, as needs arise. But then, it's free.

def maskPosNeg(vector,Afile,name,outfolder):
    '''Create a mask raster from a vector over a raster then invert masked area to create mask of remaining area
    Created for masks of interpolated and extrapolated ares of glacier
    To use: maskPosNeg(vector,Afile,name,outfolder) '''
    # Get background raster (glacier dem) and calculate dimensions etc.
    Adata,AXg,AYg,AXg1,AYg1,Arx,Ary,Ademmeta = DemImport(Afile)
    transf = Ademmeta[2]
    ul_x = transf[0]
    ul_y = transf[3]
    xres = transf[1]
    yres = transf[5]
    # get image size
    demrows = Ademmeta[4]
    demcols = Ademmeta[5]
    # Calculate corners
    ll_x = ul_x
    ll_y = ul_y + (demrows * yres)
    ur_x = ll_x + (demcols * xres)
    ur_y = ul_y
    # Create names for output data
    Aname,Aext,Apath,Anamefull = namer(Afile)
    outFile1 = os.path.join(outfolder,(name + '_posMask_Temp.tif'))
    outFile2 = os.path.join(outfolder,(name + '_posMask.tif'))
    outFile3 = name + '_negMask'
    # Start by clipping to vector
    gdalMess1 = 'gdalwarp -te ' + str(ll_x) + ' ' + str(ll_y) + ' ' + str(ur_x) + ' ' + str(ur_y) + ' -tr ' +str(xres) + ' ' + str(yres) + ' -dstnodata -9999 -q -cutline ' + vector + ' -crop_to_cutline -of GTiff ' + Afile + ' ' + outFile1
    os.system(gdalMess1)
    # Then resize to original raster, I kid you not.
    gdalMess2 = 'gdalwarp -te ' + str(ll_x) + ' ' + str(ll_y) + ' ' + str(ur_x) + ' ' + str(ur_y) + ' -tr ' +str(xres) + ' ' + str(yres) + ' -dstnodata -9999 ' + outFile1 + ' ' + outFile2
    os.system(gdalMess2)
    print outFile2,' created'
    # Remove temp file
    command = 'rm ' + outFile1
    os.system(command)
    #
    Bfile = outFile2
    Bdata,BXg,BYg,BXg1,BYg1,Brx,Bry,Bdemmeta = DemImport(Bfile)
    print Bfile,' data type: ',type(Bdata)
    print 'Rows (A,B): ',Arx,Brx,' Columns (A,B): ',Ary,Bry
    print 'xres (A,B): ',Ademmeta[2][1],Bdemmeta[2][1],' yres (A,B): ',Ademmeta[2][5],Bdemmeta[2][5]
    print 'No data (A): ',Ademmeta[6],' No data (B): ',Bdemmeta[6]
    # Check matching resolution
    if Arx != Brx or Ary != Bry:
        print Bfile,' resolution mismatch with ', Afile
    elif Ademmeta[4] != Bdemmeta[4] or Ademmeta[5] != Bdemmeta[5]:
        print 'Size mismatch between ',Bfile,' and ',Afile
    # Do the masking and subtracting
    Bdata[Bdata==Bdemmeta[6]]=np.nan
    AdataMasked = np.ma.masked_array(np.nan_to_num(Adata), mask=np.isnan(Adata))# & np.isnan(Bdata))
    BdataMasked = np.ma.masked_array(np.nan_to_num(Bdata), mask=AdataMasked.mask)
    outdata = (AdataMasked - BdataMasked).filled(np.nan)
    outdata = np.ma.masked_array(np.nan_to_num(outdata), mask=np.isnan(outdata))
    filnm = datawrite(outdata,Adata,Ademmeta,outFile3,outfolder)
    print 'maskPosNeg done.'
    return 0