Thursday 29 January 2015

awk saves the day (again)

A quick post on using awk to thin data.
I have a file containing thousands of coordinates recorded using dGPS every 5 seconds. The spatial density in the data is often far too high for practical purposes and made kriging a nightmare.
So I wrote the following awk command to thin the data (note that Eastings are in column 3, Northings in column 2 and Elevation in column 4)
awk -v OFS="," -F"," 'NR == 1{xo = $3; yo = $2; zo = $4; print xo,yo,zo; next} 
    {x = $3; y = $2; z =$4; xd = xo -x; yd = yo - y; xyzd = sqrt((xd^2 + yd^2 + zd^2)); if (xyzd < 10) next; else print x,y,z; xo = x; yo = y; zo = z}' Points_e1.csv > Points_e2.csv
Breaking it down:
  1. firstly, set the output delimiter set to “,” using -v OFS=","
    • then the input delimiter to “,” using -F","
    • at line one set the initial values for x,y,z and print them
  2. set new values for x,y,z to those from current line
    • calculate differences to the old values
    • calculate 3D distance
    • if the distance is less than 10 then skip to the next line without doing anything more
    • if the distance is greater than 10 then print the current values of x,y,z and then put them into the variables for the “old” values and move on
Running this in the command line was so quick that at first I thought it must have failed.

No comments:

Post a Comment