For many years I’ve been working with R, a computer language that caught my interest in around 2011. It was not until DNR collected bathymetric data of the Deep Creek Lake in 2011, and people were expressing interest in how water depth varied over the season at their boat slip locations, that I became interested in analyzing the DNR data. I had explored enough about R to understand that it could possibly be used to generate bathymetric maps.
So, off I went, trying to create bathymetric maps from the DNR data. It was a trial and error process, as all development is, but finally succeeded in producing bathymetric maps for many coves of the lake, and one for the whole lake. The latter map clearly lacks resolution because of the size of the lake and the ability to produce useful printed documents.
In this note I’m going to explain how I created the cove maps.
To generate a bathymetric map required three capabilities:
The starting point of all of this consisted of two events:
This report is written in R Markdown and Rstudio. Rstudio is an integrated development environment, while R Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. Go here for more details on using R Markdown. R Markdown is oriented towards the R language an contains some additional functionality to work with R itself. The systems allows one to execute scripts as part of the documentation when using “knitr,” a package that is part of R.
All the code reported here was the result of trial and error with a separate editor. This allows one to find the best way to produce a final product. All code here works and produces the results listed.
This report is about the work that I’ve done to develop bathymetric maps of Deep Creek Lake and its many coves using the measurements made by DNR in April of 2012. This is going to be a lengthy adventure, since I want to document every step I took.
The 2012 dataset was collected by DNR over a 6 day period. Measurements were made on April 12 and 13, April 16 and 17 and April 19 through 21, all of 2012. This data is to this date, March 2018 the latest of the bathymetry of Deep Creek Lake. This data was made available in Excel files and can be found here. DNR processed the raw data and produced a set of 6 files of true data.
The following table provides a summary of the content of the 6 files that contain the bathymetric measurements. The total number of measurements are 617,419.
Some of the points have been deleted from the originally received files, but only a very small amount, mostly in the very shalllow areas, as agreed upon with DNR. We will not go into this analysis because it has no impact on the final result.
So here is a table of the original DNR files formatted using Markdown.
# simple table creation here
Date | File Name | No. of points | Min Lon | Max Lon | Min lat | Max Lat | Min Depth | Max Depth |
---|---|---|---|---|---|---|---|---|
12-Apr | April12xyz.txt | 48,445 | -79.32150017 | -79.25799367 | 39.484192 | 39.50689817 | 2.61 | 45.75 |
13-Apr | April13xyz.txt | 62,194 | -79.31166183 | -79.275382 | 39.492046 | 39.51567983 | 2.61 | 51.85 |
16-Apr | April16xyz.txt | 106,138 | -79.3207975 | -79.2765555 | 39.448169 | 39.49623683 | 2.51 | 46.73 |
17-Apr | April17xyz.txt | 116,506 | -79.32140633 | -79.24972417 | 39.46307867 | 39.5068495 | 2.54 | 36.65 |
19-Apr | April19xyz.txt | 90,057 | -79.34453133 | -79.28951433 | 39.46386267 | 39.53688867 | 2.51 | 59.28 |
20-Apr | April20xyz.txt | 97,236 | -79.39220833 | 39.55918633 | 39.44820533 | 39.55918633 | 3.49 | 59.61 |
21-Apr | April21xyz.txt | 96,843 | -79.39202017 | -79.344315 | 39.49837067 | 39.53027483 | 3.82 | 75.85 |
Somthing similar can be done using the R Markdown and “knitr” methodology, which will be used throughout this report.
For example, when you click the Knit button in Rstudio, a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
file1 <- "./data/DNR2011MeasurementFileNames.txt"
data1 <- read.table(file=file1, header=FALSE, sep="\t")
n <- length(data1$V1)
print(n)
## [1] 7
What we’ve done here is to read in the list of file names that contain the corrected data from the DNR bathymetric data acquisition process of 2011. The original data was reduced and corrected to a more readable format. The value of n is the number of data files that we have to process.
In the following we will produce a table that contains some summary data of the above mentioned files, such as how many data points, the minimum and maximum latitude and longitude of the data set in that file, and the minimum and maximum depth measured. The results are then displayed in a table.
data.length <- vector()
data.minlon <- vector()
data.maxlon <- vector()
data.minlat <- vector()
data.maxlat <- vector()
data.mindepth <- vector()
data.maxdepth <- vector()
The section above defines that names of the vectors that will contain the corresponding values for each of the files.
for(i in 1:n){
filex <- paste("./data/", as.character(data1$V1[i]), sep="")
data2 <- read.table(file=filex, header=TRUE, sep="\t")
data.length[i] <- length(data2$Longitude)
data.minlon[i] <- min(data2$Longitude)
data.maxlon[i] <- max(data2$Longitude)
data.minlat[i] <- min(data2$Latitude)
data.maxlat[i] <- max(data2$Latitude)
data.mindepth[i] <- min(data2$Depth)
data.maxdepth[i] <- max(data2$Depth)
}
The above section examines each file an determines its length, the minimum and maximum values of the longitude and latitude coordinates and the minimum and maximum water depth encountered.
In the code in the following section a histogram is plotted of the number of data points to process in the various files. First the vectors from above are placed in a data frame, a fundamental data object in R.
Here is an example of a histogram:
data3 <- data.frame(data1$V1, data.length, data.minlon, data.maxlon, data.minlat, data.maxlat, data.mindepth, data.maxdepth)
names(data3) <- c("File.Name", "No..of.Points", "Min.Lon", "Max.Lon", "Min.Lat", "Max.Lat", "Min.Depth", "Max.Depth")
library(DT)
datatable(data3)
x <- data3$No..of.Points
# x
hist(x)
What I’ve done here is to read a file with filenames that contain the valid data obtained for each of the days measurements were made and rendered a summary of it and presented the results as a table in html and plotted a histogram of the number of data points collected., The data was read into a dataframe and the names of the columns were changed to give them a more obvious meaning.
The graph can be seen above, where x is the number of data points collected on a given day. There is not much use of this data except to show how the data can be read from a file, put in the right format, and provided a graphic representation of it.
This describes the basic elements of looking at data and what to do with it. Many more complex transformations and renderings are available within R.
One of the things to note from the table is that filenames are specified for the data collected on each of the seven days that they were collected. Let’s read one of the files. They are text files and can be read with any text editor.
The smallest file is the one on the first day of data taking, the table showing that there are 48,220 entries. Let’s read this file and explore its content by plotting the data. We have looked at the data and determined that it has column headers and that the data is tab-separated. The plot symbol for the data is made smaller than the default value usund “cex” and graph is title with the name of the data file with “main.” A blue grid is added to assist in identifying the coordinates of points. Here is the code:
file2 <- paste("./data/", as.character(data1$V1[1]), sep="")
data4 <- read.table(file=file2, header=TRUE, sep="\t")
plot(data4$Latitude, data4$Longitude, cex=0.2, main=data1$V1[1], asp=0.5)
grid(col="blue")
The plot shows all 48,220 points and because of this number of points they overlap making it look like there are lines.
All we can say from this exercise is that it appears that we have a set of valid points.
The next step is to show that they are indeed associated with depts and are contained within the boundary of the lake. Here are the processes that we’re going to do:
Along the way we’ll introduce various niceties for plots and process and display all of the data.
First the outline of the lake. The data for the outline resides on one file and consists of Northing/Easting pairs, which are assigned a depth of zero. The original file that we worked with was obtained from Garrett County Planning and Zoning office. The history of this file could not be traced, although it looked good. A more refined data set, representing the coordinates at an elevation of 2462 ft was obtained from the “bydown” effort, a surveying task to identify property that the State offered to lake-adjacent property owners when they purchased the lake from Penelec. The name of this file, for our purposes, is “century2462.txt”. We shall read this file and display the first few lines.
shoreline <- read.table(file="./data/century2462.txt", header=TRUE, sep="\t")
head(shoreline)
## Easting Northing
## 1 637381.1 680304.8
## 2 637399.5 680327.6
## 3 637418.8 680348.6
## 4 637443.9 680375.9
## 5 637465.8 680399.8
## 6 637486.6 680422.7
names(shoreline)
## [1] "Easting" "Northing"
As one can see, the latitude and longitude have already been converted to Northings and Eastings and I’ve also called for a listing of the names of the data columns. Let’s see what a plot of the data looks like. I made a fine line of all the points and colored them red and added a blue colored grid, to assist in identifying locations.
plot(shoreline$Easting, shoreline$Northing, cex=0.1, type="l", col="red", main="DCL Shoreline", asp=1)
grid(col="blue")
Note “asp”. This is the aspect ratio of the plot. I set it to 1 because the x and y values are in the same units. If I don’t specify it, the image will the stretched over the width of the page, which does not look right. The size of the image can also be specified via the “plot.window” function.
The next thing is to add the bathymetric points. First they have to be converted to Northing/Easting coordinates. This is actually pretty simple, although it required to learn about projections and how to implement these with R. The total code to transform from from lon/lat to Easting/Northing, is as follows:
library(rgdal)
# Convert the existing lon/lat data to a SpatialData format
cord.dec <- SpatialPoints(cbind(data4$Longitude, data4$Latitude), proj4string = CRS("+proj=longlat"))
# Transform the data to Easting/Northing in ft
data5<- spTransform(cord.dec, CRS("+init=epsg:2893")) # Units in ft
# Replot our shoreline plot
plot(shoreline$Easting, shoreline$Northing, cex=0.1, type="p", col="red", main="DCL Shoreline", asp=1)
grid(col="blue")
# Add the data to our previous plot with the shoreline
points(data5, col="green", pch=20, cex=0.1)
The next item is to plot the measured depths color code with some sequence of colors varying between red and blue. That would gove us some idea of the depth range without having to go via contours, something we’ll do later on.
The way this is done to treat each point of a single color as if it were from a separate data set. That allows us to invoke the “legend” function. So this is how it gets done:
# Simplify the names of the variables
depths <- data4$Depth
northings <- data5$coords.x1
eastings <- data5$coords.x2
# Create a function to generate a continuous color palette
rbPal <- colorRampPalette(c('red','blue'))
#
ncols <- 10
data5$Col <- rbPal(ncols)[as.numeric(cut(depths, breaks = ncols))]
# First plot the shoreline
plot(shoreline$Easting, shoreline$Northing, cex=0.1, type="p", col="green", main="DCL Shoreline", asp=1)
grid(col="blue")
# Add the points and color code them according to depth
points(data5, pch = 20, cex=0.05, col = data5$Col)
There are many different ways that one can show data. Since 2011, when I first started this work, I’ve learned about a lot of ways that one can do the same thin. Here is a script that improves on the above. Let us first show the result and then wlak through the compnents of the R script. Here is the R-script and the results that are calculated with it:
# Plotting the raw measurements
library(plot3D)
#========1=========2=========3=========4=========5=========6=========7=========8=========9=========0
# Functions
source("./functions/credit.R")
#========1=========2=========3=========4=========5=========6=========7=========8=========9=========0
Day <- "April12"
file1 <- paste("./data/", Day, "lonlat_clean.txt", sep="")
bath <- read.table(file=file1, header=TRUE, sep="\t")
x <- bath$Longitude
y <- bath$Latitude
z <- -bath$Depth
N <- length(x)
main.text <- paste("Measured Water Depth (ft)\n",
" On Test Day ", Day, ", 2012.\n",
N, " Observations", sep="")
scatter2D(x, y, colvar = z, pch = 16, bty ="n", type ="p", cex=0.2,
xlab="Longitude, degrees West",
ylab="Latitude, degrees North",
clab = c("Water Depth", "(ft)"),
main=main.text)
grid(col="grey", lty=1)
# To put text on the lower right hand side of the plot (like to script name and date)
credit("plot3d_bathymetry_data.R")
#========1=========2=========3=========4=========5=========6=========7=========8=========9=========0
The first thing is to load the needed library, here “plot3D”. Then I load a function, “credit.R”, that is run at the end of the script. I write many scripts and generate many graphs. This function writes the name of the script and the date the script was executed somewhere on the graph. That gives me a clue on where to look for it on my computer in case I want to update it.
Earlier I read file names from a file. Here I specify the date and construct the filename.
I read the data the same way as before, but than I copy the vectors to names that are easier to understand.
“main.text” is the title of the graph. I use the “scatter2D” function to plot the depth measurements. “colvar” automatically assigns colors from a color palette shown in the legend.
The net effect is similar, if not better, a visual display of the depths of the various points.
Let’s do this for all of the data files. We’re going to take a slightly different approach with assigning colors to data points. This is so that it is the same for points in all of the files. We will use the “RColorBrewer” librat that defines a number of different pallettes.
library(RColorBrewer)
# This adds a column of color values with 10 equally spaced breaks
ncols <- 16
colors <- list(color = colorRampPalette(brewer.pal(11,"Spectral"))(ncols))
# The maximum depth of the lake is about 75 ft so we'll use 80 ft. Create depth/color vector
maxdepth <- 80
depth.color <- seq(from=0, to=maxdepth, by=(maxdepth/ncols))
# The following is a function that allows us to assign a color index to a given depth
vector_index <- function(x, y){
# Given a set of values "x" and a set "y" whose index is to be found in x
# The vector "z" of the same size as "x" is returned with the corresponding indexes.
z <- vector()
for(i in 1:length(x)){
for(j in 1:length(y)){
if(x[i] <= y[j]) next
z[i] <- j
}
}
return(z)
}
# First plot the shoreline
plot(shoreline$Easting, shoreline$Northing, cex=0.1, type="p", col="green", main="DCL Shoreline", asp=1)
grid(col="blue")
# Add the depth points from each of the files
for(i in 1:n){
filex <- paste("./data/", as.character(data1$V1[i]), sep="")
data6 <- read.table(file=filex, header=TRUE, sep="\t")
# Setting existing coordinate as lat-long system
cord.dec <- SpatialPoints(cbind(data6$Longitude, data6$Latitude), proj4string = CRS("+proj=longlat"))
data7 <- spTransform(cord.dec, CRS("+init=epsg:2893")) # Units in ft
# Copy to simpler names
depths <- data6$Depth
northings <- data7$coords.x1
eastings <- data7$coords.x2
# We now compute the color vector index for each depth, z
z <- vector_index(depths, depth.color)
# We then assign the color code to each depth index
colors <- unlist(colors)
cl <- vector()
for(i in 1:length(z)){
cl[i] <- colors[z[i]]
}
data6$Col <- cl
# Add the points and color code them according to depth
points(data7, pch = 20, cex=0.05, col = data6$Col)
}
One can now clearly see that there is a variation in depth along the lenght of the lake, the deepest area is near the dam. Also all of the data is projected properly and falls within the shoreline as defined by the 2462 contour line. By the way, the 2462 ft elevation is the height of the spillway and hence the maximum height of the lake level.
Another observation that one can make is that the Southern end of the lake is overall shallower, as denoted by the prevalence of red points, something that is generally not much appreciated.
There are all kinds of refinements that can be made to the plots, but we’ll wait for that until later, although we’ll show here the visual impact of chosing a different color scheme.
Color palettes are very powerful in generating an impression, and the same is here with showing how depts varies over the lake. The exact same process as performed above is used with a different color palette. The code has been simplefied. Certain vectors are not needed, the assignment of the color index has been made into a function, and the use of the result is direct.
vector_transform <- function(colors, z){
colors <- unlist(colors)
cl <- vector()
for(i in 1:length(z)){
cl[i] <- colors[z[i]]
}
return(cl)
}
# Define the color palette
colors <- list(color = colorRampPalette(brewer.pal(ncols,"Set1"))(ncols))
# First plot the shoreline
plot(shoreline$Easting, shoreline$Northing, cex=0.1, type="p", col="green", main="DCL Shoreline and Depth Measurements", asp=1)
grid(col="blue")
# Add the depth points from each of the files
for(i in 1:n){
filex <- paste("./data/", as.character(data1$V1[i]), sep="")
data6 <- read.table(file=filex, header=TRUE, sep="\t")
# Setting existing coordinate as lat-long system
cord.dec <- SpatialPoints(cbind(data6$Longitude, data6$Latitude), proj4string = CRS("+proj=longlat"))
data7 <- spTransform(cord.dec, CRS("+init=epsg:2893")) # Units in ft
# Extract the depth values into a separate vector
depths <- data6$Depth
z <- vector_index(depths, depth.color)
cl <- vector_transform(colors, z)
# Add the points and color code them according to depth
points(data7, pch = 20, cex=0.05, col = cl)
}
As can be seen, the resolution of the map is limited. This is because it’s a large lake with many coves. To get a better feel for the level of detail that can be accomplished we would have to examine that effort for a specific cove.
What has been shown here is that with simple data manipulations with R and the bathymetric measurements made by DNR in 2011 one can get a first approximation of the whole lake bathymetry. Unfortunately, the lake is so large that details of a cove have to be done on a cove by cove basis. This is the subject of another article.
PLV
First Published: 12/9/2016
Last Updated: 4/10/2018