Archive | November, 2014

Trends in Atmospheric NO and NO2 Concentrations

As part of my Data Analysis in R course on Udacity, I’m publishing the results of an EDA I did on atmospheric nitric oxide and nitrogen dioxide concentrations somewhere in Cambrige (UK).

You can find the datasets here.

I plotted the levels of NO in the atmosphere over the span of several days, and noticed that it tends to have a daily cycle. The levels probably go down during the night, and go back up when it warms up during the day :

Rplot02

Viewed on a larger scale, the mean levels of NO seem to have no noticeable trend :

Rplot01

 

Plotting the levels of NO vs the levels of NO2, a noticable positive correlation emerges:

Rplot

Taking the Pearson’s product-moment correlation of NO and NO2 concentrations reveals a value of 0.6968, which supports the observation.

Here’s the R code that produced these plots, for those of you that are interested:


#http://www.airqualityengland.co.uk/local-authority/data?la_id=51

library(ggplot2)
library(dplyr)
library(grid)
library(gridExtra)

ds1 <- read.csv("2014-05-07-141107012512.csv")
ds2 <- read.csv("2014-08-05-141107012512.csv")
ds3 <- read.csv("2014-11-04-141107012512.csv")
dataset <- rbind.data.frame(ds1, ds2, ds3)


dataset$timestamp <- as.numeric(strptime(paste(dataset$End.Date,dataset$End.Time), format = "%d/%m/%Y %H:00:00"))
dataset <- dataset[!is.na(dataset$timestamp ), ]

dataset$hour <- dataset$timestamp / 3600
dataset$hour <- dataset$hour - min(dataset$hour)

dataset$day <- round(dataset$timestamp / 86400)
dataset$day <- dataset$day - min(dataset$day)

sp1 <- ggplot(aes(x = hour, y = NO), data = cleanData) +
 ylim(c(0, 150)) +
 geom_line(color = "#334455") + 
 scale_x_continuous(breaks = seq(0, 200, 24), limits = c(0, 200)) +
 labs(x = "Hour Since Start", y = "Nitric Oxide Concentration")

sp2 <- ggplot(aes(x = hour, y = NO), data = cleanData) +
 ylim(c(0, 150)) +
 geom_line(color = "#334455") + 
 scale_x_continuous(breaks = seq(0, 500, 24), limits = c(0, 500)) +
 labs(x = "Hour Since Start", y = "Nitric Oxide Concentration")

grid.arrange(sp1, sp2)



dataset.by_day <- dataset %>%
 group_by(day) %>%
 summarise(mean_NO = mean(NO))

sp1 <- ggplot(aes(x = day, y = mean_NO), data = dataset.by_day) +
 ylim(c(0, 100)) +
 geom_line(color = "#334455") +
 scale_x_continuous(breaks = seq(0, 200, 7), limits = c(0, 50)) +
 labs(x = "Days Since Start", y = "Mean NO Concentration")

sp2 <- ggplot(aes(x = day, y = mean_NO), data = dataset.by_day) +
 ylim(c(0, 100)) +
 geom_line(color = "#334455") +
 scale_x_continuous(breaks = seq(0, 200, 7), limits = c(0, 200)) +
 labs(x = "Days Since Start", y = "Mean NO Concentration")

grid.arrange(sp1, sp2)




sp1 <- ggplot(aes(x = NO, y = NO2), data = dataset) + 
 xlim(c(0, 150)) + 
 ylim(c(0, 100)) +
 geom_point(alpha = 1/5, position = position_jitter(width = 0.8, height = 0.8)) +
 geom_smooth() +
 labs(x = "Nitric oxide concentration", y = "Nitrogen dioxide concentration")

grid.arrange(sp1)

with(dataset, cor.test(x = NO, y = NO2)) # 0.6968017