Quality of Sensor Data: A Study Of Webcams

noisy GPS data

Fig 1: Noisy GPS data: allegedly running across a lake.

For a while, I’ve been wondering what the data quality of sensor data is. Naively – and many conversations that I had on this went along that route – it can be assumed that sensors always send correct data unless they fail completely. A first counter-example that many of us can relate to is GPS, e.g. integrated into a smartphone. See the figure to the right which visualises part of a running route and shows me allegedly running across a lake.

Now, sensor does not equal sensor, i.e. it is not appropriate to generalise “sensors”. Quality of measurements and data varies a lot on the actual measure (e.g. temperature), the environment, the connectivity of the sensor, the assumed precision and many more effects.

In this blog, I analyse a fairly simple, yet real-world setup, namely that of 3 webcams that take images every 30 minutes and send them via the FTP protocol to an FTP server. The setup is documented in the following figure that you can read from right to left in the following way:

  1. There are 3 webcams, each connected to a router via WLAN.
  2. The router is linked to an IP provider via a long-range WIFI connection based on microwave technology.
  3. Then there is a standard link via the internet from IP provider to IP provider.
  4. A router connects to the second IP provider.
  5. The FTP server is connected to that router.
The connection between the webcams on th right to the FTP server on the left.

Fig 2.: The connection between the webcams on the right to the FTP server on the left.

So, once an image is captured, it travels from 1. to 5. I have been running this setup for a number of years now. During that time, I’ve incorporated a number of reliability options like rebooting the cameras and the (right-hand) router once per day. From experience, steps 1. and 2. are the most vulnerable in this setup: both, long-range WIFI and WLAN, are liable to a number failure options. In my sepcific setup, there is no physical obstacle or frequency polluted environment. However, weather conditions are most likely to be the source of distortion, like widely varying humidity and temperature.

So, what is the experiment and what are the results? I’ve been looking at the image data sent over the course of approx. 3 months. In total, around 8000 images were transmitted. I counted the successful (fig 3) vs the unsuccessful (fig 4) transmissions. I did not track the images that completely failed to be transmitted, i.e. that did not reach the FTP server at all and therefore did not leave any trace. 5.3% of the images were distorted (as in fig 4) or every 19-th image failed to be transmitted correctly. In addition, that rate was no constant (e.g. per week) but there were times of heavy failures and times of no failures.

Successfully transmitted image.

Fig 3: Successfully transmitted image.

Distorted image.

Fig 4: Distorted image.

This is an initial and simple analysis but one that matches real-world conditions and setups pretty well and is therefore no artificial simulation. In the future, I might refine the analysis like counting non-transmissions too or correlating the quality with temperature, humidity or other potential influencing factors.

You can follow me on Twitter via @tfxz.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s