You’ve collected your measurement data, but how do you easily make sense out of it? How can you simplify the post-processing of your data and start analyzing it sooner? In this article, I’ll explain how adding some code to your CRBasic program can save you from a post-processing headache later.
As a graduate student, I remember my colleagues often talking about the headache of post-processing data before they could even begin analyzing their data.
As you might imagine, my colleagues and I found all of these post-processing methods to be cumbersome.
For our experiments in graduate school, we were encouraged to keep detailed notes on dates and times, as well as when an experiment moved from one treatment to the next. I understood the importance of keeping good notes, but I wondered if there was a way to save the information in the data table defined in CRBasic.
Most statistical software packages rely on data frames or input files that have a minimum number of columns and large quantities of rows or records. Too often, however, we format our raw data tables without considering the analysis we hope to complete after the data is collected.
Data tables typically include the default time stamp plus record number, and then we just start adding sensor measurements and system voltages. But normally, an experiment includes more—more treatments, more repetitions with multiples of the same sensor, just more information that is used to group the data for exploratory data analysis as well as T-tests, ANOVAs (analyses of variance), or regressions.
So, how can you store information like this in your data table for easy reference? I’ll step you through an example of how to create and manipulate categorical variables in CRBasic. The variables can be edited while the data is collected and stored in the data table.
Note: Even if you’re not familiar with water quality and turbidity measurements, I think you’ll see in this example how easy it is to improve the helpfulness of your data table.
In this example, the following question was posed: How does optical turbidity vary at six known concentrations (0, 10, 50, 100, 250, and 1000 mg/L)? To answer this question, data were collected using three OBS500 turbidity meters. The details of the experiment follow:
The first round of data collection produced data tables similar to the one shown in Table 1. The reported values included the time stamp, record number, raw backscatter and sidescatter in millivolts, and backscatter and sidescatter readings in FBU (Formazin Backscatter Unit) and FTU (Formazin Turbidity Unit).
Table 1. Raw data collected during the sediment study.
For this experiment, the time stamp was not important because the measurements were not changing relative to time but to known treatments of sediment (concentration and mineral). Using the current data table setup in the CRBasic program running on my datalogger, I had to rely on my detailed notes to post-process the data and figure out which sensor was used, what concentration was measured, and what mineral was used to create the suspension.
After making a few changes to the CRBasic program, I was able to generate a more detailed data table (Table 2) that includes categorical information defined as the following:
Table 2. Data collected for the natural sediment from sensor A at a concentration of 0 mg/L.
To transform my data table from Table 1 to Table 2, I only needed to add a few lines of code to the CRBasic program. Here’s how you can do it too:
Dim SensorA As String * 16 Dim SensorB As String * 16 Dim SensorC As String * 16 Public Concentration As Long
DataTable (SamplesA,1,10000) Sample (9,obs500_measA(),IEEE4) Sample (1,SensorA,String) Sample (1,Concentration,FP2) EndTable DataTable (SamplesB,1,10000) Sample (9,obs500_measB(),IEEE4) Sample (1,SensorB,String) Sample (1,Concentration,FP2) EndTable DataTable (SamplesC,1,10000) Sample (9,obs500_measC(),IEEE4) Sample (1,SensorC,String) Sample (1,Concentration,FP2) EndTable
Define the sensor names within the program. Because we declared the sensors as “Dim” in our program, they will not show up in the public table. So, after your “BeginProg” statement, you will need to define the Name you want for each of the three sensors.
Tip: Remember that the sensors were defined as a 16-character-long string, so the name for each must be less than 16 characters.
In our example, A, B, and C were used. However, you could include serial numbers or model numbers of your sensors. These are not public variables in the sense that they are NOT changing with treatment. I collected 200 measurements per sensor in each treatment.
BeginProg SensorA="A" SensorB="B" SensorC="C"
Now, every time a data table is called, the sensor string and concentration are added to the record as independent columns in the data table. It’s that easy!
I hope this example helps you understand how some simple additions to your CRBasic program code can save you from a post-processing headache later. Feel free to post your questions or comments below.