Manuel Suarez was kind enough to provide us with some more training data. His data mostly comes from the Milwaukee area. We specifically asked for data that would be harder to classify than the cut and dry sweeps in our original training data set.
The results of the first experiment confirm that this data will be harder to classify. The Suarez dataset consists of 22 sweeps and when trained on the original Diehl dataset, the neural network classified the Suarez dataset with an accuracy of only 35%. As this is worse than guessing, it would seem there are elements in this dataset that the classifier must be exposed to during training.
Training and validating using only this dataset produced slightly better results, coming in at an accuracy of 64%. Training and testing using both datasets absorbed some of this variability and produced a mean classification accuracy of 84.5% with a 95% confidence interval of (79%, 90%).