June 1, 2005Analytics Davis, M., R. Smith, B. Dixon, A. Parrish and D. Cordes,
Software: Practice and Experience, Volume 35, no. 7, June 2005.
This research delved into the current use of the commodity computing hardware, which is motivated by a dramatic increase in the performance to price ratio. The research evaluated the performance of a statistical analysis application in a ten-node off the shelf computing cluster. The study had two stems: (1) examining the various network topologies, and (2) minimizing the software modifications required in distributing the application. The general conclusion was that when reuse of existing code is feasible, performance can be dramatically increased by the combined use of parallel computing and commodity components.
April 1, 2005Analytics, Motor Vehicles, Traffic Safety Wang, H., A. Parrish, R. Smith and S. Vrbsky,
Proceedings of the 2005 ACM Symposium on Applied Computing (AI Track), April 2005, pp. 36-41.
This paper explores a data mining process in which the original dataset is first transformed through a variable subset selection process followed by the application of a machine learning algorithm. A variable ranking technique, called the Sum of Maximum Gain Ratio (SMGR), is applied. This technique computes a score that is based on the over-representation of attribute values. Essentially, SMGR is the ratio of the number of cases that could potentially be reduced by an effective countermeasure to the total number of cases associated with the over-represented value. SMGR was shown empirically to provide comparable results to alternative techniques, but it had significantly improved runtime performance.
January 1, 2005Analytics Parrish, A., S. Vrbsky, B. Dixon and W. Ni
Database techniques generally require the reading of complete rows of data (traditionally referenced as “records”) in order to get at a single attribute that might be of interest. Further, if filtering is required (not all records are of interest), a further computational step is needed on each record to determine if it qualifies. Transposition of the data enables this to be accomplished with a single read operation, followed by a single filter-pointer operation producing essentially instantaneous results. This method has proven successful in producing real-time instantaneous results when applied to well over millions of records.
April 1, 2004Motor Vehicles, Traffic Safety Wang, H., H. C. Chen and A. Parrish.
Proceedings of 42nd ACM Southeast Regional Conference, pp. 375-378, 2004.
This research built on the foundation of the Critical Analysis Reporting Environment (CARE), which was developed at the University of Alabama to mine crash reports submitted by investigating officers in the field. The research extended CARE capabilities by developing neural network algorithms to automatically learn potentially problematic attributes over time. The system was piloted and tested using records from Walker County, Alabama.
This paper presents an early (2003) review of CARE that was published in IEEE Computer, the flagship publication of the IEEE Computer Society. The major points made in the paper include:
The causes for CARE’s early success were twofold: (1) its simplicity of use, enabling safety practitioners with basic computer literacy skills to easily obtain information from it with a minimum of training; and (2) its efficiency, providing virtual instantaneous presentation of results for even the largest of databases (several hundred thousand records).
That CARE had been implemented in a number of states.
That CARE had received the 1995 NHTSA Administrator’s Award for innovation.
That other applications were being made of CARE in addition to highway safety, namely databases were being mined at the Federal Aviation Administration (FAA) and NASA.