Cloud computing is a great thing! One of the areas where cloud is useful is in enabling manufacturers to benchmark their hardware, software, or workloads on a variety of different hardware to compare and contrast the differences in performance. At Nimbix, we have many customers who do this quite successfully.
Here are some lessons we’ve learned from our customers.
Have an experimental design
- Have a well-articulated hypothesis
- Know what you are testing for; speed, accuracy, cost per solution?
- Design your experiment (that’s what benchmarking is) to be optimized for what you are testing and only for what you are testing. Remove all other extraneous and confounding elements.
- Have a precise method for handling your data and for summarizing your data.
- Know, before you start, what statistics or mathematical methods you will need to use to make and support your arguments.
- Know what success and failure (acceptance and refutation of your hypothesis) look like from your statistical tests.
Control for differences
- Drivers, software versions, and IOPs speed from storage, and many other things can all contribute to the performance of a particular application.
- These elements need to be documented if not set to parity between platforms.
- If you are testing hardware, it is imperative that you compare ‘apples’ to ‘apples’ regarding software versions, drivers and accelerators.
- If you have a situation where you have a version or driver limitation, first do the ‘apples’ to ‘apples’ test with the limited driver, then run with the modern driver and report your findings.
- If X and Y are equal using a limited driver, but Y is superior with an upgraded driver that X cannot support, these findings need to be included in your report.
- If you are comparing drivers, you must control for the hardware on which you are running.
- If you are comparing software, you have some choices. The choice you make depends on the argument you are trying to make, but be conscious of your choice.
- Run optimized for the hardware
- Run a basic installation
Make sure your data is representative or appropriately biased
- In almost all benchmark cases you are running a set of data against some process and taking measurements while the process is running (time, size, etc.). The contents of the data need to be adequately biased for the point that is being supported. For example, if I want to say that product X classifies pictures of horses better than product Y, then my data set(s) should contain a variety of horses (Quarter horses, Clydesdales, Tennessee Walking Horses, etc.). Your test data should be representative of real world usage of the tool you are testing.
- If you are testing a classification type of application make sure you include material that exercises the full discriminatory power of the tool, and you must track different result and error types — true positives, false positives, true negatives, false negatives.
Make sure your data can support your claims
- This goes back to proper experimental design. Make sure the claims you make with your data can be substantiated with your data and with the measurements and summary statistics you have collected. Avoid the temptation of over-interpreting your data and let your data lead your conclusions, not the other way around.
Contact us about benchmarking on the Nimbix Cloud.