How accurate are body composition analyzers?
A body composition analyzer can provide you with results for body water, fat, muscle, and more! But how do you know results are accurate?
For weight scales, it's relatively straightforward! When professionally trained technicians conduct calibration/inspection of weight accuracy, they use "known weights" that have been certified by accredited laboratories, and confirm if the scale returns the weight result it should. For example, if you place a known weight of 20 kg on the scale, then it should display a weight of 20 kg. If the results are clearly inaccurate, then you know it requires calibration.
For body composition, it's a bit more complicated! First of all, unlike for weight, there is no easily referenced "truth" you can compare your results to in order to confirm accuracy. Various different methods of assessing body composition, such as Dual-Energy X-Ray Absorptiometry (DXA), Underwater Weighing, Air Displacement Plethysmography, Magnetic Resonance Imaging (MRI), Computed Tomography (CT), and Bioelectrical Impedance Analysis (BIA) are all forms of calculation/estimation.
Among these methods, several have been accepted as "gold standards". While methods such as MRI or CT are also used, DXA is one of the most common methods used for comparison, which is why the vast majority of validation studies will compare BIA with DXA to determine how similar the results are.
By comparing a device's results to that measured by a "gold standard", you can determine how "accurate" your device is - the more similar the results, the better. This is done via statistical analysis - generally, via (1) Level of Correlation, and (2) Limit of Agreement.
Level of Correlation
This tells you how closely related results from two devices are, and in this case, a higher level of correlation (known as "r" value) is desired (since you're trying to replicate the "gold standard"). So for example, on a validation study, you may see that a BIA device's fat percentage result had a r=0.96 correlation with DXA's fat percentage results, which means the two results are highly correlated.
Limit of Agreement (LoA)
This is similar to Standard Deviation, and the purpose is to determine how close a result will be to the "gold standard", on average. A larger Limit of Agreement means that results are more likely to deviate farther from the "gold standard", so the smaller the better.
These are the key points to look for in a validation study. That said, although the natural inclination may be to find a simple answer, such as "Charder's devices have a r=0.96 correlation with DXA", this isn't necessarily the complete story. A device may show good results in one study, with high correlation and low LoA. But if all the subjects in the study were men, do we know for certain that women would receive equally accurate results? If the subjects all had normal BMI values, is it a given that results would be equally accurate for subjects with high BMI?
Unlike weight measurement, which is more or less "settled" with a "100% correct answer", body composition is an ongoing science where researchers are still constantly improving algorithms and accuracy. While it's important to validate and confirm the accuracy of a device by comparison with a "gold standard", it's perhaps just as important for scientists to continuously work to improve calculation algorithms, providing increasingly accurate results through constant research.