top of page

Elephant Shaped Dynamometers (Part One)

Dec 8, 2024

11 min read

4

207

0




[This is the picture chat GPT gave me of an elephant shaped dynamometer. It is completely ridiculous and not what I wanted but I actually like it... why on earth is it wielding a sword?! Anyway, this blog itself is not AI generated and recaps some of the realisations I have had with dynamometry over the years. Hope it is helpful]


I have a complicated relationship with dynamometry.


It's a little bit similar to how I feel about Thor: Love and Thunder... On the one hand, I love the concept and overarching storyline but then at the same time I feel a little let down. Maybe I was expecting it to do something magnificent like Thor: Ragnarok and leave me in awe at just how impressive it was, but it missed on more than a few occasions which left just a little bit of a sour taste in my mouth. It was the forced jokes that probably did it, and I feel like Zeus could have been a more powerful or threatening presence to feed into future story lines.. BUT ANYWAY.


This echoes some of the feelings I have about dynamometry, it's great but also not at the same time. Shamefully it took about 5ish years in to my career to actually know what it was (i.e. more than the grip strength test that every NHS department seems to have... just 'cause). For years I would just stare at this massive machine in the corner of department and not even care what it was. Isokinetic dynamometer? Never heard of it mate.


Picture: Me showing off how much f****** force we can measure (I'm still proud)


When I finally figured out that an isokinetic dynamometer is considered gold standard testing (the reasons given for this are a bit ambiguous), I was more than a bit confused as to how we had one in an NHS department considering they cost about £50,000 and then absolutely buzzing that we had something that could provide actual, objective data! Finally, something I could hold my hat on and actually use to guide my rehabilitation rather than judging how wonky someones knee was on a single leg squat or juggling in my head whether their strength was a 4+ or 4- on the oxford grading scale (it's never a 3+, ever).


Fast forward to my lower limb rotation, and I was asked to complete the testing for the department. People would book their patients on to my list, most commonly soft tissue knee reconstructions, trauma and the odd person with osteoarthritis or PFP. I loved it, patients loved it and I became a huge advocate for it. Why wouldn't you want a number to show you how far off or how close to your end goal you are?


I was so convinced that dynamometry was the best thing ever. And I was wrong. Don't misunderstand me, I still love dynamometry (extend that to any force measurement) for the right reasons, at the right time, for the right person. But I don't think that is how it is commonly being used, and it is rising in popularity possibly without the full consideration of how it can be best applied.


If anyone reading this is rattled by these thoughts so far (except if it's for my take on Thor Lover and Thunder) then I think this is especially important for you. Take a second and think why you are taking the position you are. As with many things, I never got taught any dynamometry in my formal education and I think it is the same for a lot of people. Therefore we get a lot of our information from social media and overwhelmingly this has been 'If you aren't testing, you're guessing'... which is just how dogma starts if we don't analyse our stance.


[Dogma: a principle or set of principles laid down by an authority as incontrovertibly true. Think knee valgus being akin to the devil or mistiming of the transversus abdominus being the holy grail of low back pain if you're looking for examples.]


Let's take a step back. My aim of this blog isn't to denounce dynamometry or measuring force; I actually have a fairly strong stance on this and it is very pro testing.


My aim is to help people at the start of their journey of clinical reasoning with measuring force and maybe give you a few short cuts. For this blog we will focus only on the bare essentials, without which we cannot hope to proceed in any meaningful way.


And I am not claiming to know everything about it, because I certainly don't, but I do have a lot of lived experience and completed my Masters by Research in measuring quads strength after ACL Reconstruction in NHS patients. Right, I think it's about time for another anecdote...



 


'I want to look at normative data for shoulder testing as there isn't much out there' I said to Jo Gibson who was my supervisor at the time. It's only on reflection now that I realise how naive I must have sounded to someone as smart as Jo (of course she didn't let this show as she is a great teacher and leader). I had no clue about what was important when it came to testing and it showed. 'That's great, but maybe we need to look at the clinical utility of the machine as a first line?'


I didn't know what that meant in all honesty, it took a number of conversations to nail down what our actual research question would be and initially I was a little disappointed - utility of the isokinetic dynamometer meant going back to fundamentals like reliability and validity, not changing the world forever by giving the people the normative data that they need to track progress! Yawn. But despite my naivety I obviously have such a huge respect for Jo and her knowledge / intuition that I was more than happy to go down this road, it must be important stuff for her to suggest it. (We got a small research bursary for this but ultimately didn't end up proceeding with this project)


In the time waiting for the grant to clear, I spent a fair amount of hours playing round with the machine we were going to use (somehow we had another machine that acted as an isokinetic dynamometer also, the PrimusRS, seriously how have we afforded these?!). Have a look at me ripping my podcast co-host James (@thenerdyphysio) for his effort in internal / external rotation testing in 90 degrees abduction. (Ignore the mountain of walking aids, our gym was a storage closet in COVID)




Hopefully, you will be able to see where I am going with this. But even if you are, stick with me because this is such an important stipulation that I didn't realise at the time.


Yes, James was weak (no surprises there) but actually, I'm the laughing stock of this video. It's my job as the tester to make sure I get clean data that can actually tell me something, and if this video is anything to go by, then I spectacularly failed at this.


Reliability is a core concept in measurement. Essentially, it is the extent to which a measurement is free from error. In clinical practice this means that if you tested the same construct (e.g. quads peak force / torque) a few times in the same conditions and not that far apart, high reliability would be the score not changing an awful lot. Of course it will change a bit, nobody is going to hit the same scores all of the time, but they should be similar.


If they aren't similar, then there is no f****** point in doing it.


That's because I don't know if an improvement or decrease in score actually reflects anything to do with my patient or the actual measurement just being incredibly noisy. One of the key things about reliability is that the test needs to be completed in the same way every time, this makes sense because otherwise you are comparing apples and oranges. Testing a muscle in mid-range is likely to give different measurements than end of range. Trunk angle, thigh lift, where and how the strap is attached all will play a role in the result you get at the end. If you aren't being consistent with that, you may as well guess the score.


Knowing how reliable a measure is also gives us important information on the minimal detectable change of a construct (let's go with quadriceps again). How do I know if the improvements someone has in their peak force output is because they are actually able to produce more force as opposed to the expected variation of scores?


'Great, you have improved by 'X' amount!' - if it's the actual testing that we are bothered about, and not contextual effects, then we should be able to say if that improvement is reflective of true change, or not. This is where papers like this come in to play, and is exactly the train of thought Jo was trying to get me on. See the table below to see what I mean.



This is the table from the paper referenced above. If I am testing quadriceps LSI at 60 degrees per second isokinetic then I know that if the difference between testing sessions needs to be more than 8.9% to reflect true change. And this is in a health cohort, not people who have a history of knee injury!


Go back now and look at that video of the shoulder testing. Just how devastating can you be in your feedback if you were going to be feeding back to me on how I messed it up?


The fact he is standing, not using any straps to test just the shoulder and physically leaning into the movements still haunts me when I think back to it. I'd like to say that this example was just me messing about, and I actually think it was, but I doubt my proper attempts were much better (all I did was bring a chair in to have him sit down).


This presents me with a conundrum... was I that bad at testing all the knees on my lower limb rotation also?



The answer is... probably.


The reason that I think isokinetic machines (in the picture above the HUMAC Norm) could be gold standard is the ability to accurately measure and electronically store the positions that someone was in when they were tested, and the copious amount of straps / comfortable things to push in to. By having this it should be very easy to replicate the position in future tests. The actual dynamometer part is either accurate or it isn't, like a pull or push dynamometer.


It is the set up that matters, not the equipment.


I didn't quite grasp this on my lower limb rotation, and distinctly remember changing testing position between time points as I thought the knee lined up with the axis of the dynamometer better. Which means that the patient would be in a different position a lot of the time compared to their previous test.


Did this matter? It's actually a really interesting question that I am going to answer in two parts.


Firstly, I am going to say I don't know. It probably really mattered for some people if they were in a completely different position, and maybe didn't matter that much if the adjustments I decided to make were only relatively small.


I might have said people had no improvement when actually if tested in the same position they might have, and I probably told people they had improved when actually it was just measurement error and a different set up. If we had rock solid data that quadriceps strength has a strong causal role in future injury (we don't), then it could be grounds for disciplinary action.


The concept is a bit like scanning a tumor in different co-ordinates and seeing that it's smaller, then giving the clinical report that the mass is reducing in size and responding to treatment compared to previous scans. It might just be that the different view doesn't capture the entirety of the tumor rather than actually measuring the tumor size...


Okay, it's nowhere near as drastic as that. But in principle it's similar, you could take it for any type of measurement - radiographers always get jabs at how X-Rays are taken in different angles or projections which make it very difficult to tell if there are any interval changes. It's hard to say there is or isn't without reliable measures, especially for fine tuning on more subtle things like degenerative changes, lesion features e.t.c. rather than gross things like displaced fractures.


Secondly, I will say that it didn't matter for a lot of people, because the role that strength testing played for them was contextual, and a communication aid. It's much easier to see a deficit than get it explained in complicated terms by a health care professional.


This isn't to be downplayed. At the end of the day if all that dynamometry gives us is a clear communication aid for patients, then it can still definitely be worth it in the right context / clinical reasoning. But if that's what we are using it for, then lets be clear on that and not try and twist it into something that it's not!


Wrapping it up


With my retrospectoscope I can see exactly what Jo was talking about when I was trying to plan a research question with her. It might seem boring, it might be confusing at first, but if we skip this step of reliability and validity then we are in serious danger of another faulty clinical reasoning fiasco that we have been subject to many times in the past.


This is one of the elephants in the room when it comes to dynamometry. Before we go off clinically applying dynamometry and basing communication and treatment decisions off it. Lets recognise the fundamentals of measurement and start on solid grounding.


Here is the minimum that we need to do before we can even go on to discuss about clinical applications.


Do it properly, and don't get too attached:


  1. Make sure you read up about reliability and validity for the specific type of test, and the metric (e.g. peak force) you are choosing to do.

  2. The set up is the be all and end all. Do the test in that way, every single time. Find a way to standardise it if you are doing pull or push dynamometry and aren't as spoilt as me having an isokinetic dynamometer.

  3. Don't accept messy repetitions or messy data, if you're actually interested in the force production you are only cheating yourself and the patient.

  4. If you have access to data that states minimal detectable change then great, use that to inform if you think someone has improved or not but be aware that it is a ball park figure, not an absolute and often testing in people with no history of injury.

  5. If you don't have data stating reliability or SDC values then take everything with a pinch of salt. You are looking for obscene differences, that are there consistently to be more confident your measure is picking up something clinically relevant.


When looking up research around reliability we need to look at the ICC values which give us an idea of how reliable a measure is; the higher the better (measured from 0-1).


< 0.5: Poor reliability

0.5–0.75: Moderate reliability

0.75–0.9: Good reliability

> 0.9: Excellent reliability


We need to know is that inter- or intra-rater reliability. Unless you have someone you trust to be as pinickity as you then you really should do the tests yourself every time to make sure.


It takes a lot of practice, and you will hate yourself for how picky you become with it. But let's get this elephant out of the room before we start trying to herd the other elephants out aswell (clinical applications).


It's near the end of the year, so you have a couple of months before I write the next blog to clean up your testing technique.


Thanks for reading and please comment with any questions,


Jeff


(Confession: I actually love everything about Thor: Love and Thunder, I just borrowed some common critiques to give a quirky opening. Take it for what it is, a comic book movie! It's quite entertaining to see a superhero go through a midlife crisis, it's different!)

Comments

Share Your ThoughtsBe the first to write a comment.

Jeff Morton - Physio

Thanks for getting in touch!

bottom of page