## Median of medians?

20 posts / 0 new
Author
Message

Assume I have the median of 11 sets of data. Is the median of the medians the same as the median of the 11 combined data sets? I might be a better programmer than mathematician.

Imagecraft compiler user

{1 2 3} median = 2
{3 4 5} median = 4
{1 2 5} median = 2
All: {1 1 2 2 3 3 4 5 5} median = 3
M of M: {2 2 4} median = 2

Proven NOT true.

Seems obvious when one sees a simple example. Thanks.

Imagecraft compiler user

Just TRY convincing somebody who THINKS they're an accountant that the sum of the quotient is not equal to the quotient of the sums! Or a lawyer that 33% and 1/3 are two different numbers

If you don't know my whole story, keep your mouth shut.

If you know my whole story, you're an accomplice. Keep your mouth shut.

Torby,

33% and 1/3 are not only two different numbers, they are also two different numerical values. 1/3 =33.3333333333333333
3333333333333333333333
3333333333333333333333
3333333333333333333333
3333333333333333333333
3333333333333333333333
3333333333333333333333
3333333333333333333333
3333333333333333333333
3333333333333333333333

As far as westfw's "median" example. I must ask: What's the definition of "median" in Bob's original question, and in the example presented?

The "median" is the "middle number" - half the numbers in the set are above it, and half are below.
(so says wikipedia; I checked.)
There are complications for sets of data with an even number of members, and perhaps with sets containing duplicate values, but I'm pretty sure my trivial example is correct.

Note that "Half of students score below the median grade on the XXX test!" is therefore always true.

[hair-splitting mode on]

Quote:
Note that "Half of students score below the median grade on the XXX test!" is therefore always true.

Except for the cases with an even number of students having done the test... And of course for the totally real-life exception of all students having the same grade!
[hair-splitting-mode off]

Einstein was right: "Two things are unlimited: the universe and the human stupidity. But i'm not quite sure about the former..."

Quote:

I must ask: What's the definition of "median" in Bob's original question, and in the example presented?

http://en.wikipedia.org/wiki/Median

As of January 15, 2018, Site fix-up work has begun! Now do your part and report any bugs or deficiencies here

No guarantees, but if we don't report problems they won't get much of  a chance to be fixed! Details/discussions at link given just above.

"Some questions have no answers."[C Baird] "There comes a point where the spoon-feeding has to stop and the independent thinking has to start." [C Lawson] "There are always ways to disagree, without being disagreeable."[E Weddington] "Words represent concepts. Use the wrong words, communicate the wrong concept." [J Morin] "Persistence only goes so far if you set yourself up for failure." [Kartman]

I see, it is a simple mathematical mechanism meant to deceive the unwary.

I would guess salesmen and managers are well-versed in its use to prove their factual opinions.

Wages are often quoted as "the median salary in city or county or state XX". Evidently one or two millionaires
biases up the average too much.
If I wanted to compare two populations of salaries for example, would the std deviation from the mean of each set give any better information?

Imagecraft compiler user

bobgardner wrote:
Wages are often quoted as "the median salary in city or county or state XX". Evidently one or two millionaires
biases up the average too much.
If I wanted to compare two populations of salaries for example, would the std deviation from the mean of each set give any better information?

yes.
anything is better than median...median gives you nothing. just a simple average salary is better than median. but it depends on what you want to find out as in where you want to use the info.

new to mirco electronics

The median is one statistics, just like the arithmetic mean, the geometric mean or others. Which one is best depends on the application.
The special property of the median is, that is rather insensitive to a few values that are way off.

I want 'a number' to compare public and private salaryandbenefits. A metric that says 'the private S&B in cityx is \$40K but the public S&B is \$46K', which can then be a basis for a polite discussion of why the public salaries just cant all be \$100K with lifetime pension after one term of service, but step one is finding a meaningful metric. If its not the mean, and not the median, and the rms of a bunch of salaries doesn't seem to add anything to me, what is that metric? I mean a big salary gives you power, but calculating the power of a salary doesn't seem useful. So how to boil down a city/county/state compensation into a composite metric?

Imagecraft compiler user

bobgardner wrote:
I want 'a number' to compare public and private salaryandbenefits. A metric that says 'the private S&B in cityx is \$40K but the public S&B is \$46K', which can then be a basis for a polite discussion of why the public salaries just cant all be \$100K with lifetime pension after one term of service, but step one is finding a meaningful metric. If its not the mean, and not the median, and the rms of a bunch of salaries doesn't seem to add anything to me, what is that metric? I mean a big salary gives you power, but calculating the power of a salary doesn't seem useful. So how to boil down a city/county/state compensation into a composite metric?

you need a graph.

new to mirco electronics

Quote:

If its not the mean, and not the median, ...

Figures often beguile me, particularly when I have the arranging of them myself; in which case the remark attributed to Disraeli would often apply with justice and force: "There are three kinds of lies: lies, damned lies and statistics."
- Mark Twain's Own Autobiography: The Chapters from the North American Review

The old saying is that â€œfigures will not lie,â€ but a new saying is â€œliars will figure.â€ It is our duty, as practical statisticians, to prevent the liar from figuring; in other words, to prevent him from perverting the truth, in the interest of some theory he wishes to establish.
-- Carroll D. Wright was a prominent statistician employed by the U.S. government, and he did use the expression in 1889 while addressing the Convention of Commissioners of Bureaus of Statistics of Labor. [But Wright did not claim that he coined the expression ...]

Indeed, it is hard to devise metrics for such situations. For our own work, say ADC sampling, we can use whichever metric best represents the parameter(s) we are trying to measure. But the data can often [always?] be represented in a different manner for a different result.

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

Quote:

a basis for a polite discussion of why the public salaries just cant all be \$100K with lifetime pension after one term of service

IOW, you want a metric that will fit your opinion. There should be plenty to choose from..

As of January 15, 2018, Site fix-up work has begun! Now do your part and report any bugs or deficiencies here

No guarantees, but if we don't report problems they won't get much of  a chance to be fixed! Details/discussions at link given just above.

"Some questions have no answers."[C Baird] "There comes a point where the spoon-feeding has to stop and the independent thinking has to start." [C Lawson] "There are always ways to disagree, without being disagreeable."[E Weddington] "Words represent concepts. Use the wrong words, communicate the wrong concept." [J Morin] "Persistence only goes so far if you set yourself up for failure." [Kartman]

I found a couple of web articles that cite studies that show public salaries are LESS than the private, and another that is the opposite. Seems odd there is not a mathematical concept of 'average salary'.

Imagecraft compiler user

Quote:

Seems odd there is not a mathematical concept of 'average salary'.

I'd say there are. Several. What is likely happening is that people tweak it, what salaries gets into the calc, how part-timers are "translated" to full-time equivalents etc etc. As Lee hinted, you can always twist a measurement. Not odd at all - they are doing exactly what you asked for above..

Quote:
I want 'a number' to [your preferred input and calculation]. A metric that [your preferred numerical result] which can then be a basis for a polite discussion of why [your preferred political goal]

As of January 15, 2018, Site fix-up work has begun! Now do your part and report any bugs or deficiencies here

No guarantees, but if we don't report problems they won't get much of  a chance to be fixed! Details/discussions at link given just above.

"Some questions have no answers."[C Baird] "There comes a point where the spoon-feeding has to stop and the independent thinking has to start." [C Lawson] "There are always ways to disagree, without being disagreeable."[E Weddington] "Words represent concepts. Use the wrong words, communicate the wrong concept." [J Morin] "Persistence only goes so far if you set yourself up for failure." [Kartman]

If you really want an odd metric - I have one for the weirdness of words, or rather, how differently they are spelt in the context of the document in which they live.

1) you start with a really big list of probabilities of all the letter and space triads in a larger body of work within a genre (the corpus) - I used over a million words - by splitting each word, including leading and trailing spaces and capitalisation but not punctuation except for apostrophes (so ' the ' gives ' th', 'the', and 'he '.

2) now you total the count for each triad and divide each count by the total number of triads in the corpus, giving a probability for each triad existing.

3) when faced with a word which is not in your spelling dictionary, disassemble it into triads - include spaces before and after. Take the negative logarithm of each triad's possibility (actually, that's what you store in the triad list - saves time) and add them together, then divide the result by the number of triads in the word.If a triad does not appear in the list, assign an arbitrary value of 10 to it.

4) that gives you a number usually somewhere between 1.5 and 4.5. If the number is below 4.5, you can be pretty certain, absent a few edge cases which you have previously noted and excluded, that the word you have is correctly spelt in the context of the document.

5) Ideally you look for the word spelt identically more than once in the document. If it does occur more than once, you can allow the weirdness value to be higher before declaring it a misspelt word.

I played with this quite a lot. It's absolutely useless in the context of a spelling corrector for a word processor, since most people insert spelling mistakes because they either can't type, and hit the wrong keys consistently, or they can't spell, and hit the same wrong spelling - which usually follows generic English rules and so gets caught in the multiple words trap. However, it's *incredibly* good at looking at an optical character recognised text and finding words which have been mistranscribed.

There's something really odd about proof-reading: when you're chugging through a document you tend to miss words which look the same (e.g. 'ain' for 'am') but when you read the work for pleasure they leap out and poke you in the eye.

What this does is allow correctly spelt words (places, character names, made up words, and even some foreign languages) to be removed from the list of words which need consideration as to whether they are correctly spelt or not - and it does it extremely well.

It's an average of averages of probabilities, I think, but I'm blowed if I know what it's called!