Assume I have the median of 11 sets of data. Is the median of the medians the same as the median of the 11 combined data sets? I might be a better programmer than mathematician.

## Median of medians?

{1 2 3} median = 2

{3 4 5} median = 4

{1 2 5} median = 2

All: {1 1 2 2 3 3 4 5 5} median = 3

M of M: {2 2 4} median = 2

Proven NOT true.

Seems obvious when one sees a simple example. Thanks.

Just TRY convincing somebody who THINKS they're an accountant that the sum of the quotient is not equal to the quotient of the sums! Or a lawyer that 33% and 1/3 are two different numbers

Torby,

33% and 1/3 are not only two different numbers, they are also two different numerical values. 1/3 =33.3333333333333333

3333333333333333333333

3333333333333333333333

3333333333333333333333

3333333333333333333333

3333333333333333333333

3333333333333333333333

3333333333333333333333

3333333333333333333333

3333333333333333333333

333333333333333333333%, or thereabouts.

As far as westfw's "median" example. I must ask: What's the definition of "median" in Bob's original question, and in the example presented?

The "median" is the "middle number" - half the numbers in the set are above it, and half are below.

(so says wikipedia; I checked.)

There are complications for sets of data with an even number of members, and perhaps with sets containing duplicate values, but I'm pretty sure my trivial example is correct.

Note that "Half of students score below the median grade on the XXX test!" is therefore always true.

[hair-splitting mode on]

Note that "Half of students score below the median grade on the XXX test!" is therefore always true.

Except for the cases with an even number of students having done the test... And of course for the totally real-life exception of all students having the same grade!

[hair-splitting-mode off]

I must ask: What's the definition of "median" in Bob's original question, and in the example presented?

http://en.wikipedia.org/wiki/Median

I see, it is a simple mathematical mechanism meant to deceive the unwary.

I would guess salesmen and managers are well-versed in its use to prove their factual opinions.

Wages are often quoted as "the median salary in city or county or state XX". Evidently one or two millionaires

biases up the average too much.

If I wanted to compare two populations of salaries for example, would the std deviation from the mean of each set give any better information?

**bobgardner wrote:**

Wages are often quoted as "the median salary in city or county or state XX". Evidently one or two millionaires

biases up the average too much.

If I wanted to compare two populations of salaries for example, would the std deviation from the mean of each set give any better information?

yes.

anything is better than median...median gives you nothing. just a simple average salary is better than median. but it depends on what you want to find out as in where you want to use the info.

The median is one statistics, just like the arithmetic mean, the geometric mean or others. Which one is best depends on the application.

The special property of the median is, that is rather insensitive to a few values that are way off.

I want 'a number' to compare public and private salaryandbenefits. A metric that says 'the private S&B in cityx is $40K but the public S&B is $46K', which can then be a basis for a polite discussion of why the public salaries just cant all be $100K with lifetime pension after one term of service, but step one is finding a meaningful metric. If its not the mean, and not the median, and the rms of a bunch of salaries doesn't seem to add anything to me, what is that metric? I mean a big salary gives you power, but calculating the power of a salary doesn't seem useful. So how to boil down a city/county/state compensation into a composite metric?

**bobgardner wrote:**

I want 'a number' to compare public and private salaryandbenefits. A metric that says 'the private S&B in cityx is $40K but the public S&B is $46K', which can then be a basis for a polite discussion of why the public salaries just cant all be $100K with lifetime pension after one term of service, but step one is finding a meaningful metric. If its not the mean, and not the median, and the rms of a bunch of salaries doesn't seem to add anything to me, what is that metric? I mean a big salary gives you power, but calculating the power of a salary doesn't seem useful. So how to boil down a city/county/state compensation into a composite metric?

you need a graph.

If its not the mean, and not the median, ...

Figures often beguile me, particularly when I have the arranging of them myself; in which case the remark attributed to Disraeli would often apply with justice and force: "There are three kinds of lies: lies, damned lies and statistics."

- Mark Twain's Own Autobiography: The Chapters from the North American Review

The old saying is that â€œfigures will not lie,â€ but a new saying is â€œliars will figure.â€ It is our duty, as practical statisticians, to prevent the liar from figuring; in other words, to prevent him from perverting the truth, in the interest of some theory he wishes to establish.

-- Carroll D. Wright was a prominent statistician employed by the U.S. government, and he did use the expression in 1889 while addressing the Convention of Commissioners of Bureaus of Statistics of Labor. [But Wright did not claim that he coined the expression ...]

Indeed, it is hard to devise metrics for such situations. For our own work, say ADC sampling, we can use whichever metric best represents the parameter(s) we are trying to measure. But the data can often [always?] be represented in a different manner for a different result.

a basis for a polite discussion of why the public salaries just cant all be $100K with lifetime pension after one term of service

IOW, you want a metric that will fit your opinion. There should be plenty to choose from..

I found a couple of web articles that cite studies that show public salaries are LESS than the private, and another that is the opposite. Seems odd there is not a mathematical concept of 'average salary'.

Seems odd there is not a mathematical concept of 'average salary'.

I'd say there are. Several. What is likely happening is that people tweak it, what salaries gets into the calc, how part-timers are "translated" to full-time equivalents etc etc. As Lee hinted, you can always twist a measurement. Not odd at all - they are doing exactly what you asked for above..

I want 'a number' to [your preferred input and calculation]. A metric that [your preferred numerical result] which can then be a basis for a polite discussion of why [your preferred political goal]

If you really want an odd metric - I have one for the weirdness of words, or rather, how differently they are spelt in the context of the document in which they live.

1) you start with a really big list of probabilities of all the letter and space triads in a larger body of work within a genre (the corpus) - I used over a million words - by splitting each word, including leading and trailing spaces and capitalisation but not punctuation except for apostrophes (so ' the ' gives ' th', 'the', and 'he '.

2) now you total the count for each triad and divide each count by the total number of triads in the corpus, giving a probability for each triad existing.

3) when faced with a word which is not in your spelling dictionary, disassemble it into triads - include spaces before and after. Take the negative logarithm of each triad's possibility (actually, that's what you store in the triad list - saves time) and add them together, then divide the result by the number of triads in the word.If a triad does not appear in the list, assign an arbitrary value of 10 to it.

4) that gives you a number usually somewhere between 1.5 and 4.5. If the number is below 4.5, you can be pretty certain, absent a few edge cases which you have previously noted and excluded, that the word you have is correctly spelt in the context of the document.

5) Ideally you look for the word spelt identically more than once in the document. If it does occur more than once, you can allow the weirdness value to be higher before declaring it a misspelt word.

I played with this quite a lot. It's absolutely useless in the context of a spelling corrector for a word processor, since most people insert spelling mistakes because they either can't type, and hit the wrong keys consistently, or they can't spell, and hit the same wrong spelling - which usually follows generic English rules and so gets caught in the multiple words trap. However, it's *incredibly* good at looking at an optical character recognised text and finding words which have been mistranscribed.

There's something really odd about proof-reading: when you're chugging through a document you tend to miss words which look the same (e.g. 'ain' for 'am') but when you read the work for pleasure they leap out and poke you in the eye.

What this does is allow correctly spelt words (places, character names, made up words, and even some foreign languages) to be removed from the list of words which need consideration as to whether they are correctly spelt or not - and it does it extremely well.

It's an average of averages of probabilities, I think, but I'm blowed if I know what it's called!

argh bloody captcha

^tooo long , didnt read. :mrgreen: