Ich habe versucht, in verschiedenen Quellen nachzulesen, bin mir aber immer noch nicht sicher, welcher Test für meinen Fall geeignet wäre. Es gibt drei verschiedene Fragen, die ich zu meinem Datensatz stelle:
Die Probanden werden zu verschiedenen Zeiten auf Infektionen mit X getestet. Ich möchte wissen, ob die Anteile von Positiv für X danach mit den Anteilen von Positiv für X zuvor zusammenhängen:
After |no |yes| Before|No |1157|35 | |Yes |220 |13 | results of chi-squared test: Chi^2 = 4.183 d.f. = 1 p = 0.04082 results of McNemar's test: Chi^2 = 134.2 d.f. = 1 p = 4.901e-31
Da es sich bei den Daten meines Wissens um wiederholte Messungen handelt, muss ich den McNemar-Test verwenden, mit dem geprüft wird, ob sich der Anteil der positiven Werte für X geändert hat.
Aber meine Fragen scheinen den Chi-Quadrat-Test zu benötigen - zu testen, ob der Anteil des Positiven für X danach mit dem Anteil des Positiven für X zuvor zusammenhängt.
Ich bin mir nicht mal sicher, ob ich den Unterschied zwischen McNemars Test und dem Chi-Quadrat richtig verstehe. Was wäre der richtige Test, wenn meine Frage lauten würde: "Ist der Anteil der mit X nach infizierten Personen anders als zuvor?"
Ein ähnlicher Fall, bei dem ich anstelle von vorher und nachher zwei verschiedene Infektionen zu einem Zeitpunkt messe:
Y |no |yes| X|No |1157|35 | |Yes |220 |13 |
Welcher Test wäre hier richtig, wenn die Frage lautet: "Beziehen sich höhere Anteile einer Infektion auf höhere Anteile von Y?"
Wenn meine Frage lautete: "Bezieht sich die Infektion Y zum Zeitpunkt t2 auf die Infektion X zum Zeitpunkt t1?", Welcher Test wäre angemessen?
Y at t2 |no |yes| X at t1|No |1157|35 | |Yes |220 |13 |
In all diesen Fällen habe ich den McNemar-Test verwendet, aber ich habe meine Zweifel, ob dies der richtige Test ist, um meine Fragen zu beantworten. Ich verwende R. Kann ich glm
stattdessen ein Binomial verwenden? Wäre das analog zum Chi-Quadrat-Test?
quelle
Antworten:
Es ist sehr bedauerlich, dass McNemars Test für die Menschen so schwer zu verstehen ist. Ich stelle sogar fest, dass am oberen Rand der Wikipedia-Seite angegeben ist, dass die Erklärung auf der Seite für die Menschen schwer zu verstehen ist. Die typische kurze Erklärung für McNemars Test ist entweder, dass es sich um einen "Innersubjekt-Chi-Quadrat-Test" handelt, oder dass es sich um einen "Test der marginalen Homogenität einer Kontingenztabelle" handelt. Ich finde beides nicht sehr hilfreich. Erstens ist es nicht klar, was mit "Chi-Quadrat" innerhalb von Subjekten gemeint ist, da Sie Ihre Subjekte immer zweimal messen (einmal für jede Variable) und versuchen, die Beziehung zwischen diesen Variablen zu bestimmen. Darüber hinaus "marginale Homogenität" (Tragischerweise kann auch diese Antwort verwirrend sein. Wenn dies der Fall ist, kann es hilfreich sein, meinen zweiten Versuch weiter unten zu lesen .)
Sehen wir uns an, ob wir ein Argumentationsverfahren für Ihr Top-Beispiel durchführen können, um herauszufinden, ob (und wenn ja, warum) der McNemar-Test angemessen ist. Sie haben gesetzt:
Dies ist eine Kontingenztabelle, also eine Chi-Quadrat-Analyse. Außerdem wollen Sie die Beziehung zwischen verstehen und A f t e r , und der Chi-Quadrat - Test prüft , ob eine Beziehung zwischen den Variablen, so auf dem ersten Blick scheint es , wie der Chi-Quadrat - Test sein muss die Analyse, die Ihre Frage beantwortet.Before After
Es sei jedoch darauf hingewiesen, dass wir diese Daten auch wie folgt darstellen können:
Wenn Sie die Daten auf diese Weise betrachten, könnten Sie denken, Sie könnten einen regulären alten Test durchführen. Aber ein t- Test ist nicht ganz richtig. Es gibt zwei Probleme: Erstens, da in jeder Zeile Daten aufgelistet sind, die von demselben Subjekt gemessen wurden, möchten wir keinen Zwischensubjekt- T- Test durchführen, sondern einen Innersubjekt- T- Test. Zweitens ist die Varianz eine Funktion des Mittelwerts , da diese Daten als Binom verteilt sind . Dies bedeutet, dass es keine zusätzliche Unsicherheit gibt, über die Sie sich Sorgen machen müssen, sobald der Stichprobenmittelwert geschätzt wurde (dh Sie müssen die Varianz nicht nachträglich schätzen). Sie müssen sich also nicht auf die t- Verteilung beziehen, sondern können die verwenden zt t t t t z Verteilung. (Um mehr darüber zu erfahren, kann es hilfreich sein, meine Antwort hier zu lesen: Der Test im Vergleich zum χ 2- Testz χ2 .) Daher würden wir einen Test innerhalb der Probanden benötigen . Das heißt, wir brauchen einen Innersubjekttest auf Gleichheit der Proportionen. z
So, having realized that we want to conduct McNemar's test, how does it work? Running a between-subjectsz -test is easy, but how do we run a within-subjects version? The key to understanding how to do a within-subjects test of proportions is to examine the contingency table, which decomposes the proportions:
R
outputs.)There is another discussion of McNemar's test, with extensions to contingency tables larger than 2x2, here.
Here is an
R
demo with your data:If we didn't take the within-subjects nature of your data into account, we would have a slightly less powerful test of the equality of proportions:
That is,13 cases are overlapping as discussed above. (Another, and more important, problem here is that this counts your data twice, i.e., N=2850 , instead of N=1425 .)
X-squared = 133.6627
instead ofchi-squared = 134.2157
. In this case, these differ very little, because you have a lot of data and onlyHere are the answers to your concrete questions:
This version is trickier, and the phrasing "does higher proportions of one infections relate to higher proportions of Y" is ambiguous. There are two possible questions:
Since this is once again the same infection, of course they will be related. I gather that this version is not before and after a treatment, but just at some later point in time. Thus, you are asking if the background infection rates are changing organically, which is again a perfectly reasonable question. At any rate, the correct analysis is McNemar's test.Edit: It would seem I misinterpreted your third question, perhaps due to a typo. I now interpret it as two different infections at two separate timepoints. Under this interpretation, the chi-squared test would be appropriate.
quelle
Well, it seems I've made a hash of this. Let me try to explain this again, in a different way and we'll see if it might help clear things up.
The traditional way to explain McNemar's test vs. the chi-squared test is to ask if the data are "paired" and to recommend McNemar's test if the data are paired and the chi-squared test if the data are "unpaired". I have found that this leads to a lot of confusion (this thread being an example!). In place of this, I have found that it is most helpful to focus on the question you are trying to ask, and to use the test that matches your question. To make this more concrete, let's look at a made-up scenario:
Here are the data:
At this point, it is important to figure out what question we want to ask of our data. There are three different questions we could ask here:
BP
andNationality
are associated or independent;Finally, we might wonder if the proportion of statisticians with high blood pressure is equal to the proportion of US statisticians that we talked to. This refers to the marginal proportions of the table. These are not printed by default in R, but we can get them thusly (notice that, in this case, they are exactly the same):
As I said, the traditional approach, discussed in many textbooks, is to determine which test to use based on whether the data are "paired" or not. But this is very confusing, is this contingency table "paired"? If we compare the proportion with high blood pressure between US and UK statisticians, you are comparing two proportions (albeit of the same variable) measured on different sets of people. On the other hand, if you want to compare the proportion with high blood pressure to the proportion US, you are comparing two proportions (albeit of different variables) measured on the same set of people. These data are both "paired" and "unpaired" at the same time (albeit with respect to different aspects of the data). This leads to confusion. To try to avoid this confusion, I argue that you should think in terms of which question you are asking. Specifically, if you want to know:
Someone might disagree with me here, arguing that because the contingency table is not "paired", McNemar's test cannot be used to test the equality of the marginal proportions and that the chi-squared test should be used instead. Since this is the point of contention, let's try both to see if the results make sense:
The chi-squared test yields a p-value of approximately 0. That is, it says that the probability of getting data as far or further from equal marginal proportions, if the marginal proportions actually were equal is essentially 0. But the marginal proportions are exactly the same,50%=50% , as we saw above! The results of the chi-squared test just don't make any sense in light of the data. On the other hand, McNemar's test yields a p-value of 1. That is, it says that you will have a 100% chance of finding marginal proportions this close to equality or further from equality, if the true marginal proportions are equal. Since the observed marginal proportions cannot be closer to equal than they are, this result makes sense.
Let's try another example:
In this case, the marginal proportions are very different,97.5%≫50% . Let's try the two tests again to see how their results compare to the observed large difference in marginal proportions:
This time, the chi-squared test gives a p-value of 1, meaning that the marginal proportions are as equal as they can be. But we saw that the marginal proportions are very obviously not equal, so this result doesn't make any sense in light of our data. On the other hand, McNemar's test yields a p-value of approximately 0. In other words, it is extremely unlikely to get data with marginal proportions as far from equality as these, if they truly are equal in the population. Since our observed marginal proportions are far from equal, this result makes sense.
The fact that the chi-squared test yields results that make no sense given our data suggests there is something wrong with using the chi-squared test here. Of course, the fact that McNemar's test provided sensible results doesn't prove that it is valid, it may just have been a coincidence, but the chi-squared test is clearly wrong.
Let's see if we can work through the argument for why McNemar's test might be the right one. I will use a third dataset:
This time we want to compare51.25% to 62.5% and wonder if in the population the true marginal proportions might have been the same. Because we are comparing two proportions, the most intuitive option would be to use a z-test for the equality of two proportions. We can try that here:
(To use
prop.test()
to test the marginal proportions, I had to enter the numbers of 'successes' and the total number of 'trials' manually, but you can see from the last line of the output that the proportions are correct.) This suggests that it is unlikely to get marginal proportions this far from equality if they were actually equal, given the amount of data we have.Is this test valid? There are two problems here: The test believes we have 800 data, when we actually have only 400. This test also does not take into account that these two proportions are not independent, in the sense that they were measured on the same people.
Let's see if we can take this apart and find another way. From the contingency table, we can see that the marginal proportions are:
In this version, only the informative observations are used and they are not counted twice. The p-value here is much smaller, 0.0000001588, which is often the case when the dependency in the data is taken into account. That is, this test is more powerful than the z-test of difference of proportions. We can further see that the above version is essentially the same as McNemar's test:
If the non-identicallity is confusing, McNemar's test typically, and in R, squares the result and compares it to the chi-squared distribution, which is not an exact test like the binomial above:
Thus, when you want to check the marginal proportions of a contingency table are equal, McNemar's test (or the exact binomial test computed manually) is correct. It uses only the relevant information without illegally using any data twice. It does not just 'happen' to yield results that make sense of the data.
I continue to believe that trying to figure out whether a contingency table is "paired" is unhelpful. I suggest using the test that matches the question you are asking of the data.
quelle
The question of which test to use, contingency tableχ2 versus McNemar's χ2 of a null hypothesis of no association between two binary variables is simply a question of whether your data are paired/dependent, or unpaired/independent:
Binary Data in Two Independent Samplesχ2 test.
In this case, you would use a contingency table
For example, you might have a sample of 20 statisticians from the USA, and a separate independent sample of 37 statisticians from the UK, and have a measure of whether these statisticians are hypertensive or normotensive. Your null hypothesis is that both UK and US statisticians have the same underlying probability of being hypertensive (i.e. that knowing whether one is from the USA or from the UK tells one nothing about the probability of hypertension). Of course it is possible that you could have the same sample size in each group, but that does not change the fact of the samples being independent (i.e. unpaired).
Binary Data in Paired Samplesχ2 test.
In this case you would use McNemar's
For example, you might have individually-matched case-control study data sampled from an international statistician conference, where 30 statisticians with hypertension (cases) and 30 statisticians without hypertension (controls; who are individually matched by age, sex, BMI & smoking status to particular cases), are retrospectively assessed for professional residency in the UK versus residency elsewhere. The null is that the probability of residing in the UK among cases is the same as the probability of residing in the UK as controls (i.e. that knowing about one's hypertensive status tells one nothing about one's UK residence history).
In fact, McNemar's test analyzes pairs of data. Specifically, it analyzes discordant pairs. So ther and s from χ2=[(r−s)−1]2(r+s) are counts of discordant pairs.
Anto, in your example, your data are paired (same variable measured twice in same subject) and therefore McNemar's test is the appropriate choice of test for association.
[gung and I disagreed for a time about an earlier answer.]
Quoted References
"Assuming that we are still interested in comparing proportions, what can we do if our data are paired, rather than independent?... In this situation, we use McNemar's test."–Pagano and Gauvreau, Principles of Biostatistics, 2nd edition, page 349. [Emphasis added]
"The expression is better known as the McNemar matched-pair test statistic (McNemar, 1949), and has been a mainstay of matched-pair analysis."—Rothman, Greenland, & Lash. Modern Epidemiology, page 286. [Emphasis added]
"The paired t test and repeated measures of analysis of variance can be used to analyze experiments in which the variable being studied can be measured on an interval scale (and satisfies other assumptions required of parametric methods). What about experiments, analogous to the ones in Chapter 5, where the outcome is measured on a nominal scale? This problem often arises when asking whether or not a an individual responded to a treatment or when comparing the results of two different diagnostic tests that are classified positive or negative in the same individuals. We will develop a procedure to analyze such experiments, Mcnemar's test for changes, in the context of one such study."—Glanz, Primer of Biostatistics, 7th edition, page 200. [Emphasis added. Glanz works through an example of a misapplication the contingency tableχ2 test to paired data on page 201.]
"For matched case-control data with one control per case, the resultant analysis is simple, and the appropriate statistical test is McNemar's chi-squared test... note that for the calculation of both the odds ratio and the statistic, the only contributors are the pairs which are disparate in exposure, that is the pairs where the case was exposed but the control was not, and those where the control was exposed but the case was not."—Elwood. Critical Appraisal of Epidemiological Studies and Clinical Trials, 1st edition, pages 189–190. [Emphasis added]
quelle
My understanding of McNemar's test is as follows: It is used to see whether an intervention has made a significant difference to a binary outcome. In your example, a group of subjects are checked for infection and the response is recorded as yes or no. All subjects are then given some intervention, say an antibiotic drug. They are then checked again for infection and response is recorded as yes/no again. The (pairs of) responses can be put in the contigency table:
And McNemar's test would be appropriate for this.
It is clear from the table that many more have converted from 'yes' to 'no' (220/(220+13) or 94.4%) than from 'no' to 'yes' (35/(1157+35) or 2.9%). Considering these proportions, McNemar's P value (4.901e-31) appears more correct than chi-square P value (0.04082 ).
If contigency table represents 2 different infections (question 2), then Chi-square would be more appropriate.
Your 3rd question is ambiguous: you first state relating Y at t2 with Y at t1 but in the table you write 'X' at t1 vs Y at t2. Y at t2 vs Y at t1 is same as your first question and hence McNemar's test is needed, while X at t1 and Y at t2 indicates different events are being compared and hence Chi-square will be more appropriate.
Edit: As mentioned by Alexis in the comment, matched case-control data are also analyzed by McNemar's test. For example, 1425 cancer patients are recruited for a study and for each patient a matched control is also recruited. All these (1425*2) are checked for infection. The results of each pair can be shown by similar table:
More clearly:
It shows that it is much more often that cancer patient had infection and control did not, rather than the reverse. Its significance can be tested by McNemar's test.
If these patients and controls were not matched and independent, one can only make following table and do a chisquare test:
More clearly:
Note that these numbers are same as margins of the first table:
That must be the reason for use of terms like 'marginal frequencies' and 'marginal homogeneity' in McNemar's test.
Interestingly, the addmargins function can also help decide which test to use. If the grand-total is half the number of subjects observed (indicating pairing has been done), then McNemar's test is applicable, else chisquare test is appropriate:
The R codes for above tables are as from answers above:
Following pseudocode may also help knowing the difference:
Edit:
mid-p
variation of peforming McNemar test ( https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3716987/ ) is interesting. It comparesb
andc
of contingency table, i.e. number who changed from yes to no versus number who changed from no to yes (ignoring number of those who remained yes or no through the study). It can be performed using binomial test in python, as shown at https://gist.github.com/kylebgorman/c8b3fb31c1552ecbaafbIt could be equivalent to
binom.test(b, b+c, 0.5)
since in a random change, one would expectb
to be equal toc
.quelle