Является ли когда-либо обоснованным включение двустороннего взаимодействия в модель без учета основных эффектов? Что, если ваша гипотеза касается только взаимодействия, вам все равно нужно включить основные эффекты?
85
Является ли когда-либо обоснованным включение двустороннего взаимодействия в модель без учета основных эффектов? Что, если ваша гипотеза касается только взаимодействия, вам все равно нужно включить основные эффекты?
Ответы:
По моему опыту, не только необходимо иметь все эффекты более низкого порядка в модели, когда они связаны с эффектами более высокого порядка, но также важно правильно моделировать (например, допускать нелинейность) основные эффекты, которые, по-видимому, не связаны с факторы во взаимодействиях интересов. Это потому, что взаимодействия между и могут быть для основных эффектов и . Взаимодействия иногда кажутся необходимыми, потому что они коллинеарны с пропущенными переменными или пропущенными нелинейными (например, сплайн) терминами.x1 x2 x3 x4
источник
Вы спрашиваете, действительно ли это когда-либо. Позвольте мне привести общий пример, разъяснение которого может предложить вам дополнительные аналитические подходы.
Простейшим примером взаимодействия является модель с одной зависимой переменной и двумя независимыми переменными , в видеZ X Y
с - случайной переменной с нулевым ожиданием и параметрами и . Часто стоит проверить, приближает ли , потому что алгебраически эквивалентное выражение той же моделиε α,β′,γ′, δ′ δ′ β′γ′
(whereβ′=αβ , etc).
Whence, if there's a reason to suppose(δ−βγ)∼0 , we can absorb it in the error term ε . Not only does this give a "pure interaction", it does so without a constant term. This in turn strongly suggests taking logarithms. Some heteroscedasticity in the residuals--that is, a tendency for residuals associated with larger values of Z to be larger in absolute value than average--would also point in this direction. We would then want to explore an alternative formulation
with iid random errorτ . Furthermore, if we expect βX and γY to be large compared to 1 , we would instead just propose the model
This new model has just a single parameterη instead of four parameters (α , β′ , etc.) subject to a quadratic relation (δ′=β′γ′ ), a considerable simplification.
I am not saying that this is a necessary or even the only step to take, but I am suggesting that this kind of algebraic rearrangement of the model is usually worth considering whenever interactions alone appear to be significant.
Some excellent ways to explore models with interaction, especially with just two and three independent variables, appear in chapters 10 - 13 of Tukey's EDA.
источник
While it is often stated in textbooks that one should never include an interaction in a model without the corresponding main effects, there are certainly examples where this would make perfect sense. I'll give you the simplest example I can imagine.
Suppose subjects randomly assigned to two groups are measured twice, once at baseline (i.e., right after the randomization) and once after group T received some kind of treatment, while group C did not. Then a repeated-measures model for these data would include a main effect for measurement occasion (a dummy variable that is 0 for baseline and 1 for the follow-up) and an interaction term between the group dummy (0 for C, 1 for T) and the time dummy.
The model intercept then estimates the average score of the subjects at baseline (regardless of the group they are in). The coefficient for the measurement occasion dummy indicates the change in the control group between baseline and the follow-up. And the coefficient for the interaction term indicates how much bigger/smaller the change was in the treatment group compared to the control group.
Here, it is not necessary to include the main effect for group, because at baseline, the groups are equivalent by definition due to the randomization.
One could of course argue that the main effect for group should still be included, so that, in case the randomization failed, this will be revealed by the analysis. However, that is equivalent to testing the baseline means of the two groups against each other. And there are plenty of people who frown upon testing for baseline differences in randomized studies (of course, there are also plenty who find it useful, but this is another issue).
источник
The reason to keep the main effects in the model is for identifiability. Hence, if the purpose is statistical inference about each of the effects, you should keep the main effects in the model. However, if your modeling purpose is solely to predict new values, then it is perfectly legitimate to include only the interaction if that improves predictive accuracy.
источник
this is implicit in many of answers others have given but the simple point is that models w/ a product term but w/ & w/o the moderator & predictor are just different models. Figure out what each means given the process you are modeling and whether a model w/o the moderator & predictor makes more sense given your theory or hypothesis. The observation that the product term is significant but only when moderator & predictor are not included doesn't tell you anything (except maybe that you are fishing around for "significance") w/o a cogent explanation of why it makes sense to leave them out.
источник
Arguably, it depends on what you're using your model for. But I've never seen a reason not to run and describe models with main effects, even in cases where the hypothesis is only about the interaction.
источник
I will borrow a paragraph from the book An introduction to survival analysis using Stata by M.Cleves, R.Gutierrez, W.Gould, Y.Marchenko edited by Stata press to answer to your question.
источник
Both x and y will be correlated with xy (unless you have taken a specific measure to prevent this by using centering). Thus if you obtain a substantial interaction effect with your approach, it will likely amount to one or more main effects masquerading as an interaction. This is not going to produce clear, interpretable results. What is desirable is instead to see how much the interaction can explain over and above what the main effects do, by including x, y, and (preferably in a subsequent step) xy.
As to terminology: yes, β 0 is called the "constant." On the other hand, "partial" has specific meanings in regression and so I wouldn't use that term to describe your strategy here.
Some interesting examples that will arise once in a blue moon are described at this thread.
источник
I would suggest it is simply a special case of model uncertainty. From a Bayesian perspective, you simply treat this in exactly the same way you would treat any other kind of uncertainty, by either:
This is exactly what people do when testing for "significant effects" by using t-quantiles instead of normal quantiles. Because you have uncertainty about the "true noise level" you take this into account by using a more spread out distribution in testing. So from your perspective the "main effect" is actually a "nuisance parameter" in relation to the question that you are asking. So you simply average out the two cases (or more generally, over the models you are considering). So I would have the (vague) hypothesis:
And you can see from this thatP(Hint|DMmI) is the "conditional conclusion" of the hypothesis under the mth model (this is usually all that is considered, for a chosen "best" model). Note that this standard analysis is justified whenever P(Mm|DI)≈1 - an "obviously best" model - or whenever P(Hint|DMjI)≈P(Hint|DMkI) - all models give the same/similar conclusions. However if neither are met, then Bayes' Theorem says the best procedure is to average out the results, placing higher weights on the models which are most supported by the data and prior information.
источник
It is very rarely a good idea to include an interaction term without the main effects involved in it. David Rindskopf of CCNY has written some papers about those rare instances.
источник
There are various processes in nature that involve only an interaction effect and laws that decribe them. For instance Ohm's law. In psychology you have for instance the performance model of Vroom (1964): Performance = Ability x Motivation.Now, you might expect finding an significant interaction effect when this law is true. Regretfully, this is not the case. You might easily end up with finding two main effects and an insignificant interaction effect (for a demonstration and further explanation see Landsheer, van den Wittenboer and Maassen (2006), Social Science Research 35, 274-294). The linear model is not very well suited for detecting interaction effects; Ohm might never have found his law when he had used linear models.
As a result, interpreting interaction effects in linear models is difficult. If you have a theory that predicts an interaction effect, you should include it even when insignificant. You may want to ignore main effects if your theory excludes those, but you will find that difficult, as significant main effects are often found in the case of a true data generating mechanism that has only a multiplicative effect.
My answer is: Yes, it can be valid to include a two-way interaction in a model without including the main effects. Linear models are excellent tools to approximate the outcomes of a large variety of data generating mechanisms, but their formula's can not be easily interpreted as a valid description of the data generating mechanism.
источник
This one is tricky and happened to me in my last project. I would explain it this way: lets say you had variables A and B which came out significant independently and by a business sense you thought that an interaction of A and B seems good. You included the interaction which came out to be significant but B lost its significance. You would explain your model initially by showing two results. The results would show that initially B was significant but when seen in light of A it lost its sheen. So B is a good variable but only when seen in light of various levels of A (if A is a categorical variable). Its like saying Obama is a good leader when seen in the light of its SEAL army. So Obama*seal will be a significant variable. But Obama when seen alone might not be as important. (No offense to Obama, just an example.)
источник
F = m*a, force equals mass times acceleration.
It is not represented as F = m + a + ma, or some other linear combination of those parameters. Indeed, only the interaction between mass and acceleration would make sense physically.
источник
Yes it can be valid and even necessary. If for example in 2. you would include a factor for main effect (average difference of blue vs red condition) this would make the model worse.
Your hypothesis might be true independent of there being a main effect. But the model might need it to best describe the underlying process. So yes, you should try with and without.
Note: You need to center the code for the "continuous" independent variable (measurement in the example). Otherwise the interaction coefficients in the model will not be symmetrically distributed (no coefficient for the first measurement in the example).
источник
If the variables in question are categorical, then including interactions without the main effects is just a reparameterizations of the model, and the choice of parameterization depends on what you are trying to accomplish with your model. Interacting continuous variables with other continuous variables ore with categorical variables is a whole different story. See: see this faq from UCLA's Institute for Digital Research and Education
источник
Yes this can be valid, although it is rare. But in this case you still need to model the main effects, which you will afterward regress out.
Indeed, in some models, only the interaction is interesting, such as drug testing/clinical models. This is for example the basis of the Generalized PsychoPhysiological Interactions (gPPI) model:
y = ax + bxh + ch
wherex/y
are voxels/regions of interest andh
the block/events designs.In this model, both
a
andc
will be regressed out, onlyb
will be kept for inference (the beta coefficients). Indeed, botha
andc
represent spurious activity in our case, and onlyb
represents what cannot be explained by spurious activity, the interaction with the task.источник
The short answer: If you include interaction in the fixed effects, then the main effects are automatically included whether or not you specifically include them in your code. The only difference is your parametrization, i.e., what the parameters in your model mean (e.g., are they group means or are they differences from reference levels).
Assumptions: I assume we are working in the general linear model and are asking when we can use the fixed effects specificationAB instead of A+B+AB , where A and B are (categorical) factors.
Mathematical clarification: We assume that the response vectorY∼N(ξ,σ2In) .
If XA , XB and XAB are the design matrices for the three factors, then a model with "main effects and interaction" corresponds to the restriction ξ∈ span{XA,XB,XAB} .
A model with "only interaction" corresponds to the restriction ξ∈ span{XAB} .
However, span{XAB}= span{XA,XB,XAB} . So, it's two different parametrizations of the same model (or the same family of distributions if you are more comfortable with that terminology).
I just saw that David Beede provided a very similar answer (apologies), but I thought I would leave this up for those who respond well to a linear algebra perspective.
источник