Abstract :
Based on the observation that scale-level methods are sometimes exclusively used to investigate measurement invariance for test translation, this article describes the results of a simulation study investigating whether item-level differential item functioning (DIF) manifests itself in scale-level analyses such as single and multigroup factor analyses and per group coefficient alpha. The simulation factors were two levels of DIF (moderate and large) and four levels of percentage of items with DIF (ranging from approximately 3–41% of the items). The results indicate that item-level DIF did not manifest itself in the scale-level results. Clearly, then, translation efforts in language testing should ensure measurement equivalence by investigating item-level translation DIF, and it may be misleading to give consideration only to the scale-level methods results as evidence of translation equivalence