Statistical analysis techniques based on Cross-Cultural research methods: cross-cultural paradigms and intra-country comparisons

Accumulated cross-cultural research has shown that its methods can also apply within countries, especially as more and more different immigrants or sojourners flow into host countries and the need to deal at least with acculturation issues is pressing. Cross-cultural methodology approximates research on intra-country issues, since comparing groups with different characteristics within countries may also reflect different “cultures” represented by each of the differential groups. A question of bias elimination is raised when such comparisons are attempted either under a Cross-Cultural or an intra-country scope. Taking the van de Vijver and Leung and the Poortinga and van de Vijver theories on bias in terms of culture as a starting point, a triple-fold paradigm employing factor analysis and other techniques is presented on: (a) the application of simple congruence coefficients in estimating factor similarity –that is, basic factor equivalence testing– along with a proposed method of taking advantage of the Tucker coefficient matrix for a set of two or more factor structures, (b) the within-country application of multilevel covariance structure analysis and Procrustean rotations for a set of between groups and pooled-within correlation matrices, and (c) the reduction of “bias in terms of culture” by eliminating variance components through multivariate methods. By incorporating some of these methods in standard -within country-psychological research, we should be able to gain on theoretical and psychometric grounds and we may finally question the degree of construct similarity among groups within a country, which cannot be necessarily taken for granted. These considerations are closely related to the use of multilevel analyses, as these stem from Cross-Cultural Psychology through most forms of intra-country and/or inter-country comparisons


Introduction
This study addresses some of the methodological gains we have accumulated during the last 40 years or so, through the application of Cross-Cultural research methods and statistical techniques; we were quite fortunate to be reminded of such profits by Gustav Jahoda in his Walter J. Lonner Distinguished Lecture Series Inaugural Speech during the 18th International Congress of the International Association for Cross-Cultural Psychology (Jahoda, 2006).
Cross-Cultural Psychology has balanced itself for a long time between the similarities and the differences (Georgas, Berry, & Kagitcibasi, 2006;Georgas & Mylonas, 2006;Poortinga & van de Vijver, 1987) across two or more countries, under the hypothesis that these countries are supposed to reflect two or more separate cultures (the verification of such a claim can be easily traced along most of the last 35 years of scientific literature published in scientific journals such as the Journal of Cross-Cultural Psychology, the Cross-Cultural Research, and others, such as International psychological journals hosting cross-cultural studies).A "culture" issue has also been raised, in respect to its conceptual and its operational definition, since it is difficult to define "culture", because its operationalization while comparing "cultures" (Segall et al., 1990;Stigler, Shweder, & Herdt, 1990) is a dangerous field quite often.Georgas and Berry (1995) have specifically supported that the operationalization of culture is "mistakenly" equated to country.Such a Cross Cultural Psychology "balancing" has produced several other theoretical concerns as of how to test for similarities or for differences, or for both (Poortinga, 1989;van de Vijver & Leung, 1997).The "definition of culture" issue has been largely debated (Hofstede, 1980;Kagitcibasi & Poortinga, 2000;Kim, Park, & Park, 2000;Segall et al., 1990); a culture may not necessarily be tautologous with a country (Georgas & Berry, 1995) but can be dependent on different cultural groups of any kind (e.g., immigrants or sojourners vs. natives, different generations, the two sexes, etc.).In such sense, Cross-Cultural Psychology's statistical methods do not directly reflect methodology in cross-cultural research only, but the retroaction of such methods towards intra-countries research and any kind of within-groups statistics computing and inference procedures.Additionally, many research attempts in the last decades have yielded methodological concerns and have of course produced other more specifically focused -although very important as well-questions, such as extremity scoring interference (e.g., Welkenhuysen-Gybels, Billiet, & Cambré, 2003), social desirability effects (e.g., Johnson & van de Vijver, 2003), metric, scalar and construct equivalence (e.g., van de Vijver & Tanzer, 1997), bias in terms of culture (e.g., Byrne & Watkins, 2003), and of course multilevel research in crosscultural psychology (van de Vijver, van Hemert, & Poortinga, 2008).
From all the above, it seems clear that an important lesson we have learned from crosscultural methodology is that its methods do not have to necessarily cross the nation borders to be useful in terms of statistical comparison.A large proportion of cross-cultural research has stemmed from the issue of acculturation (e.g., Berry, 1990Berry, , 1992;;Furnham & Bochner, 1990;Georgas et al., 1996).The acculturation procedures are taking place within the narrow borders of one nation and this may concurrently hold true for many nations.As acculturation was studied in terms of the host culture interaction with the immigrant, sojourner, etc. culture (e.g., Furnham & Bochner, 1990), it produced vast literature reflecting the mechanisms pertaining the acculturation procedure followed by acculturation consequences within the specific culture (e.g., Georgas & Papastylianou, 1996).Such paradigms can be especially important for countries such as Greece, as the incoming flow of immigrants is rising (Emke-Poulopoulou, 2007) for the last 10 years or so but also keeps "coming through in waves".This last parameter can be of extreme importance, as the immigrants can -or cannot-get acculturated in a usually limited amount of time for which they bear a 'sojourner' identity (Ward, Leong, & Low, 2004) within a hostcountry, as enculturation, the process of "learning of the knowledge and beliefs of the social group in the society of origin" (Nauck, 2008, 379) precedes acculturation.In brief, acculturation procedures, as studied within a country, are the closest example of how we can conduct research using cross-cultural methodology without comparing countries, but "cultures" in contact.Such contacting cultures could also be minority groups within the country, or separate religious denominations, overseas students, etc.However, it is not the aim of the present study to follow all cross-cultural leads; this would be impossible.Its aim is to "translate" some of the cross-cultural research and statistical methods into a comprehensible "country-bound" perspective, in such a way that interested researchers can apply the methods to intra-country groups bearing identities sufficient enough to suggest culturally diverse personalities, attidudes, beliefs, values, temperament, motives, development, adjustment, abilities, and so forth.Gender roles or differences between the two sexes can accommodate for such diversities, since they represent the standard "different cultures" question asked within the same country or national identity (e.g., different norms for males and females, testing for sex differences through analysis of variance designs, methodologically retaining just one sex in some sample, etc.).Under this rationale, this "countrybound" perspective is obviously cross-cultural, but in an indirect way.
Three statistical methods -as stemming and/or related to cross-cultural research-will be presented in the following pages: (a) the first refers to the use of factor analysis in the quest of possible factor equivalence among two or more factor solutions for assumingly different cultural groups; Tucker's Phi (º) coefficients are employed for this basic procedure.This method will be expanded through the use of the "hit" matrix (Georgas & Mylonas, 2006;Mylonas, Pavlopoulos, & Georgas, 2008).(b) the second stati-stical method presented stems also from factor analysis, although it is associated to estimable functions within and between the cultural groups and the attempt to arrive to universal factor solutions through Procrustean rotations, achieving the maximum level of equivalence for further comparisons across groups to be justified.In such a sense, this second method is another expansion of the basic Tucker's º quest for equivalence.(c) Finally, the third statistical method refers to the use of other multivariate techniques and specifically Multidimensional Scaling in the attempt to reduce error variance caused by the very fact of cultural grouping, otherwise called "bias in terms of culture".
The three selected "paradigm" cases should easily demonstrate their potential use while comparing intra-country "cultural" groups or even under any case of differential group analysis in the sense that in order to enhance such a comparison, we first need to ascertain, at least to some extent, that the constructs are indeed comparable across those within-country groups.
Separate attention -in brief, though-should be drawn here on the "Equivalence" and "Bias in terms of culture" concepts.The latter has been systematically addressed by theorists and reseachers in Cross-Cultural Psychology, with Poortinga setting the scene back in 1989, arguing on several ways of dealing with the artifacts caused by the specific type of bias.In a satisfactory cross-cultural study there is no variance left to be explained in terms of culture (Poortinga & van de Vijver, 1987) and cultural variance should be reduced to zero to derive comparable measures and cross-culturally meaningful structures.To further clarify it, a "comparison scale" vs. "measurement scale" differentiation was also described by Poortinga (1989).In a cross-cultural comparison with respect to some variable, differences in scores between cultural groups can reflect valid differences in the construct measured, or they can result from measurement artifacts or bias.In

Statistical analysis techniques based on C-C research methods x 187
a comparison affected by bias, the relationship of the measurement scale with the comparison scale is not the same for the different groups.This issue is tightly connected to the concept of equivalence.Construct equivalence refers to the equivalence of a construct across cultures.If constructs are not identical across cultures, comparisons of items or scores on a questionnaire are not possible.Combination of exploratory factor analysis and target rotations, along with an estimation of the degree of factorial agreement, most often Tucker's º index, are commonly employed to describe levels of construct equivalence.Thus, the quest for construct equivalence is usually operationalized through factor structure testing, resulting into factor equivalence testing.

Paradigm #1. Tucker's º coefficients and their application in basic factor equivalence testing
Tucker introduced his congruence coefficient in 1951, a method for computing the correlation between two vectors, as these represent the loadings of a set of variables in two separate factor solutions.This is not the only method for such a comparison, as Cattell has also introduced (Cattell & Baggaley, 1960) his Salience (S) coefficient to address the same question.The Salience coefficient (Tabachnick & Fidell, 1989) is accompanied by probability estimates and is not a correlation coefficient.Additionaly, this coefficient is much more stringent with the data, while its computation is somewhat laborious.It would be unexpected if the literature (especially the cross-cultural one) would not generally follow the easier (and much more tolerant) Tucker's º.For this congruence (or proportionality) index, a ᎂºᎂ≥0.90level of congruence is needed to consider two vectors identical (0.95 level is preferable though).Absolute indices less than 0.90 -but close to it-denote similarity but not identity, whereas low absolute indices, such as 0.75 or lower, indicate dissimilarities between the vectors.The computation formula for this coefficient is given in (1).
A brief example for a set of 15 family values drawn from the European Value Study (Arts & Halman, 2004) is presented next.For these 15 values regarding "important issues within marriage" we selected just two countries -out of the 32 involved-for illustration purposes.Separate three-factor structures were computed (via principal components method followed by orthogonal rotation of the axes) and are presented in Table 1.For these factors we computed all nine (3×3) Tucker's º indices.These did not reach identity levels in any case; however, some similarity was present.Specifically, the first Dutch factor as compared to the first Greek factor resulted into a Tucker's º of 0.46; for the first Dutch factor as compared to the second Greek factor, º=0.47; for the first Dutch factor and the third Greek factor, º=0.53.Respectively, for the second Dutch factor and the first to third Greek factors, Tucker's º reached 0.84, -0.19, and 0.52.For the last Dutch factor and the first to third Greek factors, Tucker's º reached 0.21, 0.85 and 0.28, respectively.
One might wonder, what is new in all this.Obviously, Tucker's congruence coefficients are not a novel issue but such equivalence-level comparisons, as the ones described above assume that the factor structure for each countryculture is the product of homogeneous population subsets, with their structural components harmonized towards the factor structure tested in the initial country comparison.However, several conditions have to be met before such an where k corresponds to the number of items in each vector under comparison (same for both vectors), and X i and Y i are the elements within each vector.(1) assumption can be made, as the one that the two sexes do not differ in their factor structure within each culture.A sex comparison should be psychometrically and theoretically appropriate, but many times is considered unnecessary.Previous research may have accounted for such testing but if not, how can one compare countries when the intra-country data may be plagued by unaccounted error variance?One needs to test for sex -and possibly other-structural differences within each culture separately and this can of course be done through Tucker's º indices.A brief example is employed here (Table 2) to illustrate the possible structural dissimilarities between the two sexes within the same country (the Netherlands in this example).For these intra-country factor structures, Tucker's º did not exceed 0.88 in any case, revealing only some similarity between the first Male and the second Female factors.Specifically, the first Male factor as compared to the first Female factor resulted into a Tucker's º of 0.16; for the first Male factor as compared to the second Female factor, º=0.88; for the first Male factor as compared to the third Female factor, º=0.72.Respectively, for the second Male factor and the first to third Female factors, Tucker's º reached 0.73, -0.23, and 0.27.For the last Male factor and the first to third Female factors, Tucker's º reached only 0.26, 0.05 and 0.65.
Thus, separate factor structures can be computed for each differential group within each country and then formula (1) can be applied to test for factor equivalence within countries before proceeding to the across-countries comparison of structures.Such an approach delays crosscultural analyses but if such fundamental knowledge of intra-country homogeneity is not given (for a few but possibly strong correlates, such as sex), it does not seem very safe to proceed with cross-cultural analyses, since we cannot be certain that the differences themselves are "equivalent" across the cultures.Additionally, and regardless of the implications for crosscultural psychology, it is self-evident that Tucker's º allows for the comparison of any differential groups in respect to their factor structures within a country (or culture).Tucker's º indices become very useful when a large number of countries, cultures, or groups is available.An attempt to account for the overall level of similarity or identity among the 32 countries involved in the European Value Study has revealed both identities and dissimilarities across countries (Georgas & Mylonas, 2006), meaning that some countries can be considered identical in their factor structure for a set of variables and dissimilar to another set of countries, which in turn are congruent for their factor structures within the set.This method can be of course applied to any set containing multiple sets of data (differential groups).An example would be intra-country comparisons of occupational groups (Mylonas & Georgiadi, 2004), age groups (Georgas et al., 2003), socioeconomic status groups (Mylonas & Xanthopoulou, 2007), permanent residence locations, religious denominations groups, educational level groups, language groups, etc., literally any differential grouping based on any ecosocial indicator (Georgas & Berry, 1995;Georgas, van de Vijver, & Berry, 2004).All factor analysis assumptions have of course to be met for such a strategy in analyzing each group separately to be statistically justified (Kline, 1993).

Statistical analysis techniques based on C-C research methods x 189
The following intra-country illustration is based  (Allport, Vernon, & Lindzey, 1951;Spranger, 1928).The questionnaire employed (in its Greek adaptation, Mylonas, 1994) is consisted of 54 items (nine items per scale) distributed in six value scales (Theoretical, Economic, Aesthetic, Social, Political, and Religious values) and has provided information on the value-systems of student samples (Gari, Mylonas, & Karagianni, 2005;Mylonas, 1994).For this paradigm, we analyzed only the Religious values scale (a scale addressing human existence and introspective issues) and we did so for the six student subsamples involved in the 1994 study: Departments of Mathematics, Economics, Literature, Medicine, Sports Science (Fine Arts), and Theology.These sub-samples were of unequal N, but complied with the necessary factor analysis assumptions.
For two-factor solutions, the nine items of the scale resulted into a Tucker º matrix of 15 two-bytwo paired comparison matrices accompanied by a diagonal of six identity matrices.From this matrix it was easy to arrive into a "hit" matrix (Georgas & Mylonas, 2006;Gari, Panagiotopoulou, & Mylonas, 2008;Mylonas, Pavlopoulos, & Georgas, 2008) for which the diagonal contains zeros, as dissimilarities rather than similarities are coded, and on eitherside of the diagonal it contains instances of inequivalence for each of the 15 two-by-two paired comparison matrices (in this case a minimum of 0 holds -both factors identical in both solutions-and a maximum of 2 -no factors identical).Only 27% of the 30 (=2×15) maximum possible hits (equivalence instances) were present in the Tucker º "hit" matrix for these six samples, indicating possible structural differences to be further described.
To describe the equivalence levels in terms of differential groups, we employed this same hit matrix as the input to a Multidimensional Scaling solution, in order to test for possible homogeneous sets of differential groups ("Departments") as defined by the levels of equivalence of their factor structures.For the analysis we employed the Chebychev definition of distances based on the maximum absolute difference between the values for the items, in order to statistically maximize the possibility of non-homogeneity and differentiation among the groups.The ordinal level of measurement was adopted as the raw scores represented levels of equivalence (calculated as instances of inequivalence); these instances of inequivalence are not of the same magnitude in the Tucker º matrix nor are they of the same importance or meaning, thus the ordinal level of measurement is more appropriate.A two-dimensional solution was computed and the two sets of coordinates were plotted on the circumference through trigonometric transformations.
A separate note should be drawn on these transformations; they aim at simplifying the patterns present in the two-dimensional area, in order to constrain the plotted items upon a circular continuum by calculating the arctangent values for coordinate inputs (Gari, Panagiotopoulou, & Mylonas, 2008;Sidiropoulou, Mylonas & Argyropoulou, 2008;Veligekas, Mylonas & Zervas, 2007).Information on the linear departure of each item from the origin is not used, but the information on the clustering of these items upon the same or neigboring radius points on a circumference can be of great explanatory power, at least at an initial interpretation level.The computational procedure to transform each set of coordinates to the quadrant-specific arctangent value (floating point partial arctangent) expressed in radians (-, ) is described through formulae (2) to (8) in Table 3.
To avoid laborious calculations, the function "ATAN2" has been implemented a few years ago in programming languages such as C+, BASIC or FORTRANlike ones, mathematical function software libraries, and is also incorporated in popular packages such as Microsoft Excel.Through this function, the user can easily compute the arctangent in radians for a set of two coordinates (quadrant specific) and then transform it to degrees on the circumference using formula (4).The outcomes for the current

Statistical analysis techniques based on C-C research methods x 191
paradigm are presented in Figure 1 along with the respective trigonometric transformation plot.
The circumference plot is quite revealing in respect to the homogeneous groups as defined by their factor equivalence levels.Although the Sports Sciences (Fine Arts) subsample unexpectedly matched the Medicine structure, the overall clustering of departmental groups can be clearly considered interesting and certainly useful with its further implementation in computing the factor structure for each of the two clusters namely the Literature and Economics cluster and the Mathematics, Theology, Medicine and Sports Science (Fine Arts) cluster of groups.Such clustering of countries has yielded interesting and theoretically sound results in previous research (Gari, Panagiotopoulou, & Mylonas, 2008).The overall method of trigonometric transformation of the multidimensional scaling coordinates has also been employed for clustering of variables, with exceptionally interesting and sound findings (Georgas et al., 2004;Gari & Mylonas, 2006).
This first paradigm can be summarized into two main points: (a) Exploring for factor equivalence across within-country groups is by itself a "prerequisite" for further comparison to follow across these groups.In order to be able to interpret possible mean similarities or differences it would be very supportive if not necessary to have already shown that the groups under Table 3 Arctangent transformation for point (y, x) on a (-, ) range comparison are indeed comparable on the construct level.(b) Employing the hit matrix method along with the trigonomerically transformed multidimensional scaling solution for the information on the levels of factor equivalence across a number of within-country groups can be either of metric importance (same constructs assessed), of theoretical importance (clusters of groups), or both.

Paradigm #2. Achievement Goal Orientation theory and Multilevel Covariance Structure analysis
Another way to test for construct equivalence in Cross-Cultural Psychology is to employ the Muthén method (1994Muthén method ( , 2000) ) as extended to factor analysis and presented in detail by van de Vijver and Poortinga (2002).This method takes

Figure 1 Multidimensional scaling overview for the Religious Value Scale (N=462)
advantage of the multilevel covariance structure in the data to compute estimated between-groups and pooled-within groups correlation matrices as the best estimates of correlation matrices to factor analyze, if the conditions are met.The outcomes of this analysis are then target-rotated (Procrustean rotation of one matrix on the other) and result in a final rotated matrix of loadings showing the overall factor structure for both groups, if the intra-class correlation coefficients allow for universality assumptions to be stated.For reasons of brevity and since our aim is to show that the method is useful for construct equivalence testing in withinculture differential groups, we will only present here an intra-country paradigm, considering the cross-cultural paradigm given (e.g., Poortinga & van de Vijver, 2004).Our intra-country paradigm refers to a set of 12 goal-setting items as stated by the Achievement Goal Orientation theory (Nicholls, 1984;Roberts, Treasure, & Balague, 1995, 1998).This theory describes two goalorientations, namely "task" and "ego" goal orientation.For the current paradigm, the sample consisted of 483 Greek track and field athletes (386 males and 177 females), with a median age of 19 (age range 15-33).These athletes responded to the Perception of Success Questionnaire (POSQ, Roberts, Treasure, & Balague, 1995, 1998), a 12-item Likert-type scale questionnaire.The results for this sample (Veligekas, Mylonas, & Zervas, 2007) refer to the overall sample, but what if we questioned the homogeneity of possible subsamples (i.e., types of events, like running events vs. jumping and throwing events)?We separated the overall sample to two "event" sub-samples to answer this question and then followed the Muthén method2 as extended to factor analysis by van de Vijver and Poortinga.The results for both subsamples and for the target-rotated solution after the implementation of the multilevel covariance structure analysis method are presented in Table 4.
The intra-class correlation coefficients resulted into an absolute average of 0.003 which was extremely promising in terms of universality (with >95% invariance, there is no uninvariance to be modeled, according to Muthén).Thus, universality would be expected, as was evident as well from the factor solutions for each group.Indeed, after the Procrustean rotation the respective Tucker's º indices reached 1.00 for both factors and the rotated solution is the one presented in Table 4.The interesting issue here was to see what would the final loadings be for this Procrustean solution, since the two separategroup-solutions revealed differences in the relative power of each of the items in the solutions.The main difference from the throwing events structure is the 0.10 loading increase for the last item, whereas all other items seem to generally compose both solutions into an overall one; actually, the target-rotated solution is very close to the overall solution computed directly on the raw data for the total sample.Indeed, this is a clear-cut case of within-country universality, at least for the groups at hand, so a cross-cultural comparison of such groups would be enhanced by the homogeneity of the specific group factor structures and the validity of construct assessment across groups is also a major withincountry comparison advantage; however, this is not always the case of course, and caution should be exercised for the intra-country equivalence before testing for cross-cultural structural, metric or scalar equivalence (van de Vijver & Leung, 1997).Testing within-country for possibly different cultures can of course be an autonomous procedure and the method as described here can be of valuable assistance in such testing.

Paradigm #3. Reducing bias in terms of culture, or in terms of intra-country groups suspect of bias inflation
The final paradigm-illustration is not a purely factor analysis one but is mostly based on multidimensional scaling methods instead.Factor analysis is only providing the initial solution to be processed through the methods proposed within this paradigm, and is then employed only at the final stages again to compare the results to the ones computed at the initial stage.The current paradigm is presented in two parts, the first one computed on data from different countries, and the second part computed on intra-country group data, as it can perfectly fit intra-country group comparisons.The starting point is a factor structure computed through methods similar to the paradigm #2 ones, in order to arrive at an overall factor structure for a number of countries or cultures with as high levels of universality as possible -if possible.Still, discrepant items are most of the times present in the solutions and many times are due to metric errors (such as social desirability or acquiescence effects) or they are due to bias in terms of culture (both sources of bias can of course be active, also along with other sources).
Construct bias may confound factor structures, as these are computed for more than one cultures.Inequivalence in cross-cultural studies can be mainly attributed to variance in terms of culture, which has to be reduced to null for the factor structures to be meaningful (Poortinga & van de Vijver, 1987).Removing item bias at the item level does not necessarily remove construct bias, since it is wrong to interpret all types of bias as item bias only (Poortinga, 1989;van de Vijver & Leung, 1997).A biased item though, can be treated as a disturbance at the item level that has to be removed in ways such as the van Hemert, Baerveldt, and Vermande computational method (2001).A drastic way would be to remove such items altogether, but with obvious counter-effects.Yet, another method would be to circumvent the cultural bias effect by controlling for external  (Georgas & Berry, 1995;Georgas et al., 2004).

Statistical analysis techniques based on C-C research methods x 195
For the present paradigm and the method proposed through it, a factor structure has been computed (as also presented by the author in the 2003 6th IACCP European Regional Congress in Budapest) for a set of 20 paternal roles within the family (van de Vijver et al., 2006) for a set of six countries (Greece, Georgia, U.S.A., Germany, Indonesia, and Pakistan).The loadings for the overall sample (N=1,655), as computed through multilevel covariance structure analysis followed by Procrustean rotation (Stage 1), are presented in Table 5.
It is clear in this solution (Stage 1) that only the "takes care of grandparents" item does not load on any of the three factors.Additionally, a number of items are cross-loading on more than one factors.These findings might be partly due to bias in terms of culture which could be inflating the error terms or might be introducing unwanted metric or construct inequivalence.The question is, as Poortinga and van de Vijver might have put it, "Is there still variance left to be explained in terms of culture?" (1987) and as they might have continued, "Cultural variance should be reduced to null to derive comparable measures and crossculturally meaningful factor structures."However, as Poortinga has discussed (6 th IACCP European Regional Congress, July 2003, Budapest), any method employed to reduce bias in terms of culture should not discard too much of the error variance because there would be no variance left to explain.Proceeding with minor adjustments in order to cautiously account for bias, at least up to some extent, sounds wiser; however, the outcomes themselves can show if such a cautious approach needs being bolder or not.
In cross-cultural comparisons, if a factor equivalent structure for a set of countries is the target, then the items to be factor analyzed could also be themselves an estimation source of bias.That is, for a set of items across countries, indices can be computed that may contain information about the variance explained only in terms of culture.Thus, accounting for cultural variance is a procedure which can be achieved by estimating for a set of items the amount of variance caused by "culture", using the information provided by these same items.Such estimates can be computed in the way described below, through Multidimensional Scaling models.
Multidimensional scaling has been widely used to model cross-cultural similarities and differences (like in the Schwartz studies on values through Smallest Space Analysis, a variant of multidimensional scaling, i.e., Schwartz & Sagie, 2000).Furthermore, sophisticated ways to model similarities and differences simultaneously are available, such as the individual differences Euclidean distance model, through which we can compute the underlying dimensions for a set of countries and at the same time compute the relative importance of these dimensions for each country, in terms of dimension weights.A "weirdness index" computed for each country is also available, which corresponds to the proportionality of the individual dimension weights to the overall average weights, thus depicting the eccentricity of each country's similarity matrix in respect to the overall dimensions in the data.This index can be considered an r 2 index since it accounts for variance explained by the eccentricity of the similarity matrix, thus depicting the covariance of the cultural elements with the measures of interest 3 .Following the computation of this "r-square" estimation for each of the countries involved, we can adjust for the bias estimates in this index through the procedure described in formulae (10) to (12).Thus, the "weirdness index" effect can be removed from the original raw scores by adjusting the standard deviation of each item within each country taking this "cultural effect size" out.The adjustment stage is initiated by computing the z-scores for the raw data (formula 10) within each separate group, with the final aim being the recalculation of raw scores for each item based on adjusted standard deviations.
All computations are performed within each country separately and for each item separately.All participants scores are thus adjusted, having removed some of the bias in terms of culture as

Statistical analysis techniques based on C-C research methods x 197
* Structure for the raw scores before adjustment for bias in terms of culture.** Structure for the adjusted raw scores for the "weirdness index" bias information.
, for i = 1 to n participants (10) s' = s 2 -s 2 r 2 , where r 2 = "weirdness index" for each group (11) X' =zs' +X , where s' is the adjusted standard deviation and X' is the adjusted raw score (12) depicted by the "cultural effect size" through the weirdness index.Having computed these adjusted raw scores for our paradigm data, the factor analysis procedures were performed afresh (Stage 2) and the resulting factor solution is presented in Table 5.What is of specific importance is that for this adjusted solution, the item "takes care of grandparents" is now loading on the second factor whereas it was not loading on any factor before adjustment.Additionally, for many items (underlined figures) in the first factor, loadings have increased and in only one instance there is a loading decrease (marked in italics).This is the opposite for the second and third factors with some marker variables increasing their loadings but with most items of these two factors suffering losses in loading magnitude (e.g., three out of five items in the third factor).These changes in loadings seem to emphasize the importance of the first factor and by doing so, the power of the second and mostly of the third factor becomes less than the respective power present in Stage 1.
If this can apply to cross-cultural data, there is no reason for not applying it to any intra-country comparison of groups of any kind, such as sex groups, age groups, occupational groups, student status groups (undergraduate vs. postgraduate), etc., as these groups may represent different "cultures" within the country, since the method can be carbon-copied to any two or more groups when attempting to reduce the bias caused by different group identities.An attempt was made for the current paradigm to apply the method to the Hellenic WISC-III normative data (Wechsler, 1997).The WISC-III has been standardized for the Greek population by Georgas, Paraskevopoulos, Besevegis and Giannitsas and has also been tested for its factorial structure (Georgas et al., 2003;Giannitsas & Mylonas, 2004).For the current exercise we selected only two age-groups namely 9-year olds and 14-year olds to allow for ample age differences and applied the method proposed in this paradigm.It has to be stressed that the results reported here are not the structure for the Hellenic WISC-III, which has been shown to be consisting of three factors (Giannitsas & Mylonas, 2004); this is merely a 2-factor structure for only two age-groups and specifically devised for the needs of this exercise.Out of the 13 scales of the WISC-III, only 11 are employed for the exercise, excluding the complementary scales; the reasons for this selection are not elaborated in the current paper for reasons of brevity, but have been fully explicated by Giannitsas and Mylonas (2004) and during the Joint European Conference of the IACCP and ITC in Graz (Mylonas, 1999).The results of the proposed method application are summarized in Table 6.The first stage showed very high levels of identity for the two factor structures as computed separately for the age of nine and the age of 14 years.The average absolute intra-class correlation coefficient reached only .01 and the multilevel covariance structure analysis resulted into the matrix denoted with "Fbf i " in Table 6.
The main discrepancy for the "Fbf" loadings were the cross-loadings for the Arithmetic and the Picture Arrangement scales.Adjustments were performed on a 0.23 weirdness index for the 9year olds and a 0.19 weirdness index for the 14year old children with these indices computed through an individual differences Euclidean distance model (2-dimensional, ordinal level of measurement).Our question was of course whether we might reduce the strength of these cross-loadings by removing some bias in terms of different age-groups; it would be very useful to avoid those cross-loadings altogether, as this would enhance factor interpretation, but this seemed highly improbable due to the high invariance levels and also due to the normative and theoretical strength of the WISC-III data, which do not allow for error sources to act in uncontrollable ways.Indeed, the differences in loadings after the weirdness index raw-score adjustments (denoted as "Faf i " in Table 6) were very small.There was "some" profit though which was the small drop for the loading of Picture Arrangement on the first "Verbal Intelligence" factor, nearly taking one cross-loading out of the factor structure; this was followed by very small drops for most of the loadings on the second "Performance Intelligence" factor.The "Verbal" part in both solutions remained exactly the same, verifying once more the statistical strength of the exercise data.
A final note is that this method can of course be applied to any two groups and can be generalized to three or more samples under comparison; even further, interactions (combined groups) can be tested, provided that statistical assumptions are met, since multidimensional scaling can be performed on very limited N, but factor analysis is not so convenient and requires at least triple participants than items and at least 10 items to analyze (Kline, 1993).

Discussion and Conclusions
The aim in this study was to introduce a variety of cross-cultural research methods to non crosscultural psychologists mainly and then clarify the application of the methods beyond the crosscultural psychology research borders.In doing so, we also aimed at stressing the methodological properties of cross-cultural psychology which embrace nearly every psychological research domain.Two original methods were also introduced in their broader statistical and methodological sense: (a) the hit matrix method as followed by the trigonometric transformation of multidimensional scaling solutions, and (b) the method of adjusting raw scores through the use of the weirdness index as computed through individual scaling Euclidean distance solutions.
One central axis of all three paradigms presented in this study is the multilevel characteristic of the methods.Multilevel analysis in Cross-Cultural Psychology deserves special merit and is a very important topic in current research.There are two aggregate levels in multilevel analysis for crosscultural research and these are usually the individual level and the country level (van de Vijver, van Hemert, & Poortinga, 2008); the individual level refers to the raw scores as collected from each individual from each country,  Poortinga (2002) and have been linked to equivalence testing.

Statistical analysis techniques based on C-C research methods x 199
The paradigms presented in this study were generated by cross-cultural methods but are applicable to intra-country comparisons as well; they also share the multilevel element for the reasons explained hereon.For paradigm #1, the multilevel element is the "hit" matrix itself; this is an aggregate of "hits", that is the number of instances of factor inequivalence among the factor structures for a set of groups (computed at the individual level).Even more, the aggregation of countries (cross-culturally) or of differential groups (intra-country approach) in clusters or homogeneous groups, as stemming from the "hit" matrix evaluation, is the second step within the same multilevel approach.Therefore, the individual level as defined for cross-cultural research still holds for the intracountry paradigm but the cross-cultural countrylevel of aggregation now becomes group-level of aggregation without violating any of the assumptions in handling such an aggregate level or the properties involved.
For paradigm #2, the method proposed by Muthén is by itself the multilevel element, since the pooled-within correlation matrix employed is generated by individual group matrices but the estimated between group correlation matrices correspond to aggregated group information.This may be considered a "second-order" multilevel aggregation technique for intra-country comparisons, since not the country-level, but only sets and subsets within the country are employed; still, these correspond to aggregates of information to be compared in the next stages within the same method.
Finally, for paradigm #3, the multilevel approach is based on the computation of the overall multidimensional scaling solution followed by the individual "subject weights" which in turn define the weirdness index levels for each group involved.In testing for equivalence among the groups considered, the individual distance matrices solutions are the lower level of aggregation and the overall solution is the higher level of aggregation.These levels are then evaluated for their similarity so as to adjust (or not) the individual raw scores for the weirdness indices according to their group membership.

Overall conclusion
The paradigms presented in this study served as a first attempt to link cross-cultural research methods to "everyday psychological research".This way, some of the methods specifically devised for cross-cultural research have been explicated in brief, for the interested reader to follow either in his/her own cross-cultural studies or in intra-country group comparisons.Two noveloriginal statistical methods have been also formally presented and explicated to aid towards the same end.Overall and obviously, it is not a matter of just employing all the specific methods described; it is a matter of paying attention to the intra-country equivalence issues among groups under comparison and possibly enhancing the validity of such comparisons at an intra-culture level, as is -or should be-the case at the crosscultural level.

Table 6 Results for 11 of the Hellenic WISC-III scales: factor structures before and after metric adjustment
whereas the aggregated country level refers to overall statistics, such as the means or the standard deviations, as computed and analyzed for the countries involved.Aggregation levels have been explained in detail by van de Vijver &