M ODEL E VALUATION 709

M ODEL E VALUATION 709

tistical learning: Data mining, inference, and prediction. New York: Wixted, J. T., & Ebbesen, E. B. (1991). On the form of forgetting. Springer.

Psychological Science, 2, 409-415.

Hayes, K. J. (1953). The backward curve: A method for the study of learning. Psychological Review, 60, 269-275.

NOTES

Jeffreys, H. (1935). Some tests of significance, treated by the theory of probability. Proceedings of the Cambridge Philosophical Society,

1. Recent work, such as Bayesian hierarchical modeling (e.g., Lee & 31, 203-222.

Webb, 2005), allows model fitting with parameters that vary within or Jeffreys, H. (1961). Theory of probability (3rd ed.). Oxford: Oxford

across individuals.

University Press. 2. Of course, even in the case of well-justified individual analyses, Karabatsos, G. (2006). Bayesian nonparametric model selection and

there is still the possibility of other potential distortions, such as the model testing. Journal of Mathematical Psychology, 50, 123-148.

switching of processes or changing of parameters across trials (Estes, Kass, R. E., & Raftery, A. E. (1995). Bayes factors. Journal of the

1956). Those possibilities go beyond the present research. American Statistical Association, 90, 773-795.

3. We recognize that quantitative modeling is carried out for a vari- Lagarias, J. C., Reeds, J. A., Wright, M. H., & Wright, P. E. (1998).

ety of purposes, such as describing the functional form of data, test- Convergence properties of the Nelder–Mead simplex method in low

ing a hypothesis, testing the truth of a formal model, and estimating dimensions. SIAM Journal of Optimization, 9, 112-147.

model parameters. Although it is possible that the preferred method Lee, M. D., & Webb, M. R. (2005). Modeling individual differences in

of analysis could depend on the purpose, we believe that the lessons cognition. Psychonomic Bulletin & Review, 12, 605-621.

learned here will also be of use when other goals of data analysis are Mack, A., & Rock, I. (1998). Inattentional blindness. Cambridge, MA:

pursued.

MIT Press. 4. This assumption underlies other common model selection tech- Macmillan, N. A., & Creelman, C. D. (2005). Detection theory: A

niques, such as BIC (Burnham & Anderson, 1998). user’s guide (2nd ed.). Mahwah, NJ: Erlbaum.

5. It may be best to use a common set of parameters to fit the data from Massaro, D. W. (1998). Perceiving talking faces: From speech percep-

each individual. That is, rather than fitting the models either to a single tion to a behavioral principle. Cambridge, MA: MIT Press.

set of averaged data or to individuals, using different parameters for each Medin, D. L., & Schaffer, M. M. (1978). Context theory of classifica-

individual, it is possible to find the single set of model parameters that tion learning. Psychological Review, 85, 207-238.

maximizes the goodness of fit for each of the individuals. This analysis Minda, J. P., & Smith, J. D. (2002). Comparing prototype-based and

was repeated for each pair of models discussed below, but the results exemplar-based accounts of category learning and attentional allo-

from this third type of analysis were almost identical to those when aver- cation. Journal of Experimental Psychology: Learning, Memory, &

age data were used. They will not be discussed further. Cognition, 28, 275-292.

6. It is unclear how to combine measures such as sum of squared error Myung, I. J., Kim, C., & Pitt, M. A. (2000). Toward an explanation

across individuals in a principled manner. of the power law artifact: Insights from response surface analysis.

7. Note that this is a very simple form of uninformed prior on the Memory & Cognition, 28, 832-840.

parameters. To be consistent with past work (Wagenmakers et al., 2004), Navarro, D. J., Griffiths, T. L., Steyvers, M., & Lee, M. D. (2006).

the priors would have to be selected from a more complex prior, such as Modeling individual differences using Dirichlet processes. Journal of

the Jeffreys’s prior. Because of its ease of use, we opted for the uniform Mathematical Psychology, 50, 101-122.

prior. Furthermore, as will be seen, the results, for the most part, seem to Navarro, D. J., Pitt, M. A., & Myung, I. J. (2004). Assessing the dis-

be unchanged by the choice of prior.

tinguishability of models and the informativeness of data. Cognitive 8. 13 individuals/experiment 3 9 trials/condition 3 2 parameter se- Psychology, 49, 47-84.

lection methods 3 500 simulations.

Nosofsky, R. M. (1986). Attention, similarity, and the identification– 9. Note that we use the term optimal guardedly. The optimality de- categorization relationship. Journal of Experimental Psychology:

pends not only on the details of our simulation procedure, but also, more General, 115, 39-57.

critically, on the goal of selecting the model that actually generated the Nosofsky, R. M., & Zaki, S. R. (2002). Exemplar and prototype models

data. There are many goals that can be used as targets for model selec- revisited: Response strategies, selective attention, and stimulus gen-

tion, and these often conflict with each other. If FIA, for example, does eralization. Journal of Experimental Psychology: Learning, Memory,

not produce a decision criterion at the “optimal” point, one could argue & Cognition, 28, 924-940.

that FIA is nonetheless a better goal for model selection. The field has Oden, G. C., & Massaro, D. W. (1978). Integration of featural informa-

not yet converged on a best method for model selection, or even on a best tion in speech perception. Psychological Review, 85, 172-191.

approximation. In our present judgment, the existence of many mutually Reed, S. K. (1972). Pattern recognition and categorization. Cognitive

inconsistent goals for model selection makes it unlikely that a single Psychology, 3, 382-407.

best method exists.

Rissanen, J. (1978). Modeling by the shortest data description. Auto- matica, 14, 465-471.

ARCHIVED MATERIALS

Rissanen, J. (1987). Stochastic complexity. Journal of the Royal Statis- tical Society B, 49, 223-239, 252-265.

The following materials associated with this article may be accessed Rissanen, J. (1996). Fisher information and stochastic complexity.

through the Psychonomic Society’s Norms, Stimuli, and Data archive, IEEE Transactions on Information Theory, 42, 40-47.

www.psychonomic.org/archive.

Rissanen, J. (2001). Strong optimality of the normalized ML models as To access these files, search the archive for this article using the universal codes and information in data. IEEE Transactions on Infor-

journal name (Psychonomic Bulletin & Review), the first author’s name mation Theory, 47, 1712-1717.

(Cohen), and the publication year (2008). Schwarz, G. (1978). Estimating the dimension of a model. Annals of

F ILE : Cohen-PB&R-2008.doc.

Statistics, 6, 461-464. D ESCRIPTION : Microsoft Word document, containing Tables A1–A16 Shepard, R. N. (1987). Toward a universal law of generalization for

and Figures A1–A12.

psychological science. Science, 237, 1317-1323.

F ILE : Cohen-PB&R-2008.rtf.

Sidman, M. (1952). A note on functional relations obtained from group D ESCRIPTION data. Psychological Bulletin, 49, 263-269. : .rtf file, containing Tables A1–A16 and Figures A1–A12

in .rtf format.

Siegler, R. S. (1987). The perils of averaging data over strategies: An example from children’s addition. Journal of Experimental Psychol-

F ILE : Cohen-PB&R-2008.pdf.

ogy: General, 116, 250-264. D ESCRIPTION : Acrobat .pdf file, containing Figures A1–A12. Wagenmakers, E.-J., Ratcliff, R., Gomez, P., & Iverson, G. J. (2004).

F ILE : Cohen2-PB&R-2008.pdf.

Assessing model mimicry using the parametric bootstrap. Journal of D ESCRIPTION : Acrobat .pdf file, containing Tables A1–A16. Mathematical Psychology, 48, 28-50.

A UTHOR ’ S E - MAIL ADDRESS : [email protected]. (Continued on next page)