Contributions to Assessment

The following list highlights the ways that CARS faculty, staff, and graduate assistants contribute to the higher education assessment community. Some of these are milestones we have celebrated and others represent ongoing work. We feel that each of these elements have added value and helped elevate the standards of assessment practice in higher education.  

Connection to the Assessment and Measurement PhD Program
Learning Improvement (Use of Results)
Assessment Day
Motivation Research in Low Stakes Testing Environments
Implementation Fidelity Applied to Student Affairs Programs
Advanced Measurement Techniques Applied to Assessment Practice
Defining and Evaluating Assessment Quality (Meta-Assessment)
Partnerships with University Content Experts to Develop Tests
Professional Development in Assessment (Assessment 101)
Competency-Based Testing

CARS receiving the 2022 Banta Lifetime Acheivement Award at Assessment Institute

Donate to CARS

Connection to the Assessment and Measurement PhD Program

Initially seeded by the Fund for the Improvement of Post-Secondary Education, the assessment and measurement doctoral program at JMU was launched in 1998. At the time, few colleges possessed the expertise to conduct useful assessment. In fact, Dr. Peter Ewell characterized the early campus assessment initiators as “happy amateurs.” The Center for Assessment and Research Studies (CARS) at JMU and the subsequent assessment and measurement PhD program responded to the national need for innovative quality assessment. JMU added a Master’s degree in psychological science with a concentration in quantitative psychology in 2001. Many of these students matriculate in the doctoral program. 

In addition to teaching in these programs and recruiting students into the fields of assessment and measurement, our faculty serve in various assessment specialist roles in CARS where they provide assessment consultation services to faculty across campus. Most graduate students in the Ph.D. and M.A. program are awarded graduate assistantships in the Center as well; providing an opportunity for students to apply the skills they are learning in real-time situations. This unique structure makes the center one of the largest of its kind. By working closely with faculty mentors to assist academic and student affairs programs at JMU, students learn advanced skills in measurement, statistics, consultation, and higher education policy. 

Graduates of the program are sought after by universities, testing companies, and other businesses for their blend of technical skills and practical know-how.   Moreover, many of the publications cited on this webpage are co-authored by our graduate students. 

Leventhal, B. C. & Thompson, K. N. (2021). Surveying the measurement profession to assist recruiting in the United States. Educational Measurement: Issues and Practice, 40(3), 83-95.

Erwin, T. D. & Wise, S. L. (2002). A scholar-practitioner model for assessment. In T. W. Banta (Ed.), Building a Scholarship of Assessment. San Francisco: Jossey-Bass. 

Wise, S. L. (2002). The assessment professional: Making a difference in the 21st century. Eye on Psi Chi, 6(3),16-17. 

More information on JMU’s Assessment and Measurement Ph.D. program. 

More information on JMU’s Master’s in psychological sciences (quantitative concentration). 

(Back to Top)

Learning Improvement (Use of Results)

While student learning improvement is championed on many campuses, few universities have evidenced such improvement. In 2014, CARS faculty and students published a simple model for learning improvement through NILOA (the National Institute of Learning Outcomes Assessment). The “weigh pig, feed pig, weigh pig” model demonstrates how universities can evidence student learning improvement through a process of assessment, intervention, and re-assessment. 

Internally, CARS collaborated with JMU’s administration and the Center for Faculty Innovation to pilot the model on campus. Externally, CARS is partnering with prominent organizations and institutions to shift academe from a culture of assessment to a culture of improvement. 

Finney, S.J. & Buchanan, H.A. (2021). A more efficient path to learning improvement: Using repositories of effectiveness studies to guide evidence-informed programming. Research & Practice in Assessment, 16(1), 36-48.

Fulcher, K. H., & Prendergast, C. O. (2021). Learning improvement at scale: A how-to guide for higher education. Sterling, VA: Stylus.

Fulcher, K. H., & Prendergast, C. O. (2020). Equity‐Related Outcomes, Equity in Outcomes, and Learning Improvement. Assessment Update, 32(5), 10-11.

Fulcher, K. H., & Prendergast, C. O. (2019). Lots of assessment, little improvement? How to fix the broken system. In S. P. Hundley & S. Kahn (Eds.), Trends in assessment: Ideas, opportunities, and issues for higher education. Sterling, VA: Stylus.

Smith, K.L., Good, M.R., and Jankowski, N. (2018). Considerations and Resources for the Learning Improvement Facilitator. Research and Practice in Assessment (13), 20-26.

Smith, K.L., Good, M.R., Sanchez, L.H., & Fulcher, K.H. (2015). Communication is Key: Unpacking “Use of Assessment Results to Improve Student Learning.” Research & Practice in Assessment, (10), 15-27

Fulcher, K. H., Good, M. R., Coleman, C. M., & *mith, K. L. (2014, December). A simple model for learning improvement: Weigh pig, feed pig, weigh pig. (NILOA Occasional Paper). Urbana, IL: University of Illinois and Indiana University, National Institute for Learning Outcomes Assessment. 

(Back to Top)

Assessment Day

How does a university with more than 20,000 undergraduates assess general education learning outcomes and other large-scale initiatives? JMU conducts what is known as Assessment Day (A-Day, for short). Students participate in two assessment days—once as incoming first-year students during August and again after earning 45-70 credit hours (typically in the spring semester of their second year). Students complete the same tests at both times. This pre-and-post-test design allows the university to gauge how much students have learned as a function of their general education coursework. 
Assessment Day enables the university to answer important questions asked increasingly by students, parents, employers and legislators about what a college degree is worth.  Student learning data also helps the university improve its educational offerings. 

CARS is responsible for coordinating JMU’s A-Day—setting up the test conditions, proctoring the assessments, and analyzing the data collected. This data collection process is widely recognized as one of the most successful and longstanding in the nation, having been in place for over 30 years. 

Alahmadi, S., & DeMars, C. E. (2022). Large-scale assessment during a pandemic: Results from James Madison University’s remote assessment day. Research and Practice in Assessment, 17 (3), 4-15.

Pastor, D. A., & Love, P. (2020, Fall). University-wide Assessment during Covid-19: An opportunity for innovation. Intersection: A Journal at the Intersection of Assessment and Learning, 2(1).

Pastor, D.A., Foelber, K.J., Jacovidis, J.N., Fulcher, K.H., Sauder, D.C., & Love, P.D. (2019). University-wide Assessment Days: The James Madison University Model. The AIR Professional File, Spring 2019, Article 144, 1-13.

Mathers, C.E., Finney, S.J., & Hathcoat, J.D. (2018). Student learning in higher education: A longitudinal analysis and faculty discussion. Assessment and Evaluation in Higher Education, 43(8), 1211-1227.

Hathcoat, J. D., Sundre, D. L., & *Johnston, M. M. (2015) Assessing college students’ quantitative and scientific reasoning: The James Madison University story. Numeracy,8 (1),Article 2. 

Lau, A. R., Swerdzewski, P. J., Jones, A. T., Anderson, R. D., & Markle, R. E. (2009). Proctors matter: Strategies for increasing examinee effort on general education program assessments. Journal of General Education,58 (3), 196-217.    

Pieper, S. L., Fulcher, K. H., Sundre, D. L. & Erwin, T. D. (2008). “What do I do with the data now?”: Analyzing assessment information for accountability and improvement. Research and Practice in Assessment,3, 4-10. 

(Back to Top)

Motivation Research in Low-Stakes Testing Environments

Imagine that a university carefully selected tests and had a representative sample of students: a good start to assessment, no doubt. However, if performance on the test matters very little to students personally, they may not be motivated to do well. In this situation, despite otherwise robust methodology, test scores would not reflect what this student actually knows, thinks, or is able to do. This all too common situation is why JMU has studied motivation and how to improve it in low-stakes situations. 

A few practical procedures allowing CARS to examine validity issues related to motivation include: 

Training of test proctors to keep students motivated. 
Customizing test instructions to increase relevancy to students. 
Evaluating how much effort students give during test taking. 
Removing (i.e., filtering) data from students who expended little to no test-taking effort.

Myers, A.J. & Finney, S.J. (2021). Change in self-reported motivation before to after test completion: Relation with performance. Journal of Experimental Education, 89, 74-94.

Myers, A.J. & Finney, S.J. (2021). Does it matter if examinee motivation is measured before or after a low-stakes test? A moderated mediation analysis. Educational Assessment, 26, 1-19.

Finney, S.J., Perkins, B.A., & Satkus, P. (2020). Examining the simultaneous change in emotions during a test: Relations with expended effort and test performance. International Journal of Testing, 20, 274-298.

Finney, S.J., Satkus, P. & Perkins, B.A. (2020). The effect of perceived test importance and examinee emotions on expended effort during a low-stakes test: A longitudinal panel model. Educational Assessment, 25, 159 – 177.

Pastor, D. A., Ong, T. Q., & Strickman, S. N. (2019). Patterns of solution behavior across items in low-stakes assessments. Educational Assessment, 24(3), 189-212.

Finney, S. J., Sundre, D. L., *Swain, M.S., & *Williams, L. M. (2016). The validity of value-added estimates from low-stakes testing contexts: The impact of change in test-taking motivation and test consequences. Educational Assessment. 

Swerdzewski, P. J., Harmes, J. C., & Finney, S. J. (2011). Two approaches for identifying low-motivated students in a low-stakes assessment context. Applied Measurement in Education, 24 (2),162 – 188. 

Barry, C. L., Horst, S. J., Finney, S. J., Brown, A. R., & Kopp, J. (2010). Do examinees have similar test-taking effort? A high-stakes question for low-stakes testing. International Journal of Testing, 10 (4), 342-363. 

Thelk, A., Sundre, D. L., Horst, J. S., & Finney, S. J. (2009). Motivation matters: Using the Student Opinion Scale (SOS) to make valid inferences about student performance. Journal of General Education,58 (3), 131-151. 

Wise, S. L. & DeMars, C. E. (2005). Low examinee effort in low-stakes assessment: Problems and potential solutions. Educational Assessment,10 (1),1-17. 

(Back to Top)

Implementation Fidelity Applied to Student Affairs Programs

Think for a moment about a doctor prescribing a drug to a patient experiencing an illness. In two weeks the patient returns to the doctor, describing persistent medical issues. During the consultation, the doctor would ask the patient how much and how often the patient took the prescribed drug. 

This situation is analogous to JMU’s implementation fidelity application in student affairs programs. The success of a planned program such as Orientation, relies not only on how the program was designed (prescribed) but also on whether the program was implemented as planned (the medicine was taken in the appropriate dosage for the appropriate time). 

CARS is creative about how to measure implementation fidelity. For example, graduate students have posed as first year students experiencing Orientation. These auditors take notes about the quality and duration of the program, topics covered, students’ level of engagement, and more. 

This check to ensure the program is implemented as prescribed allows program leaders to identify areas for improvement. Instead of changing the designed program because students are not learning, implementation fidelity assessment ensures that programmatic changes are based on an accurate assessment of what is actually being taught (i.e., the delivered program). 

Finney, S.J., Wells, J.B., & Henning, G.W. (2021). The need for program theory and implementation fidelity in assessment practice and standards (Occasional Paper No. 51). Urbana, IL: University of Illinois and Indiana University, National Institute for Learning Outcomes Assessment (NILOA).

Smith, K.L. & Finney, S.J. (2020). Elevating program theory and implementation fidelity in higher education: Modeling the process via an ethical reasoning curriculum. Research & Practice in Assessment, 15(2), 5-17.

Smith, K.L., Finney, S.J., & Fulcher, K.H. (2019). Connecting assessment practices with curricula and pedagogy via implementation fidelity data. Assessment and Evaluation in Higher Education, 44, 263 – 282.

Smith, K.L., Finney, S.J., & Fulcher, K.H. (2017). Actionable steps for engaging assessment practitioners and faculty in implementation fidelity research. Research & Practice in Assessment, 12, 71-86.

Finney, S.J. & Smith, K.L. (2016). Ignorance is not bliss: Implementation fidelity and learning improvement. Urbana, IL: University of Illinois and Indiana University, National Institute for Learning Outcomes Assessment (NILOA).

*Fisher, R., *Smith, K. L., Finney, S. J., & *Pinder, K. E. (2014). The importance of implementation fidelity data for evaluating program effectiveness. About Campus, 19,28-32. 

*Gerstner, J. J. & Finney, S. J. (2013). Measuring the implementation fidelity of student affairs programs: A critical component of the outcomes assessment cycle. Research and Practice in Assessment,8, 15-28. 

*Swain, M. S., Finney, S. J., & *Gerstner, J. J. (2013). A practical approach to assessing implementation fidelity. Assessment Update,25(1), 5-7, 13. 

(Back to Top)

Advanced Measurement Techniques Applied to Assessment Practice

The methodological tools available to assessment practitioners today go beyond ANOVA and coefficient alpha and include such techniques as structural equation modeling, item response theory, hierarchical linear modeling, generalizability theory, and mixture modeling. CARS faculty and students contribute to the use, development, and study of such advanced methodologies not only in higher education assessment, but in educational research more broadly.

Advanced methodologies can be challenging to understand. CARS faculty and students are known for their ability to explain complicated techniques and concepts in understandable ways. CARS’ union of technical expertise and effective communication skills has resulted not only in award-winning teaching and highly sought after workshops, but several publications that serve as “go to” resources for applied methodologists. 

Ames, A.J. & Leventhal, B. C. (2022). Modeling changes in response style with longitudinal IRTree models. Multivariate Behavioral Research, 57(5), 859-878.

Kush, J. M., Pas, E. T., Musci, R. J., & Bradshaw, C. P. (2022). Covariate balance for observational effectiveness studies: A comparison of matching and weighting. Journal of Research on Educational Effectiveness, 1-24.

Kush, J. M., Masyn, K. E., Amin-Esmaeili, M., Susukida, R., Wilcox, H. C., & Musci, R. J. (2022). Utilizing moderated nonlinear factor analysis models for integrative data analysis: A tutorial. Structural Equation Modeling: A Multidisciplinary Journal, 1-16.

Bao, Y., Shen, Y., Wang, S., & Bradshaw, L. (2021). Flexible Computerized Adaptive Tests to Detect Misconceptions and Estimate Ability Simultaneously. Applied Psychological Measurement, 45(1), 3-21.

DeMars, C. E. (2021). Violation of conditional independence in the many-facets Rasch model. Applied Measurement in Education, 34 (2), 122-138.

Leventhal, B. C. & Grabovsky, I. (2020). Adding objectivity to Standard Setting: Evaluating consequence using the conscious and subconscious weight methods. Educational Measurement: Issues and Practice, 39(1), 30-36.

Leventhal, B.C. (2019). Extreme response style: A simulation study comparison of three multidimensional item response models. Applied Psychological Measurement. 43(4), 322-335.

Pastor, D. A., & Erbacher, M. K. (2019). Cluster analysis. In G. R. Hancock, L. M. Stapleton, & R. O. Mueller (Eds.), The Reviewer's Guide to Quantitative Methods in the Social Science (2nd Ed.). New York, NY: Routledge.

Bao, Y., & Bradshaw, L. (2018). Attribute-level Item Selection Method for DCM-CAT. Measurement: Interdisciplinary Research and Perspectives, 16(4), 209-225.

DeMars, C. E. (2018). Classical test theory and item response theory. In P. Irwing, T. Booth & D. J. Hughes (Eds.). The Wiley Handbook of Psychometric Testing: A Multidisciplinary Reference on Survey, Scale and Test Development (pp. 49-73). London: John Wiley & Sons.

Hathcoat, J.D., & Naumenko, O. (2018). Generalizability theory. In B. Frey (Ed.), The SAGE Encyclopedia of Educational Research, Measurement, and Evaluation (pp. 724-730). Washington DC: Sage Publications.

Hathcoat, J.D. & Meixner, C. (2017). Pragmatism, factor analysis, and the conditional incompatibility thesis in mixed methods research. Journal of Mixed Methods Research, 11(4), 433-449.

Finney, S.J., DiStefano, C. & *Kopp, J.P. (2016). Overview of estimation methods and preconditions for their application with structural equation modeling. In K. Schweizer & C. DiStefano (Eds.), Principles and methods of test construction: Standards and recent advancements (pp.135 - 165). Boston, MA: Hogrefe.

Finney, S. J. & DiStefano, C. (2013). Nonnormal and categorical data in structural equation modeling. In G. R. Hancock & R. O. Mueller (Eds.), Structural equation modeling: A second course(2nd ed.) (pp. 439-492). Charlotte, NC: Information Age Publishing, Inc. 

Pastor, D. A., & Gagné, P. (2013). Mean and covariance structure mixture models. In G. R. Hancock & R. O. Mueller (Eds.), Structural equation modeling: A second course(2nd ed.) (pp. 343-294). Charlotte, NC: Information Age Publishing, Inc. 

Taylor, M. A. & Pastor, D. A. (2013). An application of generalizability theory to evaluate the technical quality of an alternate assessment. Applied Measurement in Education,26(4), 279-297. 

DeMars, C. (2010). Item Response Theory. New York, NY: Oxford University Press. 

Pastor, D. A., Kaliski, P. K., & Weiss, B. A. (2007). Examining college students’ gains in general education. Research and Practice in Assessment, 2, 1-20. 

(Back to Top)

Defining and Evaluating Assessment Quality (Meta-Assessment)

Many institutions struggle to convince accreditors of their assessment quality. To address this issue, CARS worked with university stakeholders to articulate various levels of assessment quality via a rubric. For example, the rubric helps faculty and administrators distinguish among beginning, developing, good, and exemplary statements of learning objectives. 

At JMU all academic degree programs submit assessment reports. Subsequently, trained raters provide feedback via the rubric. This feedback is shared with faculty assessment coordinators, department heads, and upper administration. It is also aggregated across all programs providing a university-level index of assessment quality. 

Eubanks, D., Fulcher, K., & Good, M. (2021). The Next Ten Years: The Future of Assessment Practice? Research & Practice in Assessment, 16(1).

Fulcher, K. H. & *Bashkov, B. M. (2012, November/December). Do we practice what we preach? The accountability of an assessment office. Assessment Update, 24(6), 5-7, 14. 

Fulcher, K. H. & *Orem, C. D. (2010). Evolving from quantity to quality: A new yardstick for assessment. Research and Practice in Assessment,5, 13-17. 

*Rodgers, M., Grays, M. P., Fulcher, K. H., & Jurich, D. P. (2012). Improving academic program assessment: A mixed methods study. Innovative Higher Education,38(5), 383-395. 

(Back to Top)

Partnerships with University Content Experts to Develop Assessments and Communicate Findings

Building a good assessment takes a partnership between content experts (faculty and staff) and assessment experts. Developing student learning outcomes and creating test items and rubrics to assess student learning is an iterative process. Content experts articulate what students should know, think, or be able to do. Assessment experts help design assessments. 

By partnering, these teams ensure that assessment instruments fit program learning and development objectives. In fact, 90% of the assessments used on campus are designed by faculty and staff at JMU (e.g., the Ethical Reasoning Rubric). Content experts and assessment practitioners also work together to ensure findings are clearly communicated.

Merrell, L.K., Henry, D.S., Baller, S., Burnett, A.J., Peachey, A.A., & Bao, Y. (in press). Developing an assessment of a course-based undergraduate research experience (CURE). Research & Practice in Assessment.

Horst, S. J., Leventhal, B. C., Clarke, K. E., & Hazard, G. A. (2021). Bringing together two ends of the spectrum. Assessment Update, 33 (5), 4-5.

Kopp, J. P., Zinn, T. E., Finney, S. J., & Jurich, D. P. (2011). The development and evaluation of the Academic Entitlement Questionnaire. Measurement and Evaluation in Counseling and Development,44 (2),105-129. 

Cameron, L., Wise, S. L., & *Lottridge, S. M. (2007). The development and validation of the information literacy test. College & Research Libraries,68(3), 229-237. 

Halonen, J., Harris, C. M., Pastor, D. A., Abrahamson, C. E., & Huffman, C. J. (2005). Assessing general education outcomes in introductory psychology. In D. S. Dunn and S. Chew (Eds.),Best Practices in Teaching Introduction to Psychology(pp. 195-210). Mahwah, NJ: Lawrence Erlbaum Associates, Inc. 

Finney, S. J., Pieper, S. L., & Barron, K. E. (2004). Examining the Psychometric Properties of the Achievement Goal Questionnaire in a General Academic Context. Educational and Psychological Measurement,64 (2),365-382. 

(Back to Top)

Professional Development in Assessment (Assessment 101)

CARS has been offering professional development opportunities for faculty in assessment in one form or another for over a decade. Our flagship offering is a workshop that covers the fundamentals of the assessment cycle. This content used to be offered as a 3-week long residency called Assessment Fellows and was successful for many years. As time went on, the demand for this programming grew beyond anything we could have imagined, and it became clear that we needed to make changes to the delivery model if we wanted to be able to offer this experience to more than just a handful of faculty each summer. After a transition year in 2016, we launched Assessment 101 in the summer of 2017. In its current form, Assessment 101 in a week-long workshop designed to align with the Assessment Skills Framework, which delineates the knowledge, skills, and attitudes necessary to conduct assessment work.

Through the thoughtful design and delivery model, Assessment 101 can now be offered multiple times each year, accommodating ~30 faculty, staff, and students in each session. Originally intended as a professional development opportunity for JMU faculty alone, the content in Assessment 101 is now presented in a way that allows it to be transferable to other organizations that assess student learning outcomes. As such, in 2019, we opened our workshop up to the public and welcomed participants from other institutions both here in the United States, and internationally. 

Prendergast, C. O., & Horst, S. J. (2021). Assessment Professional Development Competencies: Applications of the Assessment Skills Framework. Occasional Paper No. 54. National Institute for Learning Outcomes Assessment.

Horst, S. J., & Prendergast, C. O. (2020). The Assessment Skills Framework: A Taxonomy of Assessment Knowledge, Skills and Attitudes. Research & Practice in Assessment, 15(1), n1

(Back to Top)

Competency-Based Testing

Recently, higher education has focused its attention on competency-based testing, a practice that has been a part of JMU’s culture for years. For example, all students must pass JMU’s Madison Research Essentials Test (MREST)—a test of information literacy skills—within their first academic year. 

Since JMU has deemed information literacy skills as fundamental to the success and maturation of an engaged and enlightened citizen, determining what score qualifies as passing was an important task. CARS faculty are experts in methods of standard setting such as bookmark and Angoff and assisted in the creation of cut scores for MREST along with many other assessments at JMU. 

For example, faculty in the department of social work have collaborated with CARS faculty on an assessment which social work seniors must pass to graduate. Other programs at JMU have set standards to help interpret student performance: How many students meet faculty expectations on the assessment?

Leventhal, B. C. & Grabovsky, I. (2020). Adding objectivity to Standard Setting: Evaluating consequence using the conscious and subconscious weight methods. Educational Measurement: Issues and Practice, 39(1), 30-36.

Horst, S. J. & DeMars, C. E. (2016). Higher education faculty engagement in a modified Mapmark standard setting. Research & Practice in Assessment, 11, 29-41.

DeMars, C. E., Sundre, D. L, & Wise, S. L.  (2002). Standard setting: A systematic approach to interpreting student learning. Journal of General Education, 51 (1),1-20. 

(Back to Top)