Welcome to the class blog! The John Jay - Vera Fellows Program is a collaborative effort between John Jay College and the spin-off agencies of the Vera Institute of Justice, combining an internship and participation in a seminar taught by faculty from John Jay's Interdisciplinary Studies Program. (To see a video about the John Jay - Vera Fellows Program, click here.) Part of the seminar experience is weekly participation in the class blog, which keeps the conversation going from week to week and will be a place for you to share your thoughts and concerns about the materials discussed in seminar as well as the internship experience. The opinions expressed on this blog do not necessarily reflect the views of the Vera Institute of Justice or its spin-off organizations. While the blog is open to the public and anyone, theoretically, can comment, only class members and invited guests will be able to post. You can also look for us on our student and alumni page on Facebook.
Each student has been assigned one week to write the "post." Please post within 24 hours after class. Every week, each student must comment on the post (feel free to comment more than once). Please comment by Monday afternoon to allow time for further questions and responses and so that we can read all the entries before class.

Thursday, February 25, 2010

The Measurement Problem

When Jonathan Swift wrote "A Modest Proposal" almost 300 years ago, he had a few targets in mind: absentee landlords, English politicians, an indulgent upper class. He was none too flattering about some of his fellow citizens, either, calling them out on domestic violence and drunkenness. But his famous satire had an additional target: social reformers who turned their pseudo-science on the problem of poverty and over-population and in trying to measure the problem created more misery. This is who he is referring to when he writes (on the bottom of the first page of the handout), "having turned my thoughts for many years upon this important subject, and maturely weighed the several schemes of other projectors, I have always found them grossly mistaken in their computation." One of Swift's points, even as he poses in the essay as a "projector" with a "scheme," is that it is almost as ghastly to see a starving infant and measure that suffering in terms of facts, figures and shillings as it is to propose to turn it into a "fricassee." Both practices turn a living being into a commodity to be consumed.

Over 100 years later, Dickens wrote Oliver Twist, which begins with a satire of England's Poor Laws and the pseudo-science used to address problems of English poverty. (Swift was perhaps smiling in his grave.) The novel's nightmarish "workhouse," where the orphaned, penniless Oliver spends his childhood, was designed not by sadists but by social reformers (though reading the novel, it is hard to tell the difference between sadists and social reformers as the scientists and politicians have figured out just how much gruel the starving children can get at each meal and Oliver, still starving, is severely punished for boldly asking for "more").

How do we measure social problems and what do our measurement tools say about us as a civilization?

This question was brought home to us today in class as Professor Stein, Lisa, Manny and Danielle recounted a recent panel discussion at the College and talked about the trend toward "evidence-based practice," a practice which emphasizes measuring the effectiveness of current programs in order to support good ones and eliminate bad ones. This seems as unobjectionable as wanting to measure issues like poverty and hunger in order to relieve them. But as we discussed in class, measuring is a complicated business: what do we measure? when do we start/stop measuring it? what gets left out of the process?

A story making the news today involves a high school in Rhode Island (link in "Check It Out" section on the blog). The school was failing and the superintendent was given 4 "models" to choose from in her task of turning the school around. One of the models involved firing all faculty and staff. Evidence to support this dramatic decision came in the form of studies which measured things like student proficiency at math (only 7% by 11th grade), retention rates (almost half the students dropped out), and academic performance (many students were failing at least two classes). Even as a teacher, I gotta admit this is some pretty gnarly data. But what these studies don't measure is that over half the students are living under the poverty line, and that many are non-native English speakers. One student, living with a single mom who works long hours in a factory, says that her teachers were the only constants in her life. How do you measure that?

These are questions for our own school as well as all departments are gearing up for review in a few years and need to perform and document elaborate "outcomes assessments." That is jargon for figuring out how well we do what it is we think we're doing. Just like "evidence-based practice," that seems pretty common sense, pretty unobjectionable. And I can imagine some measurement tools for what I do: essay exams, a certain kind of paper assignment, good attendance and solid student grades. But when I really think about why I do what I do -- to make students happy and engaged for a couple hours a week, to turn a student into the kind of person who can't put a book down, who is capable of greater empathy because he/she has learned through literature to understand how other people feel, who understands the power and satisfaction that come with being able to express oneself clearly and intelligently, and who maybe doesn't fully understand the impact of a class until, 15 years later, when he/she is reading to his/her own child -- how do I measure this? In answer, my own modest proposal that we measure how well a student retains a piece of literature by weighing the student, turning the book into a "fricassee," feeding it to the student and then weighing the student afterwards. (I profess, in the sincerity of my heart, that I have not the least personal interest in endeavoring to promote this necessary work, as I have already eaten all of my own books.)

11 comments:

M. Patino said...

The truth is that most of what truly matters is not quantitative. There's no scientific way to measure the impact a teacher may have on a student, a social worker on a client, or even a parent on a child. I use this last example because the way a parent gauges his or her effectiveness in raising a child is the prime example all of us can relate to (always in the role of the child, some of us in both roles). This is an unmeasurable value which does not fit into a scientific
"evidence based" schema whatsoever.

Also, in "measuring" people there is no way to ever come up with a single and conscise conclusion. Being human, being a person, living a life, can never be an exact science. Our society is so geared towards instant gratification and efficiency that even the most benevolent of reformers and advocates overlook the human element in a mad dash to measure, measure, measure. It's also detrimental that resource allocation plays the biggest role and that funders of social programs, schools, etc are looking for measurable signs of success and failure. When money is to be made/saved/cut/requested the issue of measurable signs of success and failure will come up first in any conversation. The obvious solution would be to find a statistic (I hate that word) that correlates as closely as possible with what we ultimately wish to "measure". But I have no idea what it would be. Evidently, it eludes the professionals as well. This "show-me-the-stuff-and-I'll-show-you-the-money" approach sounds more like a drug deal about to go wrong in some cop drama than it does a model for social reform.

Though evidence could complement or support a social program, using it as a starting point just doesn't sound like the best policy. I think an empathetic person has an idea of what could help a person, an empathetic professional even more so. The discord comes from taking what we feel could help a person and trying to make it fit into a "proven" measurable schema. "I just feel..." and "this person needs..." are just not enough to greenlight a social program or to move a person past intake. How many times do people in the capacity to help others say to themselves "If only I/we could..."? These are the reasons why they can't.

Lisa Chan said...

Evidence-based practice will always be around to measure programs in order to validate its effectiveness. New programs should be measured against programs that have been successful in order to measure it's efficacy. For example, Vera is constantly spinning off new programs which there are all sorts of measurements that are used to figure out its effectiveness and cost analysis. If it were not measured how will we ever know if it's cost effective and successful in serving the population that it's meant to serve?

I am a believer of evidence-based practice. Measurements should be made to record the effectiveness of a program and perhaps if there are any areas of improvements that could be made to strengthen it. However, measurements should also be wary of other aspects that may skew its report.

Professor Reitz mentioned the news story regarding the high school in Rhode Island where they fired the entire faculty due to their report on the schools process, which I must say was very shocking. But as you look into other factors that may have caused this result (i.e. poverty) really shows that other factors that have an impact on the initial report. As Professor Reitz asked, regarding the students reliance on her teacher, "how do you measure that?" It's un-measureable and that is why these reports must look to other aspects, but is that even possible?

Danielle said...

I attended a “performance measurement meeting” last Thursday & the attendees discussed the same questions Professor Reitz asks: how do you determine what outcome to measure & how we know when to stop measuring it? “The Checklist Manifesto” by Atul Gawande was mentioned by one woman in the group didn’t believe that all the variables of the world (which make us as individuals so complex) are actually within the control of her organization. Her organization itself was complex, with multiple programs to fit individual needs (shelter, vocational counseling, domestic abuse courses, etc)

How come if we don’t acknowledge multiple human needs at once we seem ignorant? I invoked an experience at Phoenix House Foundation, when I was so sure that they were onto the best treatment modality - the therapeutic community (TC). Their clients had a nurse, a dentist, VESID, child care services, mental health counseling, drug treatment, exercise classes, GED courses all at their disposal for 9 months. I thought that this would decrease the clients’ downtime (time that would have been spent thinking about, copping or using drugs) and increase their motivation to show up to their referrals. Theoretically, this was treating the whole person & thus for me, was the perfect model. Realistically, I found the clients’ got frustrated because they were bombarded with too much, too soon, losing motivation to focus on any one particularly tight corner. My experience with the service provided by CEO is much more straightforward. They focus on delivery of vocational services only. They know exactly what outcomes they’re interested in obtaining when it comes to delivery. They purposely do not deliver mental health counseling or hire a GED tutor; there’s “job coaches” and “job developers.” CEO learned this strategy by researching other organizations that try to work on the individual holistically, but like I noticed in the TC, too many things at once (for the client and for the service provider) ends up being overwhelming. Furthermore, when you have too much to measure & the outcomes are not concrete it does become a very messy business.
To measure the success of CEO, for example, you align their mission (a la their “reason for being”) with recidivism rates of those in the program versus those who don’t participate. I understand that the students of the RI school miss their teachers in the present, but the teachers apparently weren’t engaging in the mission of school - helping students move forward academically.

You’ve heard the saying “don’t let your emotions cloud you’re rationality” & “don’t let perfection be the enemy of improvement” – I’m emotional and a perfectionist but I say we learn how to draw the line between THE BEST (i.e. anything holistic) and the best we can do. If we focus on too many things at once we sacrifice getting anything done at all.

Prof. Stein said...

I was disheartened to read in this morning’s NY Times that President Obama has directly given a shout out to the Rhode Island school district cited by Professor Reitz for their mass faculty firing (http://www.nytimes.com/2010/03/02/us/02obama.html?ref=todayspaper). Obama has rhetorically exchanged “No Child Left Behind” for “The Race To the Top”, believing that competition and free market type incentives are the best way to increase performance.

Performance based measurement is most easily made a numbers game: when the salesperson doesn’t meet the mark, there is no commission; when a hypothetical student’s grades falter, maybe she flunks out; the ballplayer without decent stats is released from his contract. There is, superficially at least, a justice to this, and a certain kind of efficiency. But I worry that it distracts us from more critical issues. I was taught when listening to clients/patients to always think about what they weren’t saying, indeed to ask them “what would we be talking about if we weren’t talking about this (your horrible mother, your fat behind, your lack of time) right now?” Often, what we are most obsessed with just obfuscates more important issues which we feel powerless to control. And so, too, with neat rows of statistics that mask larger, more intractable social problems.

However, given this as a backdrop, Danielle and Lisa are absolutely right that when we focus exclusively on grand holistic solutions, we may end up trying to command too big a ship, and not even notice the little leaks that are sinking us. It is interesting that Danielle references “The Checklist Manifesto” by Atul Gawande; this very interesting book is about a surgeon’s experience instituting a basic checklist for surgical procedures that everyone in the OR had to utilize (Did you wash your hands? Did you put a fresh line in the IV tube?, etc). It turns out to have dramatically reduced hospital infections and deaths around particular surgical procedures. Gawande’s point is that the most complex systems often go awry because we are ignoring the simplest solutions. His checklist indeed is ultimately measured by its performance: the drop in deaths, how much money the hospital saves. If it doesn’t perform, we throw it out.

Gawande, however, wants to shine a light on the simplicity of success, not the importance of outcomes assessment (although, without the assessment, the endeavor would be meaningless). Unfortunately, in so much of what we do in social services, criminal justice, and education, the measurement becomes the whole point of the exercise, skewing drastically which exercises we will even undertake (i.e. only those that are measurable). Can a failing student be measured in the same way as a pathogen in a central catheter? In some ways, yes. In others, absolutely not. But when we are too focused on measurable outcomes we, as Swift implied, might as well just eat the baby to show off our empty plate.

Ana Rojas said...

I believe that statistics are made of superficial and malleable data. They represent what the observer or analyst want it to represent. After all they choose what numbers to use, which to discard, and how to collect the information. Numbers are only numbers and by using them to represent people we dehumanize individuals.

I understand that "evidence-based practice" is useful in weeding out good programs from bad ones, but I don't think it should be the final word in deciding whether something works or not. It is a necessary but not sufficient tool for evaluating programs. I feel we often get caught up in the all or nothing mentality. Why can't we have a bit of everything? After all humans aren't simplistic beings. Why should the test we used be any different? I know statistics aren't done quite easily, but perhaps we could keep in mind the emotional and the very present but invisible data that many times goes unaccounted for.

I am a tutor and I must give out tests at the beginning and at the end of the tutoring cycle. At the beginning I used to feel anxious about the tests because they reflect my capacity as a teacher. Most of the time my kids improve, but that data really doesn't mean much to me now. I understand the grades could be affected by tons of reasons that have nothing to do with my performance as a teacher. Sometimes kids are tired, they are cranky, they are sad, and that takes their focus away. The grade I get as a tutor comes from the smiles that spread from side to side after they complete a problem alone, and in the confidence that the kids develop. How do I measure that? Do I keep a record of the smiles, and "aha" moments? Should I include that in the paperwork I submit?

Neethu said...

This is an interesting topic because I've seen my mentor, Sophia, struggle to put in place outcome measures at CASES for funding purposes. I'd have to agree with most everyone and say that outcome measures can be very useful in measuring how well a service or program is leading to the desired result. Coming up with interventions to effectively measure results can help manage and improve services at an organization. However, when Sophia was discussing outcome measures, I remember asking, "how do you measure the immense impact this program has on these kids lives?" Outcomes, though important, can miss the immeasurable effects organizations have on their clients that can't be translated into numbers. Also, when we find results that are lacking, outcome measures do not tell us what the problem is exactly. Why are we not getting the desired result? The fact that the students in the school were doing poorly does not necessarily mean that the teachers were doing a poor job. There can be so many different factors that lead to the poor outcome.

Katiria said...

I have to agree with Neethu, simply because some students do poorly in school does not mean a teacher or professor is not doing their job correctly. Some students simply do not want to do their work or could care less. While the teacher struggles to get them engaged; they sit there disturbing the class. But I also must say that some teachers also do not know how to teach in many cases affecting the students who do not learn anything at all. Some stop teaching once they become tenured and other limit their teaching to certain methods that only benefit some students. And as Manny pointed out it is impossible to measure a teacher’s impact on a student.

Teachers are often a gateway to a better tomorrow, as being a teacher is not an easy job; it takes a lot of energy, patience, hard work and dedication, qualities which many people lack. Some teachers often lose their passion for teaching after negative experiences they endure with students affect their egos and the way they view their job. While others simply become more determined to overcome the obstacles they have been faced with and endure. I believe that measuring different information to conclude if a program is effect or not is important but if it is done the right way. In this particular situation something’s may just be impossible or more difficult to measure and researcher must be very careful not to skew the information as lisa mentioned.

Unknown said...

This is an interesting, thoughtful discussion.

Teasing out from the comments that have been made, it seems there are a couple of themes:

1) statistics and numbers do have a value for providing us a picture of the world—as an indicator of something or even a predictor. For example, maternal mortality and infant mortality statistics are useful indicators of the state of health care—and access to health care of a place. These numbers don’t tell the whole story but are suggestive and provide direction. Likewise with probability: taking into consideration certain specific variables, we can use statistics to predict the likelihood of something happening (e.g., the likelihood of who suffer the effects of an earthquake a lot or less or a little);

2) Any effort must start with trying to identify actual, real needs of a “target” population. Danielle’s portrait of the therapeutic community is interesting in that regard. The first step must be to ask: what do people need (needs assessment)? Only from that starting point can specific ways of addressing those needs may be developed—some programs might address a slice of that “need” while other programs, hopefully, address other aspects of those “needs”;

3) We do need to be wary of outcome measures if the goal is only the kind of “cost-effectiveness” that rests on unstated assumptions and leaves out important, contextual variables—and does not have the actual needs of the “target” population in mind. It does seem that in today’s world, there is a distorted, over-emphasis on this kind of “cost-effectiveness.” This current state of affairs may be why we react so negatively at the mention of numbers, stats, outcome measures, etc. We need to examine the specifics (not generalize)—the who, what, where, how, and why of any particular assessment. In doing so, I think we may be better able to see how “measuring” can be a useful tool.

Prof. Stein said...

I'm weighing in again just to alert you to two developments relevant to the Rhode Island case. In today's paper, the unions (some of which represent teachers) took a hard line against Obama for his Rhode Island comments, threatening to withdraw support for Democrats in the mid-term elections. It will be interesting to see the effect of one powerful entity on the other when it comes to this measurement war.

The second development of note is the weighing in of Diane Ravitch, one of the most influential policy players in education,long an outspken advocate of accountability measures.In her new book, she reverses her previous position, saying that increased measurement has not raised standards but dumbed them down (so that institutions will look good). "Accountability (in education)", Ravitch now says, "has turned into test cramming and bean counting." She goes on to say that if we chose top college graduates, prepared them to teach, gave them respect, and paid them well (as in Finland and Japan)we might begin to see a real change in performance.

marling.montenegro said...

I do agree with what you guys have said but essentially in what other ways can we "measure" whether a program works or not? Let's not forget the meaning of cost effective and just think of it as a stigmatization. Essentially, money is a strain and a lacking resource so it must be used appropriately (however much is available for an individual program). Once the time and money is taken to asses on an individualistic level only to realize that the program might not be as beneficial then doesn’t that mean that money was lost? These measurements are used to be as holistic as possible and help people at a larger scale.
I personally would not like to invest or fund a program that doesn’t show an assurance at working at a satisfying level.

Emile Lokenauth said...

While statistics is one way to measure social problems within our society, I must say that I agree with Manuel that "measuring" people is not an exact science.

Statistics does give our society an idea of how severe a problem is, but only to a certain extent. For example, the article that Professor Reitz mentioned does paint a picture for how poorly the students were doing academically. But as many of you mentioned before, there are other factors that play a role in the reason as to why these students were performing as they were. Statistics give a general idea of a situation, however, they do not provide us with an exact measurement of the problem. This is the case because these statistics do not apply to everyone. Sure many students were failing two classes, but not ALL students. Clearly there had to be other factors, besides the teachers, that played a role in the the students academic performance, and yet again these factors do not apply to every student. Many were living under the poverty line, however, not ALL were living under the poverty line. Therefore, measuring the severity of a problem is not exact and concise.