Ask the Expert: Evaluation Planning with Wendy Erisman, PhD

We recently sat down with Wendy Erisman, PhD, HEI’s senior advisor. Wendy shares her deep wisdom about the field of evaluation, how it has evolved over the years, and the immense promise it holds for organizational partners.

Can you tell us about how you got started in the field of evaluation and how see the field evolving?

Wendy Erisman: Like many evaluators, I stumbled into it. I was working at a think tank in Washington, DC, which had primarily been doing customized research for funders. When they started to get requests for evaluations, I ended up doing several and really enjoyed them, because compared to research, you really get to spend time with clients and see them use the data, use the findings. Sometimes research feels like you’re throwing it out there into the void, and you never really know whether people are reading it or working with it.

Since then, the field has gotten more public awareness. I think people are more familiar with the term, “evaluation,” and there are more educational programs for evaluators. But it’s still kind of hidden in some ways. For example, when I fill out the forms that ask you what your profession is, there’s never any obvious category to fill out. I put management consultant most of the time. And that doesn’t feel right, really. I do think that the press from the federal government to have more evaluation of federal grants has helped, and states have done the same. And I think organizations are starting to realize that it is really important to collect data and understand whether or not you’re actually achieving the outcomes–which I think initially came from pressure from funders. Organizations are starting to realize that this is valuable and that it can help them not only meet the requirements of their funders but also really learn and make improvements to their programs internally. So it’s a growing field. And the American Evaluation Association has done a great job of providing a lot of training and opportunity for folks. We’re seeing new methods and new areas like the evaluation of public policy efforts, which initially seemed really difficult to evaluate. But people are coming up with interesting and new and thoughtful approaches to understand how advocacy works, how policy works. So I think it has a booming future.

How have you seen organizations use evaluation for their own learning and strategic advantage?

Erisman: First and foremost, organizations who do evaluation regularly have a better grasp on whether or not the work they’re doing is actually achieving the outcomes they’re hoping for, which from a cost-efficiency perspective is pretty important. But it also can really help with understanding sometimes why you may or may not be getting the outcomes that you hoped, especially if you’re getting them for some participants and not for others. It really becomes a question of understanding your own work and what’s happening with it–and so disaggregating data is really important to know whether or not you’re reaching all of your participants equally. Then programs can tweak their work, figure out where it’s not working quite as well, add new things, or expand on things that are working really well, as opposed to just throwing their programming out into the void and hoping that it lands. On a more strategic level, you can start looking beyond individual programs and try to understand the impact that an organization has across all of its programs. There, it’s clear usually that the sum is greater than the parts; most organizations do more in the world than just their programs. That can be really helpful when folks feel like we have these programs, we’re limited by funding, and we’re maybe not getting where we should be. But in fact, they’re really making a difference through being thought leaders and engaging in the field in important ways. All of those are things that can be measured as you start to think about organizational evaluation and learning–and that are valuable when presenting yourself to funders and saying, you know, we’re more than just our programs, we should be able to get some general operations funding because we are doing things outside of programs that are also contributing ultimately to the outcomes for our participants or our target population.

What is evaluation planning and why is it so important?

Erisman: I think that evaluation often lands in people’s laps, just as evaluators often land in evaluation. People get a grant, it says they have to evaluate, or they realize they need to evaluate. And then it’s very easy to jump from we have to evaluate to, Okay, we’re gonna have a survey of our participants after they participate, and then we’ll know everything we need to know. And of course, the real world doesn’t work that way. In fact, it’s very important to think through the program and what you want to know about it before you start.

So this is one of the reasons why evaluators talk a lot about things like logic models, or theories of change, which are just a way of putting on paper how the people who are running a program believe that it’s going to work, for example, why are the activities that you’re doing going to actually lead to change for your participants or your target population? It’s all knowledge that people already have, but it needs to be articulated and put into writing so that they can think about: what do we need to know in order to know if we’re getting to where we ultimately want to be? That’s hugely important because the reality is in social change efforts, the outcomes don’t happen right away. In many cases, it can be years before you start to see the outcomes that you ultimately are hoping to achieve with your work. We see this in college access and success work all the time. If you work with a high school freshman, it’s gonna be a while before they go to college, and even longer before they complete college.

So setting out your program and understanding the intermediate outcomes needed for your ultimate outcome, can enable you to measure things that are going to happen much sooner. In the example of college access and success, we know that there are a lot of actions associated with college going–completing a FAFSA, taking college entrance exams, taking certain levels of math, applying to colleges, researching colleges–all these different kinds of things that you can measure. And that’s not the same as going to college. But at least it gives you a sense that maybe students are increasing their interests or their engagement in activities that will ultimately lead to college, which you can measure those a lot sooner.

In evaluation planning, it’s also important to think about exactly how the activities that you’re doing are going to lead to outcomes. Because in many cases, the data that’s collected on just process–how did the program rollout –can turn out to be even more valuable than the outcomes data in the short run. I always joke that the number-one reason why programs don’t have any impact is that they actually didn’t ever get implemented. And there’s some truth to that–that there could be something preventing the program from working that has nothing to do with whether or not the participants can benefit from it. Learning about those roadblocks, learning about what works well and what doesn’t, then being able to adapt the program quickly as you’re implementing it, and eventually moving say, from a pilot stage into a fully implemented stage is really, really helpful. And then that same data can often be useful in understanding why you get the outcomes you do. People tend to focus on the outcomes themselves, which are certainly very important, but they only tell you what happened, they don’t usually tell you why it happened. As a result, process data along the way can be hugely valuable. But if you don’t plan to collect it before you get started, then you probably won’t have it. So again, part of the planning is to say, what do we need to know? Well, we need to know if our participants are benefiting, but we also need to know if people are actually doing the program the way we intended.

Once you have an evaluation plan, how do you maximize organizational learning?

Erisman: You need to plan for [structured evaluation learning] too, because otherwise, it’s very easy for the data to just sit and never get analyzed or looked at. I always would encourage organizations to include regular check-ins with the data, at a time when you know you’ve got data coming in; make sure it’s clear whose responsibility it is to analyze that data; and present it in a way that can be looked at by everybody involved with the program. That might be an outside consultant, or it might be someone internal. Sitting down and having a conversation about it is the most valuable way of doing it. Because everybody who participates in implementing and administering a program has perspectives on what’s happening and can add that perspective to the data that you’ve collected in terms of processes and outcomes. Don’t leave that meeting without action steps. That’s the other piece, sit down and talk about it, but make sure you leave with action steps so that folks know, it’s not just a, oh, well, we should change this; it’s okay, we’re going to change X, Y, and Z, and here’s who’s responsible for it. And if you don’t plan for the meeting, it won’t happen.

How do you help clients get comfortable with engaging with data?

Erisman: It’s one thing having someone who’s collecting data, but you also need someone who’s comfortable taking raw data and putting it into a format that can be understandable by anyone, often as a visual depiction. That requires somebody with specific training and expertise–and is often a place where you can bring in a consultant if you don’t have anyone on staff with that expertise. This is particularly true if you have a lot of data that’s being collected as part of your programmatic work, but you haven’t had a chance to use it in any meaningful way. Having someone come in and analyze the data and reflect it back to you in easy-to-understand ways so that your staff can start to talk about why it is that you’re getting the results you’re getting can be really valuable. That’s also one of the reasons why the process can get stopped–especially with administrative data that’s being collected pretty much automatically. There may not be anyone with the expertise to do even the basic level of analysis that needs to be done before you start talking about data. And you can’t just look at raw data–that’s not beneficial for anybody.

What’s the most common issue clients encounter with regard to data security and storage?

Erisman: Data security and data storage issues are a challenge and vary widely, depending on how large an organization is or the population that it’s serving. Do you have minors versus only adults? There’s just so many different pieces to that. Your data processes have to be tailored to whatever your organizational circumstances are. I don’t think that a tiny nonprofit needs to pay thousands and thousands of dollars for an elaborate data system when they might actually be better off with something pretty straightforward that can be secured by not having it on the internet if you don’t need it to be. There are easy solutions, sometimes. And larger organizations may need the larger solutions, which usually means purchasing them.

How can evaluation planning be utilized in service of organizational goals around diversity, equity, and inclusion?

Erisman: This is where disaggregating the data comes in, really thinking about, first and foremost, is the population you’re serving reflective of the broader target population that they fall into? Before you even get to outcomes, you want to make sure that you’re actually reaching a diverse audience, if that’s what you’re aiming at. Outcomes data should always be disaggregated by any relevant variable–age, gender, race, ethnicity, family income, for example. All of those are important factors to understand whether or not there’s a differential outcome. And that’s really crucial to diversity, equity, and inclusion. Because if you’re getting differential outcomes for different groups, you have to think about whether or not that’s a problem and how you might address it. Particularly if you have underserved groups who are not doing as well as others, you want to say, maybe we’re missing something, maybe we really need additional support for that particular part of the population. That’s one place where data can be exceptionally valuable, because in many cases, it’s not possible to see that disparity by looking at people, you have to actually look at the numbers in order to understand that the disparity is there.

How can organizations use evaluation to better tell their impact stories?

Erisman: Qualitative data is your friend. Quantitative data is great, it’s easy to analyze, it’s often quite easy to understand, and funders love it. But qualitative data is where those stories lie. Interviews, focus groups, any type of opportunity to actually talk with the folks who have benefited from the program is where those stories start to come to life. It’s certainly possible to turn quantitative data into qualitative data if you need to. For example, when administering a survey, you can offer people the option to opt-in to having a follow-up interview. Then if you have folks who have interesting answers, you can follow up and dig into those answers and start to understand where that person is coming from. I’m an anthropologist by training, so I’m always going to love qualitative data. But I do think that you need both; I don’t think you can be truly doing an effective evaluation with only quantitative or only qualitative. They both contribute. The qualitative contributes, particularly with those stories that illustrate what you see in the quantitative patterns.

Funders focus heavily on methodologies such as randomized controlled trials, quasi-experimental, etc. Why is that not always the most appropriate path?

Erisman: This is one of my true pet peeves in the world of evaluation. It’s true that randomized control trials are the gold standard for being able to say with certainty that a particular intervention of some sort caused the outcome. But it has to be randomized, and it has to be controlled. And the real world is neither randomized nor controlled. So right there, you have a problem. Now, this kind of testing comes out of the natural sciences, where folks are primarily working in laboratories where you really can randomize and control for a lot of things. That’s just not true in the world. The larger the unit of analysis that you have, the harder it is to control. If you’re talking about comparing cities to each other, there is no hope of doing a meaningful randomized control trial. Even comparing two schools to each other can be challenging because those schools are not going to be identical in their culture on campus, in their teaching approaches, in their student population. You could maybe have randomized control work at a classroom level, like if you have two classrooms, and you can randomize which students end up in each classroom, one classroom gets the intervention the other doesn’t. But even then I wonder, what if the students talk to each other? I mean, what if they, the ones who are getting the intervention, find the content interesting and tell it to the students who are in the other classroom? Then you no longer have a controlled experiment. That’s a very, very simple example; the more complicated we’re talking about, the more complicated it gets. It can be very difficult to randomize and extraordinarily different to control.

One time I was talking to someone about an idea that perhaps, if you were giving grants to cities, you could give grants to some cities and not to others and compare them. And I said, but what’s stopping those other cities from going and getting grants from other funders? Again, you can’t control the world, it just is not possible.

I do think that some quasi-experimental methods and some natural experiments are appropriate, but they’re appropriate because of specific situations. When it comes to anything that’s not quasi-experimental, when it comes down to true randomized control, you have to deal with ethical issues. I mean, is it truly ethical to give an intervention to only some people? Or is that a problem? Some folks have dealt with that by introducing the intervention in one school one semester and in the control school the following semester. But you’re still introducing it earlier for one. The reality is we’re dealing with people and their lives–not just the kinds of things that scientists deal with in a lab. So I don’t oppose completely the use of randomized controlled trials. But I don’t think that in general, it’s something that a nonprofit can do in a meaningful way. They’re also expensive and time-consuming. I’m not convinced that the data you get from them is so much better than data you can get from more traditional evaluation, which may only be able to show correlation, rather than actual causation. But correlation is a good thing. If you look at it long enough, over time, you may well be able to feel quite confident that you’re having an impact. I also may have a broader definition of impact, because in many cases, there are all kinds of unintended consequences for interventions, and some of them can be good. But if you’re not measuring for them in a randomized control trial, you may not even know that they’ve happened.

If you were talking to a nonprofit leader who doesn’t have a big budget for evaluation planning, what advice would you give them?

Erisman: There’s quite a lot of good information available out on the internet, about evaluation planning, the steps to go through, how to design a logic model, how to think about developing evaluation questions, and identifying appropriate data collection methods. The problem that most will run into is that there’s not nearly as good information about how then to analyze the data and use it to make meaning, use it to make change. So that’s a challenge. But if you can at least get going on the data collection, and make sure that you’re getting the data that you need, while you are in the process of doing the program, then it becomes more possible to invest, say, any extra funds that you have into getting some support on the back end. It’s often the end of the process where things tend to fall down the most… It’s a gap, I think, in the way that we have taught folks about evaluation.

What are HEI’s “data parties”?

Erisman: Data parties are data meetings, where you’re getting everybody together who works on a particular program, or in some cases, everybody who works in an organization to talk about the data from a period of time–could be six months, could be a year. The way we do it is that we’re typically analyzing the data. In many cases, our clients are actually collecting the data themselves, either administratively or through doing things like surveys. Sometimes we are collecting the data, but we do the analysis of it because we have more expertise in terms of taking raw data and turning it into something digestible. So we create data placemats, or data slides that have about one data point per page to give folks an opportunity to really think about it. At the data party, we break out into groups and get people to talk about questions like, what happened? Why did it happen? What do we need to learn from this? Then come back together into a larger group and share insights. Typically, when we do these kinds of data parties, we find that people have a lot of insight to share and they often have a very good understanding of why certain results happened. They can talk about various contextual factors that may have affected it in a positive or negative way. And then we move into trying to think about how now that we know what happened and why it happened, what are you going to do about it? We encourage people to really try and lead with action steps and how we’re going to try X, Y, or Z in the next part of the process. I think our clients sometimes laugh at us for calling these data parties, but I also think that they do actually enjoy them. I’ve seen a lot of reduction of fear of data after people have done this a few times. Because it isn’t scary. It’s very much, okay, here’s something to look at and think about, it’s not here, do difficult math or something like that. When I do them in person, I try to take chocolate.