Recently I have been having conversations with a handful of people, ranging from passionate tree huggers to seasoned EHS managers. All work for large organizations with one thing in common – responsibility for their organizations’ CDP reporting. Our conversations gave us a view into the real challenges of data management, but they certainly weren’t unique to the CDP, Greenhouse Gas Reporting, or sustainability reporting in general (though these specific exercises exemplify the challenges). As we serve an increasing variety of customers, we see the same challenges emerging again and again, in many different contexts and subject areas.
In order to perhaps make you, dear reader, feel less alone at this CDP time of year, I thought I’d share the story of Brian at XYZ Corp. Neither Brian nor XYZ Corp are real names, but their stories are real stories, composed based on several customer interviews. We used Brian’s story to guide our work on the Scope 5 application.
Brian is a manager at XYZ Corp. His organization operates some 200 facilities around the world. He is responsible for reporting XYZ Corp’s greenhouse gas emissions each year to the CDP. For this, he needs to find XYZ Corp’s usage of electricity and various fuels at each facility and then multiply those numbers by some emissions factor, which may be location-specific.
Brian composes a spreadsheet, posts it on a shared server, and sends an email to all of his facility managers. In that email, he sends a pointer to the spreadsheet and asks the managers to please fill out the data for their facility in the cells indicated. For extra credit, Brian asks for cost data too.
Brian eagerly anticipates the return of his spreadsheets. As they start to trickle in, he tries to combine the numbers. A few weeks and many hours of Excel column copying later, Brian is struggling. Here’s what’s been happening:
- Several respondents have sent revised spreadsheets after the first, with highlighted cells that contain corrected (or previously missing) data.
- Many of the respondents have written comments into cells – Brian needs to go through these and delete the text.
- Some are reporting back in units that Brian hadn’t anticipated.
- Some of the responses are so out of range that Brian feels there must have been a mistake.
- Brian finds duplicate entries. He’s sure one of them is correct and one erroneous. But – which one?
- Some respondents provide cost data, some don’t. Some provide it in USD, others in various foreign currencies.
In short, the blending of spreadsheets is turning out to be a nightmare, requiring endless version management and email correspondence. Several weeks later, Brian has in his hands a pile of spreadsheets, corrected spreadsheets, and comments, and the prize – the single blended spreadsheet with the final results, is almost ready for that CDP report.
Brian meets with his boss, Jane, to share his results. Jane is pleased that Brian has finally completed his data collection exercise. She has some questions for him:
- That seems like a lot of money spent on fuel. How is the cost distributed across the facilities? Is it evenly distributed?
- Are there facilities at which the cost per unit of fuel is very high or very low?
- Is there an audit trail? We can get this verified by an auditor but we’ll need the data behind the numbers.
- How reliable is this data? Did everyone provide the numbers requested? Or were some missing?
Back in his office, Brian sits in quiet desperation. He digs through his pile of spreadsheets, trying to combine subsets and chart the results. He’s keeping lots of notes in lots of margins and he feels his sanity beginning to slip away.
Whether in CDP reporting or other data collection exercises, we see this general pattern and frustration frequently: far too many hours invested to produce one specific report. While the final report (probably) answers the specific question asked, there’s little ability to glean further insights – to ask the next set of questions that beg to be asked. And too many people repeat this tedious exercise year after year, with little ability to compare the results over time or to analyze the wealth of data collected.