ESG data – mind the gaps
The $40 trillion environmental, social and governance investment industry is built on a bedrock of data from an ever-widening range of sources. Is that data fit for purpose?
|In this story|
|• Public or private Should data providers engage with companies directly?|
|• Checks and balances Independent data sources provide a useful corrective to company disclosures|
|• The real-time tech revolution AI is empowering a new generation of data providers
|• Who has the black box? Start-ups take on traditionalists in the big transparency debate|
|• Industry consolidation Will a wave of acquisitions push up the price of data?
|• Pipe dream or reality Is cross-border consensus on standardization a viable proposition?
|• Social focus Pandemic and racial protests turn the spotlight on deficient data|
It has become a truism to say that environmental, social and governance investing has gone mainstream. Certainly, it has never been more popular.
Last year, nearly 400 new ESG funds were launched. In the first half of this year, more than $100 billion of new money was allocated to the industry.
According to Opimas, assets under management at funds incorporating ESG – whether for risk management, compliance or impact investment – now total more than $40 trillion, up from $12 trillion as recently as 2012.
This spectacular growth has shone a spotlight on a part of the market that has often been overlooked – namely, the data on which it is all based.
Until recently, much of this was inaccessible to investors, who had to make do with ESG scores provided by a diverse and proliferating group of rating agencies. With ever more cash flowing into ESG-related products, however, asset managers have begun demanding access to the raw data underpinning these ratings.
Meanwhile, advances in technology have made it possible for newer providers, and even investors themselves, to collect vast sets of data from company disclosures, news, social media and a host of other sources. Perhaps unsurprisingly, given the relative youth of the ESG industry, there is still little consensus on what data is relevant and material, or even how to collect it.
Historically, the majority of data used by ESG ratings providers has been derived from companies themselves, with the main debate among key players in the industry revolving around whether to engage directly with companies or stick to publicly disclosed information.
Uniquely among the big ESG rating providers, market leader MSCI – which accounts for around a third of the ESG research and data market, according to UBS – has a strict policy of only using publicly available data.
However, for most of the big providers, working with companies – whether through questionnaires or as feedback on research reports – is central to their business model.
When we started, the only way to get ESG information was to ask companies uncomfortable questions
RobecoSAM’s Corporate Sustainability Assessment, for example, which is based on an extensive annual questionnaire sent out to companies, has been running for 20 years and is one of the most widely used reference points for ESG investors, as well as the basis for the Dow Jones Sustainability Indices.
It is now under the aegis of S&P Global, which bought RobecoSAM’s ESG ratings and benchmarking business in January.
Manjit Jus, formerly of RobecoSAM and now global head of ESG research and data at S&P Global, says the direct engagement model was born of necessity.
“When we started there wasn’t a lot of ESG information in the public domain, so the only way to get it was to knock on company doors and ask uncomfortable questions about what they were doing on water, climate, environmental issues, human rights and governance,” he says.
Other leading industry players – including Sustainalytics, FTSE Russell and ISS ESG, as well as environmental specialists such as Trucost – offer companies the chance to provide feedback on their ratings and, in some cases, additional information.
Going beyond public data is often necessary, but you have to be very clear about how you do it and disclose it
Arne Staal, head of product and research FTSE Russell, says this type of engagement can improve the quality of ESG ratings: “The advantage of using only public data is that it’s uncontroversial, objective and can be replicated.
“But we are trying to answer difficult questions here, and more data often gives you better insight. So going beyond public data is often necessary, but you have to be very clear about how you do it and disclose it.”
Reinhilde Weidacher, global head of ESG data strategy at ISS ESG, notes that engaging directly with companies can also provide a useful indicator of their commitment to sustainability.
“We offer companies not just the ability to verify the data we’ve collected but also to review our analysis and provide additional input,” she says. “This is an important component of our research because it allows us to measure a company’s responsiveness and, by extension, its ability and willingness to amend any adverse impacts.”
Checks and balances
As well as company self-disclosed information, rating agencies and research providers also pull in data from other sources to a greater or lesser extent. Nearly half of the data that underpins MSCI’s ESG ratings comes from independent sources, including regulatory databases.
“When you rely only on companies’ self-disclosed information then by its nature it will only show what is good,” says Remy Briand, head of ESG at MSCI. “You will rarely see a list of the fines or product recalls that a company has gone through in their CSR [corporate social responsibility] report, for example.”
Reinhilde Weidacher, ISS ESG
Similarly, ISS monitors local and global non-governmental organizations (NGOs), as well as government databases and intergovernmental agencies.
“For us, it’s fundamentally important to cross-reference – and make it possible for our clients to cross-reference – corporate disclosure with data provided by other sources,” says Weidacher.
The problem with all of these data sources is that they are not very timely. Most companies only publish ESG information once a year, and then with a substantial lag.
“We need to close the gap in terms of timeliness between annual reports, which are disclosed in the first quarter, and real-time information, which is available in the media,” says Thomas Roulland, head of ESG tools and models at Axa Investment Managers. “The more fresh data we have the better.”
The main ESG information providers try to fill this gap with a ‘controversy assessment’, which involves monitoring news sources for risk signals. Today, this is often done using artificial intelligence (AI).
Daniel Wild, global head of ESG strategy at Credit Suisse, says this can provide a useful complement to company self-disclosed data.
“What companies report annually is mostly about what policies and risk management strategies are in place and what metrics are measured,” he says. “These are very important for making long-term predictions about a company’s trajectory, but using AI on unstructured public data can give you a better idea of what is going on at the company right now and provide early warnings of risks.
Thomas Kuh, Truvalue Labs
“For example, if suddenly there is a high frequency of small environmental accidents at a company that seems to have a good environmental policy, that might indicate that it will soon have a larger environmental incident.”
At traditional ESG providers, however, the information gleaned from real-time sources is passed to an army of analysts for assessment.
Thomas Kuh, head of index at Truvalue Labs, says this means the process is still too slow.
“Ratings providers track what they call controversies, which are event-driven situations at companies, but that typically happens with a multi-week or longer lag,” he says.
“There’s a difference between putting information into the hands of portfolio managers on the day it happens and collecting the same information, giving it to your analysts, having them process it and then distribute it with a lag to your clients.”
The real-time tech revolution
Started in 2013, Truvalue Labs is the oldest of a new generation of ESG information providers that make much more extensive use of technology.
In the case of Truvalue Labs, this means using machine learning and natural language processing (NLP) to both harvest and process data from a vast array of sources in more than a dozen languages, including every type of media publication, NGO report, academic paper, regulatory and legal disclosures, and social media.
Unusually, even by the standards of ESG tech firms, Truvalue Labs eschews all data provided by companies themselves.
“Information disclosed by companies has been shown to be partial,” says Kuh. “Good news is over-reported and bad news is under-reported.”
One advantage of this approach is that the firm is able to cover a much larger range of companies than traditional providers despite being a fraction of the size, including those that don’t disclose ESG information.
“It also means we don’t have to take a position on whether to penalize companies for not disclosing or penalize companies that are disclosing while the ones that don’t get a free pass,” says Kuh.
The benefit of automation is that you don’t have to tell 400 analysts to rerate 3,000 companies for a single client. You just write a few lines of code and it’s done
Another thing that sets Truvalue Labs – and other new-generation providers – apart from traditional ESG raters is a focus on positive real-time news as well as controversies. Indeed, Kuh notes that more than half of the firm’s data consists of positive signals.
“We live in a world where things are changing very quickly, and while it seems like there’s a lot of bad news out there, there are also very significant positive transitions going on,” he says. “We give investors an opportunity to outperform by looking at the upside as well as the risks.”
Other ESG information startups use big data and AI to produce innovative measures of impact. Arabesque S-Ray, for example, provides a temperature score that allows investors to assess the contribution of their portfolio to the rise in global temperature, measured in degrees centigrade.
Launched in 2017, S-Ray was created to provide input for Arabesque, a quantitative sustainability focused asset manager. Again, the firm harvests data from multiple sources, including company disclosures, using NLP and machine-learning.
Andreas Feiner, Arabesque S-Ray
“We wanted to add more data sources and combine them in a smart way,” says Andreas Feiner, chief executive of Arabesque S-Ray. “Since we then had billions of data points, we needed technology to make sense of it. You can’t process that volume of data with humans.”
S-Ray became a subsidiary of Arabesque Group in 2018. Last year, four big German investors – Allianz, DWS, Commerzbank and the State of Hessen – took stakes in the Frankfurt-based firm.
Another up-and-coming ESG data provider, Impact-Cubed, also started life as an in-house tool for a sustainable asset manager, in this case London-based Auriel Investors.
The firm’s flagship product, the Portfolio Impact Footprint, reports portfolio impact according to the UN’s Sustainable Development Goals (SDGs) and quantifies it in basis points of tracking error, to be considered as a third dimension alongside risk and return.
This is calculated using data harvested from companies’ own disclosures and public sources such as the World Bank and the World Resources Institute.
Arleta Majoch, a partner at Auriel Equity Investors, says company data is particularly challenging to work with.
“We use raw data on topics such as carbon, waste and water that is scraped from annual reports by crawler software,” she says. “We get that warts-and-all from different aggregators, then we scrub it, triangulate the different sources and estimate the gaps.”
At the same time, she adds, sourcing only data that is widely available and processing it with technology not only removes biases against non-disclosure but also allows for a much broader coverage range.
“With more traditional methods, if you want to cover 100 more stocks you have to hire another 20 to 30 analysts, and then the process of rating and rerating the company takes time,” she says. “The automation element of our model and the generic nature of the data removes all those limitations.”
This approach also makes it easier for investors to customize data to reflect their own priorities – something asset managers have become increasingly focused on in recent years.
“If a user thinks our interpretation of the tax gap doesn’t belong in an impact model, they can remove it,” says Majoch. “The benefit of automation is that you don’t have to tell 400 analysts to rerate 3,000 companies for a single client. You just write a few lines of code and it’s done.”
You can get a lot of interesting insights with technology, but you still need the human element to ensure that whatever the machine is highlighting actually makes sense
Another London-based startup, Datamaran, takes this approach even further. The firm uses NLP across a wide range of media, company and other sources – including an extensive regulatory database – to assess companies’ positioning on ESG issues.
Ian van der Vlugt, director of product at Datamaran, says this approach avoids the problems faced by other ESG data providers, such as boundary definition, gaps in indicators and consistency of reporting.
“For instance, we assess whether a company is reporting on the risks posed to it by climate change or whether it is adequately addressing biodiversity in its services,” he says.
He adds that this approach can help to filter out meaningless or misleading corporate verbiage.
“We are increasingly looking at how we can identify fluffier disclosures or what could be considered green-washing,” he says. “We recently updated our lexicon of terms and term co-locations and relationships that we use to analyze data in order to address this issue and provide deeper insight into specific risk action areas.”
Originally designed for companies themselves, Datamaran is now starting to follow the other ESG startups in offering its data to investors.
Who has the black box?
While traditional ESG ratings and information providers also use some elements of AI to gather data, most are dismissive of the fully automated models used by new-generation firms.
“The state we are in with AI can often be somewhat overstated in the debate,” says Weidacher. “I strongly believe in the need for human intelligence in the research process.”
Remy Briand, MSCI
MSCI’s Briand agrees.
“You can get a lot of interesting insights with technology, but you still need the human element to ensure that whatever the machine is highlighting actually makes sense,” he says. “A good ratio might be 80% machine learning and 20% quality control.”
The other criticism levelled at tech-driven firms is that their models are too opaque.
“If you are collecting data through AI-driven methodologies with complex algorithms that derive data points from all types of sources, it is much harder to make the process fully transparent,” says Staal at FTSE Russell.
The tech firms retort that it is traditional providers that are not transparent for refusing to release details of their company questionnaires and how they derive their ratings.
“At the end of the day, what rankers and raters are bringing to the market and what’s most consumed by the market are aggregate ESG scores, which are based on completely black-box methodologies with inherent biases,” says van der Vlugt.
Indeed, the startups argue that transparency is central to their business models.
“If our clients want, they can get the most granular level of data,” says Majoch. “They can see every number that went into their portfolio calculation and where it came from, which allows them to reverse engineer our assessment. We’ve even shared our regression model for estimating carbon data with clients.”
Leaving aside the question of who is right, what this debate illustrates is the increasing emphasis on transparency in what traditionally has been a fairly opaque industry. Unlike the ESG startups, the big rating agencies were until recently highly protective of their data.
Amy O’Brien, head of responsible investing at Nuveen
“For a long time all investors had was ESG ratings,” says Amy O’Brien, head of responsible investing at Nuveen, the investment arm of TIAA. “Today, through negotiation, we have agreements with large vendors to pull in the underlying data points, but that has only happened in the last couple of years.”
This shift by the industry leaders such as MSCI and Sustainalytics has been primarily driven by pressure from large investors, who have become increasingly reluctant to rely on top-line ratings as demand for more nuanced ESG and impact strategies has grown.
“Investors increasingly want to be able to test different scenarios with granular ESG data and identify trends, and shifts in trends, at an early stage,” says Weidacher at ISS.
We have a fundamental problem in not having comparable, reliable and meaningful data that covers 100% of the activities of companies
The likes of Nuveen, Schroders and UBS have been ramping up their data processing and analytics capabilities and now only buy in raw ESG data, usually from multiple new and traditional providers.
“We don’t want to take someone else’s perspective on sustainability,” says Hannah Simons, head of sustainability strategy at Schroders.
As O’Brien at Nuveen notes, the process has not been easy.
Daniel Wild, Credit Suisse
“Linking up the systems of ESG vendors and feeding it into the systems our investment people are using is challenging,” she says. “Even though vendors have expanded coverage, we still have a lot of work to do internally to make this meaningful.
“Sometimes when we pull in data from different vendors, we get different values, and then we have to spend time reconciling these differences. As such, we’ve had to put in a lot of checks and balances, which is telling in terms of the state of the market.”
The in-house option also doesn’t come cheap. In addition to the cost of developing data-processing capabilities, the raw data itself is currently very pricey.
Wild, who joined Credit Suisse from RobecoSAM last year, says this reflects the effort and expense involved in collecting ESG data.
“In most countries reporting isn’t standardized,” he says. “You have to reach out to firms or devise your own models to create, standardize and possibly interpolate data.”
At the same time, some fear the price of data could rise further following a recent wave of consolidation in the ESG information industry that has seen smaller firms subsumed into four or five increasingly dominant global players.
As well as S&P Global’s purchase of SAM, the last 18 months have seen Moody’s acquire stakes in Vigeo Eiris, one of the oldest ESG research providers, and climate risk specialist Four Twenty Seven, and Morningstar complete its acquisition of Sustainalytics.
London Stock Exchange Group, which includes FTSE Russell, bought ESG fixed income data provider Beyond Ratings in June 2019 and is awaiting approval of its $27 billion takeover of Refinitiv, another ESG data leader. ISS also continued a series of ESG acquisitions with the purchase of Australia’s CAER last year.
Andrew Lee, head of sustainable and impact investing at UBS Global Wealth Management, says such consolidation can be helpful from the perspective of delivery and standardization of data.
“At the same time, it could enhance the market power of a smaller number of players, which is not always positive from an investor perspective,” he adds. “As an investor, you want best-in-class data from providers that are specialists in their focus areas.”
The high price of raw data has also fanned concerns that a gap is opening up between investors that can afford to buy massive amounts of data and develop the capabilities to process it, and those that are reliant on top-line ESG ratings. According to Arabesque S-Ray’s Feiner, around 80% of investors are currently using ratings alone.
Fiona Reynolds, chief executive of the Principles for Responsible Investment (PRI), says: “We are hearing concerns about asymmetry of information on ESG, particularly from some of the smaller asset owners and managers.”
These concerns have been recognized by European authorities, which are considering the creation of an open-source ESG database as part of the Renewed Sustainable Finance Strategy, partly in order to ensure a level playing field for investors.
Pipe dream or reality
The European Commission has also led the way in pushing for the standardization of ESG data, from the Non-Financial Reporting Directive, which entered into force in 2017, to the EU Action Plan on Sustainable Finance, which will impose ESG reporting obligations on European investors from 2021.
The EC’s most recent project is the development of a Sustainable Finance Taxonomy, the first section of which – covering climate change adaptation and mitigation – is due to be finalized this year.
The initiative has been warmly welcomed by ESG investors, who regularly cite lack of consistency in data as one of the main factors holding back the development of the industry.
Helena Viñes Fiestas, BNP Paribas
“We have a fundamental problem in not having comparable, reliable and meaningful data that covers 100% of the activities of companies,” says Helena Viñes Fiestas, global head of stewardship and policy at BNP Paribas Asset Management.
Perhaps surprisingly, given the emphasis they place on proprietary data-sourcing models and the price they can charge for raw data, ESG information providers also claim to be unanimously in favour of standardization.
“If we had one or two standards for ESG disclosure, like GAAP and IFRS, this would be a blessing,” says Feiner. “At the moment the inefficiencies of data reporting, analytics and collection is costing investors and corporates a lot of money and hindering the mainstreaming of sustainability on both sides.”
Simon MacMahon, head of ESG research at Sustainalytics, adds that, without standardization of company disclosures, ESG ratings will struggle to achieve their full potential.
“There is value that is created by turning unstructured data into insightful signals,” he says, “but we all benefit from higher-quality disclosures that will allow us to take more reliable pieces of information and turn them into even better insights.”
The announcement in July of a collaboration between two of the leading ESG standards organizations in the US and Europe, the San Francisco-based Sustainability Accounting Standards Board (SASB) and the Global Reporting Initiative, was also hailed as a promising step on the road to data standardization.
Whether or not the various standardization pushes will eventually achieve alignment, however, is open to question.
“There are early signs that people are starting to converge and think about being more coherent, but practically we need one system to gain traction and others to start aligning with it,” says FTSE Russell’s Staal. “That will happen in different ways for different types of data sources and different topics. The EU taxonomy is the type of development that could drive real standardization in ESG.”
Feiner is more pessimistic.
“Standardization that people can agree on on both sides of the Atlantic, on ways of measuring sustainability? I don’t see that coming in the near future,” he says. “There are too many different powerful forces that are trying to achieve that goal, so the market in terms of ESG reporting will likely remain fragmented.”
Obviously having the data is important, but you can’t let perfect be the enemy of good… Sometimes you just have to get on with things
Along with consistency, the other big challenge for the ESG industry remains a shortage of information. Despite a surge in corporate disclosures over the past decade, particularly on the environmental side, there are still crucial gaps in the data.
For example, five years after the Paris Agreement, only around 4,000 companies worldwide report their greenhouse-gas emissions.
Some ESG information providers attempt to bridge the gap with modelling, particularly on the climate side.
“We invest heavily in technology on the modelling side because climate is an area which is more about data science and models rather than just raw data,” says MSCI’s Briand. “If you combine industrial data with climate modelling, you can get something which is quite powerful.”
He cites the example of the catastrophic failure of Vale’s Brumadinho dam in southeastern Brazil in January 2019.
“We had clients asking if there were similar configurations where there were industrial processes in locations with heavy rainfall,” he says.
MSCI was able to combine industrial data with climate modelling to produce a list of companies with similar set-ups, mostly in China.
There are also hopes that combining technology with new data sources, such as satellite imagery, might help to fill in some disclosure gaps – for example, by allowing emissions to be geolocated to specific industrial facilities.
Trucost is one of the firms exploring ways to use this type of input.
Simon MacMahon, Sustainalytics
“There are lots of areas where satellite data could provide an additional layer to our analysis, from sea-level rise to information on droughts and heatwaves,” says James Salo, head of data strategy.
Many in the industry, however, are sceptical about the broader commercial use case for satellite data – particularly given the lack of information on the location of many companies’ operational facilities.
“When it comes to deriving ESG metrics from satellite data, the theory is there, but it’s very hard to build a viable product in scale for big institutional investors with thousands of companies globally in their portfolios,” says Feiner.
Sustainalytics’ MacMahon agrees.
“We’ve looked into satellite data,” he says. “There may be some opportunities there, but at the moment they are probably a bit overblown.”
In the meantime, the focus has recently shifted to another big gap in the ESG data landscape. Social issues have historically attracted less attention than environmental ones – partly, according to PRI’s Reynolds, due to lack of information.
“A lot of investors will say that not being able to get the right data is one of the things that holds them back from really integrating social issues in a more consistent way,” she says.
Companies have traditionally been reluctant to divulge information on social metrics such as diversity, pay equality or employee retention, while in some countries there are also legal or regulatory hurdles. In France, for example, firms are prohibited from collecting data on the race or ethnicity of employees.
This year, however, both the coronavirus pandemic and the Black Lives Matter protests have put social issues centre stage in the ESG debate.
“Looking at the health, safety and human governance context linked to Covid-19 and at the implications linked to the social justice movements, ‘S’ metrics are becoming a prerequisite for issuers and investors,” says Martina Macpherson, head of ESG and engagement strategy at Moody’s.
While companies are expected to step up disclosure on social issues in response to investor demand, however, this will not show up in corporate reports until next spring at the earliest.
In the meantime, ESG information providers are finding various ways to bridge the gap. Newer-generation firms have used their technological capabilities to collect and process timely information on social issues from news, social media, regulatory and other sources.
Datamaran, for example, created a new topic in its NLP lexicon around public-health risks and business continuity, while Truvalue Labs – which uses a SASB framework for data collection – was able to capture the surge of data in areas such as health and safety and labour practices after the start of the pandemic.
You can liken ESG data to financial data 80 years ago, before the SEC was established and IFRS was conceived
Traditional providers have not been behind the curve, however. Jus says the pandemic has highlighted the advantages of SAM’s model of direct engagement with companies.
“Inevitably, emerging trends won’t be fully captured in companies’ public reporting as quickly as if you can ask them questions directly,” he says.
Other ESG rating firms have long depended on alternative sources to fill in the gaps in social data. MSCI monitors job websites to gauge companies’ employee turnover and uses harassment cases as a proxy for diversity issues.
Meanwhile, ISS stresses the importance of stakeholder sources for picking up early warning signals on social metrics.
Weidacher notes, for example, that issues with Boohoo, the fashion retailer that saw its share price collapse in July following allegations of labour violations in its supply chain in the UK, were flagged by a local NGO in early 2017.
This in turn points to another big gap in the data and one that is increasingly attracting attention. While companies are working to understand the sustainability risks in their own supply chains, with the help of data providers such as EcoVadis, little of that information is currently reaching investors.
Fiona Reynolds, chief executive of the Principles for Responsible Investment
With the spotlight now on the social side of ESG, however, Reynolds at the PRI is confident that the quality of data will improve.
“As recently as five years ago it was very difficult to get environmental data, but things have changed very quickly,” she says. “It’s still not perfect, but there’s now a range of different tools that investors can use to look at their portfolio and do climate scenario analysis.
“It shows that if there’s a will there’s a way – so we can get social data happening just as we have with environmental data.”
Nevertheless, everyone in the industry agrees that there is still a long way to go on ESG data. Even on the governance side, market participants note, there is still no agreement across jurisdictions on the definition of an independent director.
“You can liken ESG data to financial data 80 years ago, before the SEC was established and IFRS was conceived,” says Feiner.
The question is whether or not real progress can be made while reporting remains largely voluntary. Viñes Fiestas is sceptical.
“We need to start harmonizing the calculation of raw data for a broad range of ESG metrics such as water emissions, waste ratios, gender pay ratios etc.,” she says.
“To start measuring at that scale, and in a way that is useful for investors, we need consistency across the board and we need reporting to be mandatory.”
“The industry working together is important, but at the end of the day you need the regulators to come along and mop up the tail,” she says. “You’ll always have leading organizations that do all the right things and others who won’t until the regulator says they have to.”
At the same time, she warns asset owners and managers against getting hung up on the data, or lack of it.
“Obviously having the data is important, but you can’t let perfect be the enemy of good,” she says. “Sometimes you just have to get on with things.
“I think sometimes people use it as a bit of an excuse – there isn’t the data so I can’t do anything. Well pick up the phone, think outside the box, do some research.”