Cutting Through the Noise: NLP’s Role in Streamlining Corporate Report Analysis

In today’s business landscape, corporate reports and regulatory filings are becoming increasingly complex due to a multitude of factors. The growing complexity of businesses, coupled with heightened disclosure requirements imposed by accounting regulators and market watchdogs, are among the primary contributors to this trend. The notion that increased disclosure is beneficial for investors has been a driving force behind these demands, especially after market crashes and corporate scandals. While the push for increased disclosure is aimed at benefiting investors, many experts are now cautioning against the never-ending increase in disclosure volume. They argue that too much information can create “noise,” making it difficult to extract meaningful insights from the data.

While there has been a push for increased disclosure to benefit investors, there is a concern that too much information can overwhelm investors and obscure important information that is buried in a sea of data. This can lead to confusion, decision paralysis, and missed opportunities. In some cases, it can even lead to unintended consequences, such as increased litigation or regulatory scrutiny. By emphasizing disclosure at the expense of these other factors, investors may miss out on important signals that can help them make better investment decisions.

To address this issue, recent advances in Natural Language Processing and AI can assist investors in cutting through the noise and identifying the most relevant information for their investment decisions. NLP techniques such as summarization, sentiment analysis, section detection, and change detection can be utilized to pinpoint the key topics and sentiments expressed in the textual data, allowing investors to stay up-to-date on the latest developments in the companies they are interested in. This can help investors gain a comprehensive understanding of a company’s financial condition and performance and filter out irrelevant information that may add noise to the data. By using these techniques, investors can also quickly identify important changes and trends, saving time and effort in analyzing the reports and making more informed investment decisions.

Increased demand for information and a growing focus on transparency and accountability. 

In recent years, regulators around the world have been implementing stricter requirements for publicly traded firms to provide more detailed and accurate information to their shareholders. As a result, companies are under increased scrutiny to maintain high standards of transparency and accountability. This has led to a greater focus on the quality of financial reporting and the reliability of the information being provided to shareholders. These regulations have not only helped to improve the quality of financial reporting but also increased the trust of investors in the information being provided.

From the Great Depression to ESG Reporting: A Historical Overview of Public Companies’ Evolving Reporting Requirements

Public companies had an inconsistent approach to reporting prior to the 20th century. The level of disclosure varied greatly between firms and was heavily influenced by managers’ preferences, with smaller, privately held companies being particularly secretive. The Great Depression played a crucial role in driving the New York Stock Exchange and the government to take measures to improve disclosure standards. This led to the establishment of the Securities and Exchange Commission (SEC) by the Securities Act of 1933, which aimed to ensure transparency and accuracy in reporting securities sold across state and national borders. The SEC’s regulatory role was further strengthened by the Securities Exchange Act of 1934, which governed the secondary trading of securities. During this period, the accounting profession worked to establish consistent reporting standards for publicly traded companies, culminating in the creation of Generally Accepted Accounting Principles (GAAP) in 1933.

Fast forward to 1998 The Plain English Handbook, published by the Securities and Exchange Commission (SEC) in 1998, was a pioneering effort to promote clear communication in SEC disclosure documents. The handbook aimed to help public companies create documents that were more accessible to investors and the general public by providing guidelines on how to use plain language and avoid jargon, legalistic terms, and obfuscating language.

As the 21st century began,  the Enron and WorldCom accounting scandals highlighted weaknesses in the accounting and auditing practices of publicly traded companies, as well as in the oversight mechanisms of regulators. In response to this, regulators around the world began to implement stricter reporting requirements for publicly traded companies. In the United States, for example, the Sarbanes-Oxley Act of 2002 was passed in response to the Enron scandal, which highlighted the need for greater transparency and accountability in financial reporting. Under the Sarbanes-Oxley Act, publicly traded companies are required to maintain accurate and complete records of all financial transactions, and to provide regular reports to their shareholders that are certified by independent auditors. Companies are also required to have strong internal controls in place to prevent fraud and to ensure the accuracy of their financial reporting.

Another more recent event and a primary reason for this increased focus on financial reporting is the global financial crisis that began in 2008. The crisis highlighted the importance of accurate and transparent financial reporting, as many of the companies involved in the crisis were found to have been providing inaccurate or misleading information to their shareholders.

In recent years, there have been a number of significant developments related to ESG reporting by public companies too. For example, in 2018, the SEC issued an interpretive release providing guidance on how public companies should disclose material ESG risks and opportunities in their SEC filings. In 2020, the SEC proposed amendments to its disclosure rules that would require public companies to disclose more information about human capital management, including workforce demographics, employee compensation, and training and development. The SEC has also signaled its intention to consider new rules for ESG disclosure by public companies in the coming years.

Reading Between the Lines: The Other Side of Financial Reporting Transparency and Accountability

Despite the efforts to promote transparency and accountability in financial reporting, a growing number of regulators, industry professionals, and academic experts are concerned about the readability of financial disclosures. They believe that the language used in these disclosures is often too complex and difficult for the average person to understand, which can lead to miscommunication and misinterpretation of important financial information. According to  Lesmy et a, 10-K reports have grown substantially longer, more complex, and less readable. This highlights the ongoing need for public companies and regulators to continue to improve the readability of financial disclosures, ensuring that investors and the general public can make informed decisions based on accurate and understandable information.

“The sheer quantity of financial disclosures has become so excessive that we’ve diminished the overall value of these disclosures”. 

Ray Groves, former partner at Ernst and Young, 1994

The following graph shows the increasing word count in 10-K reports over time.

The average length of Form 10-K reports. The plot shows the average number of words per 10-K report over time. The error bands represent 95% confidence intervals produced by 1000 bootstrap samples. (Lesmy et al.)

Aswath Damodaran, a professor of Finance at the Stern School of Business at New York University, also known as the Dean of Value Investing discusses in his blog post how the increased amount of information and reporting has led to more “noise” for investors making finding relevant information so much harder.  The post is creatively titled: “Disclosure Dilemma: When more (data) leads to less (information)” and in it, he explains how the perverse effect of supplying people with this overload is that they start using mental shortcuts. 

“10-Ks  are written to confuse not inform”

Aswath Damodaran

While the graph on the increasing length of the documents clearly adds insight, Dyers, Lang and Stice-Lawrence also count not just the words in 10-K filings, but also what they term redundant, boilerplate, and sticky words. These techniques divert attention away from more important issues, such as the quality of management, the strength of the business model, and the long-term prospects of the company which is what investors are interested in. Firstly, these words can create noise and distraction, making it difficult for investors to identify the key takeaways and important details of the report. Secondly, these types of words can be used to obscure or downplay potential issues, risks, or weaknesses in the company. By using overly general language, companies can gloss over important details that could be a cause for concern for investors.

Trends in Textual Attributes Over Time (Dyers, Lang, and Stice-Lawrence, 2017). Boilerplate words are the words in sentences that include at least one 4-word phrase that is shared by at least 75% of all firms in a given fiscal year. Redundant words are considered the words in sentences that are repeated verbatim in other portions of the 10-K. Sticky Words are the words in sentences that include at least one 8-word phrase that is identical to a phrase used in the prior year’s 10-K.

Lastly, more recently, Guest and Yan (2022) studied Semantic Progression progression in terms of complexity, in order to capture how fast a narrative in 10-K reports moves (speed), how much ground it covers (volume), and whether it goes in circles (circuitousness). Their findings say findings suggest that it is a good sign when a firm’s 10-K moves quickly from topic to topic and covers many topics. In contrast, it appears to be a bad sign when a company meanders among concepts as opposed to taking a more direct route. 

NLP’s Role in Streamlining 10-K Report Analysis: Unlocking Valuable Insights

But could the recent development of natural language processing (NLP) help us unravel this complexity and provide a solution to the challenges of corporate reporting?

Let us decompose famous NLP techniques and see how it can make the process of reading through 10-K reports much easier and more efficient.

NLP Summarization

One of the key benefits of using NLP technology in reading through 10-K reports is the ability to automatically summarize key information. The management’s discussion and analysis (MD&A) section, for example, can be quite lengthy and contain a lot of technical jargon. However, an NLP algorithm can quickly identify the most important topics and provide a summary that is much easier to read and understand. Similarly, an NLP algorithm can be used to summarize other sections of the 10-K report, such as the business overview, risk factors, and financial statements. By summarizing the key information in these sections, NLP can help investors quickly identify the most important information and make more informed investment decisions.

Sentiment Analysis

Another way that NLP technology can benefit investors reading through 10-K reports is through sentiment analysis. Sentiment analysis is the process of analyzing the tone and sentiment of text, and it can be used to identify the positive or negative language in a report. By analyzing the sentiment of a company’s 10-K report, investors can get a better sense of the company’s overall financial health and prospects for the future. For example, if a company’s 10-K report contains a lot of negative language and risk factors, it may be a sign that the company is struggling and that there are significant risks to investing in the company. On the other hand, if a company’s 10-K report contains a lot of positive language and growth opportunities, it may be a sign that the company is doing well and that there are good prospects for future growth.

Change Detection

It’s true that the majority of the information included in the report does not change significantly from year to year. By analyzing changes in the 10-K report, investors can quickly identify new information, trends, or developments that may be important to their investment decisions. For example, if a company has significantly increased its investment in research and development, this could be a positive sign that the company is focused on growth and innovation. NLP can be used to help with this process by automatically identifying changes between two 10-K reports. By comparing the language and content of the previous year’s report to the current report, an NLP algorithm can identify changes and summarize the most important differences. This can save investors a significant amount of time and effort, as they do not need to manually read through the entire report to identify changes. Instead, they can quickly review a summary of the changes and focus on the most important new information. In addition, NLP can help to identify trends and patterns in a company’s 10-K reports over time. By analyzing multiple years of 10-K reports, NLP can identify trends in the company’s financial performance, changes in management focus, and other important developments that may impact the company’s future prospects.

Section Classification

The SEC facilitates non-sequential processing by requiring firms to follow standardized section naming/ordering conventions, as well as by providing guidelines to help users quickly locate key information. This can be used by Section segmentation NLP techniques, and provide more specific section insight to investors. The Section Classification technique involves analyzing a document to identify the different sections or subsections and assigning labels to each section based on its content. For example, in a 10-K report, the section classification technique can be used to identify and label sections such as “Business Overview”, “Risk Factors”, “Management’s Discussion and Analysis”, “Financial Statements”, and so on. This technique can also be used to identify and label subsections within these sections, such as “Market Risk”, “Legal Proceedings”, or “Executive Compensation”.

10-K Items examples:

The Management’s Discussion and Analysis (MD&A): This section provides an overview of the company’s financial performance, market trends, and risks. NLP can be useful for summarizing this section, as it can help investors quickly understand the key points of the company’s financial performances

The Business Overview: This section provides a summary of the company’s business, including its products and services, customers, and competitors. NLP can be useful for summarizing this section, as it can help investors quickly understand the key aspects of the company’s business.

Why is this Important? For example, summarizing the Management& Discussion or Business Overview section of a 10-K might be more useful than doing so on the Properties Section while also allowing sentiment scores on specific sections helping investors navigate fast to the sections that matter! 

These techniques can also be used in combination!

For example, Change Detection can happen at the section level.  Section classification can help investors quickly navigate the report and find the sections that are most relevant to their analysis. By identifying and labeling the sections and subsections of the report, investors can focus their attention on the areas that are most important to their investment decisions. Once the relevant sections have been identified, change detection can be used to identify important changes between the current report and previous reports. For example, if a company has significantly increased its investment in a particular area of its business, change detection can identify the specific section where this change is discussed and summarize the key differences between the current report and the previous report. 

By combining these techniques, investors can quickly and efficiently identify important changes and trends in the areas of the 10-K report that are most relevant to their analysis. This can help investors to make more informed investment decisions and stay up-to-date on the latest developments in the companies they are interested in.

Stakeholders may not read the 10-K from start to finish. Experienced readers, such as sophisticated investors, are likely adept at skipping around filings to collect information. This is where NLP can streamline the process of finding the relevant information the investor needs. Not only that but also fast and reliable compared with the company’s peers, as long as the NLP happens en mass to multiple documents. This is how NLP can help investors understand the information provided in annual reports and regulatory writings, leading to better investment decisions and greater transparency in the financial market.

Conclusion

In conclusion, while the demand for increased disclosure and transparency in financial reporting is beneficial for investors, it also poses challenges for investors to extract meaningful insights from the data. However, recent advances in NLP and AI can help to streamline the process of reading through 10-K reports and provide a solution to the challenges of corporate reporting. NLP techniques such as summarization, sentiment analysis, change detection, and section classification can be used in combination to help investors quickly identify important changes and trends in the areas of the 10-K report that are most relevant to their analysis. By leveraging these techniques, investors can make more informed investment decisions and stay up-to-date on the latest developments in the companies they are interested in. As the demand for transparency and accountability continues to increase, NLP technology can play a significant role in unlocking valuable insights from the data and improving the financial market’s transparency.