Skip to content

Forecasting Bankruptcy for Financial Institutions via Machine Learning Models

Scientists at the University of Parma, Italy, and the University of Florida, USA, have compiled a comprehensive database encompassing 8,262 publicly-traded firms on the New York Stock Exchange or NASDAQ, two significant American stock markets, spanning the years 1999 to 2018. The database...

Predictive Model Training for Bankruptcy Scenarios
Predictive Model Training for Bankruptcy Scenarios

Forecasting Bankruptcy for Financial Institutions via Machine Learning Models

Accessing a Comprehensive Financial Dataset for Bankruptcy Prediction Models

For researchers seeking to train bankruptcy prediction models, a widely-used dataset of 8,262 public companies' financial variables from 1999 to 2018 is typically accessed through academic or financial research databases. This dataset, which includes companies listed on the New York Stock Exchange and NASDAQ, offers 18 annual accounting and financial variables that characterize the financial health of each company.

One common avenue to access this dataset is through commercial databases such as Compustat via Wharton Research Data Services (WRDS). These databases maintain comprehensive company financial data, including balance sheet, income statement, and cash flow variables, over many years. Researchers often use these databases to assemble datasets on public firms from 1999 to 2018. However, access usually requires institutional or paid subscriptions.

Another approach is to explore university research repositories or supplementary data from published papers. Sometimes, authors of bankruptcy prediction studies publish their cleaned datasets or provide access upon request. Checking literature that uses this exact dataset (8,262 firms, 1999-2018) and contacting corresponding authors can be a lead.

It's worth noting that the U.S. Securities and Exchange Commission’s EDGAR system offers raw financial filings, but assembling and cleaning such a large panel dataset is complex and time-consuming.

If you don't have institutional access to academic databases like WRDS, Bloomberg Terminal, or Capital IQ, you may want to consider commercial data providers specialized in financial data for academic and industry users.

It's important to note that the dataset does not include companies that filed for bankruptcy during the period from 1999 to 2018. If you need financial data specifically for bankruptcy prediction, database subscriptions typically offer the richest, most widely used datasets.

Researchers at the University of Parma in Italy and the University of Florida in the United States have compiled this dataset. Each company in the dataset has 18 annual accounting and financial variables listed, such as current assets or cost of goods sold. The dataset lists whether each company filed for bankruptcy the following year.

[1]: Link to the first source [2]: Link to the second source [3]: Link to the third source [5]: Link to the fifth source

To utilize this bankruptcy prediction dataset, researchers can access it through commercial data providers specializing in financial data, such as WRDS, Bloomberg Terminal, or Capital IQ, which often require institutional or paid subscriptions. Alternatively, researchers may find datasets used in published bankruptcy prediction studies on university research repositories or by contacting the study's authors directly.

Read also:

    Latest