Asset Quality Misrepresentation by Financial Intermediaries: Evidence from the RMBS Market

Asset Quality Misrepresentation by Financial Intermediaries: Evidence from the RMBS Market

TOMASZ PISKORSKI, AMIT SERU, and JAMES WITKIN ∗

ABSTRACT

We document that contractual disclosures by intermediaries during the sale of mort- gages contained false information about the borrower’s housing equity in 7–14% of loans. The rate of misrepresented loan default was 70% higher than for similar loans. These misrepresentations likely occurred late in the intermediation and ex- ist among securities sold by all reputable intermediaries. Investors—including large institutions—holding securities with misrepresented collateral suffered severe losses due to loan defaults, price declines, and ratings downgrades. Pools with misrepresen- tations were not issued at a discount. Misrepresentation on another easy-to-quantify dimension shows that these effects are a conservative lower bound.

M ARKET RULES AND REGULATIONS that require disclosure of information and pro- hibit misleading statements on the financial products being manufactured by intermediaries play an important role in the functioning of capital markets (Akerlof ( 1970 )). However, the nature of intermediation has changed dramati- cally over the past decade, with the introduction of more agents in the supply chain of credit (Loutskina and Strahan ( 2009 ), Keys et al. ( 2013 ), Nadauld and Sherlund ( 2013 )) potentially weakening the ability of existing market ar- rangements and regulatory oversight to ensure truthful disclosure of asset quality. This concern has gained momentum in the aftermath of the recent crisis, which witnessed a precipitous decline in the value of supposedly safe

∗ Tomasz Piskorski is at Columbia Business School, Amit Seru is at the University of Chicago and the National Bureau of Economics Research, and James Witkin is at Columbia Business School.

Piskorski thanks the Paul Milstein Center for Real Estate at Columbia Business School and the NSF (Grant 1124188) for financial support. Seru thanks the Initiative on Global Markets at Booth for financial support. Witkin thanks the Paul milstein Center for Real Estate at Columbia Busi- ness School for financial support. We thank Gene Amromin, Charlie Calomiris, John Cochrane, Gene Fama, John Griffin, Chris Mayer, Lasse Pedersen, Tyler Shumway, Ken Singleton, Richard Stanton, Phil Strahan, Amir Sufi, and Luigi Zingales, two anonymous referees, as well as seminar and conference participants at AQR, Columbia Business School, Purdue, Rice, Rutgers, Stockholm University, the Securities and Exchange Commission, the University of Illinois, Global Justice Fo- rum, National Bank of Poland, NBER Economics of Real Estate summer meeting, AFA, AREUEA, NUS-IRES Symposium, Red Rock Finance conference, Summer Real Estate Symposium, and UC Berkeley Conference on Fraud and Misconduct for valuable comments. We are grateful to Equifax and BlackBox Logic for their data. We also thank Ing-Haw Cheng, Andrew Ellul, and Taylor Nadauld for sharing their data. We are indebted to Vivek Sampathkumar and Zach Wade for outstanding research assistance.

DOI: 10.1111/jofi.12271

The Journal of Finance R

securities as well as large investor losses (Acharya, Schnabl, and Suarez ( 2013 )). 1 This paper adds to the debate by quantifying the extent to which buyers may have received false information about the true quality of assets by the sellers of the securities, investigating where in the supply chain of credit these misrepresentations likely occurred, and examining the economic conse- quences of such misrepresentations.

We focus on misrepresentation of the asset quality of securities collateralized by residential mortgages originated without government guarantees, that is, nonagency residential mortgage-backed securities (RMBS), a $2 trillion mar- ket in 2007 (Keys et al. ( 2013 )). These misrepresentations are not instances of the usual asymmetric information problem in which buyers know less than the seller. Rather, we argue that they are instances in which sellers provided buyers false information on asset characteristics during the contractual disclo- sure process. In the first part of the paper we focus on detecting such instances and where in the supply chain of credit these might have occurred. In the sec- ond part, we focus on understanding whether these misrepresentations were costly for buyers. The null hypothesis is that these misrepresentations, even if present, did not have a meaningful impact, either because they did not mat- ter for performance or because investors were able to differentiate pools with versus without misrepresented assets.

As we discuss in Section I , the RMBS securitization process involves aggre- gating mortgages into loan trusts, either through direct origination or indirect acquisition, and using their underlying cash flows to issue securities. The sale of these securities is organized by underwriters who, as part of this process, collect, verify, and certify information regarding the quality of the underlying collateral backing these securities. The underwriters in this market are large, reputable financial intermediaries, which are considered more sophisticated than the buyers in this market, which are typically institutional investors such as pension funds, mutual funds, and insurance companies. The underly- ing collateral data collected during mortgage origination are made available to investors of these securities, both in aggregated form in prospectuses as well as in the form of detailed loan files obtained from originators (lenders).

Our analysis focuses on an easy-to-quantify dimension of asset quality mis- representation during the sale of mortgages: loans that are reported as having no other lien when in fact the properties backing the first (senior) securitized mortgage were also financed with a simultaneously originated second (junior) mortgage. The consequence of this type of misrepresentation is that the re- ported combined loan-to-value ratio (CLTV) at origination is materially lower than the actual CLTV of the loan. Since the true equity stake of the borrower on such a loan is lower than the reported one, these loans carry significantly higher default risk. As we show, a loan’s CLTV is one of the most fundamental

1 Critics of imposing more regulation argue that reputational concerns of large, well-established financial intermediaries would prevent such violations of investors’ rights. In contrast, proponents

of increased regulation argue that intermediaries were able to exploit investors despite their reputation (and existing regulation).

2637 metrics that market participants take into account when assessing the value

Asset Quality Misrepresentation by Financial Intermediaries

and risk of mortgage securities. Thus, misrepresentation on this dimension implies that RMBS investors took on more risk than was implied by the con- tractual disclosure.

We identify second lien misrepresentation by comparing the characteristics of mortgages disclosed to investors at the time of sale with those in a data set provided by a credit bureau. Specifically, as we discuss in Section II , we use a data set provided by a credit bureau that matches loan-level data on mortgages disclosed to investors (BlackBox) with highly accurate data on these loans from consumer credit files at the same bureau (Equifax). A high-quality matched data set is critical for constructing measures of misrepresentation. Several pieces of evidence reported in Section III convincingly demonstrate the quality of the match.

In Section IV , we show that a significant degree of misrepresentation of col- lateral quality exists across nonagency RMBS pools. More than 7% of loans (13.6% using a broader definition) reported to investors as not having a junior lien did have a second lien. On average these loan files understate the true CLTV by about 20 percentage points. The misrepresentations are concentrated among loans used to purchase properties where more than 14% of loans re- ported to investors as not having a junior lien are misrepresented. Moreover, misrepresentations are not confined to loans with reported low documentation status.

The second-lien misrepresentation captures economically meaningful infor- mation about asset quality. Loans with a misrepresented higher lien, which we find are often fully documented, have about 10 percentage points higher like- lihood of default compared to loans with similar characteristics but no higher lien. This estimate is large, implying about 70% higher mean default rate of misrepresented loans relative to loans without higher liens. Lenders charged somewhat higher interest rates on loans with misrepresented second liens rel- ative to similar loans with no such lien. However, the interest rate markups on the misrepresented loans were much smaller relative to loans with similar or lower default risk, that is, those that truthfully disclosed a higher lien.

In Section V , we investigate which entities—the borrower, lender, and underwriter—might have been aware of misrepresentation more systemati- cally. To do so, we exploit an internal database of a large subprime lender, which allows us to observe the data that the lender collected during origina- tion. Comparing this information with that disclosed to the investors, we find that the lender knew second liens were present in almost all cases of second- lien misrepresentation, which implies that the misreporting occurred later in the supply chain of credit (i.e., within the boundaries of the financial industry). This evidence is not particular to one large lender; we confirm these findings in a broader sample of lenders using the registry of deeds that records the legal titles to the properties. Almost all mortgages that we identify as having

a misreported second lien have such liens recorded in the registry of deeds. Notably, virtually all of the misreported second-lien loans are originated by the same bank responsible for the first-lien loans sold to investors. Moreover, in

The Journal of Finance R

significant cases of second-lien misrepresentation, the lender and the under- writer associated with the loan are the same institution. Thus, lenders were aware of second-lien loans that were misrepresented and at least in some cases the underwriters could have easily accessed this information, as their own lend- ing arms originated the loans. Further, though there is heterogeneity in the extent of misrepresentations among securities sold by reputable underwriters,

a significant amount exists across all of them. In Section VI , we investigate the impact of second-lien misrepresentation on the RMBS market and present several important findings. First, confirming our loan-level evidence, we show that pools with a larger share of loans with misrepresented second liens suffer significantly higher losses.

Second, using tranche-level information, we show that these losses were not confined to only the most junior tranches (the riskiest securities that are hit first when the pool backing these securities suffers a loss) in misrepre- sented pools. Rather, even the most senior and safest tranches appear to suffer significant losses if the fraction of misrepresented loans in a pool is large. The evidence on tranche-level losses is also reflected in larger subsequent price declines and ratings downgrades of tranches backed by misrepresented pools.

Third, using measures of pricing employed by prior literature we show that there is no relation between the share of misreported assets in a pool and its pricing. Specifically, there is no evidence that pools with a larger share of misrepresentations—those carrying significantly higher default risk and suf- fering large subsequent losses and price declines—were sold at a discount at issuance, relative to pools with few misrepresentations. Further, the rat- ing of pools at issuance did not reflect the variation in the share of misrep- resented loans across pools. More importantly, however, we show that pool pricing did reflect disclosed information to investors—pools with higher dis- closed cumulative debt (i.e., second lien) were sold at a discount. Thus, secu- rities backed by misrepresented collateral were sold for more than what their price would have been had their characteristics been truthfully reported to investors.

Finally, we investigate what types of investors may have been affected due to RMBS holdings backed by pools with second-lien misrepresentations. We use information on holdings of one class of investors—insurance companies— from their disclosure filings to shed light on this issue. We show that these investors suffered significant losses, large subsequent rating downgrades, and price declines on tranches backed by misrepresented collateral. Moreover, we show that these investors were unable to distinguish misrepresented securi- ties from those that accurately represented collateral quality. This evidence suggests that the largest institutional buyers, such as insurance companies, suffered meaningful losses due to asset quality misrepresentation. Investigat- ing which other classes of investors held RMBS securities is difficult due to the lack of available data on holdings. Thus, we cannot directly test if losses on RMBS backed by misrepresented loans were borne entirely by large institu- tional investors, such as insurance companies, and hedge funds that invested

2639 in junior tranches, or if some losses were also borne by sellers of these securi-

Asset Quality Misrepresentation by Financial Intermediaries

ties that may have kept some of the unsold tranches backed by misrepresented collateral on their balance sheet. However, anecdotes and public testimonies indicate that prior to the crisis, sellers of mortgage securities succeeded in sell- ing most of the RMBS tranches, including the most junior tranches. Thus, it is likely that investors and not sellers bore the losses due to misrepresentation of collateral that backed RMBS.

Our analysis rejects the hypothesis that these misrepresentations did not have large material economic consequences. Investors bought misrepresented assets that proved to be ex post significantly more risky relative to what would

be assessed based on the contractual disclosure. Moreover, investors were un- able to distinguish pools with a large amount of misrepresented assets from those that had few misrepresentations. 2 These assets, at least in part, seem to have been misrepresented by financial intermediaries, and could meaningfully impact investors’ short-term profits. Notably, as we show in Section VII , there are other types of easy-to-quantify misrepresentation beyond the second-lien one. Our estimates are thus a conservative lower bound on the extent and consequences of misrepresentation in the RMBS market.

Overall, as we discuss in Section VII , our results suggest that the current market arrangements, based on reputational concerns and explicit incentives, may have been insufficient to prevent misrepresentations of asset quality in

a large capital market. Our findings are in line with studies that suggest the existing regulatory framework may have been insufficient to prevent such behavior (see Keys et al. ( 2009 )).

The remainder of this paper is organized as follows. Section I describes in- formation disclosure in the nonagency RMBS market. Section II describes our data. In Section III we construct and validate the second-lien misrepresenta- tion measure. In Section IV we use this measure to quantify the extent of asset quality misrepresentation and study its impact on loan performance and pric- ing. In Section V we investigate at which stage of the financial intermediation chain the misrepresentations likely occurred. In Section VI we investigate the impact of second-lien misrepresentation on losses and pricing of mortgage se- curities. Finally, in Section VII we discuss broader implications of our findings and their relation to the relevant literature.

2 It is important to note that we are not taking a stand on the extent to which the average level of asset misrepresentation was anticipated by investors nor on the overall welfare consequences

of misrepresentation. Theoretical literature on markets with asymmetric information shows that misrepresentation can arise in equilibrium even if investors correctly anticipate its average level (Akerlof ( 1970 )). In the context of this literature, our paper provides novel and comprehensive evidence on the nature of equilibrium in a large debt market. Our results indicate that investors were unable to distinguish mortgage pools with a large amount of misrepresented assets from those that had very few misrepresentations. Consequently, even if the average level of misrepresentation was anticipated and priced, misrepresentations resulting in such pooling equilibrium across good and bad mortgage pools could have adversely affected the efficiency of capital allocation (see De Meza and Webb ( 1987 )).

The Journal of Finance R

Figure 1. Key players involved in the creation and sale of RMBS. This figure presents a simple schematic of key players involved in the creation and sale of assets in this market. We also discuss the associated information flow.

I. Nonagency RMBS Market and Information Disclosure

The vast majority of mortgages originated in the United States are not held by the banks that originated them, but are instead securitized and sold as secu- rities to investors. In this paper we focus on RMBS collateralized by mortgage loans originated without government guarantees, that is, nonagency RMBS. This sector was a significant portion of the overall mortgage market, reaching more than $2 trillion of securities outstanding in 2007 (Keys et al. ( 2013 )). We discuss two aspects of this market below. First, we discuss the key players in- volved in loan origination and selling. Second, we discuss the information sets of the different players, which are critical in identifying misrepresentations at the time of sale.

Figure 1 presents a simple schematic of the key players involved in the creation and sale of assets in this market and the associated information flow. Banks originate mortgages either through their own branches or with the help of brokers. During this process, lenders collect detailed information regarding the borrowers’ characteristics, such as their credit scores and income, whether the property is backed by other debt (liens), and whether the acquired property is intended to be used as the borrower’s primary residence. In the case of a limited or no-documentation loan, some of this information, such as borrower income, can be self-reported with no additional documentation. Subsequent to loan origination, lenders pool groups of loans to create mortgage-backed

2641 securities or sell the loans to financial intermediaries who do the pooling and

Asset Quality Misrepresentation by Financial Intermediaries

creation of such securities. The structuring and sale of these securities are organized by the underwrit- ers, which are typically large and reputable financial institutions that often also originate mortgages. The underwriters disclose and certify information re- garding the loans underlying the mortgage-backed securities that was collected during the origination process preceding their sale. The pool characteristics are commonly disclosed in the prospectus in the section related to “representations and warranties.” This disclosure summarizes information about variables rel- evant for assessing the risk-return of the pool, such as loan-to-value ratios of loans in the pool, their interest rates, borrowers’ credit scores, and the occu- pancy status and location of the properties backing the mortgages. Each pool is also accompanied by a loan-level data file that contains detailed information regarding each mortgage in a given pool. In addition to prospective investors, this information can be used by rating agencies to rate the pool. Prior to the crisis, underwriters managed to sell the vast majority of mortgage securities. This includes the riskiest (equity-like) tranches, as according to the 2010 report of the Board of Governors to Congress, financial intermediaries “ . . . generally sold the equity tranche, but they sometimes retained it temporarily after clos- ing because of the difficulty in selling this tranche. However, securitizers often ultimately succeeded in selling this piece to other market participants.”

The buyers of mortgage-backed securities are typically institutions, such as pension funds, that are generally considered less financially sophisticated than underwriters, especially in the case of highly rated senior tranches, which com- prise more than 80% of the value of the pool. These buyers rely to a large extent on the certification of these securities’ quality by the underwriters and ratings agencies. There are also some sophisticated buyers, such as hedge funds, that demand risky junior tranches. These institutions (and rating agencies) often use the detailed loan characteristics disclosed by the sellers as an input in their pricing models.

Since the information disclosed to investors by the underwriters plays an im- portant role in assessing the value of mortgage-backed securities, underwriters are contractually responsible for guaranteeing that the underlying collateral in a pool is accurately represented. Of particular relevance are contractual obligations that force the lender or underwriter to repurchase the loan from

the securitization trust. 3 These obligations may lead buyers of these securi- ties to neglect performing an independent review of the accuracy of disclosed information. 4

As we explain in detail in Section III , we identify instances of misrepresen- tation by comparing characteristics of loans that were disclosed to investors in

3 For instance, the prospectus supplement for Series OOMC 2005-HE6 states that “If the seller or the originator fails to cure a material breach of its representations and warranties with respect

to any mortgage loan in a timely manner, then the seller or the originator would be required to repurchase or substitute the defective mortgage loan.”

4 While repurchase clauses were common during 2005 to 2006, their enforcement was not as strict (Piskorski, Seru, and Vig ( 2010 )).

The Journal of Finance R

the detailed loan-level data at the time of sale with characteristics of the same loans at the same time in a proprietary matched data set by a credit bureau. As we discuss in detail below, the latter data set contains highly accurate in- formation on the characteristics of these loans. We therefore identify a loan as misrepresented if the characteristic of the loan at the time of sale that is dis- closed to investors differs from that available in the credit bureau–generated data set. In running this analysis we take the perspective of an investor who used the detailed loan-level information available from the underwriter at the time of sale.

We focus on misrepresentation of the collateral backing the RMBS that our data enable us to identify. The loan characteristic of interest is the to- tal debt backing the property, which serves as collateral for the mortgage sold to investors. In particular, we identify a loan as misreported if the loan is re- ported to investors as backed by property that has no associated higher liens, when the credit bureau data show that the property backing the first (senior) mortgage is also financed with a simultaneously originated second (junior) mortgage. Prospectus supplements commonly make statements regarding the total value of all liens on the collateralized property. For instance, the prospec- tus’s supplement of Series 2006-FF15 underwritten by Lehman Brothers states that the “Original full Combined Loan-to-Value Ratio reflects the original Loan- to-Value Ratio, including any subordinate liens, whether or not such subordi- nate liens are owned by the Trust Fund.”

It is worth reiterating that the information contractually disclosed to in- vestors by underwriters allows investors to assess the risk of the security. In particular, previous research shows that mortgages with a higher CLTV are often associated with greater default risk (see Mayer, Pence, and Sherlund ( 2009 )). Thus, the misrepresentations we focus on—omitting information on the junior mortgage on a property—can understate the true risk associated with the pool collateral. This may imply that RMBS investors took on more risk than indicated by the contractual disclosure. These assertions can be tested against the null hypothesis that these misrepresentations, if present, did not have a meaningful impact, either because they did not matter for performance or because investors were able to differentiate pools that misrepresented assets from those that did not.

II. Data

Our primary data set links two databases that allow us to construct our measures of asset collateral misrepresentation: (i) loan-level mortgage data collected by BlackBox Logic and (ii) borrower-level credit report information collected by Equifax.

BlackBox is a private company that provides a comprehensive, dynamic data set with information about 21 million privately securitized Subprime, Alt-A, and Prime loans originated after 1999. These loans account for about 90% of all privately securitized mortgages from that period. The BlackBox data, which are obtained from mortgage servicers and securitization trustees, include static

2643 information taken at the time of origination, such as the mortgage origination

Asset Quality Misrepresentation by Financial Intermediaries

date and amount, borrower FICO credit score, 5 servicer name, interest rate, term, interest rate type, CLTV, and borrower occupancy status. The Black- Box data also include dynamic data on monthly payments, mortgage balances, and delinquency status. Importantly, this database collects information from trustees of mortgage pools concerning characteristics of mortgages in the pool that were disclosed to investors at the time the pool was sold. In other words, these data contain the loan-level data file, that is, detailed information on orig- ination characteristics of each mortgage in a given pool, provided to investors.

The other data set that we use is from Equifax—a major credit reporting agency that collects information from various sources and provides monthly data on borrowers’ current credit scores, payments, and balances on mort- gage and installment debt as well as revolving debt (such as credit cards and home-equity lines of credit (HELOCs)). The banking industry critically relies on data from credit bureaus for assessing the creditworthiness of borrowers. A large body of research concludes that such data have strong predictive power in making such an assessment (e.g., Piskorski, Seru, and Vig ( 2010 )). Conse- quently, lenders have strong incentives to correctly report the characteristics of borrowers and their loans to credit bureaus. In addition, credit bureaus of- ten cross-validate the records using other sources, limiting the occurrence of reporting errors. Thus, the credit bureau data are generally believed to contain highly accurate information. Importantly, the RMBS investors did not have access to mortgage records in these data at the time of sale.

Equifax recently linked its credit information data to the BlackBox data using a proprietary match algorithm that merges on more than 25 variables (see the Internet Appendix for more details). 6 We use the merged data provided by Equifax to identify whether information in credit bureau data at the time of sale differs from that disclosed to investors as captured through the BlackBox sample.

Two comments are in order about the merged data. First, Equifax reports

a merge confidence indicator that ranges from low to high confidence. Low confidence implies that a given record could have multiple matches, while high confidence implies that the match is close to perfect. Not surprisingly, as the degree of confidence increases, the sample available for analysis becomes smaller. In our analysis we restrict the sample to loans that have the highest Equifax merge confidence level. While this restriction reduces the sample size, it ensures that the misrepresentations we identify are not due to merging errors. We note that our analysis is limited to loans originated between 2005 and 2007, years for which Equifax provides us this high-confidence merge sample. Second, we limit attention to those loans disclosed to investors as first liens. After imposing these restrictions, we obtain a base sample of 1.9 million

5 The FICO credit score ranges from 350 to 850, with a higher score indicating a more credit- worthy borrower.

6 The Internet Appendix may be found in the online version of this article on the Journal of Finance website.

The Journal of Finance R

loans. The Internet Appendix shows that this base sample has slightly higher quality observables than those in the overall data.

III. Constructing and Validating the Second-Lien Misrepresentation

Measure

A. Constructing the Second-Lien Misrepresentation Measure Our measure of misrepresentation identifies instances in which a first-lien

loan is reported to investors as backed by property that has no associated higher liens when the credit bureau data show that the property backing the first mortgage is also financed with a simultaneously originated second mortgage. To construct this measure, we use the CLTV for all liens on the property at the time of loan origination that was disclosed to investors as part of the loan-level information (a majority of loan files do report CLTV). If the lender is unable to ascertain this value or is not willing to disclose this information, the loan is usually given a missing CLTV. While it is possible that some missing CLTV values could be due to asset misrepresentation, it is hard to know this with certainty. We thus take a conservative approach and do not consider loans with missing CLTV values as having been misrepresented. Instead, we restrict our sample to loans with nonmissing CLTV, where the reported CLTV is within

1% of the loan’s loan to value ratio (LTV), 7 as we can be confident that the securitized first mortgage (senior loan) for these loans was reported to the trustee of RMBS as having only one lien on the property. This yields a sample of 854,959 loan files that report no second liens to investors.

Using this sample, we construct the misreported second-lien measure based on the observation that the credit bureau (Equifax) data include information about other mortgages held by each borrower. In particular, we can examine whether loans reported to RMBS investors as having no simultaneous second

liens do in fact have a second lien reported in the credit bureau data. 8 If this is the case, we classify the loan file as having a misreported second lien. Because we focus on data with the highest merge quality, we are confident that such instances represent second liens that were misreported to RMBS investors.

7 We have also verified that loans reported to investors as having second liens do indeed have such mortgages reported in the credit bureau data. In the vast majority of cases (more than 90%),

loans that report origination CLTV greater than LTV have a simultaneous second-lien loan present in the Equifax database. The remainder likely represents cases in which lenders have not promptly reported their records to the credit bureau, which we confirmed with the data vendor. If anything, this makes our measure of misrepresentation a conservative estimate since we may not detect some misreported second liens.

8 We take a conservative approach and, throughout most of our analysis, do not consider loans as misrepresented if the CLTV reported by the trustee does not include a HELOC balance in its

calculation. Despite the fact that prospectuses often include such loans in the definition of CLTV, one could argue that such loans are harder to categorize as misrepresented mortgages and instead could be perceived as revolving debt similar to credit card debt. The level of misrepresentation is much higher if we consider HELOCs as second liens (see Section IV.A), and our other results are robust to such extension of our misrepresentation measure.

2645 In applying this measure we make a judgment call in identifying second liens

Asset Quality Misrepresentation by Financial Intermediaries

as either simultaneously or subsequently originated. We classify simultaneous second liens as those that appear in the credit bureau data with an origination date within 45 days of that of the first lien. A loan is classified as having a misreported second lien if it is reported to investors as having no second lien but shows such a lien in the credit bureau data within this time window. The

45-day window allows for small differences in the recording of dates. 9 It is quite unlikely that a borrower would obtain a subsequent second lien on a mortgage within this time period without the lender of the first lien having that information when reporting to investors. Our results are not sensitive, however, to changes in the length of this window.

B. Validating the Quality of the Second-Lien Misrepresentation Measure Our misrepresentation measure relies on Equifax data being correctly

matched with BlackBox data. In this section we consider whether the match between the two data sets, done by the credit bureau, is of high quality.

As we discuss above, we restrict our sample to loans that have the highest merge confidence level assigned by the credit bureau. Independent analysis of the quality of this data confirms that the merge is of very high quality. In particular, we cross-check fields such as dynamic payment history, origi- nation balance, and origination date across the two data sets. The Internet Appendix shows that the vast majority of these fields are the same across the two databases for the matched sample. For example, the match between payment histories increases monotonically with Equifax’s confidence measure, with only 0.3% of the loans with the highest merge confidence level having a different payment status.

Further validation of the match quality comes from analysis that assesses delinquency rates on misrepresented loans. Evidence in the literature indicates that loans with second liens have higher delinquency rates (see Mayer, Pence, and Sherlund ( 2009 )). Hence, if our measures correctly identify misrepresen- tations, we would expect such loans to have a higher default pattern compared

with loans that are not misrepresented. Analysis in Section IV shows that ex post delinquencies of loans that we identify as misrepresented are significantly higher compared with otherwise similar loans. This evidence supports the view that our identification procedure captures actual misrepresentations of asset quality instead of incorrectly merged records.

This argument finds additional support from a placebo test that shows incor- rectly merged records should not have a strong directional relationship with subsequent loan performance. In particular, in the Internet Appendix we fo- cus on the few records in our database for which the loan balance of the first mortgage does not exactly match across the two databases. The balance of the

9 Allowing for such a window is quantitatively unimportant as the vast majority of loans that we classify as having misreported second liens have the exact same second-lien origination date

(in the Equifax database) as the first-lien origination date (in the BlackBox database).

The Journal of Finance R

securitized first mortgage is unlikely to be misreported to investors because servicers verify and report the outstanding loan amount and payments on a monthly basis to the securitization trust. Hence, differences in such records between the two data sets likely indicate incorrectly merged loans across the two databases. We find that these incorrectly matched loans are not associ- ated with subsequent adverse performance as is the case for misrepresented loans.

Finally, if our sample consists of correctly merged records, we should be able to directly identify whether a given loan has an undisclosed second lien. As we discuss in Section V , we are able to cross-validate the merge quality of our databases using an internal database from a large subprime lender in which almost all loans (more than 93%) that we identify as having misreported second liens do indeed have such liens reported in the bank’s internal data. In addition, we verify that almost all loans that we identify as having misreported second liens do have such liens in a sample of our data merged with the registry of deeds (more than 93%). This exercise provides an independent verification of our methodology and of the accuracy of Equifax data. 10

C. Lower Bound It is important to note that our estimate of misrepresentation in the RMBS

market is likely to be a lower bound, which should bias against finding that misrepresented loans have higher ex post delinquencies relative to otherwise identical loans for two reasons. First, we consider only one dimension of misrep- resentation due to lack of data that would allow us to construct other objective measures of misrepresentation. Of course, many other types of misrepresenta- tion are possible, such as manipulation of owner occupancy status of a home, credit scores, income, assets, value of the home, etc. Second, we do not identify misrepresentations on loans for which second liens exist but are not reported to Equifax by lenders. We return to this issue in Section VII when we discuss the broader implications of our findings.

10 Our methodology treats the records in Equifax as an accurate representation of the character- istics of borrowers and their loans. As discussed earlier, credit bureau data can be thought of as an

internal database of the banking industry. Lenders have strong incentives to correctly report the characteristics of borrowers and their loans since consumer credit records are critical for assessing borrowers’ creditworthiness. In addition, since the information collected by the credit bureau feeds into the credit scores of borrowers, the borrowers themselves have strong incentives to correct any discrepancy. Not surprisingly, a wide body of research shows that the information collected by credit bureaus, summarized in their credit scores, has a strong predictive power in explaining borrower risk (see Keys et al. ( 2010 )). While there could be discrepancies in credit bureau records, most concern borrowers’ delinquency status, which our methodology does not rely upon (see, e.g., December 2012 Federal Trade Commission Report to Congress). Whether the borrower has a si- multaneous second lien is based on a loan reported under a borrower’s name, and such records are commonly reviewed by borrowers. The strongest evidence for the accuracy of Equifax data comes from the analysis that shows significantly higher ex post delinquencies of misrepresented loans relative to otherwise identical loans.

Asset Quality Misrepresentation by Financial Intermediaries

IV. Extent of Second-Lien Misrepresentation, Loan Performance,

and Pricing

In this section we use the measure discussed above to quantify the extent of second-lien misrepresentation, and its relation with both ex post delinquency on loans and mortgage pricing by lenders.

A. Overall Level of Second-Lien Misrepresentation We start our analysis by quantifying the extent of second-lien misrepresenta-

tion. Table I reports statistics for loans that we identify as misreporting second liens to RMBS investors. To facilitate interpretation of our results, Panel A reports the summary statistics for the overall sample of loans that report no second liens to investors as well as for the subset of these loans that are mis- represented on the dimension of second lien. Panel B reports the percentage of misrepresented loans in various subsamples of our data.

As we observe from Panel B, about 7.1% of mortgages that do not report second liens to investors are misrepresented, since these loans have a simulta- neous second lien in the credit bureau data. The misrepresented loans overstate the borrower’s housing equity significantly. In particular, misrepresented loans report an initial CLTV of about 80 percentage points on average, which is on average about 19.5 percentage points lower than its true value (with a standard deviation of about eight percentage points). This implies that, contrary to the information given to investors, borrowers with misreported second liens had very little equity left in their homes. It is also worth noting that, if we treat the presence of HELOCs as equivalent to the presence of second liens—as in many prospectuses disclosed to investors—the overall level of misreported second liens is almost twice as large: 16.64% of loans that report no second lien actu- ally have a closed-end second mortgage or a HELOC with positive outstanding balance originated at the same time as a first mortgage.

One may expect that asset misrepresentation, and, more broadly, lower reported data quality, was signaled to investors through the low- or no- documentation status of loans. However, we find a significant extent of mis- representation even when we focus on fully documented loans. About 7.93% of fully documented loans stating that a second lien is not present actually had

such a second lien. We return to this fact in Section V .

We note that the prevalence of misrepresentation is higher for loans used to purchase properties as opposed to mortgages used to refinance existing loans. Among purchase loans reported as having no second liens, about 13.75% mis- represent second-lien status compared with only 3.20% for refinance mort- gages. This estimate implies that, among homes financed with simultaneous second liens, about 25% involved misreporting the presence of a second lien to investors. There are a number of reasons why misrepresentation could be more prevalent for purchase loans. Chief among these is the fact that second- lien loans are less common in refinance transactions because often a lender

The Journal of Finance R

Table I

Descriptive Statistics and Average Second-Lien Misrepresentation Level

Panel A presents summary statistics of key variables for mortgages reported as having no simul- taneous second liens to the RMBS trustee (loans reported as such in the BlackBox data set). The sample comprises these loans merged with a high level of confidence with the credit bureau data. Panel A also shows summary statistics for the subsample of these mortgages that consists of loans with misreported second liens. FICO is the borrower’s FICO credit score at loan origination. Bal- ance is the initial loan balance (in thousands of dollars). CLTV is the loan’s origination combined loan-to-value ratio in percentage terms reported to investors. No Cash Out Refi and Cash Out Refi are dummies that take a value of one if the loan purpose was a no-cash-out refinancing or cash-out refinancing, respectively, and zero otherwise. Low or No Doc. is a dummy that takes a value of one if the loan was originated with no or limited documentation, and zero otherwise. ARM and Option ARM are dummies that take a value of one if the loan type was an ARM or option ARM, respectively, and zero otherwise. Panel B presents the percent of loans reported as having no simultaneous second liens that are misrepresented (loans identified as having simultaneous second liens based on our method). Column (1) shows results for the overall sample, column (2) for the subsample of fully documented loans, column (3) for the subsample of loans used to purchase homes, and column (4) for the subsample of loans used to refinance existing mortgages, while column (5) shows results when we expand the definition of simultaneous second liens to HELOCs originated simultaneously with the first-lien mortgage.

Panel A: Descriptive Statistics for Mortgages Reported as Having No Simultaneous Second Liens Loans with misreported

Overall sample

second liens

Mean SD FICO

0.36 0.48 0.71 0.45 No Cash Out Refi

0.12 0.32 0.08 0.28 Cash Out Refi

0.50 0.50 0.19 0.39 Low or no doc.

0.51 0.50 0.45 0.49 ARM

0.47 0.49 0.70 0.45 Option ARM

0.16 0.37 0.07 0.25 Number of loans

Panel B: Level of Misrepresented Second Liens Overall

Refinance Overall sample sample

loans (with HELOCs)

(4) (5) Percent misreported

3.20% 16.64% Difference between actual CLTV and reported CLTV for loans with misreported seconds: Mean =

19.57% (SD = 8.00%).

2649 will refinance outstanding first and second liens with one loan, eliminating

Asset Quality Misrepresentation by Financial Intermediaries

the possibility of second-lien misrepresentation in refinancing. Indeed, in our data about 37.4% of purchases are financed with simultaneous second liens compared with only 14.2% of refinance transactions.

Finally, we note that loan misrepresentation does not concentrate in specific geographic areas. In particular, while there is significant regional variation in the prevalence of misrepresentation across the United States, a sizable degree of misrepresentation exists in loans originated in the vast majority of U.S. states. We return to this issue in Section VII .E.

Next, we investigate the degree to which asset misrepresentation varies with observable characteristics over time. To do so, we estimate loan-level regressions of the following form:

(1) where the key dependent variable, Misreported Second, is a dummy variable

Y i =α+βX i + γ × Misreported Second i +ε i ,

that takes a value of one if loan i states to investors that it has no simultaneous second lien but our procedure indicates that there is such a lien, and zero otherwise. Because we are interested in how asset misrepresentation is related to observable loan characteristics, we include a vector X i that consists of loan- level observable characteristics such as credit score and CLTV.

We present the results in Table II . Column (1) presents results including a set of basic controls. 11 Column (2) adds half-year fixed effects capturing the loan origination date, which allows us to track the evolution of asset misrepre- sentation over time. Column (3) adds controls for the location of the property backing the loan, column (4) adds fixed effects capturing the identity of the underwriter that sold a loan and column (5) standard errors clustered at the state level.

Our results reveal that loans with misreported second liens tend to have higher credit scores, smaller origination balances, and lower reported CLTV ratios. Moreover, these loans are more likely to be fully documented as indi- cated by the negative and significant coefficients for low- and no-documentation

status. Consistent with our earlier results (Table I ), misrepresentation of sec- ond liens is more common among purchase loans, as evidenced by significant and negative coefficients on refinance dummies. In addition, these loans are

less likely to be option adjustable-rate mortgages (ARMs). 12 These features broadly depict patterns expected of typical loans with second liens. This evi- dence also suggests that the lenders originating the first-lien loans may have been aware of the presence of second liens—given that they were more likely

11 Throughout the paper we estimate our specifications using OLS despite the binary nature of several of the dependent variables. Our OLS specification with flexible controls to capture

nonlinearity allows us to estimate our coefficients consistently even with multiple fixed effects (Johnston and DiNardo ( 1997 )). As we illustrate in the Internet Appendix, our inferences are very similar using a nonlinear specification (probit) instead.

12 See Piskorski and Tchistyi ( 2010 ) and Amromin et al. ( 2011 ) for a discussion of option ad- justable rate mortgages.

The Journal of Finance R

Table II

Second-Lien Misrepresentation across Borrower and Loan Characteristics

This table presents OLS estimates from regressions in which the dependent variable takes a value of one if the loan is Misreported Second Lien and zero otherwise. The controls represent the values reported to investors (the corresponding values in the BlackBox database). Column (2) adds fixed effects corresponding to the loan origination time, with 2005 being the omitted category. Column (3) adds fixed effects corresponding to the location of the property backing the loan. Column (4) adds fixed effects capturing the identity of the underwater that sold a loan. Column (5) reports standard errors clustered at the state level. All estimates are in percentage terms; standard errors

are in parentheses. *p < 0.10, **p < 0.05, and ***p < 0.01. Variables are defined in Table I .

(0.00307) (0.0643) No Cash Out Refi

(0.0919) (0.713) Cash Out Refi

(0.0651) (0.755) Low or no doc.

(0.0670) (0.337) Option ARM

(0.128) (1.114) Originated in 2006H1

(0.0764) (0.233) Originated in 2006H2

(0.0810) (0.321) Originated in 2007H1

(0.101) (0.418) Originated in 2007H2

(0.262) (0.406) State fixed effects

Yes Yes Underwriter fixed effects

Yes Yes SEs clustered by state

No Yes Number of loans

854,959 854,959 Percent misrepresented

to collect fully documented information on assets, income, and employment of borrowers when granting the first mortgage—and that information regarding the second lien was indeed misrepresented to investors. We investigate this

hypothesis further in Section V .

Although the addition of origination cohort controls in column (2) does not seriously impact the sign or significance of the regression coefficients, the fixed

2651 effects themselves point to an interesting trend. These coefficients should be