Cross Validation of my Lending Club model

Several months ago, one of my assignments was to perform a multivariate analysis of Lending Club data to see if I could come up with a model that would accurately predict the interest rate offered to a potential lendee, given several indicators such as FICO score, home ownership, term leangth of the loan, loan amount, etc.

I recently revisited this assignment as part of learning about how to perform cross validation in Python, and in doing so I was able to increase the performance of my model slightly, from an R2 of 0.657 to 0.689. My analysis is up on my GitHub if you’d like to check it out, but the TL:DR version is that the most reliable indicators for an interest rate offered by Lending Club are FICO score and the term length of the loan.

Written on December 1, 2015