Student Debt Story

https://public.tableau.com/views/TableauWorkbook_11/StudentDebtStory?:embed=yes&:display_count=no&:showVizhome=no

The issue of student debt is one of the most pressing for the millennial generation. Millions of young people every year leave college with an ever increasing burden of debt, threatening their purchasing power and financial security. The data bears these facts out, as the tableau story below will demonstrate. The purpose of this story is to demonstrate that student debt continues to be an unaddressed issue and is affecting the viability and ultimate value of the current higher education model.

The story type I used was change over time. Using the college scorecard data set, I was able to pull a decade of tuition, debt and income data for Non-Profit and Public institutions. The story looks at change in tuition, total debt and debt-to-income from 2003 to 2013. I provided three story points in order to tell the story. Each story point should logically build on the previous one Рsetting the stage with the increase in tuition leads to the consequential increase in debt burden. Because tuition exceeds the increase in income over the period Рstudent debt-to-income naturally increases as well.

Counting Illinois Colleges

To dive deeper into my dataset, I used Google BigQuery to run SQL queries on my data table. Because of the size of my dataset (over 10MB) I had to first upload the file to Google Cloud.

In order to sample the use of BigQuery on my dataset, I decided to query the number of accredited undergraduate degree granting publc and non-profit institutions in the state of Illinois. In order to run this query I constructed the following in SQL:

SELECT (*)
FROM [neat-sunspot-164714:CollegeScoreCard.Dataset_2013]
WHERE STABBR = ‘IL’
AND PREDDEG = 3
AND CONTROL != 3

The structure of the query is to select all the results (not excluding any columns) from my table where the state is Illinois, the predominate degrees granted are for undergraduate, and control of the institution is not for-profit.

A count expression provides: 71 results

I’m actually somewhat surprised by the number of results, however they’re confirmed by viewing the institution names, id’s and cities from the table. The query took about 5 seconds to run, despite needing to run through tens of thousands of rows.

UNITID INSTNM CITY STABBR
145691 Illinois College Jacksonville IL
146481 Lake Forest College Lake Forest IL
143288 Blackburn College Carlinville IL
146825 MacMurray College Jacksonville IL
148405 Rockford University Rockford IL
144351 Concordia University-Chicago River Forest IL
147341 Monmouth College Monmouth IL
148131 Quincy University Quincy IL
143084 Augustana College Rock Island IL
148496 Dominican University River Forest IL
149505 Trinity Christian College Palos Heights IL
148584 University of St Francis Joliet IL
147660 North Central College Naperville IL
145725 Illinois Institute of Technology Chicago IL
149514 Trinity International University-Illinois Deerfield IL
145372 Greenville College Greenville IL
147679 North Park University Chicago IL
147244 Millikin University Decatur IL
147013 McKendree University Lebanon IL
144962 Elmhurst College Elmhurst IL
143118 Aurora University Aurora IL
145619 Benedictine University Lisle IL
147536 National Louis University Chicago IL
147828 Olivet Nazarene University Bourbonnais IL
143358 Bradley University Peoria IL
148654 University of Illinois at Springfield Springfield IL
146612 Lewis University Romeoville IL
145336 Governors State University University Park IL
148627 Saint Xavier University Chicago IL
144883 East-West University Chicago IL
148487 Roosevelt University Chicago IL
146719 Loyola University Chicago Chicago IL
144892 Eastern Illinois University Charleston IL
144005 Chicago State University Chicago IL
149231 Southern Illinois University-Edwardsville Edwardsville IL
149772 Western Illinois University Macomb IL
147776 Northeastern Illinois University Chicago IL
145813 Illinois State University Normal IL
148335 Robert Morris University Illinois Chicago IL
144281 Columbia College-Chicago Chicago IL
144740 DePaul University Chicago IL
145637 University of Illinois at Urbana-Champaign Champaign IL
145600 University of Illinois at Chicago Chicago IL
147703 Northern Illinois University Dekalb IL
149222 Southern Illinois University-Carbondale Carbondale IL
147369 Moody Bible Institute Chicago IL
149639 VanderCook College of Music Chicago IL
143297 Blessing Rieman College of Nursing Quincy IL
148511 Rush University Chicago IL
147590 National University of Health Sciences Lombard IL
149028 Saint Anthony College of Nursing Rockford IL
149763 Resurrection University Chicago IL
148575 Saint Francis Medical Center College of Nursing Peoria IL
146533 Lakeview College of Nursing Danville IL
147129 Methodist College Peoria IL
144971 Eureka College Eureka IL
146427 Knox College Galesburg IL
145646 Illinois Wesleyan University Bloomington IL
146667 Lincoln Christian University Lincoln IL
149781 Wheaton College Wheaton IL
146339 Judson University Elgin IL
143048 School of the Art Institute of Chicago Chicago IL
144050 University of Chicago Chicago IL
147767 Northwestern University Evanston IL
148593 St. John’s College-Department of Nursing Springfield IL
145558 Rosalind Franklin University of Medicine and Science North Chicago IL
148849 Shimer College Chicago IL
149329 Telshe Yeshiva-Chicago Chicago IL
145497 Hebrew Theological College Skokie IL
146621 Lexington College Chicago IL
260947 Christian Life College Mount Prospect IL

Finding The Right Tools

 

powerbi-desktop

When it comes to transforming data into information, the tool used is often as important as the data itself. The College Scorecard data-set will require data cleansing and any tool used to create visualizations will need to be able to handle large data volumes. The first step to analyzing data is to understand the structure of your data-set.

To begin, I will use Microsoft Excel to open the .CSV files in the raw data dump and better understand how the data is structured. This will give me an idea of what is relevant and usable in the data-set. It will also give me the beginning of an understanding of what data needs to be cleansed and what columns or rows will be irrelevant to the analysis. Excel can provide quick and easy formulas, sums and averages for simple data calculations.

Second, I will look to use a tool such as Microsoft PowerBI to attempt to analyze and visualize the data. PowerBI allows powerful visualization and the adding of fields and filters to present information in the most relevant manner. Finally, it allows the establishment of relationships between fields, tables and even data-sets. Understanding a data-set, its components and the relationships between fields and tables is important to utilizing it as effective information.