When it comes to transforming data into information, the tool used is often as important as the data itself. The College Scorecard data-set will require data cleansing and any tool used to create visualizations will need to be able to handle large data volumes. The first step to analyzing data is to understand the structure of your data-set.
To begin, I will use Microsoft Excel to open the .CSV files in the raw data dump and better understand how the data is structured. This will give me an idea of what is relevant and usable in the data-set. It will also give me the beginning of an understanding of what data needs to be cleansed and what columns or rows will be irrelevant to the analysis. Excel can provide quick and easy formulas, sums and averages for simple data calculations.
Second, I will look to use a tool such as Microsoft PowerBI to attempt to analyze and visualize the data. PowerBI allows powerful visualization and the adding of fields and filters to present information in the most relevant manner. Finally, it allows the establishment of relationships between fields, tables and even data-sets. Understanding a data-set, its components and the relationships between fields and tables is important to utilizing it as effective information.