Finding The Right Tools

 

powerbi-desktop

When it comes to transforming data into information, the tool used is often as important as the data itself. The College Scorecard data-set will require data cleansing and any tool used to create visualizations will need to be able to handle large data volumes. The first step to analyzing data is to understand the structure of your data-set.

To begin, I will use Microsoft Excel to open the .CSV files in the raw data dump and better understand how the data is structured. This will give me an idea of what is relevant and usable in the data-set. It will also give me the beginning of an understanding of what data needs to be cleansed and what columns or rows will be irrelevant to the analysis. Excel can provide quick and easy formulas, sums and averages for simple data calculations.

Second, I will look to use a tool such as Microsoft PowerBI to attempt to analyze and visualize the data. PowerBI allows powerful visualization and the adding of fields and filters to present information in the most relevant manner. Finally, it allows the establishment of relationships between fields, tables and even data-sets. Understanding a data-set, its components and the relationships between fields and tables is important to utilizing it as effective information.

Advertisements

Bringing Data Down to Size

As discussed in class, data can be noisy, messy and at times difficult to understand. However, using data and data analysis tools we can transform data into information. The key distinction lies in usability. Information contains relevant and easily digested facts and figures. Part of the process of converting data to information involves reducing the volume being presented. In this post, I’ll explore ways to decrease the size and scope of my data-set to make my analysis more manageable.

explainia-poster1-1024x791

Because my data-set centers on U.S. higher education, five key categories stand out as relevant for narrowing down the data: State, Private vs Public, Size, Institution Type, and Financial Aid. For example, I might want to focus on private universities in Illinois with 5,000 to 15,000 students. To further narrow down my analysis, I would focus on pulling the key figures relevant to my analysis, such as cost, employment rate, and debt and earnings levels after graduation. Not only will this narrow down the scope of my data-set, it will allow for more relevant comparisons between similar institutions.

In my next post, I will investigate data analysis tools I can use to visualize and present the information extracted from the data.