The purpose of this project is to understand what, where and how each user is listening to the songs in the meta data generated base on the Million Song Dataset. The analytial goals is to find out what is making the free tier users switch to paid tier and why...
[Read More]
Data Modeling with Cassandra
Query data with partition key
Using Python to create an ETL pipeline for data modeling with Apache Cassandra.
[Read More]
Consumer Complaints Data Transformation
Using Python Script to transform data
For this project using only built-in Python libraries, we want to know for each financial product and year, the total number of complaints, number of companies receiving a complaint, the company with the most complaints, and the highest percentage of complaints directed at a single company.
[Read More]
Exploratory Analysis of Apple Mobile App Reviews
Extracting data from iTunes API to Data Visualization
Analyzing Yelp Dataset with Scattertext
Exploratory data analysis and visualization for text data using NLP
One of the most crucial work in the text mining field is to present the content of the text data visually. Using natural language processing (NLP), a data scientist can summarize documents, create topics, explore storylines of the content in different angles and scope of details.
[Read More]