首页 > 代码库 > Data Visualizations 3
Data Visualizations 3
Data Cleaning and visualization:
1.Before cleaning a set of data, we need to inspect the data by using shape(),head(),dtype(),decribe() function.
2.First, we are going to deal with the missing data.(by using dropna() or loc[])
3.Second, we are going to normalize/victorize the data.
4.We need to convert some special data types to float. ( the use of str.rstrip(""), astype("") )
5.To change the index of each dataframe by using set_index function.
6.Create a new Dataframe which contains only necessary data. When create a new dataframe according to an origional data frame. The index keep the same.
#critics_reviews =pd.DataFrame({"RT Score":pixar_movies["RT Score"],"IMDB Score":pixar_movies["IMDB Score"],"Metacritic Score":pixar_movies["Metacritic Score"]})
7.Plot the dataset. Adjust the cell size by using figsize function. #critics_reviews.plot(figsize = (9,5),kind = ‘box‘)
8.To compare two values which has the same total number(like percentage). We can use stacked bar plot.
Conclusion:
Before analyzing the data. First we want to have a clean data set. It is better the data set only contains float or string in the same range. Then we plotting the data set to create a compelling chart.
Data Visualizations 3