首页 > 代码库 > Data Visualizations 3

Data Visualizations 3

Data Cleaning and visualization:

  1.Before cleaning a set of data, we need to inspect the data by using shape(),head(),dtype(),decribe() function.

  2.First, we are going to deal with the missing data.(by using dropna() or loc[])   

  3.Second, we are going to normalize/victorize the data. 

  4.We need to convert some special data types to float. ( the use of str.rstrip(""), astype("") )

  5.To change the index of each dataframe by using set_index function.

  6.Create a new Dataframe which contains only necessary data. When create a new dataframe according to an origional data frame. The index keep the same.

  #critics_reviews =pd.DataFrame({"RT Score":pixar_movies["RT Score"],"IMDB Score":pixar_movies["IMDB Score"],"Metacritic Score":pixar_movies["Metacritic Score"]})

  7.Plot the dataset. Adjust the cell size by using figsize function. #critics_reviews.plot(figsize = (9,5),kind = ‘box‘)

  8.To compare two values which has the same total number(like percentage). We can use stacked bar plot.

Conclusion:

  Before analyzing the data. First we want to have a clean data set. It is better the data set only contains float or string in the same range. Then we plotting the data set to create a compelling chart. 

Data Visualizations 3