首页 > 代码库 > Data manipulation in python (module 6)
Data manipulation in python (module 6)
1. Pandas plotting
import matplotlib.pyplot as plt import numpy as np import pandas as pd %matplotlib notebook plt.style.use("seaborn-colorblind") np.random.seed(123) # cumsum: add value_of_i + value_of_i+1 = value_of_i+2 df = pd.DataFrame({‘A‘: np.random.randn(365).cumsum(0), ‘B‘: np.random.randn(365).cumsum(0) + 20, ‘C‘: np.random.randn(365).cumsum(0) - 20}, index=pd.date_range(‘1/1/2017‘, periods=365)) # create a scatter plot of columns ‘A‘ and ‘C‘, with changing color (c) and size (s) based on column ‘B‘ df.plot.scatter(‘A‘, ‘C‘, c=‘B‘, s=df[‘B‘], colormap=‘viridis‘) #df.plot.box(); #df.plot.hist(alpha=0.7); #df.plot.kde();
#pd.tools.plotting.scatter_matrix(iris); Create scater plots between the different variables and
#histograms aloing the diagonals to see the obvious patter
#pd.tools.plotting.parallel_coordinates(iris, ‘Name‘);
#visualizing high dimensional multivariate data, each variable in the data set corresponds to an equally spaced parallel vertical line
Output:
2. Seaborn
import numpy as np import pandas as pd import matplotlib.pyplot as plt import seaborn as sns %matplotlib notebook np.random.seed(1234) v1 = pd.Series(np.random.normal(0,10,1000), name=‘v1‘) v2 = pd.Series(2*v1 + np.random.normal(60,15,1000), name=‘v2‘) # plot a kernel density estimation over a stacked barchart plt.figure() plt.hist([v1, v2], histtype=‘barstacked‘, normed=True); v3 = np.concatenate((v1,v2)) sns.kdeplot(v3); plt.figure() # we can pass keyword arguments for each individual component of the plot sns.distplot(v3, hist_kws={‘color‘: ‘Teal‘}, kde_kws={‘color‘: ‘Navy‘}); plt.figure() # sns.jointplot(v1, v2, alpha=0.4); # grid = sns.jointplot(v1, v2, alpha=0.4); # grid.ax_joint.set_aspect(‘equal‘) # sns.jointplot(v1, v2, kind=‘hex‘); # set the seaborn style for all the following plots # sns.set_style(‘white‘) # sns.jointplot(v1, v2, kind=‘kde‘, space=0);# space is used to set the margin of the joint plot
Output:
joint plots
Second example
iris = pd.read_csv(‘iris.csv‘) sns.pairplot(iris, hue=‘Name‘, diag_kind=‘kde‘, size=2);
Third example
iris = pd.read_csv(‘iris.csv‘) plt.figure(figsize=(8,6)) plt.subplot(121) sns.swarmplot(‘Name‘, ‘PetalLength‘, data=http://www.mamicode.com/iris); plt.subplot(122) sns.violinplot(‘Name‘, ‘PetalLength‘, data=http://www.mamicode.com/iris);
Output:
Data manipulation in python (module 6)
声明:以上内容来自用户投稿及互联网公开渠道收集整理发布,本网站不拥有所有权,未作人工编辑处理,也不承担相关法律责任,若内容有误或涉及侵权可进行投诉: 投诉/举报 工作人员会在5个工作日内联系你,一经查实,本站将立刻删除涉嫌侵权内容。