首页 > 代码库 > Data Analysis with Pandas 1

Data Analysis with Pandas 1

1. NumPy: NumPy is a Python module that is used to create and manipulate multidimensional arrays.

2. genfromtxt() : Function of reading dataset in NumPy

  numpy.genfromtxt

numpy.genfromtxt(fnamedtype=<type ‘float‘>comments=‘#‘delimiter=Noneskip_header=0skip_footer=0converters=Nonemissing_values=Nonefilling_values=None,usecols=Nonenames=Noneexcludelist=Nonedeletechars=Nonereplace_space=‘_‘autostrip=Falsecase_sensitive=Truedefaultfmt=‘f%i‘unpack=Noneusemask=False,loose=Trueinvalid_raise=Truemax_rows=None)

  Example:

  import numpy  
  nfl = numpy.genfromtxt("nfl.csv", delimiter=",")

3. array(): Function of creating array.

  matrix = numpy.array([[5, 10, 15], [20, 25, 30], [35, 40, 45]])

4. We can use the shape property on arrays to figure out how many elements are in an array:

  vector = numpy.array([1, 2, 3, 4])

  print(vector.shape) or print(numpy.shape(vector)

5. NumPy will automatically figure out an appropriate data type when reading in data or converting lists to arrays. You can check the data type of a NumPy array using the dtype property.

  numbers = numpy.array([1, 2, 3, 4])

  numbers.dtype

6. nan = not a number

7.selecting elements:

  vector = numpy.array([5, 10, 15, 20])
  equal_to_ten = (vector == 10)# this will return [False,True,False,True]

  print(vector[equal_to_ten]) # Here equals to vector[False,True,False,True], which is [10]. Accoding to this principle, in the matrix, we can find the row which is true.

8.Use astype(type) to change the array of data into another type.

9. With a matrix, when we want to use build-in function to like sum(), we have to specify an additional keyword argument axis. The axis dictates which dimension we perform the operation on.1 means that we want to perform the operation on each row, and 0 means on each column. Here‘s an example where an operation is performed across each row

Data Analysis with Pandas 1