首页 > 代码库 > Data Analysis with Pandas 1
Data Analysis with Pandas 1
1. NumPy: NumPy is a Python module that is used to create and manipulate multidimensional arrays.
2. genfromtxt() : Function of reading dataset in NumPy
numpy.genfromtxt
- numpy.genfromtxt(fname, dtype=<type ‘float‘>, comments=‘#‘, delimiter=None, skip_header=0, skip_footer=0, converters=None, missing_values=None, filling_values=None,usecols=None, names=None, excludelist=None, deletechars=None, replace_space=‘_‘, autostrip=False, case_sensitive=True, defaultfmt=‘f%i‘, unpack=None, usemask=False,loose=True, invalid_raise=True, max_rows=None)
Example:
import numpy
nfl = numpy.genfromtxt("nfl.csv", delimiter=",")
3. array(): Function of creating array.
matrix = numpy.array([[5, 10, 15], [20, 25, 30], [35, 40, 45]])
4. We can use the shape property on arrays to figure out how many elements are in an array:
vector = numpy.array([1, 2, 3, 4])
print(vector.shape) or print(numpy.shape(vector)
5. NumPy will automatically figure out an appropriate data type when reading in data or converting lists to arrays. You can check the data type of a NumPy array using the dtype property.
numbers = numpy.array([1, 2, 3, 4])
numbers.dtype
6. nan = not a number
7.selecting elements:
vector = numpy.array([5, 10, 15, 20])
equal_to_ten = (vector == 10)# this will return [False,True,False,True]
print(vector[equal_to_ten]) # Here equals to vector[False,True,False,True], which is [10]. Accoding to this principle, in the matrix, we can find the row which is true.
8.Use astype(type) to change the array of data into another type.
9. With a matrix, when we want to use build-in function to like sum(), we have to specify an additional keyword argument axis
. The axis
dictates which dimension we perform the operation on.1
means that we want to perform the operation on each row, and 0
means on each column. Here‘s an example where an operation is performed across each row:
Data Analysis with Pandas 1