Getting Started with Numpy

The basics you need to know

NumPy is a Python package. It stands for 'Numerical Python'. It is a library consisting of multidimensional array objects and a collection of routines for processing of array.

Operations using NumPy, a developer can perform the following operations:

-Mathematical and logical operations on arrays.
-Fourier transforms and routines for shape manipulation.
-Operations related to linear algebra.
-NumPy has in-built functions for linear algebra and random number generation.

To get started with data analysis using NumPy, you first need to import the NumPy library by using the import statement as follows:


import numpy as np

The next step is to create NumPy arrays, which can hold a collection of values. You can create 1D and 2D arrays using the np.array() function, as shown below:

pythonCopy code# Creating a 1D array
arr1 = np.array([1, 2, 3, 4])

# Creating a 2D array
arr2 = np.array([[1, 2], [3, 4]])
print(arr2)
# Output: 
# [[1 2]
#  [3 4]]

You can also generate arrays using NumPy functions. For example, to create an array of zeros or ones, you can use the np.zeros() or np.ones() function, respectively, as follows:

# Generate an array of zeros
zeros = np.zeros((3, 3))
print(zeros)
# Output:
# [[0. 0. 0.]
#  [0. 0. 0.]
#  [0. 0. 0.]]

# Generate an array of ones
ones = np.ones((2, 2))
print(ones)
# Output:
# [[1. 1.]
#  [1. 1.]]

To generate an array of evenly spaced values, you can use the np.linspace() function, as shown below:

# Generate an array of evenly spaced values
linspace = np.linspace(0, 10, 5)

print(linspace)
# Output: [ 0.   2.5  5.   7.5 10. ]

This generates an array of 5 equally spaced values between 0 and 10.

Lastly, you can generate an identity matrix, which is a square 2D array with ones on the diagonal and zeros elsewhere, using the np.eye() function:

# Generate an identity matrix
identity = np.eye(3)
print(identity)
# Output:
# [[1. 0. 0.]
#  [0. 1. 0.]
#  [0. 0. 1.]]

After creating NumPy arrays, you can manipulate and analyze them in various ways. Here are some basic techniques you will need:

You can obtain array attributes, such as the shape (dimensions), size (number of elements), and data type, using the following code:

import numpy as np

# Create a 2D array
arr2 = np.array([[1, 2], [3, 4]])

# Get the shape (dimensions) of an array
shape = arr2.shape
print(shape)
# Output: (2, 2)

# Get the number of elements in an array
size = arr2.size
print(size)
# Output: 4

# Get the data type of an array
dtype = arr2.dtype
print(dtype)
# Output: int64

To access elements in an array, you can use indexing and slicing. In a 1D array, you can access elements using their indices, as shown below:

#Access elements in a 1D array
element = arr1[0] # First element
element = arr1[-1] # Last element.

In a 2D array, you need to provide both the row and column indices to access an element, as shown below:

# Access elements in a 2D array
element = arr2[0, 1]  # Element in 1st row, 2nd column

You can also slice arrays to extract a subarray using the : operator. In a 1D array, you can slice as follows:

# Slicing a 1D array
subarray = arr1[1:3]  # Elements from index 1 (inclusive) to 3 (exclusive)

In a 2D array, you can slice using the : operator for both rows and columns, as shown below:

# Slicing a 2D array
subarray = arr2[:, 0]  # All rows, 1st column

You can perform basic array operations, such as element-wise addition, subtraction, multiplication, and division.

Broadcasting

Broadcasting is a NumPy feature that allows arithmetic operations to be performed on arrays with different shapes. The smaller array is "broadcasted" to match the shape of the larger array, so that the operation can be performed element-wise.

# Example of broadcasting
arr1 = np.array([1, 2, 3])
arr2 = np.array([[1, 2, 3], [4, 5, 6]])

result = arr1 + arr2

# The smaller array arr1 is broadcasted to match the shape of arr2, resulting in:
# [[2, 4, 6], [5, 7, 9]]

Concatenation

NumPy arrays can be concatenated using the concatenate function. The arrays being concatenated must have the same shape, except for the dimension along which they are being concatenated.

# Example of concatenation
arr1 = np.array([[1, 2], [3, 4]])
arr2 = np.array([[5, 6], [7, 8]])

# Concatenate along the first axis (rows)
result = np.concatenate((arr1, arr2), axis=0)

# Result:
# [[1, 2],
#  [3, 4],
#  [5, 6],
#  [7, 8]]

# Concatenate along the second axis (columns)
result = np.concatenate((arr1, arr2), axis=1)

# Result:
# [[1, 2, 5, 6],
#  [3, 4, 7, 8]]

In summary,

NumPy provides a powerful set of tools for data analysis, including array creation and manipulation, indexing and slicing, arithmetic operations, reshaping and transposing, statistical functions, broadcasting, and concatenation. By mastering these fundamental techniques, you'll be able to perform a wide range of data analysis tasks using NumPy. As you continue to gain experience, you can explore more advanced features and techniques, such as advanced indexing, masking, and broadcasting.