Labs ICT
Pro Login

NumPy Basics

Arrays, operations, and broadcasting for fast computation.

NumPy Basics

NumPy is the foundation of numerical computing in Python. Every data science library — pandas, scikit-learn, TensorFlow — is built on top of it. If you understand NumPy, everything else becomes easier.

Why NumPy?

Python lists are slow for numerical work. NumPy arrays are stored in contiguous memory and operations are implemented in C under the hood. The speed difference is insane — NumPy can be 100x faster than Python lists for large arrays.


import numpy as np
import time

py_list = list(range(1000000))
np_arr = np.arange(1000000)

start = time.time()
sum(py_list)
print(f"Python list: {time.time() - start:.4f}s")

start = time.time()
np.sum(np_arr)
print(f"NumPy array: {time.time() - start:.4f}s")
    
Try it Yourself →

Creating Arrays

There are many ways to create NumPy arrays. You can convert a Python list, use built-in functions, or read from files.


import numpy as np

a = np.array([1, 2, 3, 4, 5])
zeros = np.zeros((3, 4))
ones = np.ones((2, 3))
random_arr = np.random.randn(5)

print(a.shape)
print(zeros)
    

The .shape attribute tells you the dimensions of the array. For 2D data (like a spreadsheet), shape gives you (rows, columns).

Try it Yourself →

Array Operations

NumPy lets you perform operations on entire arrays without writing loops. This is called vectorization and it's one of NumPy's greatest features.


import numpy as np

arr = np.array([1, 2, 3, 4, 5])
print(arr * 2)
print(arr ** 2)
print(np.sqrt(arr))

matrix = np.array([[1, 2], [3, 4]])
print(matrix.T)
    

Notice how we multiplied every element by 2 without a loop? That's vectorization. It's not just cleaner — it's dramatically faster.

Try it Yourself →

Indexing and Slicing

You access elements the same way as Python lists, but NumPy adds powerful multi-dimensional indexing.


import numpy as np

arr = np.array([10, 20, 30, 40, 50])
print(arr[1:4])

matrix = np.array([[1, 2, 3], [4, 5, 6]])
print(matrix[0, 1])
print(matrix[:, 1])
    

The colon syntax : selects all elements along a dimension. This is incredibly useful for selecting columns or rows from a dataset.

Try it Yourself →

Key Takeaways

  • NumPy arrays are much faster than Python lists for numerical operations
  • Vectorization eliminates the need for explicit loops
  • Array shape describes dimensions — essential for debugging
  • Multi-dimensional indexing makes data selection intuitive