Working with Image Data
Images are just arrays of pixels! Think of it like a digital mosaic where each tile (pixel) has a color value. NumPy makes image manipulation easy.
Images as Arrays
A grayscale image is a 2D array, color is 3D (height, width, RGB).
import numpy as np
grayscale = np.random.randint(0, 256, (100, 100))
color = np.random.randint(0, 256, (100, 100, 3))
print(f"Grayscale shape: {grayscale.shape}")
print(f"Color shape: {color.shape}")
Each value is 0-255 (8-bit). 0 is black, 255 is white for grayscale.
Basic Image Operations
You can manipulate images with array operations.
import numpy as np
image = np.random.randint(0, 256, (100, 100))
brighter = np.clip(image + 50, 0, 255)
inverted = 255 - image
print(f"Original mean: {image.mean():.1f}")
print(f"Brighter mean: {brighter.mean():.1f}")
Here is the thing - np.clip() ensures values stay in valid range (0-255).
Cropping and Resizing
Use slicing to crop, reshape to resize.
import numpy as np
image = np.random.randint(0, 256, (100, 100, 3))
cropped = image[20:80, 30:70]
resized = image.reshape(50, 200, 3)
print(f"Original shape: {image.shape}")
print(f"Cropped shape: {cropped.shape}")
print(f"Resized shape: {resized.shape}")
One thing that confused me at first was the order of dimensions. OpenCV uses (height, width, channels), while some libraries use (channels, height, width).
Try it Yourself →Key Takeaways
- Grayscale images are 2D arrays, color images are 3D
- Pixel values range from 0 (black) to 255 (white)
- Use np.clip() to keep values in valid range
- Slicing can crop images efficiently