Skip to content

Array Shapes and Axis Operations

What This Is

This topic is about keeping rows, columns, and axis reductions straight. Many later mistakes in modeling come from losing track of which dimension represents examples and which represents features.

The practical skill is not "knowing NumPy exists." It is knowing which NumPy operation preserves row alignment, which one reduces a dimension, and which one silently changes the meaning of your data.

When You Use It

  • building feature matrices with NumPy
  • reducing across rows or columns
  • adding derived features to an existing matrix

Most Useful NumPy Moves

The functions below do most of the work in this topic:

  • np.array to create a matrix from Python lists
  • shape and np.ndim to inspect the current layout
  • mean and sum to reduce across rows or columns
  • reshape and np.expand_dims to turn 1D results into column vectors
  • np.squeeze to remove size-1 dimensions when you intentionally added them
  • np.column_stack to append 1D derived features as columns
  • np.stack to create a brand-new axis when you want to keep separate pieces distinct
  • np.concatenate to join arrays along an existing axis when the shapes already match

That list is enough for most tabular feature-matrix work.

Tooling

  • numpy.array
  • shape
  • mean(axis=...)
  • sum(axis=...)
  • boolean masks
  • np.column_stack
  • np.stack
  • np.concatenate
  • reshape
  • np.expand_dims
  • np.squeeze
  • astype(float)
  • np.ndim

Minimal Example

import numpy as np

X = np.array([[5.0, 0.0, 86.0], [0.0, 2.0, 64.0], [2.0, 1.0, 78.0]])
row_means = X.mean(axis=1)
column_means = X.mean(axis=0)
column_totals = X.sum(axis=0)

Axis Mental Model

Think of axis as the direction you are collapsing:

  • axis=0 reduces rows and returns one value per column
  • axis=1 reduces columns and returns one value per row

For a 3 x 4 matrix:

  • X.mean(axis=0) gives 4 values
  • X.mean(axis=1) gives 3 values

That one distinction explains many shape bugs.

Worked Pattern

late_flag = (X[:, 0] <= 0).astype(float)
repeat_flag = (X[:, 1] >= 2).astype(float)
X_augmented = np.column_stack([X, late_flag, repeat_flag])
print(X.shape, X_augmented.shape)

What to notice:

  • axis=1 reduces across columns and returns one value per row
  • axis=0 reduces across rows and returns one value per column
  • every derived feature must have the same row count as the base matrix
  • column_stack is safe only when row alignment is still intact

Two more useful patterns:

col_means = X.mean(axis=0, keepdims=True)
X_centered = X - col_means
row_means = X.mean(axis=1)
row_means_2d = np.expand_dims(row_means, axis=1)

keepdims=True is useful when you want broadcasting to keep working without manually reshaping the result. np.expand_dims is useful when a 1D result needs to become a column vector before stacking.

Two debugging habits help here:

  • print the shape after every transformation that changes the matrix
  • compare the derived vector length to X.shape[0] before stacking anything

If a feature came from a filtered subset, rebuild it from the original row order instead of trying to force it into place.

Failure Pattern

Creating a derived feature with the wrong row count and stacking it anyway. If the base matrix has n rows, every derived feature must also have n rows.

Other warning signs:

  • a boolean mask was computed on a different table
  • a derived vector was sorted independently from X
  • a transpose changed the interpretation of rows and columns
  • a matrix was stacked before the row counts were checked
  • squeeze removed a dimension you still needed for later stacking
  • reshape was used to silence a shape mismatch instead of fixing the upstream logic

np.stack is also a common place to make a mistake. It creates a new axis, so it is not the same operation as appending a column. If you want to add features, column_stack or concatenate(..., axis=1) is usually the right shape story.

np.concatenate joins along an existing axis. That makes it useful when you already have compatible 2D arrays and want to widen the feature matrix without creating a new axis.

np.squeeze removes size-1 dimensions. That is convenient after an expand_dims or a reshape, but dangerous if you are not sure whether the singleton dimension is still needed.

np.ndim is a fast sanity check when you are not sure whether something is a vector, a matrix, or a higher-dimensional tensor.

Practice

  1. Create a 4 x 3 array and compute row means and column means.
  2. Add one derived flag and verify the new shape.
  3. Build one wrong-length feature vector and explain why it must be rejected.
  4. Use reshape to turn a 1D vector into a column vector and explain why that matters for stacking.
  5. Compare np.column_stack with np.stack and describe when each one is the more natural choice.
  6. Explain what happens if you compute a mask on sorted data and then attach it to unsorted data.
  7. Show how keepdims=True changes the result of a reduction.
  8. Explain the difference between concatenate, stack, and column_stack in one sentence each.

Runnable Example

Open the matching example in AI Academy and run it from the platform.

While reading the output, ask:

  • did the row count stay fixed
  • do the row means and column means answer different questions
  • did the augmented shape change only in the feature dimension
  • would a later model still know which values came from which row

Inspect the row means, column means, and final stacked matrix shape. If the shape changes in a way you did not expect, the bug is probably in how the new feature was built, not in the model.

Quick Checks

  • If a vector should be one value per row, its length must match X.shape[0].
  • If a value came from a reduction across columns, it should be a row-level summary.
  • If a value came from a reduction across rows, it should be a column-level summary.
  • If the output is going into column_stack, test the length first and the meaning second.
  • If a reduction will be used in subtraction or division later, consider keepdims=True.
  • If you are about to call squeeze, ask whether the size-1 dimension was useful for broadcasting.

Common Tricks

  • Use X.mean(axis=0, keepdims=True) when you want a column-wise summary that still broadcasts cleanly back onto X.
  • Use X.mean(axis=1) when you want a row-level signal and then turn it into a column with reshape(-1, 1) or np.expand_dims(..., axis=1).
  • Use np.column_stack for one-dimensional derived features that should become new columns.
  • Use np.concatenate([X, extra], axis=1) when both inputs are already 2D and share the same row count.
  • Use np.stack([...], axis=0) only when you really want to create a new outer axis and keep the inputs separate.

Questions To Ask

  1. Does this operation reduce rows, reduce columns, or add a new axis?
  2. If I print the shape, what exact tuple do I expect?
  3. Is this derived value one per row or one per column?
  4. Would broadcasting still work if I changed this to keepdims=True?
  5. Am I appending a feature or building a new dimension?
  6. What would break first if I shuffled the rows before stacking?

Longer Connection

Continue with Python, NumPy, Pandas, Visualization for a fuller data-to-matrix workflow.