Grouped Summaries and Slice Checks¶
What This Is¶
Grouped summaries help you move beyond one overall average. They show whether a model or dataset behaves differently across categories or slices, and they are one of the fastest ways to find hidden imbalance.
The main job is not to build a fancy table. The main job is to answer a simple question: which slice is actually different, and is it large enough to care about?
When You Use It¶
- checking target rates across categories
- finding suspicious subgroups
- comparing aggregates with slice-level behavior
- comparing more than one metric for the same slice
- deciding whether a subgroup is large enough to trust
- turning a raw table into a reportable summary
Tooling¶
groupbyaggsort_valuesreset_index- named aggregation
sizecountnuniquepivot_tablecrosstabtransformfiltervalue_counts
Most Useful Functions¶
groupby is the starting point. It splits the table by a key or keys and gives you a grouped object that can summarize each slice.
agg is the workhorse. It lets you compute means, counts, unique counts, and other summaries in one pass. Named aggregation is usually the cleanest form when you want readable column names.
size counts rows in each group, including missing values in the measured columns. count counts non-null values in a specific column. That difference matters when missingness is part of the story.
nunique tells you how many distinct values appear inside each group. It is useful when you want to know whether a category is repetitive or varied.
value_counts is the fastest way to inspect how common a category is. On a grouped Series, it can show the distribution inside each group.
pivot_table and crosstab are useful when you want a two-way summary that reads more like a report than a raw groupby result.
transform attaches a group statistic back to every row. That is useful when you want a row-level comparison against the slice mean.
filter keeps or drops whole groups based on a rule. That is useful when a slice is too small to trust.
Minimal Example¶
summary = df.groupby("channel", dropna=False).agg(
review_rate=("needs_human_review", "mean"),
n=("needs_human_review", "size"),
)
Worked Pattern¶
slice_summary = (
df.groupby(["channel", "issue_category"], as_index=False, dropna=False)
.agg(
review_rate=("needs_human_review", "mean"),
n=("needs_human_review", "size"),
unique_agents=("agent_id", "nunique"),
)
.sort_values(["review_rate", "n"], ascending=[False, False])
)
That pattern is useful because it gives you:
- the rate you want to inspect
- the slice size you need to trust it
- a uniqueness check that can reveal whether a group is dominated by the same few entities
If you want a two-way report, pivot_table and crosstab are often easier to read:
rate_table = pd.pivot_table(
df,
values="needs_human_review",
index="channel",
columns="issue_category",
aggfunc="mean",
margins=True,
)
count_table = pd.crosstab(
df["channel"],
df["issue_category"],
margins=True,
)
Use pivot_table when you want an aggregated value in a matrix layout. Use crosstab when you want a cross-tab frequency table or a compact two-way comparison.
Leave unobserved combinations as NaN when the table is showing rates. That means “no rows for this slice,” not “observed zero rate.”
transform is the row-level companion to group summaries:
df["channel_review_rate"] = df.groupby("channel")["needs_human_review"].transform("mean")
df["above_channel_average"] = df["needs_human_review"] > df["channel_review_rate"]
That pattern is useful when you want to compare each row against its own slice without losing row-level alignment.
filter is the cleanup tool for tiny groups:
large_slices = df.groupby("channel").filter(lambda g: len(g) >= 10)
That keeps only groups with enough rows to trust.
What To Inspect¶
- whether the highest rate is also supported by a decent count
- whether
sizeandcounttell the same story - whether missing values are hiding inside a slice
- whether the group key is too broad or too narrow
- whether a second key changes the story completely
- whether
value_countsor a cross-tab reveals a pattern the mean hides - whether the table is easier to explain with named aggregation
If a slice looks suspicious, inspect its exact rows before trusting the summary. Group tables are a lens, not the proof itself.
Common Mistakes¶
- using
countwhen you really wanted row counts fromsize - sorting by rate and ignoring the sample size
- forgetting
dropna=Falsewhen missing values are part of the story - using
as_index=Trueand then fighting the index when a plain table would be easier to read - grouping by too many columns and making every slice tiny
- reading
pivot_tabletotals without checking the underlying counts
observed=True can also matter when grouping categorical columns because it keeps the output focused on observed categories. That often makes the table easier to read.
Failure Pattern¶
Trusting a dramatic slice rate without checking the count. A slice with review_rate = 1.00 and n = 1 is not strong evidence.
Another failure pattern is collapsing the data into a single pivot and stopping there. If the summary looks interesting, the next step is usually to inspect the rows behind the slice, not to make the table prettier.
Quick Tricks¶
- Use
as_index=Falsewhen you want the result to read like a normal table. - Use
dropna=Falsewhen missing values should be treated as an explicit slice. - Use
sizefor row counts andcountfor non-null counts. - Use
nuniquewhen you need to know whether a slice is being driven by one repeated value. - Use
value_counts(normalize=True)when proportions matter more than raw frequency. - Use
transformwhen you want to compare each row against its group.
Practice¶
- Compute a grouped summary by one categorical column and include both
sizeandcount. - Compute a two-column slice summary and sort by rate and count.
- Mark all slices with fewer than three rows as low confidence.
- Rewrite the summary with named aggregation and compare the readability.
- Build a
pivot_tableversion and explain whether it makes the pattern easier or harder to read. - Build a
crosstabversion and explain whether the row/column layout changes your conclusion. - Use
transformto attach a slice mean back onto the original table. - Pick one slice that looks strong and explain why its count is enough, or not enough, to trust it.
- Find one group where
sizeandcountdiffer and explain what missingness is doing. - Use
filterto remove tiny groups, then explain what changed.
Runnable Example¶
Open the matching example in AI Academy and run it from the platform.
While reading the result, ask:
- which slice has the highest rate
- which slice has the best combination of rate and count
- which slice should be treated as a warning rather than a conclusion
- whether a second grouping key makes the story sharper or noisier
- whether
transformwould help you compare rows against their slice
Inspect which slices have both a high rate and enough count to be worth discussing.
Questions To Ask¶
- Is the highest-rate slice also one of the largest slices?
- Does the result change when you add a second grouping key?
- Are missing values being counted or silently dropped?
- Would
pivot_tableorcrosstabmake the report easier to read? - Which group statistic would you want to attach back to each row with
transform? - Which slice should be filtered out because it is too small to trust?
- What would change if you normalized the counts into proportions?
Longer Connection¶
Continue with Python, NumPy, Pandas, Visualization for a longer inspection workflow.