Skip to content

Analysis gotchas: comapare Legacy and Glean in GLAM #862

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Feb 24, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .spelling
Original file line number Diff line number Diff line change
Expand Up @@ -370,6 +370,7 @@ Taskcluster
TBD
TCP
templated
timeframe
timeline
timelines
timestamp
Expand Down
41 changes: 41 additions & 0 deletions src/concepts/analysis_gotchas.md
Original file line number Diff line number Diff line change
Expand Up @@ -300,3 +300,44 @@ A build id might be formatted in any way and contain the time or version control

Do not assume build id's are consistent across the products we ship. A build id format may vary between products, between channels of the same product, or over time within the same channel of the same product.
The build id format for Firefox Desktop has been very stable over time thus far, but even it can be different for different platforms in some respin circumstances (if e.g. only one platform's builder failed).

## Comparing Legacy Telemetry and Glean Data in GLAM

### Official Recommendation

> **Do Not Compare Legacy Telemetry and Glean Data Directly in GLAM.**

- If you need to track long-term trends for a particular metric, treat the Legacy Telemetry timeframe and the Glean timeframe as **separate eras**.
- For in-depth analysis, rely on the Glean instrumentation once you have fully migrated, and use Legacy Telemetry only for historical reference.
- Recognize that both Legacy Telemetry and Glean “tell the same story” but from different angles and with different measurement methodologies.
- Both data sources remain valid and useful, but **side-by-side comparison is not recommended and if done should be approached with caution**. Instead, analysts are encouraged to use Legacy Telemetry data for historical context and Glean data for current and future trends.

#### If you still need to do side-by-side comparisons, be aware that significant discrepancies will occur due to a variety of factors:

1. **Bucket Discrepancies (Histograms)**

- **Legacy Telemetry**: Fewer buckets; Uses a fixed number of buckets depending on histogram type.
- **Glean**: More buckets; Uses an algorithmically-generated number of buckets depending on the metric's distribution type.
- **Result**: The distributions and percentiles can look different in GLAM even when measuring the same underlying data because the histogram bounds and number of buckets do not match.

2. **Cross-Process vs. Per-Process Collection**

- **Legacy Telemetry**: Often collects data per process (e.g., main, content, etc.) and can send data differently depending on the process.
- **Glean**: Consolidates measurements across multiple processes.
- **Result**: Aggregated Glean data may appear larger or differently distributed compared to Legacy data, because it merges what Legacy would treat as separate process-specific measurements.

3. **Ping Differences ("baseline" & "metrics" Pings in Glean, "main" pings in Legacy Telemetry)**

- **Legacy Telemetry**: Typically sends one primary ping type (e.g., the “main” ping) for most data.
- **Glean**: Splits data into multiple ping types (e.g., a “baseline” ping, a “metrics” ping, etc.).
- **Result**: The same metric can appear to have more frequent updates or different submission times in Glean if it is reported in multiple pings.

4. **Different Reporting Frequencies (Especially for Scalars)**
- **Legacy Telemetry**: Sends telemetry data [at distinct intervals or under certain conditions](https://firefox-source-docs.mozilla.org/toolkit/components/telemetry/data/main-ping.html). Usually per browsing session.
- **Glean**: Generally sends data [less often](https://mozilla.github.io/glean/book/user/pings/metrics.html#scheduling). Usually once a day for the `metrics` ping.
- **Result**: Scalar comparisons (like sums or counts) often diverge because each system “batches” or “chunks” the data differently over time.

#### Impact on Analyses

- **Histogram Metrics**: Expect to see different bucket distributions, total counts, and percentile shapes.
- **Scalars**: Differences in sums, counts, and other simple accumulations are common. The magnitude of these discrepancies may vary depending on how often the ping is sent, how usage patterns differ, and whether data is merged across processes.