AI Data Copilot

Download the channel's data as a SQLite file, paste it into Claude or ChatGPT with any prompt below, and ask questions YouTube Studio can't answer. No SQL required.

Data through Apr 28, 2026

YouTube's reporting tends to lag 2–3 days behind real time. Newer activity may not yet appear.

This database was built at 2026-04-30T16:38:21.176Z. Daily aggregates typically reflect data through 2026-04-28.

schema v42— see changelog at the bottom of this page

What we tested

Conventional beliefs about YouTube — does each apply to this channel?

Conventional wisdom about YouTube is usually true on average. The interesting question is whether it's true on thischannel. Each card tests one belief against this channel's data and reports agrees, disagrees, inconclusive, or not currently testable.

Beliefdisagrees

“High click-through-rate drives video success more than high retention.”

On this channel: 0.11×· ctr over retention· agrees ≥ 1.5× — the belief disagrees here.

n = 19, high confidence.

How this is computed

Per video, scale CTR and retention to channel-relative units (each
value's distance from the channel's median, divided by the channel's
typical spread). Fit a linear regression of total views against both.
If the magnitude of the CTR coefficient is at least 1.5× the retention
coefficient, the data agrees that CTR dominates on this channel.

Source: MrBeast school of creator advice; widely contested.

Beliefdisagrees
“Videos 8–10 minutes long perform best for ad revenue and watch time.”
On this channel: 20 min· retention 29%· belief expects 10 min — the belief disagrees here.
n = 16, high confidence.
How this is computed
```
Bucket videos by duration (60/180/360/600/1200/1800/3600+ sec). Median
weighted CTR, AVD, retention pct, sub conversion per bucket. Agrees if
600s bucket has the highest median weighted retention pct.
```
Source: Creator-economy folklore; pre-2018 mid-roll-ad threshold.

Beliefagrees
“Publishing video N consumes impressions that would otherwise have gone to video N−1, slowing N−1's trajectory.”
On this channel: 0.50· agrees ≤ 0.70 — the belief agrees here.
n = 19, medium confidence.
How this is computed
```
For each video pair (N−1, N) where N−1 is still in days 1–7 of its
lifecycle when N publishes, compare N−1's view trajectory in the 3 days
after N's publish to N−1's trajectory in the 3 days before. Channel
median of (views_3d_after / views_3d_before) across eligible pairs.
```
Source: Common creator advice; cited across creator coaching channels.
Beliefagrees
“A new publish steals impression budget from the rest of the catalog.”
On this channel: 0.62 — the belief agrees here.
n = 20, high confidence.
How this is computed
```
When video N publishes, change in non-N videos' impression share over
the next 3 days. Agrees if median impression-share-of-others drops
≥15% in the post-publish window.
```
Source: Catalog-cannibalization folklore.
Beliefagrees
“Publishing too often cannibalizes per-video performance.”
On this channel: 0.50· agrees ≤ 0.70 — the belief agrees here.
n = 19, medium confidence.
How this is computed
```
Same calculation as #1.
```
Source: Aliased to #1 frequency_cannibalization (different phrasing of same belief).
Beliefagrees
“High retention drives video success more than high click-through-rate.”
On this channel: 9.32×· retention over ctr· agrees ≥ 1.5× — the belief agrees here.
n = 19, high confidence.
How this is computed
```
Same regression as #7; agrees if |β_retention| > 1.5 × |β_ctr|. Note:
can be `inconclusive` simultaneously with #7 if neither dominates.
```
Source: Sean Cannell / Think Media school; opposite of #7.
Beliefagrees
“Most subscribers come from a few breakout videos, not from many videos contributing equally.”
On this channel: 0.89· agrees ≥ 0.70 — the belief agrees here.
n = 21, high confidence.
How this is computed
```
Gini coefficient on subs_gained distribution across videos. Agrees if
Gini ≥0.7.
```
Source: Power-law folklore.

Beliefinconclusive
“Distribution happens in two phases — test (low impressions, narrow audience) then scale.”
On this channel: sample too thin to read — the belief inconclusive here.
n = 1, low confidence.
How this is computed
```
Per video, fit a piecewise function to the impression-by-day curve.
Identify the 'elbow' where slope changes most. Agrees if ≥50% of
mature videos show a detectable elbow within days 3–14.
```
Source: Two-phase-algorithm folklore.
Beliefinconclusive
“Videos with average view percentage above 50% receive sustained impression delivery.”
On this channel: sample too thin to read — the belief inconclusive here.
n = 17, low confidence.
How this is computed
```
Per video, compare impression delivery rate in days 8–14 against days 1–7.
Bucket by retention tier (<30%, 30–50%, 50%+). Agrees if 50%+ bucket has
impression-rate ratio ≥0.7 AND lower-retention buckets ≤0.4.
```
Source: TubeBuddy creator guides.
Beliefinconclusive
“Channel-page-driven views come from highly engaged viewers.”
On this channel: 26.0 — the belief inconclusive here.
n = 674, low confidence.
How this is computed
```
Compare AVD, retention pct, sub conversion, comment rate of Channel Page
traffic vs Suggested traffic. Agrees if all four are higher for Channel Page.
```
Source: Common assertion.
Beliefinconclusive
“As library grows, Channel Page share of views falls.”
On this channel: 14.0 — the belief inconclusive here.
n = 14, low confidence.
How this is computed
```
Per channel, Pearson correlation between days_since_first_video and
trailing-7d Channel Page view share. Agrees if r ≤−0.5.
```
Source: Library-growth folklore.
Beliefinconclusive
“Comments-per-view tracks audience resonance better than likes-per-view.”
On this channel: sample too thin to read — the belief inconclusive here.
n = 19, high confidence.
How this is computed
```
For each engagement metric, Pearson correlation with sub conversion rate
at the video level. Agrees if comments/v has the highest correlation.
```
Source: Engagement-depth folklore.
Beliefinconclusive
“Comments are weighted more than likes by the algorithm.”
On this channel: sample too thin to read — the belief inconclusive here.
n = 12, low confidence.
How this is computed
```
For videos with comparable view counts, regress impression-rate-after-day-7
on (comments per view) and (likes per view) separately. Agrees if
|β_comments| > 2 × |β_likes|.
```
Source: Various creator coaching.
Beliefinconclusive
“YouTube's algorithm tests a video primarily in its first 1–3 days, after which performance is largely set.”
On this channel: sample too thin to read — the belief inconclusive here.
n = 1, low confidence.
How this is computed
```
For each video with ≥14 days tracked, what fraction of total lifetime
views came from days 1–3? Channel median across qualifying videos.
```
Source: VidIQ and TubeBuddy guides; widely repeated in creator literature.
Beliefinconclusive
“Once 30 days post-publish, a video's daily view rate stabilizes and varies little.”
On this channel: sample too thin to read — the belief inconclusive here.
n = 0, low confidence.
How this is computed
```
For videos ≥60 days old, coefficient of variation of daily views in
days 30–60. Channel median. Agrees if median CoV ≤0.5.
```
Source: Common analytics-tool framing.
Beliefinconclusive
“Editing title or thumbnail post-publish kills momentum.”
On this channel: 0.00 — the belief inconclusive here.
n = 0, low confidence.
How this is computed
```
Same as #12, but agrees if post-change views stay ≤50% of pre-change
baseline for 14+ days. Tests for sustained damage rather than reset+recovery.
```
Source: Conventional wisdom (the 'no-fiddle' rule).
Beliefinconclusive
“Viewers from external sites have the highest engagement rates.”
On this channel: 1.00 — the belief inconclusive here.
n = 1, low confidence.
How this is computed
```
AVD, retention, sub conversion of External vs all other sources. Agrees
if External AVD ≥ p75 of cross-source AVD distribution. Sample-size flag
triggers when External views < 100 channel-wide.
```
Source: Common assertion.
Beliefinconclusive
“Days to first 100 subscribers is the slowest sub-acquisition leg.”
On this channel: sample too thin to read — the belief inconclusive here.
n = 9, low confidence.
How this is computed
```
Days from 1 sub to 100 subs vs days from 100 to 200, 200 to 300, etc.
Channel-relative growth rates per sub-bucket. Agrees if days-per-sub
in 0–100 is ≥2× days-per-sub in 100–500.
```
Source: Beginner-creator coaching.
Beliefinconclusive
“A video's lifetime trajectory is largely determined by its first 24–48 hour performance.”
On this channel: n = 0· min 5 — the belief inconclusive here.
n = 0, low confidence.
How this is computed
```
Pearson correlation of day1_views vs total_views across all videos with
≥30 days tracked. Same correlation for day1_ctr vs lifetime_weighted_ctr.
Higher r = belief agrees on this channel.
```
Source: Closely related to #2; cited as 'velocity matters'.
Beliefinconclusive
“Longer videos accumulate more total watch minutes per viewer.”
On this channel: sample too thin to read — the belief inconclusive here.
n = 19, high confidence.
How this is computed
```
Pearson correlation of duration_seconds and (total_watch_minutes / total_views).
Agrees if r ≥0.5.
```
Source: Logical inference (bounded by retention).
Beliefinconclusive
“Late-afternoon publishing optimizes day-0 reach.”
On this channel: sample too thin to read — the belief inconclusive here.
n = 21, low confidence.
How this is computed
```
Median day-1 views by hour-of-publish. Identify mode. Agrees if mode
falls in 14:00–18:00 local time. Many channels lock to one hour, making
this inconclusive.
```
Source: Publishing-time folklore.
Beliefinconclusive
“Channels that post on a consistent cadence outperform those that don't.”
On this channel: 22.0 — the belief inconclusive here.
n = 20, low confidence.
How this is computed
```
Cadence regularity score = the typical day-to-day spread of
inter-publish gaps, divided by the mean gap (coefficient of
variation). Lower = more consistent. Correlate against
trailing-30-day view growth.
```
Source: Universal creator-coaching advice.
Beliefinconclusive
“Long gaps between uploads cause the algorithm to deprioritize the channel.”
On this channel: sample too thin to read — the belief inconclusive here.
n = 0, low confidence.
How this is computed
```
For gaps of ≥5 days between publishes, compare the post-gap video's
day-1/day-3 performance against the channel's typical day-1/day-3.
```
Source: Common creator coaching.
Beliefinconclusive
“Once a channel has 30+ videos, key ratios (CTR, AVD, sub conversion) settle.”
On this channel: sample too thin to read — the belief inconclusive here.
n = 21, low confidence.
How this is computed
```
Coefficient of variation of weekly weighted CTR and AVD across the
channel's history. Compare CoV in weeks 1–10 vs weeks 11+. Agrees if
both CoVs drop by ≥30% after the 30-video mark.
```
Source: Maturation folklore.
Beliefinconclusive
“Videos with growing Search-source share over time become long-term assets.”
On this channel: 15.0 — the belief inconclusive here.
n = 8, low confidence.
How this is computed
```
Per video, slope of weekly Search-source view share over the video's
lifetime. Channel-level: share of mature videos with positive slope.
```
Source: SEO-creator advice.
Beliefinconclusive
“Shorter videos accumulate more views than longer ones.”
On this channel: sample too thin to read — the belief inconclusive here.
n = 19, high confidence.
How this is computed
```
Pearson correlation of duration_seconds and total_views. Agrees if r ≤−0.3.
```
Source: Pre-Shorts-era folklore.
Beliefinconclusive
“Renaming a video resets its testing cycle.”
On this channel: 0.00 — the belief inconclusive here.
n = 0, low confidence.
How this is computed
```
For videos with a title change, compare daily views/impressions/CTR for
the 7 days before vs 7 days after the change. Agrees if median ratio
drops below 0.5 then recovers above 1.0 within 14 days.
```
Source: Common YouTube growth-channel advice.
Beliefinconclusive
“Most videos peak in their first three days.”
On this channel: sample too thin to read — the belief inconclusive here.
n = 1, low confidence.
How this is computed
```
Distribution of peak_views_day across mature videos. Agrees if ≥60% of
videos peaked on days 0–2.
```
Source: Pre-2020 algorithmic-period folklore.
Beliefinconclusive
“Saturday and Sunday uploads get less engagement than weekday uploads.”
On this channel: sample too thin to read — the belief inconclusive here.
n = 6, high confidence.
How this is computed
```
Median weighted CTR, AVD, sub conversion for videos published on Sat/Sun
vs Mon–Fri. Agrees if all three are lower on weekend uploads. Inconclusive
with <10 weekend uploads.
```
Source: Day-of-week folklore.

Beliefnot currently testable
“Older channels become less dependent on algorithmic recommendations.”
This belief depends on data not currently tracked: cross-channel cohort data. It would activate when more channels contribute their data.
How this is computed
```
algorithmic_dependency_index (1 − sticky_traffic_ratio) vs channel age.
Cross-channel cohort-level negative correlation at r ≤−0.4.
```
Source: Maturation-creator folklore.
Beliefnot currently testable
“Browse / Homepage impressions are weighted toward existing subscribers.”
This belief depends on data not currently tracked: owner-mode-only data (subscribed-state, search queries, etc.). It would activate when the channel owner connects private analytics.
How this is computed
```
Browse views / Browse impressions broken out by subscribed state. Agrees
if subscribed-state share of Browse views ≥50%.
```
Source: Common assertion; conflicts with documented YouTube guidance.
Beliefnot currently testable
“Browse pickups are more durable than Suggested-driven impressions.”
This belief depends on data not currently tracked: cross-channel cohort data. It would activate when more channels contribute their data.
How this is computed
```
For videos with Browse-source rank-1 day, compare Browse impression
decay rate to Suggested decay rate over the next 14 days. Cross-channel
cohort baseline ideal. Agrees if Browse decay half-life > 1.5× Suggested.
```
Source: Pickup-pattern folklore.
Beliefnot currently testable
“Cards contribute roughly 1% of traffic.”
This belief depends on data not currently tracked: Awaiting Round 4 cards-source data ingestion.. It would activate when Future round.
How this is computed
```
card_views / total_views. Cards are usually grouped under Direct/Unknown
or a specific source.
```
Source: YouTube official help docs.
Beliefnot currently testable
“Neither click-rate nor retention alone matters; the product (or geometric mean) does.”
This belief depends on data not currently tracked: Awaiting Round 4 retention-curve-derived precision.. It would activate when Future round.
How this is computed
```
Define combined = sqrt(ctr_z × retention_z) per video. Regress total
views on combined vs on CTR and retention separately. Agrees if combined-
only regression has higher R² than either-alone.
```
Source: Combined-quality folklore.
Beliefnot currently testable
“Channels publishing daily need pre-recorded inventory; those without it burn out.”
This belief depends on data not currently tracked: cross-channel cohort data. It would activate when more channels contribute their data.
How this is computed
```
Cross-channel only. Cadence break frequency over time for channels
classified as `daily`. Cohort-level: agrees if daily-cadence channels
with ≥30-video pre-publish queue have lower break rates than ad-hoc.
```
Source: Common creator advice.
Beliefnot currently testable
“Videos with high end-screen click-through have higher overall retention.”
This belief depends on data not currently tracked: Awaiting Round 4 cards/end-screen data ingestion.. It would activate when Future round.
How this is computed
```
Per video, end-screen view share of total views vs lifetime weighted
retention. Agrees if r ≥0.4. Requires Round 4 cards/end-screen data.
```
Source: End-screen optimization folklore.
Beliefnot currently testable
“End-screen impressions and clicks contribute 5–15% of traffic.”
This belief depends on data not currently tracked: Awaiting Round 4 retention/cards data ingestion.. It would activate when Future round.
How this is computed
```
end_screen_views / total_views channel-wide. Reads from per-source breakdown.
Note: requires retention curve / cards data from Round 4 to test fully.
```
Source: YouTube official help docs (creator-side guidance).
Beliefnot currently testable
“Performance trajectory of the first 10 videos predicts month-12 channel state.”
This belief depends on data not currently tracked: cross-channel cohort data. It would activate when more channels contribute their data.
How this is computed
```
Cross-channel. Needs ≥12 months of data per channel and a consistent
'predict' definition.
```
Source: Long-arc folklore.
Beliefnot currently testable
“First video's day-30 performance correlates with month-3 channel trajectory.”
This belief depends on data not currently tracked: cross-channel cohort data. It would activate when more channels contribute their data.
How this is computed
```
Cross-channel only. Within-channel, capture first_video_day30_views and
compare to channel month-3 cumulative views. Across cohort, Pearson
correlation.
```
Source: TenfoldGrowth-style coaching.
Beliefnot currently testable
“The 0–15-second retention drop is the strongest single predictor of overall performance.”
This belief depends on data not currently tracked: Awaiting Round 4 retention curve ingestion.. It would activate when Future round.
How this is computed
```
From video retention curve, retention_15s_pct. Pearson correlation with
total views and weighted_ctr_post_day_3. Agrees if r ≥0.7. Requires
retention curve data (Round 4).
```
Source: Hook-economy folklore.
Beliefnot currently testable
“Hour-1 view rate predicts the video's lifetime performance.”
This belief depends on data not currently tracked: YouTube Reporting API does not expose hourly granularity.. It would activate when Future round.
How this is computed
```
Requires hourly data not exposed by the YouTube Reporting API.
```
Source: Velocity-creator folklore.
Beliefnot currently testable
“Channels with narrow topical focus outgrow broad-topic channels.”
This belief depends on data not currently tracked: cross-channel cohort data. It would activate when more channels contribute their data.
How this is computed
```
Cross-channel — needs topic similarity scoring (owner-mode titles) plus
growth rate. Score channels by topical entropy of titles. Correlate
with 30-day growth.
```
Source: Niche-down folklore.
Beliefnot currently testable
“Titles with numbers (5 Ways… / 3 Things…) perform better.”
This belief depends on data not currently tracked: owner-mode-only data (subscribed-state, search queries, etc.). It would activate when the channel owner connects private analytics.
How this is computed
```
Median weighted CTR and median total views for title_has_number = 1 vs 0.
Agrees if both metrics ≥1.2× the non-numbered group.
```
Source: Listicle-era folklore.
Beliefnot currently testable
“Titles ending in ? outperform on click-through rate.”
This belief depends on data not currently tracked: owner-mode-only data (subscribed-state, search queries, etc.). It would activate when the channel owner connects private analytics.
How this is computed
```
Median weighted CTR for videos where title_is_question = 1 vs 0.
Owner-mode (titles are coarsened in public).
```
Source: Title-optimization folklore.
Beliefnot currently testable
“Channels with more subs per view have more engaged audiences.”
This belief depends on data not currently tracked: cross-channel cohort data. It would activate when more channels contribute their data.
How this is computed
```
Cross-channel: Pearson correlation between (current_subscribers / total_views)
and (sub conversion rate, comment rate, AVD). Agrees if all three ≥0.3.
```
Source: Engagement-quality folklore.
Beliefnot currently testable
“Subscribed-state viewers have higher retention than non-subscribers.”
This belief depends on data not currently tracked: owner-mode-only data (subscribed-state, search queries, etc.). It would activate when the channel owner connects private analytics.
How this is computed
```
AVD per (video, subscribed_state). Agrees if subscribed AVD median is
≥1.3× non-subscribed AVD.
```
Source: Common assertion.
Beliefnot currently testable
“Day-0 traffic is dominated by subscribed-state viewers.”
This belief depends on data not currently tracked: owner-mode-only data (subscribed-state, search queries, etc.). It would activate when the channel owner connects private analytics.
How this is computed
```
Per-video, what % of day-0 views came from subscribed-state viewers?
Channel median. Reads from summary_video_subscribed (owner-mode only).
```
Source: Common assertion in creator coaching.
Beliefnot currently testable
“Suggested impressions reach mostly non-subscribed viewers.”
This belief depends on data not currently tracked: owner-mode-only data (subscribed-state, search queries, etc.). It would activate when the channel owner connects private analytics.
How this is computed
```
Suggested views per subscribed state. Agrees if non-subscribed share of
Suggested views ≥70%.
```
Source: Common assertion.
Beliefnot currently testable
“Channels accelerate after 1,000 subscribers.”
This belief depends on data not currently tracked: cross-channel cohort data. It would activate when more channels contribute their data.
How this is computed
```
Cross-channel only. Compare 30-day view growth rate before vs after
1,000-sub milestone. Agrees if median post-1k growth is ≥1.5× pre-1k.
```
Source: Monetization-threshold folklore.
Beliefnot currently testable
“YouTube ranks by session length, not per-video metrics.”
This belief depends on data not currently tracked: cross-channel cohort data. It would activate when more channels contribute their data.
How this is computed
```
Cross-channel; requires session inference. Approximation: total channel
watch minutes per active viewer per day. Marked low_confidence due to
proxy nature.
```
Source: Session-watch folklore (post-2018 algorithm framing).

Beliefnot currently testable
“Older channels become less dependent on algorithmic recommendations.”
This belief depends on data not currently tracked: cross-channel cohort data. It would activate when more channels contribute their data.
How this is computed
```
algorithmic_dependency_index (1 − sticky_traffic_ratio) vs channel age.
Cross-channel cohort-level negative correlation at r ≤−0.4.
```
Source: Maturation-creator folklore.
Beliefinconclusive
“Distribution happens in two phases — test (low impressions, narrow audience) then scale.”
On this channel: sample too thin to read — the belief inconclusive here.
n = 1, low confidence.
How this is computed
```
Per video, fit a piecewise function to the impression-by-day curve.
Identify the 'elbow' where slope changes most. Agrees if ≥50% of
mature videos show a detectable elbow within days 3–14.
```
Source: Two-phase-algorithm folklore.
Beliefinconclusive
“Videos with average view percentage above 50% receive sustained impression delivery.”
On this channel: sample too thin to read — the belief inconclusive here.
n = 17, low confidence.
How this is computed
```
Per video, compare impression delivery rate in days 8–14 against days 1–7.
Bucket by retention tier (<30%, 30–50%, 50%+). Agrees if 50%+ bucket has
impression-rate ratio ≥0.7 AND lower-retention buckets ≤0.4.
```
Source: TubeBuddy creator guides.
Beliefnot currently testable
“Browse / Homepage impressions are weighted toward existing subscribers.”
This belief depends on data not currently tracked: owner-mode-only data (subscribed-state, search queries, etc.). It would activate when the channel owner connects private analytics.
How this is computed
```
Browse views / Browse impressions broken out by subscribed state. Agrees
if subscribed-state share of Browse views ≥50%.
```
Source: Common assertion; conflicts with documented YouTube guidance.
Beliefnot currently testable
“Browse pickups are more durable than Suggested-driven impressions.”
This belief depends on data not currently tracked: cross-channel cohort data. It would activate when more channels contribute their data.
How this is computed
```
For videos with Browse-source rank-1 day, compare Browse impression
decay rate to Suggested decay rate over the next 14 days. Cross-channel
cohort baseline ideal. Agrees if Browse decay half-life > 1.5× Suggested.
```
Source: Pickup-pattern folklore.
Beliefnot currently testable
“Cards contribute roughly 1% of traffic.”
This belief depends on data not currently tracked: Awaiting Round 4 cards-source data ingestion.. It would activate when Future round.
How this is computed
```
card_views / total_views. Cards are usually grouped under Direct/Unknown
or a specific source.
```
Source: YouTube official help docs.
Beliefinconclusive
“Channel-page-driven views come from highly engaged viewers.”
On this channel: 26.0 — the belief inconclusive here.
n = 674, low confidence.
How this is computed
```
Compare AVD, retention pct, sub conversion, comment rate of Channel Page
traffic vs Suggested traffic. Agrees if all four are higher for Channel Page.
```
Source: Common assertion.
Beliefinconclusive
“As library grows, Channel Page share of views falls.”
On this channel: 14.0 — the belief inconclusive here.
n = 14, low confidence.
How this is computed
```
Per channel, Pearson correlation between days_since_first_video and
trailing-7d Channel Page view share. Agrees if r ≤−0.5.
```
Source: Library-growth folklore.
Beliefinconclusive
“Comments-per-view tracks audience resonance better than likes-per-view.”
On this channel: sample too thin to read — the belief inconclusive here.
n = 19, high confidence.
How this is computed
```
For each engagement metric, Pearson correlation with sub conversion rate
at the video level. Agrees if comments/v has the highest correlation.
```
Source: Engagement-depth folklore.
Beliefinconclusive
“Comments are weighted more than likes by the algorithm.”
On this channel: sample too thin to read — the belief inconclusive here.
n = 12, low confidence.
How this is computed
```
For videos with comparable view counts, regress impression-rate-after-day-7
on (comments per view) and (likes per view) separately. Agrees if
|β_comments| > 2 × |β_likes|.
```
Source: Various creator coaching.

Beliefdisagrees

“High click-through-rate drives video success more than high retention.”

On this channel: 0.11×· ctr over retention· agrees ≥ 1.5× — the belief disagrees here.

n = 19, high confidence.

How this is computed

Per video, scale CTR and retention to channel-relative units (each
value's distance from the channel's median, divided by the channel's
typical spread). Fit a linear regression of total views against both.
If the magnitude of the CTR coefficient is at least 1.5× the retention
coefficient, the data agrees that CTR dominates on this channel.

Source: MrBeast school of creator advice; widely contested.

Beliefnot currently testable
“Neither click-rate nor retention alone matters; the product (or geometric mean) does.”
This belief depends on data not currently tracked: Awaiting Round 4 retention-curve-derived precision.. It would activate when Future round.
How this is computed
```
Define combined = sqrt(ctr_z × retention_z) per video. Regress total
views on combined vs on CTR and retention separately. Agrees if combined-
only regression has higher R² than either-alone.
```
Source: Combined-quality folklore.
Beliefnot currently testable
“Channels publishing daily need pre-recorded inventory; those without it burn out.”
This belief depends on data not currently tracked: cross-channel cohort data. It would activate when more channels contribute their data.
How this is computed
```
Cross-channel only. Cadence break frequency over time for channels
classified as `daily`. Cohort-level: agrees if daily-cadence channels
with ≥30-video pre-publish queue have lower break rates than ad-hoc.
```
Source: Common creator advice.
Beliefinconclusive
“YouTube's algorithm tests a video primarily in its first 1–3 days, after which performance is largely set.”
On this channel: sample too thin to read — the belief inconclusive here.
n = 1, low confidence.
How this is computed
```
For each video with ≥14 days tracked, what fraction of total lifetime
views came from days 1–3? Channel median across qualifying videos.
```
Source: VidIQ and TubeBuddy guides; widely repeated in creator literature.
Beliefinconclusive
“Once 30 days post-publish, a video's daily view rate stabilizes and varies little.”
On this channel: sample too thin to read — the belief inconclusive here.
n = 0, low confidence.
How this is computed
```
For videos ≥60 days old, coefficient of variation of daily views in
days 30–60. Channel median. Agrees if median CoV ≤0.5.
```
Source: Common analytics-tool framing.
Beliefinconclusive
“Editing title or thumbnail post-publish kills momentum.”
On this channel: 0.00 — the belief inconclusive here.
n = 0, low confidence.
How this is computed
```
Same as #12, but agrees if post-change views stay ≤50% of pre-change
baseline for 14+ days. Tests for sustained damage rather than reset+recovery.
```
Source: Conventional wisdom (the 'no-fiddle' rule).
Beliefnot currently testable
“Videos with high end-screen click-through have higher overall retention.”
This belief depends on data not currently tracked: Awaiting Round 4 cards/end-screen data ingestion.. It would activate when Future round.
How this is computed
```
Per video, end-screen view share of total views vs lifetime weighted
retention. Agrees if r ≥0.4. Requires Round 4 cards/end-screen data.
```
Source: End-screen optimization folklore.
Beliefnot currently testable
“End-screen impressions and clicks contribute 5–15% of traffic.”
This belief depends on data not currently tracked: Awaiting Round 4 retention/cards data ingestion.. It would activate when Future round.
How this is computed
```
end_screen_views / total_views channel-wide. Reads from per-source breakdown.
Note: requires retention curve / cards data from Round 4 to test fully.
```
Source: YouTube official help docs (creator-side guidance).
Beliefinconclusive
“Viewers from external sites have the highest engagement rates.”
On this channel: 1.00 — the belief inconclusive here.
n = 1, low confidence.
How this is computed
```
AVD, retention, sub conversion of External vs all other sources. Agrees
if External AVD ≥ p75 of cross-source AVD distribution. Sample-size flag
triggers when External views < 100 channel-wide.
```
Source: Common assertion.
Beliefinconclusive
“Days to first 100 subscribers is the slowest sub-acquisition leg.”
On this channel: sample too thin to read — the belief inconclusive here.
n = 9, low confidence.
How this is computed
```
Days from 1 sub to 100 subs vs days from 100 to 200, 200 to 300, etc.
Channel-relative growth rates per sub-bucket. Agrees if days-per-sub
in 0–100 is ≥2× days-per-sub in 100–500.
```
Source: Beginner-creator coaching.
Beliefnot currently testable
“Performance trajectory of the first 10 videos predicts month-12 channel state.”
This belief depends on data not currently tracked: cross-channel cohort data. It would activate when more channels contribute their data.
How this is computed
```
Cross-channel. Needs ≥12 months of data per channel and a consistent
'predict' definition.
```
Source: Long-arc folklore.
Beliefinconclusive
“A video's lifetime trajectory is largely determined by its first 24–48 hour performance.”
On this channel: n = 0· min 5 — the belief inconclusive here.
n = 0, low confidence.
How this is computed
```
Pearson correlation of day1_views vs total_views across all videos with
≥30 days tracked. Same correlation for day1_ctr vs lifetime_weighted_ctr.
Higher r = belief agrees on this channel.
```
Source: Closely related to #2; cited as 'velocity matters'.
Beliefnot currently testable
“First video's day-30 performance correlates with month-3 channel trajectory.”
This belief depends on data not currently tracked: cross-channel cohort data. It would activate when more channels contribute their data.
How this is computed
```
Cross-channel only. Within-channel, capture first_video_day30_views and
compare to channel month-3 cumulative views. Across cohort, Pearson
correlation.
```
Source: TenfoldGrowth-style coaching.
Beliefagrees
“Publishing video N consumes impressions that would otherwise have gone to video N−1, slowing N−1's trajectory.”
On this channel: 0.50· agrees ≤ 0.70 — the belief agrees here.
n = 19, medium confidence.
How this is computed
```
For each video pair (N−1, N) where N−1 is still in days 1–7 of its
lifecycle when N publishes, compare N−1's view trajectory in the 3 days
after N's publish to N−1's trajectory in the 3 days before. Channel
median of (views_3d_after / views_3d_before) across eligible pairs.
```
Source: Common creator advice; cited across creator coaching channels.
Beliefnot currently testable
“The 0–15-second retention drop is the strongest single predictor of overall performance.”
This belief depends on data not currently tracked: Awaiting Round 4 retention curve ingestion.. It would activate when Future round.
How this is computed
```
From video retention curve, retention_15s_pct. Pearson correlation with
total views and weighted_ctr_post_day_3. Agrees if r ≥0.7. Requires
retention curve data (Round 4).
```
Source: Hook-economy folklore.
Beliefnot currently testable
“Hour-1 view rate predicts the video's lifetime performance.”
This belief depends on data not currently tracked: YouTube Reporting API does not expose hourly granularity.. It would activate when Future round.
How this is computed
```
Requires hourly data not exposed by the YouTube Reporting API.
```
Source: Velocity-creator folklore.
Beliefinconclusive
“Longer videos accumulate more total watch minutes per viewer.”
On this channel: sample too thin to read — the belief inconclusive here.
n = 19, high confidence.
How this is computed
```
Pearson correlation of duration_seconds and (total_watch_minutes / total_views).
Agrees if r ≥0.5.
```
Source: Logical inference (bounded by retention).
Beliefdisagrees
“Videos 8–10 minutes long perform best for ad revenue and watch time.”
On this channel: 20 min· retention 29%· belief expects 10 min — the belief disagrees here.
n = 16, high confidence.
How this is computed
```
Bucket videos by duration (60/180/360/600/1200/1800/3600+ sec). Median
weighted CTR, AVD, retention pct, sub conversion per bucket. Agrees if
600s bucket has the highest median weighted retention pct.
```
Source: Creator-economy folklore; pre-2018 mid-roll-ad threshold.
Beliefagrees
“A new publish steals impression budget from the rest of the catalog.”
On this channel: 0.62 — the belief agrees here.
n = 20, high confidence.
How this is computed
```
When video N publishes, change in non-N videos' impression share over
the next 3 days. Agrees if median impression-share-of-others drops
≥15% in the post-publish window.
```
Source: Catalog-cannibalization folklore.
Beliefnot currently testable
“Channels with narrow topical focus outgrow broad-topic channels.”
This belief depends on data not currently tracked: cross-channel cohort data. It would activate when more channels contribute their data.
How this is computed
```
Cross-channel — needs topic similarity scoring (owner-mode titles) plus
growth rate. Score channels by topical entropy of titles. Correlate
with 30-day growth.
```
Source: Niche-down folklore.
Beliefnot currently testable
“Titles with numbers (5 Ways… / 3 Things…) perform better.”
This belief depends on data not currently tracked: owner-mode-only data (subscribed-state, search queries, etc.). It would activate when the channel owner connects private analytics.
How this is computed
```
Median weighted CTR and median total views for title_has_number = 1 vs 0.
Agrees if both metrics ≥1.2× the non-numbered group.
```
Source: Listicle-era folklore.
Beliefinconclusive
“Late-afternoon publishing optimizes day-0 reach.”
On this channel: sample too thin to read — the belief inconclusive here.
n = 21, low confidence.
How this is computed
```
Median day-1 views by hour-of-publish. Identify mode. Agrees if mode
falls in 14:00–18:00 local time. Many channels lock to one hour, making
this inconclusive.
```
Source: Publishing-time folklore.
Beliefinconclusive
“Channels that post on a consistent cadence outperform those that don't.”
On this channel: 22.0 — the belief inconclusive here.
n = 20, low confidence.
How this is computed
```
Cadence regularity score = the typical day-to-day spread of
inter-publish gaps, divided by the mean gap (coefficient of
variation). Lower = more consistent. Correlate against
trailing-30-day view growth.
```
Source: Universal creator-coaching advice.
Beliefagrees
“Publishing too often cannibalizes per-video performance.”
On this channel: 0.50· agrees ≤ 0.70 — the belief agrees here.
n = 19, medium confidence.
How this is computed
```
Same calculation as #1.
```
Source: Aliased to #1 frequency_cannibalization (different phrasing of same belief).
Beliefinconclusive
“Long gaps between uploads cause the algorithm to deprioritize the channel.”
On this channel: sample too thin to read — the belief inconclusive here.
n = 0, low confidence.
How this is computed
```
For gaps of ≥5 days between publishes, compare the post-gap video's
day-1/day-3 performance against the channel's typical day-1/day-3.
```
Source: Common creator coaching.
Beliefnot currently testable
“Titles ending in ? outperform on click-through rate.”
This belief depends on data not currently tracked: owner-mode-only data (subscribed-state, search queries, etc.). It would activate when the channel owner connects private analytics.
How this is computed
```
Median weighted CTR for videos where title_is_question = 1 vs 0.
Owner-mode (titles are coarsened in public).
```
Source: Title-optimization folklore.
Beliefinconclusive
“Once a channel has 30+ videos, key ratios (CTR, AVD, sub conversion) settle.”
On this channel: sample too thin to read — the belief inconclusive here.
n = 21, low confidence.
How this is computed
```
Coefficient of variation of weekly weighted CTR and AVD across the
channel's history. Compare CoV in weeks 1–10 vs weeks 11+. Agrees if
both CoVs drop by ≥30% after the 30-video mark.
```
Source: Maturation folklore.
Beliefagrees
“High retention drives video success more than high click-through-rate.”
On this channel: 9.32×· retention over ctr· agrees ≥ 1.5× — the belief agrees here.
n = 19, high confidence.
How this is computed
```
Same regression as #7; agrees if |β_retention| > 1.5 × |β_ctr|. Note:
can be `inconclusive` simultaneously with #7 if neither dominates.
```
Source: Sean Cannell / Think Media school; opposite of #7.
Beliefinconclusive
“Videos with growing Search-source share over time become long-term assets.”
On this channel: 15.0 — the belief inconclusive here.
n = 8, low confidence.
How this is computed
```
Per video, slope of weekly Search-source view share over the video's
lifetime. Channel-level: share of mature videos with positive slope.
```
Source: SEO-creator advice.
Beliefinconclusive
“Shorter videos accumulate more views than longer ones.”
On this channel: sample too thin to read — the belief inconclusive here.
n = 19, high confidence.
How this is computed
```
Pearson correlation of duration_seconds and total_views. Agrees if r ≤−0.3.
```
Source: Pre-Shorts-era folklore.
Beliefagrees
“Most subscribers come from a few breakout videos, not from many videos contributing equally.”
On this channel: 0.89· agrees ≥ 0.70 — the belief agrees here.
n = 21, high confidence.
How this is computed
```
Gini coefficient on subs_gained distribution across videos. Agrees if
Gini ≥0.7.
```
Source: Power-law folklore.
Beliefnot currently testable
“Channels with more subs per view have more engaged audiences.”
This belief depends on data not currently tracked: cross-channel cohort data. It would activate when more channels contribute their data.
How this is computed
```
Cross-channel: Pearson correlation between (current_subscribers / total_views)
and (sub conversion rate, comment rate, AVD). Agrees if all three ≥0.3.
```
Source: Engagement-quality folklore.
Beliefnot currently testable
“Subscribed-state viewers have higher retention than non-subscribers.”
This belief depends on data not currently tracked: owner-mode-only data (subscribed-state, search queries, etc.). It would activate when the channel owner connects private analytics.
How this is computed
```
AVD per (video, subscribed_state). Agrees if subscribed AVD median is
≥1.3× non-subscribed AVD.
```
Source: Common assertion.
Beliefnot currently testable
“Day-0 traffic is dominated by subscribed-state viewers.”
This belief depends on data not currently tracked: owner-mode-only data (subscribed-state, search queries, etc.). It would activate when the channel owner connects private analytics.
How this is computed
```
Per-video, what % of day-0 views came from subscribed-state viewers?
Channel median. Reads from summary_video_subscribed (owner-mode only).
```
Source: Common assertion in creator coaching.
Beliefnot currently testable
“Suggested impressions reach mostly non-subscribed viewers.”
This belief depends on data not currently tracked: owner-mode-only data (subscribed-state, search queries, etc.). It would activate when the channel owner connects private analytics.
How this is computed
```
Suggested views per subscribed state. Agrees if non-subscribed share of
Suggested views ≥70%.
```
Source: Common assertion.
Beliefnot currently testable
“Channels accelerate after 1,000 subscribers.”
This belief depends on data not currently tracked: cross-channel cohort data. It would activate when more channels contribute their data.
How this is computed
```
Cross-channel only. Compare 30-day view growth rate before vs after
1,000-sub milestone. Agrees if median post-1k growth is ≥1.5× pre-1k.
```
Source: Monetization-threshold folklore.
Beliefinconclusive
“Renaming a video resets its testing cycle.”
On this channel: 0.00 — the belief inconclusive here.
n = 0, low confidence.
How this is computed
```
For videos with a title change, compare daily views/impressions/CTR for
the 7 days before vs 7 days after the change. Agrees if median ratio
drops below 0.5 then recovers above 1.0 within 14 days.
```
Source: Common YouTube growth-channel advice.
Beliefinconclusive
“Most videos peak in their first three days.”
On this channel: sample too thin to read — the belief inconclusive here.
n = 1, low confidence.
How this is computed
```
Distribution of peak_views_day across mature videos. Agrees if ≥60% of
videos peaked on days 0–2.
```
Source: Pre-2020 algorithmic-period folklore.
Beliefnot currently testable
“YouTube ranks by session length, not per-video metrics.”
This belief depends on data not currently tracked: cross-channel cohort data. It would activate when more channels contribute their data.
How this is computed
```
Cross-channel; requires session inference. Approximation: total channel
watch minutes per active viewer per day. Marked low_confidence due to
proxy nature.
```
Source: Session-watch folklore (post-2018 algorithm framing).
Beliefinconclusive
“Saturday and Sunday uploads get less engagement than weekday uploads.”
On this channel: sample too thin to read — the belief inconclusive here.
n = 6, high confidence.
How this is computed
```
Median weighted CTR, AVD, sub conversion for videos published on Sat/Sun
vs Mon–Fri. Agrees if all three are lower on weekend uploads. Inconclusive
with <10 weekend uploads.
```
Source: Day-of-week folklore.

Have a YouTube belief you want tested? or email it directly. Useful submissions get added to the canon — credit appears in the card's source line.

Honesty

What this dataset doesn’t say

A page in an atlas marked “survey conducted under ice cover” is not weakened by the footnote — it’s strengthened. The columns below name the conditions under which every other chart in this dashboard was drawn.

Reporting lag

YouTube reports two to three days behind.

Data flows reliably through 2026-04-28. The dashboard never charts past that date — it would be zeros dressed as truth. Reporting time zone is America/Los_Angeles; a “day” here is a calendar day in PT, not your local time.

Stub rows

7 videos have pre-publish stub rows.

When a video is scheduled, YouTube sometimes records reporting rows for the day before it published — usually a row of zero views and null impressions. Every chart and aggregate filters these out (the pre_publish_stub = 0guard in every summary CTE). These stubs stay in the database for provenance — they’re how we know YouTube did this on this channel.

Per-video pre-publish stub counts (top 5)
Video 9	1 stub days
Video 10	1 stub days
Video 11	1 stub days
Video 12	1 stub days
Video 13	1 stub days

Click-rate confidence

Where the click-rate sample is thin, the dashboard says so.

Click rate from fewer than 50 impressions is noise. From 50 to 249 it’s directional. From 250 to 999 it’s readable. From 1,000 or more it’s reliable. The four-segment dotted glyph after every click-rate cell encodes which tier — fewer dots means a thinner sample.

noise 3
low 4
medium 10
high 4

Watch-time confidence

The same tiering applies to average watch time.

Below 10 views, average watch is noise. 10 to 29 is directional. 30 to 99 is readable. 100 or more is reliable. The same four-segment glyph after every average-watch cell.

noise 9
low 8
medium 3
high 1

Source gaps

7% of impressions don't trace to a source.

YouTube’s per-source impression rows don’t always sum to the per-video impression total. The gap is real — some impressions come from sources YouTube doesn’t expose. We surface the gap rather than redistribute it.

989 unattributed of 14,526 total impressions.

Snapshot drift

Channel-level views match the sum of per-video views.

The channel snapshot (my-channel-snapshotsin the database) sometimes shows totals slightly higher than the sum of per-video reporting rows. This is YouTube’s own attribution gap; we chart it instead of hiding it.

How is this computed?

The timezone YouTube reporting uses for per-day rollups.: All dates in the public DB are calendar days in YouTube's reporting timezone (America/Los_Angeles, PT), not the visitor's local timezone. A "day" here is a calendar day in PT — a video published at 22:00 PT on Apr 27 will accumulate that day's reporting under the date 2026-04-27, even for viewers in UTC+12.
Flag (0/1) on every per-video reporting row indicating whether the row sits before the video's published_at.: Set to 1 when the row's date is earlier than SUBSTR(published_at, 1, 10). Every summary_* CTE that aggregates over per-video reporting must include WHERE pre_publish_stub = 0 so the seven-row stub artifact (NULL impressions, 0 views) doesn't poison aggregations. The flag itself stays in the DB for downstream provenance.
Sample-size confidence tier for the row's click-through rate; one of noise / low / medium / high.: Derived from the row's impressions sample size. Noise tier when impressions < 50; low when < 250; medium when < 1000; high otherwise. The thresholds are calibrated so that a noise-tier CTR is statistically meaningless (one or two clicks against a tiny denominator).
Sample-size confidence tier for the row's average view duration; one of noise / low / medium / high.: Derived from the row's view sample size — the denominator AVD averages over. Noise tier when views < 10 (one or two viewers' watch time, statistically meaningless); low when views < 30; medium when views < 100; high otherwise. Mirrors the V10 ctr_confidence pattern but keyed on views instead of impressions.
Per-date drift between the channel-level views snapshot and the sum of per-video views.: MAX(0, channel_daily.total_views - SUM(video_daily.views WHERE pre_publish_stub = 0)) per date. Reads >0 when the channel-snapshot total exceeds the sum of attributed per-video reporting (e.g. deleted videos, late attribution). Floored at zero so reporting overcounts (per-video sum exceeds channel snapshot — also possible during transient lag) don't render as negative drift.

Channel state · publishing cadence

The trailing 28-day publish strip plus the detected pattern. Descriptive only; the strip is the channel’s observed rhythm, not a recommendation about when to publish.

Publishing cadence appears after the first 28 days.

Channel state · view distribution

How unevenly views and subscribers are distributed across the catalog over time. A higher Gini means a few videos carry most of the recent views or subscribers; a lower Gini means the distribution is more even. Both shapes are common at different channel stages.

Questions you can ask your data

Start here

Consultant analysis — top 5 observations

The closest thing to a paid YouTube analyst — evidence-based, no fluff.

I've uploaded an anonymized YouTube analytics SQLite database from a real channel. Videos are labeled "Video 1", "Video 2", etc. Analyze all tables and give me your top 5 specific, evidence-based observations about this channel — things the creator might not have noticed. Include the actual numbers.

Diagnose growth

Is the channel growing, flat, or declining?

View velocity, spikes, and drops across the channel over time.

I've uploaded a YouTube analytics SQLite database. Query the video_daily table and analyze the daily trends. Is the channel's daily view count growing, flat, or declining? Are there any days with unusual spikes or drops? What's the view velocity (views per day) for the most recent videos vs. the earliest ones?

Is day-1 performance a predictor of lifetime views?

Whether early signals reliably forecast long-term outcomes.

I've uploaded a YouTube analytics SQLite database. Look at the summary_video table's day1_views, day1_impressions, and day1_ctr columns. Is there a relationship between day-1 performance and lifetime performance? Do videos that start strong continue strong, or is day-1 noise?

Understand the audience

Where are viewers actually finding these videos?

Browse-driven vs Suggested-driven vs Search — by video.

I've uploaded a YouTube analytics SQLite database. Query the traffic_daily and summary_traffic tables. For each video, show me the percentage of views from Browse/Homepage vs. Suggested Videos vs. YouTube Search. Are some videos "Browse videos" and others "Suggested videos"? What distinguishes them? Is there a pattern by publish date or video duration?

How does click rate behave over a video's lifetime?

Does CTR stay stable, or does it drop as YouTube expands the audience?

I've uploaded a YouTube analytics SQLite database. Query video_daily for CTR over time for each video. Does CTR typically start high and decline (as YouTube expands to less-targeted audiences), or does it stay stable? Which video had the most consistent CTR? Which had the biggest CTR drop? What does this tell us about YouTube's impression allocation strategy?

Inspect individual videos

Which videos have potential that hasn't been discovered yet?

Videos with above-average click rate but below-average views.

I've uploaded a YouTube analytics SQLite database. Query the summary_video table and find any videos where weighted_ctr is above the channel average but total_views is below average. These are videos that performed well on click rate but did not accumulate many views. What patterns do you see in their titles, durations, or traffic sources? What factors might explain the gap between click rate and views?

Which videos keep getting views — and which flashed and died?

Flash pattern vs evergreen across the catalog.

I've uploaded a YouTube analytics SQLite database. Query the video_daily table, filtering by days_since_publish. For each video, show me how views decay over time. Which videos had the steepest decay (flash pattern)? Which maintained views longest (evergreen pattern)? Plot the normalized decay curves if possible.

Which videos convert impressions into watch time most efficiently?

How much viewing time each thumbnail impression generates.

I've uploaded a YouTube analytics SQLite database. The summary_video table has a watch_min_per_impression column — this measures how much watch time each thumbnail impression generates. Which videos are most efficient at converting impressions into watch time? Is there a correlation between video length and this metric?

Which videos turn viewers into subscribers?

Subscriber conversion rate across the catalog.

I've uploaded a YouTube analytics SQLite database. Look at the subs_per_1k_views column in summary_video. Which videos are best at converting viewers into subscribers? Is it the high-view videos or the high-engagement ones? What does this suggest about which content to make more of?

Go deep

Complete channel audit

Comprehensive read across every table — what's moving, what's flat, what's surprising.

I've uploaded a YouTube analytics SQLite database with tables: summary_channel, summary_video, video_daily, traffic_daily, summary_traffic, summary_era, summary_channel_daily, summary_traffic_daily, summary_traffic_top_videos, channel_daily. This is a real, anonymized channel. Give me a comprehensive audit: what's working, what's not, what's surprising, and what questions would you ask the creator based on this data?

How does this channel compare to typical creator-stage benchmarks?

The AI compares against public creator-stage research it knows about — hedged.

I've uploaded a YouTube analytics SQLite database. Based on public creator-stage research you know about — typical views per video, CTR ranges, subscriber growth curves for channels at a similar stage — how does this channel's data compare? Where do the numbers look unremarkable for the stage, and where do they stand out? Note that benchmarks vary widely by niche, audience, and content type, so treat any comparison as a rough reference rather than a target.

Sample SQL queries (for SQL writers)

Copy any of these into a SQLite client (or paste them into your AI alongside the .db file). Each one exercises a column added this round.

Videos that were watched but not clicked

High retention, low click rate — quality signal without reach.

SELECT video_id, weighted_ctr, avg_retention_pct, total_views
FROM summary_video
WHERE ctr_quadrant = 'watched-not-clicked'
ORDER BY total_views DESC;

Channel's 28-day weighted click rate over time

Smoothed channel-wide click rate, NULL while the window is incomplete.

SELECT date, weighted_ctr_28d
FROM summary_channel_daily
WHERE weighted_ctr_28d IS NOT NULL
ORDER BY date;

How concentrated is each video's source mix?

source_hhi is the Herfindahl index over the per-video source shares (0..10000).

SELECT video_id, primary_traffic_source, source_hhi
FROM summary_video
WHERE source_hhi IS NOT NULL
ORDER BY source_hhi DESC;

Videos that took longest to get a Browse / Homepage impression

Days from publish to the first Browse impression on each video.

SELECT video_id, days_to_first_browse_imp
FROM summary_video
WHERE days_to_first_browse_imp IS NOT NULL
ORDER BY days_to_first_browse_imp DESC
LIMIT 10;

Channel's source diversity over time

Simpson 1 - Σ share² over the trailing-7d source share. Higher = more spread out.

SELECT date, source_diversity_score, top_video_share_7d
FROM summary_channel_daily
WHERE source_diversity_score IS NOT NULL
ORDER BY date DESC
LIMIT 30;

Quiet inventory — videos with no views in the prior 7 days

Track how many of the channel's published videos went view-less in each rolling window.

SELECT date, videos_with_zero_views_7d, quiet_inventory_pct
FROM summary_channel_daily
WHERE quiet_inventory_pct IS NOT NULL
ORDER BY date;

Avg Watch drift — videos whose recent average watch is longer than the first 3 days

Positive drift = recent viewers stay longer than the early audience did.

SELECT video_id, avg_avd_sec, avd_drift_sec
FROM summary_video
WHERE avd_drift_sec IS NOT NULL
ORDER BY avd_drift_sec DESC
LIMIT 10;

Per-video source-share daily — who sent views on which day

Long-format trailing-window source share. Useful for stacked-area charts.

SELECT date, traffic_source_id, share_7d, views_7d
FROM summary_channel_source_share_daily
WHERE share_7d IS NOT NULL
ORDER BY date DESC, share_7d DESC
LIMIT 50;

Which AI should I use?

These prompts work best with Claude, ChatGPT Plus, or Gemini Advanced— they need file upload and long-context reasoning. Free-tier models will often fail silently on the bigger queries (like "Complete channel audit"). If you have access to Claude Projects or custom GPTs, upload the .db once and keep asking questions.

Where each metric comes from

Subscriber columns come from two independent YouTube APIs. subs_total_eod is a daily snapshot from the Data API. subs_gained and subs_lost are deltas from the Analytics API. Drift between them is expected — they sample at different times of day and use different rounding rules. Each column is authoritative for its own purpose.

Field	API source
views, watch_time_minutes, avg_view_duration_sec, avg_view_pct, subs_gained, subs_lost, likes, comments, shares	YouTube Reporting API + Analytics API (reconciled)
impressions, ctr	YouTube Reporting API (reach reports)
engaged_views, dislikes	YouTube Analytics API only
Country, device_type, subscribed_status dimensions	YouTube Reporting API

Schema changelog (v5 – v42)

One row per schema version, newest first. Reading theschema_versiontable directly returns the same data.

v422026-04-29L-16 — summary_detector_activity table. New per-detector aggregation table populated by scripts/build-detector-activity.js after dashboard/scripts/generate-snapshots.ts has written the insight_snapshot history. One row per detector_name with total_fires, first_fired_date, last_fired_date, highest_significance, and the snapshot_date that carried that highest_significance. Empty at clean-build time (insight_snapshot has no rows yet); populated when refresh-public.js completes its finalize step. The /help/detectors master list page (L-17) reads this table via getDetectorActivity() to render fired vs not-yet-fired groupings.
v412026-04-28A-30 — Calc hygiene + metric_metadata + indexes. New metric_metadata table (metric_id, definition, source_tables, computation_doc, confidence_method, last_modified) populated from scripts/metric-metadata.yaml at build time; consumed by /data tile tooltips so "How is this computed?" reads from a single source. Index audit: the eight Round-3 indexes (idx_svcd_video_date, idx_svcd_country_date, idx_sccd_country_date, idx_mzd_metric_date, idx_wt_status_date, idx_wt_test_date, idx_vsd_video_date, idx_cal_country_date) all use CREATE INDEX IF NOT EXISTS for build-rebuild idempotence. Time-zone documentation: data_status gains a reporting_timezone row pointing at America/Los_Angeles (PT) so the dashboard's honesty surfaces read the timezone from one source. Schema-history consistency check (scripts/schema-history-check.js) extends the prior tail-equality assertion with three new failure modes — version skip / gap, non-monotonic order, and position-by-position mismatch with a hard-coded EXPECTED_SCHEMA_HISTORY_VERSIONS sequence — so future schema arithmetic regressions fail the build with precise diagnostics.
v402026-04-28A-25 — Honesty surface column. summary_channel_daily gains unattributed_channel_views INTEGER, populated post-insert as MAX(0, total_views - SUM(video_daily.views WHERE pre_publish_stub = 0)) per date. The column makes the per-date snapshot drift between channel-level views and the sum of per-video reporting rows readable directly from one column instead of recomputing every render. Powers the new <HonestyPanel> on /data (six sub-sections naming reporting lag, pre-publish stub rows, click-rate confidence distribution, AVD confidence distribution, unattributed impressions, and the channel-snapshot reconciliation chart) plus a one-line <HonestyDigest> in the Coda chapter on /. The digest reads "Data flows reliably through {cutoff}; per-source attribution is at {pct}%; click-rate sample tier today: {tier}." with a link to the full panel; the third clause omits when reconciliation is sparse. Observation-only voice; the surface is reference content, not a verdict.
v392026-04-28A-23 — AVD confidence tier columns. summary_video, summary_traffic, and video_daily each gain an avd_confidence TEXT column (one of noise / low / medium / high) derived from the row's view sample size: noise < 10, low < 30, medium < 100, high otherwise. Mirrors the V10 ctr_confidence pattern but keyed on views (the denominator AVD averages over) instead of impressions. Persisting the tier alongside the value lets every UI surface that renders an AVD value wrap it in <AvdConfidencePip> — strikethrough on noise, muted on low, full on medium / high — without recomputing the threshold per call site. Same A-23 ships StageChip + VideoMiniStack visual chrome on /videos rows + comparison-band reuse on per-video charts; those are presentation-only and do not bump the schema.
v382026-04-28A-19a — Wisdom canon foundation. Two new tables: wisdom_canon (50 rows loaded from scripts/wisdom-canon.yaml — one per conventional belief, with belief_text, calculation_doc, evidence_shape in {single-ratio, correlation, delta, share-distribution, count-tier, not-applicable}, and agrees_directionality in {positive, negative, neutral}) and wisdom_test (one verdict row per (test_id, snapshot_date) with status in {agrees, disagrees, inconclusive, not_currently_testable}, an optional inconclusive_reason that classifies into thin_sample / near_threshold, evidence_json, confidence, sample_size, prior_status, status_changed_today, and the not_yet_implemented flag the runner sets while the per-test impl files are still landing). New scripts/wisdom-runner.js loads the canon yaml on every build, populates wisdom_canon idempotently, and emits 50 wisdom_test rows: every requires_owner_mode / requires_cross_channel / not_currently_testable_reason entry short-circuits to status not_currently_testable; every testable entry with a per-test file at dashboard/src/lib/wisdom/tests/<test_id>.ts dispatches and writes a real verdict; every testable entry without an implementation file gets a placeholder row with not_yet_implemented = 1. Three anchor implementations land in this slice: frequency_cannibalization (single-ratio), first_24_48h_determines_fate (correlation), title_changes_reset_algo (delta). A-19b lands the remaining 26 testable implementations across multiple sessions; the architecture cannot regress because the runner already passes all 50 entries with the placeholder fallback.
v372026-04-28A-17 + A-18 — Cadence classifier, cadence break events, day-of-week trend (A-17), plus library Gini coefficient and Lorenz half-mass count (A-18). A-17 ships channel_cadence_daily (one row per date with a classified publishing pattern: daily, every_other_day, weekday_only, weekly_3x, weekly, irregular; conformance percentage over the trailing 28 days; days_since_break that resets on each divergence) and the cadence_break channel_timeline event emitted on each break day after the channel held the pattern for at least seven days. Two TEXT JSON columns on summary_channel: publish_gap_distribution (histogram of inter-publish gaps in days) and day_of_week_trend (per-weekday twelve-week slope of channel views). A-18 layers three new columns on summary_channel (gini_views, gini_subs, library_half_life_days — the third counts top videos sorted desc by views needed to reach half of total channel views, populated as a Lorenz half-mass count) plus two trailing 28-day rolling Gini columns on summary_channel_daily (gini_views_28d, gini_subs_28d). The Coda chapter renders a hedged "{M} of {N} videos" callout from gini_subs against the Lorenz curve; /data renders a distribution-evolution line chart of the two daily Gini series. Observation-only voice across all surfaces; both extremes (concentrated vs even) are legitimate channel patterns.
v362026-04-28A-16 — Days-to-N velocity columns. summary_video gains seven INTEGER columns: days_to_10_views, days_to_100_views, days_to_500_views, days_to_1000_views (cumulative-views milestones, populated by the same per-video walk that already produces days_to_50_views), plus days_to_first_comment, days_to_first_like, days_to_first_share (day offset where the cumulative engagement count first reaches one). All seven columns hold NULL when the milestone has not landed yet, mirroring the days_to_first_subscriber convention. summary_channel gains two INTEGER columns: median_days_to_100_views and median_days_to_1000_views — channel medians across videos that have crossed each milestone, NULL when fewer than one eligible video exists. The /videos/[id] discovery-unfolded strip extends to surface the new milestones inline alongside the existing days_to_first_* values; the Catalog chapter on the home page renders a hedged channel-median callout. Pure-data slice; observation-only voice.
v352026-04-28A-14 + A-15 — Engagement composite, velocity, two new detectors, plus subscriber granularity. A-14 ships engagement_weights (default likes=1, comments=5, shares=3) and three columns on summary_video: engagement_rate_composite (weighted-engagement-events / total_views; NULL when total_views = 0), engagement_velocity_24h (likes + comments + shares from the first 24 hours since publish), engagement_velocity_7d (the same sum over the first 7 days). Two new detectors: comment-outliers fires per-video when comments-per-100-views runs at least 3x the channel median (sprouting+); engagement-leads-views runs a cross-correlation between engagement spikes and view spikes per video at lag 3-7 days (growing+). Both route through emerging_signal via evidence.shape. The chip on /videos/[id] surfaces the 24h and 7d velocity values; the new <EngagementVelocityChart> renders them as a two-bar Recharts BarChart. A-15 layers subscriber granularity on top: summary_traffic gains subs_gained INTEGER (per-(video, source) total_subs_gained allocated proportionally by view share; YouTube does not break subs out by source so the value is an inference) and funnel_engaged_views INTEGER (SUM(engaged_views) from reporting_traffic_metric per (video, source)); summary_video gains days_to_third_subscriber and days_to_fifth_subscriber (extends the days_to_first_subscriber pattern, walks video_daily for the day cumulative subs_gained reaches 3 / 5). channel_timeline.event_type adds subscriber_event for milestone-only channel-wide subs (every 5th up to 100, every 10th past 100, every 25th past 500, every 100th past 1k) — replaces the prior milestone-typed subscriber rows so the Story dispatcher can target the new event type explicitly. New <SubFunnelChart> mounts on /traffic/[sourceId] rendering impressions → views → engaged → subs as a horizontal Recharts BarChart; new sub-rate-vs-view-rate detector fires when the trailing 28-day sub-rate slope and view-rate slope diverge (audience-quality-improving / volume-without-depth shapes routed through emerging_signal). Observation-only voice across both slices.
v342026-04-28A-12 + A-13 — Z-score machinery and step-change scaffolding. A-12 ships metric_zscore_daily (date, metric, value, baseline_mean, baseline_sd, zscore, sample_size; PK on (date, metric)) holding per-(date, metric) z-scores against a prior 28-day baseline for five tracked metrics: daily_views, weighted_ctr, weighted_avd, sub_gain_rate, total_impressions. Each row reads the 7-day trailing mean as the value, prior 28-day mean and sd as the baseline; baseline window must hold at least 28 non-null values or the z-score stays NULL. summary_channel_daily.anomaly_score_composite reads SUM(ABS(zscore)) per date so the Opening rail anomaly markers fire on multi-metric outlier days. The new metric-anomaly detector reads z > 2 outliers in the trailing 7 days and routes through the existing emerging_signal slot via evidence.shape. A-13 layers six new columns on summary_channel_daily: trailing 14-day OLS slopes for top_video_share, top3_video_share, source_diversity, quiet_inventory, and channel-level country HHI, plus recent_debut_trend_14d (relative delta between recent-14-day and prior-30-day mean day1_impressions across the catalog). A-13 also adds a CUSUM step-change scan over four channel-level rolling series (views, click rate, avg watch, subs gained) emitting step_change channel_timeline events when the maximally-separating split is at least 30% relative magnitude and sits at least 7 days from either series boundary. The new recent-debut-trend detector reads summary_video and surfaces the recent-vs-prior comparison as a single channel-scope insight at growing+ maturity. Observation-only voice across all surfaces.
v332026-04-28A-11 — Audience persistence and replacement, plus the emerging_audience event type. summary_channel_daily gains two REAL columns: audience_persistence_pct (share of last week's (source, country) buckets that also appeared the prior week) and audience_replacement_rate (share of last week's buckets that did not appear the prior week). Computed as a JS pass over reporting_traffic_metric on the source DB; one weekly comparison written per ISO week-ending row in summary_channel_daily, with NULL on dates that do not fall on a week boundary. A new emerging_audience event_type emits one channel_timeline row per (traffic_source_id, country_code) bucket that did not appear in the channel's first 14 days yet drew at least 50 views in the last 28 days; headline names the country, then the source. The Audience chapter mounts a two-bar gauge showing the persistence and replacement values with arrows reading the most recent two weekly comparisons; both states (high persistence vs high replacement) are legitimate channel patterns at different ages.
v322026-04-28A-10 — Sticky-traffic ratio and algorithmic dependency index. traffic_source gains an is_sticky 0/1 column (sources where the viewer chose to come back: Direct, Channel Page, Playlist, End Screen, link-clicks from off-platform, subscriber notification; everything else is algorithmic discovery). summary_video adds sticky_traffic_ratio (0..1 share of total_views from sticky sources, NULL when no traffic rows). summary_channel adds algorithmic_dependency_index (1.0 minus the channel-wide sticky share). summary_channel_daily adds two columns: per-day algorithmic_dependency_index and a trailing 28-day OLS slope of that index (units: index points per day, NULL when fewer than 14 daily values fall in the trailing window). The Audience chapter mounts a new <StickyTrafficSparkline> tile that renders the per-day index as a sparkline against a hedged qualifier; per-video pages render a small sticky-traffic chip. Both extremes (high and low ratio) are legitimate channel patterns; the value is observation, not verdict.
v312026-04-28A-09 — Channel temperature and country diversity. summary_channel_daily gains two new REAL columns: temperature_score (a 0..100 weighted blend of trailing CTR, AVD, sub-rate, source diversity, and library activity z-scores against the prior 90-day baseline; 50 reads as the channel's own normal, 60 as a warmer week than typical for this channel, 40 as a cooler week than typical) and country_diversity_score (Simpson form 1 - sum(share^2) over the per-date top-country shares from summary_channel_country_daily). Both columns require at least 35 days of baseline; below that they stay NULL. Adds two channel-scope detectors that share the emerging_signal slot via evidence.shape: algorithmic_thaw fires when channel-wide impressions ran at or above 2x their prior 14-day median for two or more consecutive days, and algorithmic_frost fires on the symmetric drop (impressions at or below 50 percent of the prior 14-day median for two or more days). Both maturity-gated to sprouting+. The 5th tile on the home page right-now strip renders the score, a 14-day sparkline, and a hedged qualifier sourced from EVOLUTION section 4.1 ("warmer than typical" / "your normal" / "cooler than typical") so the channel position is observation, not verdict.
v302026-04-28A-08 — Channel-scope co-movement events and anomaly composite scaffolding. Two new channel_timeline event types ship on default builds: multi_video_activation (a day where three or more videos each ran at or above 2x their prior 7-day rolling mean views) and cohort_lift (an ISO-week publish cohort whose median trailing-7-day views climbed at least 30 percent over the prior week while older cohorts stayed flat within plus-or-minus 10 percent). Two new detectors (multi-video-activation, cohort-lift) emit the same patterns as Insight objects routed through the existing emerging_signal slot via evidence.shape. summary_channel_daily gains the anomaly_score_composite REAL column; the column stays NULL on default builds until A-12 ships the metric_zscore_daily table that populates it (sum of absolute z-scores across tracked metrics per date). The Opening rail container exposes the data-testid="anomaly-marker-rail" hook so the smoke gate can catch a SQL-join-key class of regression at build time even before A-12 lands.
v292026-04-28A-05 + A-06 — Lifecycle daily classifier and stage-transition timeline. A-05 ships the new video_stage_daily table (per-(video, date), one of nine stages: pre_publish / debut / growth / peak / tail / quiet / re_emerged / zombie / dormant) populated by a deterministic JS pass over video_daily anchored on summary_video.peak_views_day. Eight new columns on summary_video persist the latest stage (lifecycle_stage), the per-stage day spans (days_in_growth, days_in_peak, days_in_tail), the time from peak to half-of-peak rolling views (tail_half_life_days), and three survives-day-N flags (survives_day_7 / 14 / 30) gated on cumulative views at day N reaching the channel-median cumulative views at the same day. A-06 layers a stage_transition channel_timeline event type on top — emitted from diff of consecutive video_stage_daily rows, filtered to meaningful transitions only (any transition INTO re_emerged / zombie / dormant, any transition OUT of quiet, plus peak -> tail) — and adds three detectors that read the persisted stages: zombie-impressions, dormant-video, and channel-level video-re-emergence. Eight columns plus the transition events let the catalog grid read current stage and trajectory health without touching video_daily; downstream slices (A-23 StageChip, A-27 story dispatcher, A-29 video timeline strip) read these directly.
v282026-04-28A-04 — Country composites on summary_video. Five derived columns (country_hhi, dominant_country_code, dominant_country_share, country_concentration_with_weak_retention, country_audience_match_score) populated by a post-summary pass over the coarsened summary_video_country rows. HHI mirrors source_hhi shape (sum of squared shares × 10000); the concentration boolean fires when one country holds >= 40% share of a video and that country watches >= 30% shorter than the video overall; the match score is the cosine similarity between this video distribution and the channel-wide well-retaining country distribution (countries weighted by views from videos retaining at or above the channel median). Pairs the country dimension with the existing source_hhi to characterize concentration on two independent axes; the geographic-concentration detector now reads the persisted boolean instead of recomputing per-insight.
v272026-04-28A-02 + A-03 — Daily country tables and country_activation_log. summary_video_country_daily (per-(video, country, date)) and summary_channel_country_daily (per-(date, country)) ship in default builds, populated from reporting_video_metric with cutoff and pre-publish guards. The daily resolution unlocks per-video country stream charts on /videos/[id] and the channel country river on /traffic; downstream slices read these tables for activation logs, audience-swap detection, and per-country sub-funnel cohorts. Long-tail rows below the 5% per-video share threshold collapse into the same "OT" bucket the lifetime tables use. A-03 layers the country_activation_log table (per-(video, country) first-active and first-meaningful >= 5 views dates) on top, plus a first_country_view channel_timeline event (one per country the channel has reached, significance 50) so the Story dispatcher can render geographic-reach milestones once A-27 wires it.
v262026-04-28A-01 — Country tables default-public. The two tables (summary_video_country, summary_channel_country_trailing30) now ship in default builds, with --coarsen-country-mix on so countries below 5% share collapse to "OT" (Other). The owner-mode flag still suppresses summary_video_subscribed plus summary_channel_device. The country dimension on its own does not deanonymize an English-language channel (top countries cluster around US/IN/GB/CA on essentially every channel); the prior owner-mode gate hid the most diagnostic dimension in the dataset.
v252026-04-27Reverts V23 (CLOSE-B-51) coarsening of video.published_at and summary_video.published_at to YYYY-MM-01. The collapse to first-of-month broke the cadence detector (every channel misclassified as "schema-ambiguous"), the per-date predicates in the dashboard, and the publish-date scatter (every dot landed on a single vertical line); the privacy gain was marginal because per-video age is already exposed via days_tracked, view trajectories, and the timeline. duration_seconds bucketing remains in place — it carries genuine de-anonymization risk (a precise integer-second runtime is rare enough to be fingerprintable) without the downstream collateral. The --show-precise-metadata flag still toggles the duration bucketing for owner-mode rebuilds.
v242026-04-27Count consistency across surfaces. summary_channel.total_public_videos now counts videos with published_at <= latest_data_date (the reporting cutoff) instead of joining through video_daily, so videos published right at/after the cutoff stop being silently dropped from the count. summary_channel.total_days_tracked now derives from (latest_data_date - first_video_date + 1) so the days span and the public-video count align by construction for daily posters. summary_channel_daily.cumulative_videos_live is recomputed from precise source-side publish dates instead of the noisy data-api channel_snapshot.video_count, and two new columns most_recent_video_id (TEXT) and days_since_last_publish (INTEGER) let the dashboard render per-date "videos live" + "Video N is X days old" without recomputing from coarsened video.published_at (CLOSE-B-51 silently broke that path).
v232026-04-27Coarsened publish_date and duration_seconds in default builds so the public DB cannot be cross-referenced to identify the channel: video.published_at and summary_video.published_at snap to YYYY-MM-01; video.duration_seconds and summary_video.duration_seconds snap to one of eight bucket lower bounds (0/60/180/360/600/1200/1800/3600). Coarsening runs as a final UPDATE pass after all derived columns (day3/day7 cohorts, days_to_first_*, peak_views_day, summary aggregates) are computed against precise values, so build-time math is unaffected. Owner-mode rebuilds with --show-precise-metadata skip the UPDATE and preserve full precision.
v222026-04-27Owner-mode tables (summary_video_country, summary_video_subscribed, summary_channel_country_trailing30, summary_channel_device) suppressed by default; pass --include-owner-tables to emit them. Stage-1 public deploys ship without these tables; the dashboard read helpers hard-assert NEXT_PUBLIC_OWNER_MODE=true as defense-in-depth. The legacy --show-channel-name flag is removed.
v212026-04-27Covering index idx_insight_snapshot_cat_date on insight_snapshot(insight_category, snapshot_date) — detectShapeChange and any future category-scoped snapshot lookup now use an index SEARCH instead of falling back to idx_snapshot_date and re-checking insight_category row by row.
v202026-04-27Covering index idx_traffic_daily_source_date on traffic_daily(traffic_source_id, date) — homepage Browse-impression activity card now uses an index SEARCH instead of a full traffic_daily scan (~91k rows on a 200-video fixture).
v192026-04-27Foreign-key constraints on junction tables (video_daily, traffic_daily, video_era, summary_video, summary_era, summary_traffic, summary_channel_traffic, summary_traffic_daily, summary_traffic_top_videos, source_activation_log, summary_traffic_daily_history, summary_channel_source_share_daily, summary_video_country, summary_video_subscribed) — orphan video_id / traffic_source_id rows now raise SQLite errors when a writer enables PRAGMA foreign_keys. The dashboard read-only client opens the DB with PRAGMA foreign_keys = ON for belt-and-suspenders enforcement.
v182026-04-27pre_publish_stub filter applied to source_activation_log and summary_traffic_daily_history build SELECTs — longitudinal storage tables now exclude pre-publish stub rows so trip events on /traffic/[sourceId] and the Spine no longer reflect zero-view pre-publish dates.
v172026-04-27Impression-weighted CTR formula (SUM(impressions * ctr) / SUM(impressions)) on summary_video.day3_ctr / day7_ctr and summary_traffic_daily_history.weighted_ctr — replaces the views/impressions shortcut so cohort and per-day source CTR match the canonical weighted_ctr definition.
v162026-04-26date-only indexes on video_daily and traffic_daily so date-range scans no longer fall back to a full table scan.
v152026-04-26source_diversity_score + top_video_share_7d + top3_video_share_7d + top_video_id_7d + videos_with_zero_views_7d + quiet_inventory_pct columns on summary_channel_daily.
v142026-04-26Rolling 7d / 28d columns (views, watch_min, impressions, weighted_ctr) on summary_channel_daily; new summary_channel_source_share_daily long-format table.
v132026-04-26ctr_quadrant column on summary_video (winner / clicked-not-watched / watched-not-clicked / quiet / noise) with median_video_ctr + median_video_retention on summary_channel as the crosshairs.
v122026-04-26imp_to_view_ratio + avd_drift_sec + source_hhi on summary_video; imp_to_view_ratio_median on summary_channel.
v112026-04-26Day-3 / Day-7 cohort columns + velocity (days_to_first_*) + peak_views_day/value + late_growth_pct on summary_video.
v102026-04-26ctr_confidence tier (noise / low / medium / high) + unattributed_impressions on video_daily, traffic_daily and summary_video.
v92026-04-26pre_publish_stub flag on video_daily / traffic_daily — flags rows where date < published_at so summary aggregations can exclude them.
v82026-04-26Owner-mode F-tier tables: summary_video_subscribed, summary_channel_country_trailing30, summary_channel_device.
v72026-04-25weighted_ctr column on summary_traffic (per-(video, source) impression-weighted CTR).
v62026-04-25Per-(video, country) views-only aggregation in summary_video_country.
v52026-04-25Longitudinal storage tables (source_activation_log, channel_phase_log, summary_traffic_daily_history) so daily history survives rebuilds.

The data is anonymized by default — video titles and IDs are stripped. Privacy-preserving but fully analyzable.

The dataset

Download .db

SQLite, anonymized by default. Open it in any SQL client, or upload it to Claude / ChatGPT / Gemini and ask the prompts on the left.

Download .db Download .csv

Tables: 35
Rows: 2,461
Identity: Anonymized

Largest tables, by row count:

insight_snapshot568
summary_video_country_daily256
traffic_daily168
summary_channel_country_daily153
video_daily138

What's in the .db

Tables your AI (or SQL client) can query directly.

summary_channel1 row
One row: channel-level totals, averages, imp_to_view_ratio_median, median_video_ctr and median_video_retention.
summary_video21 rows
Per-video lifetime stats: engagement_rate, subs_per_1k_views, watch_min_per_impression, day1/day3/day7 cohort columns, peak_views_day, late_growth_pct, imp_to_view_ratio, avd_drift_sec, source_hhi, ctr_quadrant, primary_traffic_source, days_to_first_*.
video_daily138 rows
Daily metrics per video with days_since_publish for cross-video comparison. Carries pre_publish_stub and ctr_confidence.
traffic_daily168 rows
Views, watch time, impressions and unattributed_impressions by traffic source per video per day. Carries pre_publish_stub and ctr_confidence.
summary_traffic56 rows
Lifetime traffic source breakdown per video, including impression-weighted CTR.
summary_channel_traffic9 rows
Per traffic source: total views, impressions, weighted CTR, % of channel, watch time per view.
summary_channel_daily20 rows
Daily channel totals plus rolling 7d/28d, source_diversity_score, top_video_share_7d, top3_video_share_7d, top_video_id_7d, videos_with_zero_views_7d, quiet_inventory_pct.
summary_channel_source_share_daily88 rows
Long-format trailing-7d source share per (date, traffic_source_id). Drives source-diversity and stacked-area visualisations.
summary_traffic_daily67 rows
Per source per day: views, impressions, weighted CTR, active video count.
summary_traffic_top_videos56 rows
Top 20 videos per traffic source.
summary_era42 rows
Performance during each title/thumbnail version.
channel_daily22 rows
Daily subscriber/view snapshots from YouTube Data API.
channel_timeline134 rows
Auto-generated growth story events: publishes, milestones, algorithmic signals.
data_status5 rows
What data is available from what date (helps interpret reporting lag).

What this data doesn't tell us

Every analytics dataset has gaps. Here's what isn't in this one.

Revenue and monetization
Revenue, CPM, and ad-breakdown aren't in the data. This channel isn't yet in the YouTube Partner Program, and those figures only arrive for channels that are.
Viewer demographics
Aggregate country, age, and gender breakdowns are empty. YouTube only returns them once audience size clears a minimum threshold; this channel is below that today.
Individual comments
The text of viewer comments isn't in the data. Counts are recorded; the anonymization pass keeps individual comments out of a public dataset.
Second-by-second retention
Per-second audience-retention curves aren't in this dataset. Only average retention percentage per video-day is stored; the pipeline doesn't fetch the full curve yet.
Hour-of-day patterns
Hour-of-day granularity isn't in the schema. Only daily aggregates are tracked; YouTube exposes hourly separately, and the pipeline doesn't pull it.
Search terms
The words viewers typed to find a video aren't in this dataset. YouTube exposes them through a Reporting API report the pipeline hasn't added.
External referrer URLs
The specific URLs that sent External traffic aren't surfaced — only the aggregate “External” category. The pipeline doesn't yet pull the per-referrer breakdown.