How it works

News Pulse combines signals from search, news, TV, and prediction markets to surface what's actually getting traction — across sources, not just one platform. Here's exactly how.

What we track

Every hour, News Pulse fetches live data from ten sources. Each measures a different kind of attention:

Google News — Top stories ranked by Google's algorithm. Reflects what news outlets are publishing and Google is surfacing — counted as algorithmic amplification, not active seeking.
Google Trends — Absolute search traffic for topics. Reflects what people are actively looking up — a fast signal for genuine public interest.
YouTube — Top trending videos in the News & Politics category, U.S. region. Scored by views gained in the last 24 hours — not total lifetime views — so viral-then-stale videos drop out naturally.
Wikipedia — Top-viewed articles from yesterday. When news breaks, people look it up immediately. Wikipedia traffic is one of the fastest signals that a story has genuine public interest.
Polymarket — Prediction market volume on news-related events. Money bet in the last 24 hours reflects how much uncertainty and interest a story is generating.
Kalshi — Regulated prediction market. Like Polymarket but federally licensed — volume reflects real financial conviction about how events will unfold.
NPR News — AP Wire and NPR editorial coverage. Represents what professional journalists have determined is worth covering.
NewsAPI — Coverage from hundreds of tracked news outlets — breadth of editorial coverage across the press.
The Guardian — Guardian editorial coverage. Adds international perspective to the editorial signal.
TV (CNN / Fox / MSNBC) — Broadcast news coverage. Contributes to editorial signal and reach data. TV headlines are too generic to use as topic labels, so TV never provides the headline — just the coverage signal.

Reddit and X/Twitter are not available — Reddit blocked all programmatic access in 2023, and the X API starts at $100+/month. Facebook's public data API was shut down in 2018. These are the biggest blind spots.

How stories are scored

Three factors combine to produce each topic's score.

Layer 1 — Traction score

The base score combines two things: what the public is doing, and what journalists are covering.

Engagement — each engagement source (Google Trends, YouTube, Google News, Wikipedia, Polymarket, Kalshi) scores a topic 0–100 based on signal strength. These scores are summed — not averaged. Two strongly engaged sources produce the highest possible base score.

Editorial reach — NPR, NewsAPI, Guardian, and TV contribute a reach-based base score at 40% weight. This means an important story with no public engagement yet still surfaces — just lower than stories with both signals present.

Google Trends, Wikipedia, and YouTube all use the same log scale: 100K views = ~52, 1M views = ~76, 10M = 100. Signal strength is comparable across sources regardless of what else is trending that day.

Layer 2 — Synergy multiplier

When engagement AND editorial are both present, a synergy multiplier of up to +50% is applied — proportional to how strong the engagement already is.

Strong engagement + editorial → up to +50% boost. The story is everywhere.
Weak engagement + editorial → minimal boost (~5%). Out there, but not resonating yet.
Editorial only → no multiplier, but editorial reach still contributes a base score.

Layer 3 — Sustained presence

Topics that have appeared on previous days get a +10% boost per day, capped at +40% on day five and beyond. This rewards stories with staying power over one-off spikes without letting older stories crowd out breaking news permanently.

The traction bar

Every topic card shows a thin bar broken into four colored segments. Each represents a type of attention, proportional to its weighted contribution to the score:

Active seeking (yellow) — Wikipedia traffic and Google Trends searches. People went looking for this.
Algorithmic amplification (red) — YouTube trending and Google News ranking. Platforms surfaced it to audiences.
Conviction (green) — Polymarket and Kalshi volume. People put money on an outcome.
Editorial (blue) — NPR, NewsAPI, Guardian, and TV coverage. Weighted at 40% to reflect its supporting role in the traction model.

A story that's all blue is covered by journalists with no public engagement yet. A story that's all yellow and red is viral — the public found it before (or without) the press.

What ⚡ and 📺 mean

⚡ Trending in search/social — not on TV or news
Strong public signal with no editorial coverage yet. The public noticed something journalism hasn't caught up to.
📺 In the news — not in our engagement signals
Editors are covering it but the public isn't searching or watching. Often policy stories, foreign affairs, or institutional news. These never appear in the main feed — they're surfaced separately so you can see what journalism is prioritizing that the public isn't amplifying.

How topics are grouped

Headlines are clustered automatically using named entity recognition (NER) and semantic embeddings. The goal is to group different sources covering the same story — not just stories that mention the same word.

Each headline is cleaned first: outlet attributions ("— NBC News", "- Al Jazeera") are stripped so they don't interfere with entity detection.
spaCy NER extracts named entities. Headlines with a specific, non-ubiquitous primary entity (a person, organization, or named event) are grouped by that entity key.
Geographic entities (countries, states, cities) route to the embedding pool instead — places are story settings, not subjects. "Brazil" would otherwise group a helicopter crash with a soccer match.
Headlines whose primary entities are too common to distinguish stories ("Trump," "U.S.," "AI," "White House") also go to the embedding pool, where semantic similarity decides grouping.
In the embedding pool, each headline is encoded as a 384-dimensional vector using a sentence transformer model. Headlines about the same event cluster together by meaning.
A co-reference pass merges clusters that reference each other's entities — so "U.S. strikes Iran" and "Iran responds to U.S." correctly land in one cluster.
Claude Haiku names each cluster with a 1–4 word anchor label ("Iran Strikes," "SpaceX Acquisition," "Knicks").

How often it updates

The feed refreshes every hour automatically. The timestamp in the header shows when data was last fetched. The feed is cached for up to 1 hour — if you use the Refresh button within 10 minutes of loading, it will show "Just updated."

Yesterday's feed and the full archive are accessible via the footer.

FAQ

What is News Pulse?

News Pulse is a feed that shows what's getting traction across search, news, YouTube, TV, and prediction markets — combined into one ranked list. Instead of one platform's algorithm, you get a cross-source view of what's actually breaking through.

Who is this for?

Anyone who suspects their feed isn't showing them the full picture — journalists, researchers, and curious people who want a signal that isn't shaped by one platform's incentives.

Where does the data come from?

Ten live sources: Google Trends, Google News, YouTube, Wikipedia, NPR News, NewsAPI, The Guardian, TV broadcast coverage (CNN, Fox, MSNBC), Polymarket, and Kalshi. Each is fetched hourly and weighted by signal type.

How often does it update?

Every hour. Topics that have appeared on multiple days are ranked higher than single-day spikes — so the feed reflects sustained attention, not a momentary blip.

Is this real-time?

It's hourly. Data is fetched and ranked once per hour, so the feed reflects what's been getting traction over the last several hours — not the last five minutes. Use the Refresh button to see the latest fetch.

Why are Facebook, Reddit, and TikTok missing?

Meta shut down its public data API in 2018. Reddit has blocked all programmatic access. TikTok has no public trending API. These are the biggest blind spots in the feed.

Are there ads?

No ads, ever. Links out to other platforms may serve their own ads, but News Pulse itself is ad-free.