How Recipeas chooses recipes

Recipeas crawls over 2.5 million recipes from chefs and food bloggers around the web. Not all of them are equally good. This page explains, in plain language and with live numbers, how we decide which to surface in browse and which to keep behind search.

1,880,582
Recipes accepted into the catalog
35%
Accept rate (3,526,233 discarded at ingest)
762,271
Recipes scored so far (rolling pass)

Two questions every recipe has to answer

Before a recipe even enters the catalog, our crawler asks: does it have a real photo? does it have at least 3 ingredients and 2 instructions? does the title look like a recipe and not a blog header? About 35% of what we scrape clears that bar — the rest gets discarded immediately.

The accepted recipes then get a discovery score from 0 to 100, which decides whether they show up in the browse feed. Recipes with low scores stay searchable — you can always find them by name or ingredient — they just don't lead the browse feed.

How the discovery score works

It's a small, transparent formula. We're not trying to be clever; we're trying to be honest about what makes a recipe worth recommending.

discovery_score = 50 (baseline) + up to +25 if the host is a famous American food brand + up to +15 for how many ingredients we successfully canonicalized + up to +10 for a clean English title − up to −25 for photo problems (placeholder, logo, tiny thumb, broken) − up to −10 for ingredient lines that didn't parse cleanly − up to −10 for a corrupted (mojibake) title show_in_feed = discovery_score ≥ 55

A few specific notes about that formula:

Is it actually working? The data.

The whole point of the score is to put the bad-looking recipes in the hidden bucket and the good-looking ones in the shown bucket. The fastest test is to look at known quality signals — corrupted titles, ALL-CAPS shouting, broken image URLs — and compare the rates.

Hidden from browse
616,647
recipes still searchable, hidden from feed
Corrupted (mojibake) titles0.6%
ALL-CAPS titles8.9%
Suspicious image URLs0.7%
Shown in browse
145,624
recipes leading the discovery feed
Corrupted (mojibake) titles0.0%
ALL-CAPS titles0.5%
Suspicious image URLs0.0%

If the score weren't doing anything useful, those percentages would be similar between the two columns. Today the gap is roughly 59× on mojibake — we're correctly funneling broken text away from the feed.

Which hosts lead each bucket

Top hosts in browse
www.allrecipes.com8,913
www.food.com8,016
www.americastestkitchen.com2,628
www.justapinch.com1,683
jamiegeller.com1,669
sunset.com1,668
www.greatbritishchefs.com1,648
tasty.co1,567
taste.co.za1,544
www.bbcgoodfood.com1,537
Top hosts in hidden
www.povarenok.ru47,279
eatsmarter.de4,178
varecha.pravda.sk2,742
migusto.migros.ch2,643
pt.petitchef.com2,353
www.kochbar.de2,147
dobruchut.aktuality.sk2,092
www.cuisinelolo.fr1,968
www.kotikokki.net1,805
www.lecremedelacrumb.com1,801

Languages in the catalog

We accept recipes from chefs and food bloggers around the world. The app auto-translates titles and ingredient lines into English by default; a one-tap toggle in the recipe view flips back to the original.

LanguageAccepted recipes
en1,342,358
(unknown)418,692
ru118,381
es274
de133
id103
pt73
zh-CN72
it54
hi-Latn45

What we're still working on