Is Substack exaggerating its network effects?
The data tells the story writers want to hear... but is it true?
If you are a user of the internet (you are), then you probably know that there is a vast invisible apparatus keeping tabs on your movements online. Web trackers exist to answer specific questions (or potential future questions) that business owners or content producers might have about their users.
The data point commonly known as the “referral source” contains the answer to one of the most basic questions: how did this person find me?
If you’re a writer on Substack, then the platform offers some basic information out of the box about where your readers come from. By default, Substack emails you every single time someone new subscribes to your publication, and includes in that email an estimated source: whether it’s Twitter, Google, someone’s personal blog, or Substack itself.
I had always assumed that when Substack lists itself (more specifically, “Substack Network,” “Substack App,” or “substack.com”), rather than a specific Substack publication, it means that Substack’s own algorithms — its suggested publications, or newsfeed — are delivering me readers.
But my newsletter has a small enough audience that I often get clear external signals of where my new subscribers are coming from. For example, I may notice an immediate bump in readers after a piece that I wrote gains traction on Twitter. And the strange thing is, the external indicators don’t always line up with what Substack is telling me. This is especially true when Substack attributes the referral source to itself.
Until recently, it’s only been a hunch. But this week, I emailed some of the few hundred people who subscribed to my newsletter in the last 30 days, asking them to describe their path to subscribing in their own words. I then cross referenced those responses with Substack’s own reporting. Of the 32 respondents whose estimated source was listed by Substack as itself (about half), the results were decidedly incongruent. Only 5 credited the platform with helping them find me. 84% of respondents attributed their subscription to something or someone other than Substack.1
Some examples: many readers read a piece of mine that was published in One Thing, and subscribed after clicking through the link in my bio.
of the excellent Links I Would Gchat You If We Were Friends wrote about Escape the Algorithm in her list of favorite newsletters (it isn’t bragging if it’s for journalism!), and separately shared one of my pieces in a weekly link roundup. Another reader found me because they read Anil Dash’s piece in the Rolling Stone that cited me as an example of the human, personal, creative internet (also journalism!) and then searched for my name and signed up.Many (but not all) of these are Substack publications. Is it technically a lie that someone who subscribes on the written recommendation of another newsletter writer came from the Substack Network? No. But is it meaningfully true?
Substack regularly and boldly touts its network effects. Its landing page boasts to potential writers that “more than 40% of all new free subscriptions and around 20% of paid subscriptions to Substacks come from within our network.” The implication here is that you should join Substack because something about Substack’s technology will unlock more growth for you than other platforms. But the important thing to note is this: most of these readers would be finding me even if I took my newsletter somewhere else.2
Now, any product analytics are by definition an oversimplification. And when traffic is coming from an external website, identifying referral sources is a notoriously difficult science.
But when the call is coming from inside the house, as it were, sources are quite easy to track. At a technology company with the size and funding of Substack, not keeping track of where traffic is moving within Substack would amount to analytics malpractice. I can assure you that Substack’s internal systems know that a reader clicked from a specific post before subscribing to my publications. In some cases, for reasons I can’t explain, Substack does properly attribute a source to the name of a Substack publication. So Substack is making a choice to elide the specific source in favor of a generic description that conveniently reflects well on their service.
The overall effect of looking at my Substack dashboard is “wow, Substack is really getting me an audience! If I leave Substack, half of my subscription channel will dry up.” It’s clear that this is the impression other writers, the media, and investors are getting as well, because they praise Substack about it all the time.
Again, it isn’t technically a lie to say that some of this traffic is coming from Substack, but it’s a bit like you telling a friend about my newsletter over Facetime, and Facetime taking the credit for growing my readership. The logical conclusion of my experiment is that my audience is not built on the power of Substack, but the power of people.3
Of course, I am just one writer and this isn’t a rigorous scientific analysis. But if you’re afraid that you’ll stunt your growth if you leave Substack for a platform with better service or fewer Nazis, I encourage you not to take Substack’s word for it.
This data has been updated since original publishing to reflect a correction from a reader in the comments.
The only arguments I could see for crediting Substack here are:
They’ve created a platform that people trust, making people more likely to sign up for my newsletter when they see that it is a Substack publication (I doubt this has a huge effect)
If you are already subscribed to a Substack publication on a given device, the email address form field will be pre-filled, making it easier for you to subscribe. I do believe this would have a small effect, but Substack seems to be making an even stronger argument about the source for over 50% of my readers, as evidenced by the Network dashboard screenshot shared later in this piece.
Substack does offer some technology features that I believe can take some credit for subscriptions. For example, when you subscribe to one publication, Substack recommends other publications you might enjoy. For the purposes of my data, I counted these as accurate attribution of the referral source.
But even in this case, it would be easy and less misleading to be more specific! Which feature? Which publication?
I suppose you could probably argue that at some point the very existence of the Substack ecosystem caused a chain of events leading these subscribers to my door, but you could say the same about the wing flap of a Paleolithic butterfly.
Such a fascinating article! re: your first footnote, I was under the impression the recommended publications are ones that that specific owner of that Substack recommends, not recommendations algorithmically generated by Substack (source: a friend subscribed to my Substack in front of me and I saw the publications I recommended pop up in the drawer after she did so). If so, that just goes to advance your argument in a way (or it could be twisted in Substack’s favor — if they hadn’t been on Substack, they couldn’t have been recommended — implying other platforms don’t have this recommendation feature, which may be true).
As someone in the tech/startup world, 'gracefully' presenting analytics to paint a certain story is so absurdly common I'd be surprised if they didn't count everything they possibly could as ' from substack'.
It's not just tech analytics - science, politics, books, all pretty typically use data to lie, which is so dangerous because data is seen as hard evidence.