Hacker News new | past | comments | ask | show | jobs | submit login
We bootstrapped our open source Google Analytics alternative to $500k ARR (plausible.io)
206 points by tacon on Nov 5, 2021 | hide | past | favorite | 68 comments



>If you hear about Plausible these days, it likely comes from one of our 4,802 paying customers. People who use and enjoy using Plausible help us spread the word to even more people.

>We have a $0 paid advertising budget and we don’t have an affiliate program either. We pretty much ignore all the best marketing practices.

This is very impressive, but I am not sure it generalizes to the average bootstrapped startup.


> ignore all the best marketing practices

Not sure how they're missing this, but they are not ignoring best practices. Content and SEO are fundamental marketing practices. This post and their entire blog is content and SEO. I guess they think it sounds cool to say "we don't care about marketing", but they are marketing the same way most bootstrappers people would tell you to.


> I am not sure it generalizes

I guess the marketing best practices they ignore wouldn't be best practices if it did :)

Still, it's nice to see that the model can work. As a happy customer, I'm pleased to see that the model is sustainable. When a product decides to stick to word-of-mouth alone, incentives are very aligned for them to make the user experience good enough to tell other people about (as I have).


That was my experience with boot-strapping too (B2C). Successful boot-strapped companies basically need to create customer love and word-of-mouth. We didn't spend anything on marketing until around a million annual revenue.


+1 this, we bootstrapped at Dataquest and word of mouth played a big role


I don't know what else they did, but smartystreets.com did a great job with word of mouth.

They would find forums with questions about address correction, and provide great answers that DID NOT require using their software.

Then just put their site in the profile or at the end of the post in a tactful way.

I started using them based on the expertise in these posts.


Marketing can help, but if people don’t love your product enough to tell others about it, then it’s going to be hard to get traction.


Have you heard of Windows?


Saying it's a Google analytics alternative is quite far stretched.

It doesn't have any user journey but only entry and exit, no A/B testing, no heatmap for click and scrolls.

If it doesn't provide any actionable intels, you don't call that analytics but just a counter.

Check Countly's intro video on what analytics is. Though the base is open source, most of its analytics features are behind enterprise edition.

https://youtu.be/sQCUNSzfEW8

For an open source solution, you might want to look at PostHog instead. (But paid plans exist for some features.)


When I logged into Google Analytics, I really only wanted to see:

- # of visitors / sessions over time

- where those people were coming from

After that, most of GA's features were wasted on me. Plausible has the two things above, so for me it's a valid alternative.


> We do miss out on getting featured in the tech media. TechCrunch published a story about our growth once (thanks to the VCs who were sharing a list of fastest growing open source startup) but otherwise, we get no coverage that VC-funded startups get.

This isn't because you're bootstrapped, it's likely because you don't have a PR team pitching on your behalf. That's how stories get placed.

Congrats!



One of those happy customers here -- I've always shied away from having any analytics, because my websites typically set no cookies and make no third-party requests. I'd prefer to be privacy-preserving than to see how little traffic my site gets.

Now with Plausible I still have no cookies or third-party requests on my websites, but I get to see numbers too.


> No third-party requests on my websites

I assume you are using the self-hosted version?



But this still sends third-party requests, only that from your server, not the user. This is not a privacy-friendly technique, but the opposite of it.


My take is that there are two factors at play here:

* Third party requests from the browser are problematic in general for technical and privacy reasons: it's impossible for the browser to know whether the request is privacy-preserving or not. And they need to be enabled in tooling like CSP. It's much easier not to have any.

* Third-party requests from my server don't give me any more information than I had before. I could store my logs and process them, or I can engage a third-party to aggregate my data for me. Plausible have no way to know that the requests I'm sending aren't entirely fictitious, and I can sleep slightly more soundly knowing that there's one less moving part that I need to maintain myself.


I might be projecting, but this is another company doing well enabled by tech like Clickhouse. Tinybird is another. Timeseries and event metrics are just so snappy, and the API and product (Clickhouse this is) are enabling small teams to do crazy cool new stuff.

(We're a Plausible and Clickhouse user at my company)


The co-founders met each other on Twitter:

https://microfounder.com/blog/cofounder-in-marketing

They are remote co-founders, if you will.


Woah, the real question is what on earth kind of "marketing" did Marko activate? Astronomical switch for them. Would love to read more about the tactics.

This image blows my mind. Basically from 0 to 50,000 uniques in month?! Congrats

https://microfounder.com/storage/posts/originals/sjswkbxufvl...


He shared his marketing strategy here:

https://www.starterstory.com/privacy-firendly-web-analytics-...

The answers are content marketing and spreading words in niche tech communities.


I have been using plausible on my small personal site for about a year now. When I emailed their support to ask a question I got a helpful response from one of the founders within an hour. Good experience overall, and I appreciate their efforts to protect the privacy of visitors while still providing useful data for site owners.


Same, had an issue last week (which turned out to be my fault) and I also had generous help from the founders, and they were cheerfully gracious when I apologised for wasting their time. I have recommended them to several clients.


Besides resources like Indie Hackers, I highly recommend everyone here read Small Giants (https://www.amazon.com/Small-Giants-Companies-Instead-10th-A...) which focuses on companies that have outsized impact feeling the pressure to grow.

They profile companies that have 2 to 2,000 (or so?) people, but most focus on having an opinionated take (over growth for its own sake).


I have been looking at Plausible and competitors like Fathom and others. It seems like there is limited room for innovation and differentiation when you almost can't store/track any data.

How can you stay competitive long-term?


That is a nice journey guys. I just need x10 to reach your ARR. Currently doing AMA on IH about my bootstrapping project which made ~$50k revenue in less then year. https://www.indiehackers.com/post/bootstrapped-my-saas-to-50...


Plausible is also great if you’re an agency or a SaaS hosting pages for customers and want to add analytics to your offering

you can add their script tag to the head and then use the API to get the stats for every site you track

they also give you the possibility to embed their dashboard for each site via iframe, which made integrating their product into ours frictionless and easy experience


Yeah 100% best integration. Lets embed secret iframe of or dashboard light or dark with backround you pick. It looks almost native everywhere


I'm very happy for you! Still a happy paying customer. I've recently upped my plan. =) One thing, could you please stop the "too many redirects" problems? It's annoying.. -_-


Congrats, really!

It’s ridiculous that PR news (Tech crunch etc..) are always focused on funded-startup or fundind instead of bootstrapped startup which in mine option are way more interesting!


This is such an inspiring story: building a business on an open source competitor to Google Analytics with a tiny team and no funding is one heck of an achievement.

I'm a very happy (paying) user of the product, too.

I LOVE that the script is less then 2KB of JavaScript, and that their privacy design is aligned with my values.

I find their dashboard UI solves my needs just fine, whereas I still can't find things easily in the GA interface despite using it for over fifteen years!


I'm a big fan of bootstrapped. It's always nice to keep things our own for as long as possible.

I like how you stated that you're not going to change much and continue with the same growth vs forcing results with an investor (without one is plausible of course). I'm always keen on what if's to forcing things with interesting outcomes, lol. Well done and congrats on the rising success.


>we only obsess about our customers, their needs and about removing Google Analytics from the web.

LOL, nice.


I am a bit perplexed at claiming an ARR (Annual Recurring Revenue) with not even a year of profits.

Is it customary to use extrapolations/projections to measure ARR in the startup/business world? Even projections based on less than a year of data?


The definition of ARR is MRR * 12.


Very common, yes.


Kudos to the team, and the market fit for this is more and more the fact that it's NOT Google, and therefore at least somewhat trustworthy. More power to you guys and here's hoping for 10x ARR this this 12 months.


Whats stopping someone from using your code and starting another cloud hosted competitor that's identical to you? Is that against your license?

(Different from an entity downloading your software and self-hosting which you say is allowed)


They have a blog post about the open source license they chose: GNU Affero General Public License V3 (AGPLv3).

https://plausible.io/blog/open-source-licenses

From their blog post:

> If you used AGPL-licensed code in your web service in the cloud, you are required to open source it. It basically prevents corporations that never had any intention to contribute to open source from profiting from the open source work.

> It explicitly prohibits corporations from parasitically competing with an open source project. They won’t be able to take the code, make changes to it and sell it as a competing product without contributing those changes back to the original project. [emphasise mine]

The blog post above was discussed heavily on Hacker News back in Oct 2020: https://news.ycombinator.com/item?id=24763734


> It explicitly prohibits corporations from parasitically competing with an open source project.

Many projects are moving from an AGPL-like license to a proprietary license just to prevent this though. MongoDB, ElasticSearch, and just yesterday, Apollo Federation 2.


That's not entirely correct. Elastic and Mongo weren't using AGPL in the first place, they had MIT/Apache type licenses. Generally we're seeing open core/source startups moving to AGPL or proprietary source available licenses (Grafana, Sentry to name a few more).


MongoDB was using AGPL before they created and switched to SSPL [1]. You're right about ElasticSearch, they were Apache-2.0 before creating and switching to the SSPL + Elastic License dual license [2].

[1] https://en.wikipedia.org/wiki/MongoDB#Licensing

[2] https://en.wikipedia.org/wiki/Elasticsearch


The AGPL has nothing to do with contributing back.


It seems like bug fixes and features are delayed 0-6 months in the open source version than the cloud hosted one [1].

> We have a free as in beer Plausible Analytics Self-Hosted solution. It’s exactly the same product as our Cloud solution with a less frequent release schedule (think of it as a long term support release).

> Bug fixes and new features are released to the cloud version several times per week. Features are battle-tested in the cloud which allows us to fix any bugs before the general self-hosted release. Every six months we combine all the changes into a new self-hosted release.

[1]: https://github.com/plausible/analytics#can-plausible-analyti...


Long time Plausible user here. They are the “good guys” of analytics.


I feel like ARR is a vanity metric compared to MRR.

In the SaaS industry MRR is the most commonly used metric.

ARR = 12 MRR

500k ARR = 41k MRR

ARR is just used in the title caused it's a bigger number and makes it look more impressive.


"To anonymize these datapoints, we run them through a hash function with a rotating salt.

`hash(daily_salt + website_domain + ip_address + user_agent)`"

Isn't this just PII with extra steps? OK it's at least better then the traditional approach. Keep in mind though that anonymizing is also a use of personal data in it self and requires a legal basis. https://www.insideprivacy.com/data-privacy/german-federal-co...


You seem to be confused about what PII is (which I think you meant from context). None of the listed informations are PII, nor do they become that in aggregation.

But if it was, it most likely would be enough anyway if the salt isn't stored anywhere. An irreversible hash of data is enough anonymization


You are right: PII. Sorry. So PII are Informationen that enable someone to identify a person as a unique person. On the homepage is stated: "This generates a random string of letters and numbers that is used to calculate unique visitor numbers for the day."

Where the definition of personal data is:

"(1) 'personal data' means any information relating to an identified or identifiable natural person ('data subject'); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, ___location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person;"

So if the listed information isn't PII (an IP address however is PII) then it would become PII if you can identify a unique visitor with it.

Am I wrong here? It sounds to me that this hash fits the definition of Art. 4


This is equivalent of replacing the IP address with a pseudonym that is rotated daily per each IP address.

Privacy improvements using crypto is somewhat marketing, but here the numbers show that they have a really good product, an impressive revenue model and a good marketing message so I think that's what we should look at.

Technically, at the end of the day, they store utm_source, they store the IP address (just in an encoded form with a salt + day added to it).

-> So yeah, you can be tracked, but in theory you will appear under a pseudonymised hash of your IP+UA.


At least in Germany the court ruled that an IP address is not PII

A unique identifier which let's them track a user everywhere isn't either if there is no way to match this id to a real name etc

They will likely still need to disclose this tracking (ianal), but the identifier used to track the visits isn't PII


My last info is that the ECJ ruled that IP addresses are PII. And as I quoted above it doesn't matter if information can be matched to a real name but it matters if you can single out one person e.g. a unique visitor.


you quoted something which i think you misunderstood.

the natural person is the link to a real name. it only becomes PII if its somehow possible to link a real name or similar to this identifier. If this is impossible, it will never be PII, even if it identifies a single individual.


This would be a massive misunderstanding on my part. I don't believe I did. For example: Recital 30 explicitly states IP addresses[0] and that they might be assigned to a person. Art 4 states that a person can be directly identified or indirectly identified. My understanding is, that you don't need any direct information about a person if you can single them out with indirect information. "However, a name is not always necessary. Had you not known Robert’s name, you could have still identified him through his proximity and some combination of physical factors, like height and hair color."[1]

I don't think I am wrong here. But I am willing to admit when I am wrong, where is my mistake?

[0]https://gdpr-info.eu/recitals/no-30/ [1]https://gdpr.eu/eu-gdpr-personal-data/?cn-reloaded=1


Yes, a name is strictly speaking unnecessary, but you need to actually identify a real person uniquely in the real world.

its not PII as long as its impossible to link it back to a unique identification from the real world such as a name, social security number or similar.

so my previous blanket statement of IP addresses not being PII is slightly exaggerated, they can be, but they rarely are.

most people access the internet either with dynamic ip addresses or from a corporations internet providers. people that have a static ip address on a private landline and without a NAT are rare, which is why generally speaking, IP addresses aren't PII.

it does become PII if the person in question has a static ip address and is the only person using this connection.


Would you say that your username is PII? I sure can't identify you in the real world, but I can identify you as a real person. I would consider it PII even if I know nothing else about you other than that I am talking to the same person.


No, my username is not PII as there is no lookup table to my real identity.

If it was, ycombinator would have to adhere to much more stringent regulation wrt them.


Ok so this is the point to where we differ. I read Art 4 as follows in this case: You are a natural person who can be indirectly identified by refering to an online identifier (your username). This identifier is therefore personal data. HN also has your IP while and a mail address. So they have a much stronger link. Maybe they also have information about the sites you visited and the comments you made. If one would read all of this, maybe this someone would have a good profile about you and could identify you.


A natural person is a legal term so it's really not a matter of opinion... Believe it or not, usernames only become PII if the user choose to use their real name or another information which uniquely identifies them in real life


Ok I guess that we can't reach an understanding in this point. I don't think GDPR needs the real life link, you think it does.

I can understand your point and it was nice to entertain this conversation with you. I can't see how to convince you from my view and I don't see how you could convince me of yours. But I will keep you arguments in mind and will look for some more information regarding to it. Thank you!


I should've realized this earlier but I think I understand now the reason for the confusion.

GDPR handles all kinds of data related to a person, so yes: a username is personal data wrt GDPR

personally identifiable information (PII) on the other hand is only a subset of data handled by the general data protection regulation (GDPR)

This might clear that up https://techgdpr.com/blog/difference-between-pii-and-persona...


Yes! I used PII and Personal Data synonymous and didn't realise that PII is a subset of personal data. I am glad you took the time to correct me. Man I feel stupid now...


Plausible is really well done. I’m happy to pay for it.


This is awesome, but I had no idea what ARR was. Perhaps it's not ideal to have that acronym in the article's title.


Annual recurring revenue. It is a term frequently used by the indie hacker community in addition to MRR for monthly recurring revenue.


It's also used by publicly traded software companies, not obscure and easily researchable term IMHO


I realized it's meaning once I read the article, but I don't think abbreviations should be used in article titles. Things like APA style guides state you shouldn't use abbreviations in titles, it should be accessible to non-experts.


It's okay if you don't know and it's definitely okay to ask!

They do tell you what it means in the second paragraph, though.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: