There’s lots of rules and patterns in this implementation, but it’s worth bearing in mind that you can normally get a clean URL by looking at the <link rel=canonical> element.
Sites put this in because they want search engines to index a single clean URL rather than many tracking URLs, so it’s pretty reliable.
That works if you want to get a clean URL to share with others. But if instead you have gotten a link then not using built-in patters means you would first need to retrieve the site with the tracking parameters to get to the canonical URL.
Sites put this in because they want search engines to index a single clean URL rather than many tracking URLs, so it’s pretty reliable.