Better anchor text distribution
improved
Related Searches is a Similar.ai product feature which helps improve internal linking for enterprise SEO.
Links, whether they be internal or external, are one of the most important signals to Google and other search engines. Links indicate which page is the best answer for a search intent.
Many different keywords can express the same search intent, so we cluster keywords accordingly. Total demand is the search volume of an intent.
We cluster keywords into sub-intents and show the total demand for each. One way to think about search intent is all the keywords for which a page might rank.
We check for which sub-intents a page does not get much traffic and where the page does not target that sub-intent. These
opportunity intents
can add incremental unbranded organic traffic.
When we link through a page, we generally use the main intent as anchor text to link. Pages with more demand get more internal links. For a proportion of these, we also use opportunity intents.
We should take the intent of the main keyword of a page as the intent of the page. When we link, we link with anchor texts which are distributed across the main keyword and opportunity keywords.
The main intent is still the dominant way users think of a page, but this extra anchor text diversity increases the number of pages which rank for lots of higher demand intents.
Expressing an intent as a page
new
Browse and search pages
Category pages for large sites are often split into browse and search pages (understand the difference between browse and search pages).
Search intent
An intent makes explicit all the implicit assumptions a user typically has about a set of queries and all the implicit expectations about the results. We express an intent as entities in our knowledge graph. For any search, browse or other category page, the Similar.ai platform understands the full intent.
Expressing an intent as a page
We've released new functionality to translate the intent into the site structure, attributes and use a free text keyword for the remainder. This expresses an intent as a page on a site in the most idiomatic way possible.
Why is expressing an intent as a page useful?
Expressing an intent as a page is powerful for three use cases:
  • New pages are created at the right place in the site taxonomy and with the right refinement attributes
  • Existing search pages are redirected to the right place in the site taxonomy and with the right refinement attributes.
  • Existing search pages which have the same intent as a browse page are redirected there
This means a site can have fewer category pages which cover more of its relevant search intents.
Examples
For instance,
  • seat panda 4x4
    gets created as
    cars/uk/seat/panda/seat-panda-4x4
  • tamari enkellarsjes
    gets created as
    /kleding-dames/schoenen/q/tamari-enkellarsjes
  • audi a3 black edition in grey
    gets created as
    cars/uk/audi/a3/grey/audi+a3+black+edition+in+grey
  • canapés gris
    gets created as
    maison-meubles/canapes-salons/q/canapés+gris/
Content API
new
When you integrate with our content API, you supply a path to a page or the entities in your taxonomy which lead there. The API returns the content on the page which can be rendered. This might be the:
  • a list of FAQ questions and answers
  • the related searches anchor text and links
  • the breadcrumb anchor text and links
  • the title, H1 and description for the page.
An example call returning an error:
curl --location --request POST '<content-endpoint>' \
-header 'Content-Type: application/json' \
-header 'X-API-KEY: secret-value' \
-data-raw '{
"keyword": "",
"location": "UK",
"category": "bars",
"attributes": {
"vehicle_make": "ford",
"vehicle_model": "focus"
}
}
which would return
{"errors": [{"message": "category 'bars' is invalid", "code": "invalid-category"}]}
An example call with ok response:
curl --location --request POST '<content-endpoint>' \
--header 'Content-Type: application/json' \
--header 'X-API-KEY: secret-value' \
--data-raw '{
"keyword": "",
"location": "UK",
"category": "cars",
"attributes": {
"vehicle_make": "ford",
"vehicle_model": "focus"
}
}
Deduplication of pages by intent
improved
Deduplication is a key feature for Similar.ai. We list out the clusters of pages which answer the same user need and for each cluster of dupes, choose the best page. When we integrate with our clients, they typically add a 301 redirect between a duplicate and its new canonical. In this way we get more 'juice from the squeeze': more traffic from less pages. Our goal is to have one page for each search intent. A search intent can be expressed by 100s or 1,000s of keywords (check out our demo of categorising keywords into a search intent).
In this feature we made two updates:
  • We grouped pages with the same intent instead of listing which pages matched which keyword,
  • We used our new machine learning classifiers to match pages and keywords to intents, and expressed these intents as entities in our knowledge graph.
There are also two big advantages:
  • We can find pages with completely different names, pages which miss out superfluous words, pages which use synonyms and pages with misspellings
  • Since a page often targets many keywords, it could belong to many keywords, but you can only redirect to one canonical page. This can no longer happen.
For instance,
  • volkswagen mk1 golf
    for automotive in the UK:
image
  • chesterfield zetels
    for homeware in Belgium
image
  • bmw x5 7-seater
    for automotive in the UK:
image
  • dames t-shirts
    in clothing in the Netherlands:
image
  • gucci riemen
    for clothing in the Netherlands
image
Related Searches
new
Related searches adds links between category pages, such as search results pages or your main navigation taxonomy pages. Previously we were linking to relevant pages with demand. We've tweaked that algorithm: we maximise a combination of the following:
  • Important pages give more link equity to high potential pages: high demand pages get inlinks from important pages (a combination of traffic and demand)
  • Pages have the most relevant links: pages link to other pages with which they share the most intent
  • High demand pages get more links
The goal is to correlate internal page rank with demand. There are often trade-offs here. We visualise some of these. For instance, we show
  • A histogram of 'demand buckets', which target a set range of demand, and how many pages are in each:
image
  • What % of pages get how many InLinks
image
  • The average relevance by demand
image
Pages to be
improved
Similar.ai's SEO deduplication and pages to hide features clean up duplicate search pages and pages which don't answer a need search engines users have. This is all subtractive, like the first stage of SEO should be: we're helping you focus on getting more wood behind less arrows to concentrate your visibility. Our pages to create feature finds sought-after intents for which you don't have a page, but do have inventory and could rank. This expands your coverage so that your content can be found more users.
In our Related Searches product we link from one search, browse or other category page to another. The pages to which we link are not your
as is
pages, but your
pages to be
: a combination of the
  • Canonical pages from deduplication
  • Pages with sufficient demand
  • New pages which are going to drive incremental transactional unbranded traffic
We combine these to maximise demand across the minimal number of relevant pages without creating duplicate content.
By listing these out, we can tweak the criteria to have a page to be. For instance, you can set the minimum total demand for a page or the minimum number of listings a page can have, and Similar.ai can analyse the impact of these choices.
Car intent demo
new
We recently released a demo of our car model intent model (too many machine learning models mixing with other models!). Often when people search they don’t use the ‘correct’ words, they misspell them or they leave out important query terms, because they know what they mean. We translate keywords into an intent, which makes all the implicit assumptions explicit. The demo is for just one part of the intent model that focuses on matching car makes and models. We use this more generally to match all kinds of entities in our knowledge graphs for fashion, homeware, electronics, automotive
etc
We use intents internally when we match keywords to each other (in related searches), keywords to pages or vice versa (when we create new pages or hide pages without demand), or pages with each other (in deduplication). For each of these, we express the underlying meaning as explicitly as possible as an intent.
If you’re not sure whether the model got the right result, you can ask the SERPs and Google it!
The car intent model handles
  • simple queries, like
    audi a3 2.0 tfsi dsg 2005
    -> audi/a3 or
    bmw 118i m sport
    -> bmw/1-series.
  • misspelt queries, like
    range river defendr
    -> land rover/defender
  • mixed queries, like
    mercedes jeep
    -> mercedes-benz/g-series,
    vw van
    -> volkswagen/transporter,
    ford estate
    -> ford/focus, or
    audi suv
    -> audi/q7
  • queries without car models: sometimes it will assume a default model, like abarth -> abarth/500, or only return the make
  • a little imagination! Sometimes it does weird things, like
    ford cougar
    -> ford/kuga. Ford did have a Cougar but the Kuga is a lot more popular, and the model assumes the user meant that. (Asking the SERPs gives the same answer).
Deduplication by content
improved
We used to dedupe only by intent. This would find the same intents expressed with different words or at different places on the site. However, it doesn't capture all duplication in search, browse or other category pages. You can also have different intents which have the same content.
To see why, imagine you have 10 red Volkswagen Golf 7 GTI 2018 listings and no other 2018 golfs. You could have pages about
golf 7 2018 for sale
,
buy vw gti 2018
,
used red golf 2018
or
golf gti 2018
and despite some small difference in H1, Title and URL all of the non-template content on the page would be identical. Google will have to put in some work to work out which page to rank.
Now, we identify all pages with the same listings, work out which page should be the canonical and pass that over to you for redirection just like we do pages with the same intent.
For instance:
  • king jumpsuits
    in clothing in the Netherlands
image
Google Search Console ranking 💯
improved
We now integrate Google Search Console to get you the most accurate ranking, impressions and click data for your site. Previously we got some of this data from SEMrush, but we are now able to show much fresher results.
⚡ Super-fast loading on all pages
improved
We figured you didn't have time to wait for your next top categories.
So we've made our pages load faster.
Load More