Google Ranking Factors
There’s a lot of information floating around these days about what makes a site rank higher in Google’s free, organic listings. Through our research, we’ve found the majority of this advice to be either partially wrong or completely wrong. The advice and products stemming from that information are often dangerous. And finally, the accurate “factors” remaining often aren’t factors that Google considers at all, but tactics and indirect correlation.
Despite all that, we actually know a lot for certain about the way Google ranks web sites. Real SEO knowledge doesn’t come from a random blogger, forum, or magic “get rich quick” scheme. The most trustworthy SEO knowledge only comes from three sources.
- Patent filings
- Direct statements from Google and/or their team
- Applying The Scientific Method
This resource is a complete guide to how Google ranks sites. We’ve included factors that are controversial or even outright myths, but created filters so you can sort out concepts that are less-substantiated. This resource is also limited to what factors matter when ranking in Google’s primary web search for non-local requests. This is important to clarify since local SEO, image-only search, video search, and every other Google search engine plays by at least slightly different rules.
Filters:
Showing261 of 261Factors.
Myth
Concrete
Positive On-Page Factors
On-page SEO describes factors that you are able to manipulate directly through the management of your own website. Positive factors are those which help you to rank better. Many of these factors may also be abused, to the point that they become negative factors. We will cover negative ranking factors later in this resource.
In broad terms, positive on-page ranking factors relate to establishing the subject matter of content, accessibility across various environments, and a positive user experience.
Positive On-Page Factors
Keyword in URL
Keywords and phrases that appear in the page URL, outside of the domain name, aid in establishing relevance of a piece of content for a particular search query. Diminishing returns are apparently achieved as URLs become lengthier or as keywords are used more than once.
Source(s): Patent US 8489560 B1, Matt Cutts
Keywords Earlier in URL
The order in which keywords appear in a URL matters. It’s been theorized that keywords appearing earlier in a URL have more weight. At minimum, it’s been confirmed by Matt Cutts that “after about five” words, the weight of a keyword dwindles.
Source(s): Matt Cutts
Keyword in Title Tag
Title tags define the title of a document or page on your site, and often appear in both the SERP and as snippets for social sharing. Should be no longer than 60-70 characters, depending on the characters (Moz Tool). As with URL, keywords closer to the beginning are widely theorized to have more weight.
Source(s): US 20070022110 A1
Keyword Density of Page
The percentage of times a keyword appears in text. Practicing SEOs once sculpted all content so that a single keyword/phrase appeared 5.5%-6% of the time. In the early-to-mid-2000s, this was very effective. Google has since improved with other types of content analysis that those tactics are scarcely relevant in 2015. And Keyword Density, although referenced in Google Patents, is almost certainly just a simplified concept within TF-IDF, which we’ll cover next.
Source(s): Patent US 20040083127 A1
TF-IDF of Page
Think of TF-IDF, or Term Frequency-Inverse Document Frequency, like Keyword Density with context. TF-IDF weighs the density of keywords on a page against what is “normal” rather than just seeking out a flat, raw percentage. This serves to ignore words like “the” in computation and establishes how many times a literate human should probably mention a phrase like “Google Ranking Factors” in a single document that covers such a topic.
Source(s): Dan Gillick and Dave Orr, Patent US 7996379 B1
Key Phrase in Heading Tag (H1, H2, etc.)
Keywords in Heading tags have strong weight in determining the relevant subject matter of a page. An H1 tag carries the most weight, H2 has less, and so forth. This tag also improves accessibility for screen readers and clear, descriptive headings reduce bounce rates according to various studies.
Source(s): In The Plex, Penn State
Words with Noticeable Formatting
Keywords in bold, italic, underline, or larger fonts have more weight in determining the relevant subject matter of a page, but less weight than words appearing in a heading. This is confirmed by Matt Cutts, SEOs, and a patent that states: “matches in text that is of larger font or bolded or italicized may be weighted more than matches in normal text.”
Source(s): Matt Cutts, Patent US 8818982 B1
Keywords in Close Proximity
The closeness of words to one another implies association. To anyone that’s ever wielded the English language, this won’t come as a surprise. One paragraph about your SEO work in Chicago will thus do more to rank for “Chicago SEO” than two paragraphs, with one about SEO and one about Chicago.
Source(s): Patents: US 20020143758 A1, US 20080313202 A1
Keyword in ALT Text
The ALT attribute of an image is an used to describe that image to search engines and who are unable to display the image. This establishes relevance, especially for Image Search, while also improving accessibility.
Source(s): Matt Cutts
Exact Search Phrase Match
Although Google may return search results that contain only part of a search phrase as it appears on your page (or in some cases, none at all), a patent states that a higher Information Retrieval (IR) score is given for an exact match. Specifically, stating that “a document matching all of the terms of the search query may receive a higher score than a document matching one of the terms.”
Source(s): Patent US8818982 B1
Partial Search Phrase Match
It’s established by a Google patent, that when a page contains an exact match of a search phrase on the page, it significantly perceived to that query relevance, dubbed the Information Retrieval (IR) score. In the process, they confirm that you may still rank for certain search queries when a page contains a search phrase not exactly as it was entered into Google. This is further verified by just doing a lot of Googling.
Source(s): Patent US8818982 B1
Keywords Higher on Page
There’s a natural trend in how we write English: earlier is usually more important. This applies to sentences, paragraphs, pages, HTML tags. Google seems to apply this everywhere as well, with content that appears earlier and more visibly being given more weight. This is, at very least, a function of the Page Layout algorithm, which gives a lot of preference to what appears above-the-fold on your site.
Source(s): Matt Cutts
Keyword Stemming
Keyword stemming is the practice of taking the root or ‘stem’ of a word and finding other words that share that stem (ie. ‘stem-ming’, ‘stem-med’, etc.). Avoiding this, such as for the sake of a keyword density score, results in poor readability and has a negative impact. This was introduced in 2003 with the Florida update.
Source(s): Matt Cutts
Internal Link Anchor Text
The anchor text of a link tells the user where that link leads. It’s an important component of navigation within your website, and when not abused, helps to establish the relevance of a particular piece of content over vague alternatives such as “click here”.
Source(s): Google’s SEO Starter Guide
Keyword is Domain Name
Also referred to as an Exact Match Domain or EMD. A powerful ranking bonus is attributed when a keyword exactly matches a domain and a search query meets Google’s definition of a “commercial query”. This was designed so that brands would rank for their own names, but was frequently exploited and as a result, made less-powerful in various circumstances.
Source(s): Patent EP 1661018 A2, US 8046350 B1
Keyword in Domain Name
A ranking bonus is attributed when a keyword or phrase exists within a domain name. The weight given seems to be less significant than when the domain name exactly matches that of a particular SEO query, but more significant than when a keyword appears later in the URL.
Source(s): Patent EP 1661018 A2
Keyword Density across Domain
Krishna Bharat identified a problem with PageRank when he introduced Hilltop: “a web-site that is authoritative in general may contain a page that matches a certain query but is not an authority on the topic of the query”. Hilltop improved search by looking at the relevance of entire sites, labeled “experts”. Since TF-IDF determines page-level relevance, we make a small assumption that Hilltop defines an “expert” domain using the same tools.
Source(s): Krishna Bharat, Patent US 7996379 B1
TF-IDF across Domain
Saying “Keyword Density” instead of “Term Frequency” in 2015 throws a lot of SEO specialists into a rage, despite being perfect synonyms. What’s important when talking about “Keyword Density” factors is again the latter half of TF-IDF: Inverse Document Frequency. Google throws out words like adverbs with TF-IDF and dynamically evaluates the natural density for topic. Metrics on “how much is natural” have apparently decreased over time.
Source(s): Dan Gillick and Dave Orr, Patent US 7996379 B1
Distribution of Page Authority
Typically, pages that are linked sitewide are given a large boost, pages linked from them get a lesser boost, and so forth. A similar effect is often seen from pages linked from the homepage, because this is commonly the most-linked page on most websites. Creating a site architecture to maximize this factor is commonly known as PageRank Sculpting.
Source(s): Patent US 6285999 B1
Old Domain
This is somewhat confusing since a brand new domain name may also receives a temporary boost. Older domains are given a little more trust, which Matt Cutts emphasizes is pretty minor (while in the process, acknowledging exists). Speculatively, this may be rewarding sites that have had a chance to prove themselves not a part of short-term black hat projects.
Source(s): Matt Cutts
New Domain
New domains may receive a temporary boost in rankings. In a patent discussing methods of determining fresh content, it’s stated “the date that a domain with which a document is registered may be used as an indication of the inception date of the document.” That said, the impact this actually has on one’s rankings is, according to Matt Cutts, relatively small. Speculatively, this may be intended to give a brand new site, or timely niche site, just enough chance to get off the ground.
Source(s): Patent US 7346839 B2, Matt Cutts
Hyphen-Separated URL Words
The ideal method of separating keywords in a URL is to use a hyphen. Underscores can work, but are not as reliable, as they can be confused with programming variables. Mashing words together in a URL is likely to cause words to not be seen as separate keywords, thus preventing any Keyword in URL bonus. Aside from these scenarios, just using a hyphen will not make a site rank higher.
Source(s): Matt Cutts
Keywords Earlier in Tag
An SEO theory manifested itself in the early 2000s called the first third rule. It noted that our language – sentences, titles, paragraphs, even entire web pages, are generally used in order of importance. Although not confirmed by Google, Northcutt’s experience with word order experiments have more frequently indicated that this is a factor.
Source(s): Speculative
Long Domain Registration Term
Google directly states in this patent that longer domain registration terms predict the legitimacy of a domain. Speculatively, those that engage in webspam understand that it’s a short-term, high volume game of burn/rinse/repeat and don’t purchase domains for longer than they need.
Source(s): Patent US 7346839 B2
Public Whois
Despite Google downplaying their ability to investigate Domain Registrant information, we know of a patent that discusses using Domain Registration Terms to single out webspam schemes. We’ve also seen Matt Cutts speak about private whois contributing to penalties, and encouraging visitors on his blog to report fake whois data. We believe that this is wise “play it safe card”, despite only a lack thereof being confirmed as a (negative) factor.
Source(s): Patent US 7346839 B2, Matt Cutts
Use of HTTPS (SSL)
SSL was officially announced as a new positive ranking factor in 2014, regardless of whether the site processed user input. Gary Illyes downplayed the significance of SSL in 2015, calling it a tiebreaker. Although, for an algorithm based on the numeric scoring of billions of web pages, we’ve found that tiebreakers very often make all of the difference on competitive search queries.
Source(s): Google, Gary Illyes
Schema.org
With the advent of Schema.org, a joint project between Google, Yahoo!, Bing, and Yandex to understand logical data entities over keywords, we move further away from the traditional “10 blue links” style of search. Currently, use of Structured Data can improve rankings in a massive variety of scenarios. There are also theories that schema.org can improve traditional search rankings by catering to a ranking method known as entity salience.
Source(s): Schema.org, Matt Cutts
Fresh Content
The full name of this one is technically “fresh content when query deserves freshness”. This term, Query Deserves Freshness (often shortened to QDF), refers to search queries that would benefit from more current content. This does not apply to every query, but it applies to quite a lot, especially those that are informational in nature. These SEO benefits are just one more reason that brand publishers tend to be very successful.
Source(s): Matt Cutts
Domain-wide Fresh Content
There is unconfirmed speculation that domain-wide performance is improved by maintaining fresh content. Speculatively, this means that overall the resource that Google is recommending is less “stale” and more accurate/relevant, especially if at least some significant portion of the information has been worth a little upkeep or supplementation by the owner.
Source(s): Patent US 8549014 B2, Speculation
Old Content
A Google patent states: “For some queries, older documents may be more favorable than newer ones.” It goes on to describe a scenario where a search result set may be re-ranked by the average age of documents in the retrieved results before being displayed.
Source(s): Patent US 8549014 B2
Domain-wide Old Content
Theoretically, for all we have heard about Query Deserves Freshness (QDF), which serves news-like content in a number of circumstances, some sort of “Query Deserves Oldness”. Considering that we’ve never been told about “QDO” by Google, it may be reasonable to conclude that older content is always preferred when QDF is not at play. Just like domain-wide freshness, however, we don’t have too much evidence to confirm a domain-wide seniority score.
Source(s): Speculation
Quality Outbound Links
Although it’s possible for outbound links to “leak PageRank”, web sites are not supposed to be dead ends. Google rewards authoritative outbound links to “good sites”. To quote the source: “parts of our system encourage links to good sites.”
Source(s): Matt Cutts
Relevant Outbound Links
Given that Google analyzes your inbound links for authority, relevance, and context, it seems reasonable to suggest that outbound links should be relevant as well as authoritative. This would likely relate to the Hilltop algorithm, simply in reverse to the manner that’s widely accepted for inbound links.
Source(s): Moz
Good Spelling and Grammar
This is a Bing ranking factor. Amit Singhal stated “these are the kinds of questions we ask” regarding spelling/grammar in Google’s definition of quality content. Matt Cutts said no in 2011 as of “a long time ago”, but also that rankings correlate anyway. Our agency’s findings have been that the first Panda update made this matter a lot. If nothing else, most content-related factors are clearly affected by spelling/grammar.
Source(s): Matt Cutts, Amit Singhal
Reading Level
We know that Google analyzes the reading level of content, since they created such a search filter for the results page (now removed). We also know that content mills, which Google is not fond of, are considered to be very basic, whereas academic writing was very advanced. What we don’t have, as of yet, is a concrete source or study that directly relates reading level to rankings.
Source(s): Correlation Study, Speculation
Rich Media
Rich media, on top of drawing more traffic from in-line image and video search, has long been considered a component of “high quality, unique content”. Video appeared to be the deciding factor with Panda 2.5. Northcutt’s work has also shown a positive correlation. Currently though, there’s no official, public source signing off on this factor.
Source(s): SEL on Panda 2.5
Subdirectories
Categorical Information Architecture has been an SEO discussion point for a long time, as it seems that Google analyzes topic coverage across entire sites. The exact ranking implications of this are unclear, but Google now refers to this as Structured Data, and at very least, will use to display breadcrumbs on the results page, therefore ranking more pages.
Source(s): Google Developers
Meta Keywords
Some SEOs claim that the meta keywords tag never mattered for SEO. That’s a myth. The notion that Google ranks meta keywords in 2015 is also a myth. Both of these facts were confirmed the same way – by placing a zero-competition, made-up word in a meta keywords tag, getting that page into the index, then searching that word. Remember though, that Google is not the only search engine, and could theoretically index countless other dynamic sites that benefit from this tag.
Source(s): Matt Cutts, Experiment Page
Mobile Friendliness
Mobile-friendly websites are given a significant ranking advantage. For now, the ranking implications of this appears to pertain only to users searching on mobile devices. This made its way into the mainstream SEO conversation and became more severe during the Mobilegeddon update in 2015, although experts were speculating on this topic for nearly a decade previous.
Source(s): Various Studies
Meta Description
A good meta description functions as a search ad. Considering how many AdWords agencies exist almost entirely on A/B testing AdWords ads, the marketing value here can’t be understated. Although keywords used in meta descriptions were once widely considered a direct ranking factor, Matt Cutts stated in 2009 that they’re not now.
Source(s): Matt Cutts
Google Analytics
Many have suggested that Google Analytics is or may become a Google ranking factor. All evidence at present, as well as very clear statements from Matt Cutts, indicate that any ranking benefits coming from Google Analytics, now or ever in the future, are an absolute myth. That said, it’s an amazingly powerful tool in the right marketer’s hands.
Source(s): Matt Cutts
Google Webmaster Tools
Just like Google Analytics, there are no confirmed ranking benefits to using Google Webmaster Tools in any way. Webmaster Tools is still useful in unearthing problems related other ranking factors on this page; especially those related to manual penalties and certain crawler errors.
Source(s): Speculation
ccTLD in National Ranking
Country code TLDs such as .uk and .br are believed to carry with them a ranking bonus to searches from the same country, which is especially useful for internationalization. They should also perform far better in contrast to a ccTLD from another country.
Source(s): Speculation
XML Sitemaps
Sitemaps can be useful, though not required, for the purpose of getting more pages of your site into the Google index. The notion that an XML sitemap will improve rankings within Google is a myth. This comes straight from Google and is confirmed by various studies.
Source(s): Susan Moskwa & Trevor Foucher
Salience of Entities
As time goes on, Google seems to do more to analyze ideas and logical entities in preference to words and phrases. It analyzes how we say things in preference to exact search queries that appear on a page. This process, in simple terms, is what’s making it possible to search for “how to cook meat”, and be returned results for steak recipes that might not mention the word “meat” directly anywhere.
Source(s): Jesse Duniet, Dan Gillick, Dan Gillick, Dave Orr, Patent US 20130132433 A1
Phrasing and Context
As keyword density is now virtually a non-factor, a basic understanding of Phrase-Based Indexing tells us that if you write about content thoroughly and elaborately, you stand a far better chance of ranking compared to writing generic content that just happens to drop a lot of keywords. A clear component of one Google patent describes this as the “identification of related phrases and clusters of related phrases”.
Source(s): Patent US 7536408 B2
Web Server Near Users
Google functions differently on many local queries, supplementing traditional results with Google Maps results, and potentially altered organic listings as well. The same is true for national and international searches. By hosting your site at least loosely near to your users, such as within the same country, you are likely to enjoy better rankings.
Source(s): Matt Cutts
Author Reputation
Authorship was an experiment that Google ran from 2011 to 2014, which thrived upon bloggers using the rel=”author” tag to establish the reputation of particular authors. Google directly confirmed by the creation and demise of authorship. Eric Enge did a nice eulogy on the rise and fall of authorship on Search Engine Land.
Source(s): John Mueller
Using rel=”canonical”
The rel=”canonical” tag suggests the ideal URL for a page. This can avert duplicate content devaluations and penalties when multiple URLs might result in the same content. Our experience is that this is only a suggestion to Google and one that is often ignored. According to Google it does not directly improve rankings. Despite all of this, it’s a very good idea.
Source(s): Google
Using rel=”author”
Using rel=”author” was once widespread SEO advice and hypothesized as a positive ranking factor, but Google’s use of this factor at all went away along with an entire practice known as Authorship. The notion that rel=”author” is beneficial for any reason whatsoever is now regarded as a myth.
Source(s): John Mueller
Using rel=”publisher”
Just like rel=”author”, using rel=”publisher” was once widespread SEO advice and also hypothesized as a positive ranking factor. And, just like rel=”author” Google’s use of rel=”publisher” at all went away along with an entire practice known as Authorship.
Source(s): John Mueller
URL uses “www” Subdomain
A common misconception propagated by by SEO bloggers suggests that a site may rank better if your URLs start with “www”. This originates from the idea that we often force all pages on a site to resolve at “www”. The reason that we actually do this is simply to avoid two URLs serving the same content at the same address, which would bring about a negative ranking factor.
Source(s): Speculation
Dedicated IP Address
Web server IP addresses can be useful for geo-targeting certain demographics. They can be negative ranking factors when they sit amidst a significant private webspam operation, or are used by the Hilltop algorithm to identify two sites as being from differing owners. But, the notion that just having a dedicated IP address provides a direct ranking advantage has been repeatedly debunked.
Source(s): Matt Cutts
Subdomain Usage
Subdomains (thing.yoursite.com) are often viewed as separate websites by Google, as compared to subfolders (yoursite.com/thing/), which are not. This has obvious implications with many other factors on this page. Matt Cutts called subfolders/subdomains “roughly equivalent” in 2012, confirming this now happens less often, but still happens. Panda recovery stories post-2012 such as HubPages migration from subfolders/subdomains, prove that it still can be a major factor.
Source(s): Matt Cutts, Matt McGee and Paul Edmondson
Number of Subdomains
The number of subdomains on a site appears to be the most significant factor in determining whether subdomains are each treated as their own sites (as occurs in nature with free web hosting services and hybrid hosting/social sites like HubPages), or just portions of a common site. Presumably, thousands of subdomains means that they don’t all belong to a single thematic site and are likely each websites in their own right.
Source(s): Speculation
Use AdSense
Although SEO paranoia seems to make this frequent advice, it’s directly denied by Google. We’ve also found no real evidence to support, and have seen no noticeable effects when assisting with optimizations for media monetization, which is something that our agency frequently does. We’re therefore prepared to firmly declare this factor a myth.
Source(s): Matt Cutts
Keywords in HTML Comments
This is an early SEO theory that’s very easily debunked by a ten second experiment and a little patience. In the cited example, we place an extremely non-competitive made-up word in our source code, then link to it prominently so that it gets indexed. If that word appears in search, we have evidence that Google ranks by that word. In this case, it doesn’t.
Source(s): Experiment Page
Keywords in CSS/JavaScript Comments
Another twist on an early SEO theory that’s very easily debunked by a ten second experiment and a little patience. In the cited example, we place an extremely non-competitive made-up word in our source code, then link to it prominently so that it gets indexed. If that word appears in search, we have evidence that Google ranks by that word. In this case, it doesn’t.
Source(s): Experiment Page
Keywords in CLASSes, NAMEs, and IDs
Once again, we can debunk theories as to whether or not words in an odd place have any impact on search engines by putting a non-competitive phrase there and waiting. It’s not worth even speculating at what Google tells us or what’s in a patent. And again here, we can confirm that this factor is a myth, at least at the time of writing this.
Source(s): Experiment Page
Privacy Policy Usage
A single experience was posted on Webmaster World in 2012 which sprawled into a larger discussion: does having a Privacy Policy benefit rankings? For what it’s worth, 30% of Search Engine Roundtable-ers voted yes, and it does fit Google’s stated philosophies pretty well. Still, this is very theoretical.
Source(s): SER Discussion
Verifiable Address
A physical address is theorized as a mark of legitimacy in standard search rankings. Loosely supported by the notion that Google looks at citations for local SEO (also known as Google Maps SEO) as mentions of Name, Address, Phone (sometimes shorted to “NAP”) together. “Highly satisfying contact information” is also something that Google quality control auditors are instructed to seek out.
Source(s): Search Engine Land
Verifiable Phone Number
A phone number is theorized as a mark of legitimacy in standard search rankings. Loosely supported by the notion that Google looks at citations for local SEO (also known as Google Maps SEO) as mentions of Name, Address, Phone (sometimes shorted to “NAP”) together. “Highly satisfying contact information” is also something that Google quality control auditors are instructed to seek out.
Source(s): Search Engine Land
Accessible Contact Page
Theorized as a mark of legitimacy. It appears that this may have originated, or is at least best-supported, from a document called Google’s Quality Rater Guidelines. In this document, Google asks quality control auditors to search for “highly satisfying contact information.”
Source(s): Search Engine Land
Low Code-to-Content Ratio
This SEO theory seemed to become widespread in 2011, suggesting that more content and less code is good. Here’s what we know: 1.) Speed is a confirmed factor, 2.) Google’s own PageSpeed Insights tool really presses even a 5Kb reduction in payload size, 3.) Minor code mistakes can cause devaluations and penalties. So at minimum (and more likely) this is an issue of indirect correlation. But to add my own account to several others, I’ve more than once seen this seem to really matter.
Source(s): SitePoint Post, SEOChat Tool
Meta Source Tag
The Meta Source Tag was created for Google News in 2010 to better-attribute sources. It comes in two forms: syndication-source (if syndicating a 3rd party) and original-source (you’re the source). In situations where content is syndicated, this may theoretically help avoid duplicate content penalties. If you’re the original-source, this tag is overridden by rel=”canonical” anyway.
Source(s): Eric Weigle
More Content Per Page
SerpIQ conducted an interesting correlation study comparing the length of content to top rankings, which decidedly favors content with 2,000-2,500 words. It’s not clear if this is an indirect function of other factors, such as these pages being better-liked and therefore drawing more links/shares, or growing popular by ranking for more, longer search query variations.
Source(s): SerpIQ
Meta Geo Tag
Unlike IP address and ccTLDs, Matt Cutts states that they “barely look at this tag, if at all”, although he did suggest that this tag might be considered if you were to use it on a gTLD site (such as “.com”), and attempt to restrict it to a country. So, while this is confirmed to be almost useless, it was suggested that Google does at least look at it and may consider it a factor very, very rarely for internationalization.
Source(s): Matt Cutts
Keywords Earlier in Display Title
More than a decade of studies and correlation research suggests that titles that begin with a keyword usually (but not always) rank better than titles ending in a keyword. It’s easy to test and usually confirms: earlier keywords are better. But our chosen source for this suggests more. Thumback.com conducted a study where title word order changed traffic by 20%-30%. Their best-performing titles didn’t begin with a keyword, but were altered (as Google sometimes does) to do so in Google’s results.
Source(s): Thumbtack Study
Keywords Earlier in Headings
Heading tags are another place where word order appears to really matter. Again, something known as the “first third rule” has been often thrown around on this topic – suggesting that words appearing earlier have more weight. Usually our findings have confirmed this, but regardless, it’s well-worth testing, especially in the H1 position.
Source(s): Speculation
Novel Content against Web
A Google Patent and this SEO’s working experience seem to indicate that Google devalues a lot more than just directly similar content. Google has literally patented methods for calling your content uninteresting. Once determining that a set of articles are related, this patent suggests various methods for determining which content is descriptive, unique, and/or weird (in a good way) when compared to others on the same topic.
Source(s): Patent US 8140449 B1, SEO by the Sea
Novel Content against Self
Google patents suggest that the genuine uniqueness/weirdness of content, as well as how elaborately that content speaks, determines something known as a “novelty score”. This is done by quantifying/qualifying “information nuggets” within text. We pretty much know only that Google’s methods for novelty scoring requires comparing many individual documents. Considering that duplicate content is weighed both internally and externally, however, novelty scores likely are as well.
Source(s): Patent US 8140449 B1
Sitewide Average Novelty Score
Kumar and Bharat’s patent titled “Detecting novel document content” describes how single documents may be scored on how “novel” (that’s an adjective) they are. Assigning an average novelty scores sitewide also appears to fit the narrative of other known sitewide factors such as sitewide thin content (Panda algorithm behavior) and sitewide expert relevance (Hilltop algorithm behavior).
Source(s): Patents US 8140449 B1, US 8825645 B1, Speculation
Quantity of Comments
We know from countless sources and even certain Webmaster Tools messages that Google can separate user-generated content and analyzes it differently. One theory suggests that Google might look at quantities of comments on content to help rate content quality. At present, however, there is no clear evidence for this factor beyond maybe fitting an “if I were Google” narrative. Speculatively, it would also be one of the easiest factors to game.
Source(s): Speculation
Positive Sentiment in Comments
It’s theorized that Google looks at blog comment opinions to determine the quality of content. There is a patent and confirmation from Google that they score the sentiment expressed towards an entire site in product reviews. But according to Amit Singhal, they’re not able to apply this to content, because “if we demoted web pages that have negative comments against them, you might not be able to find information about many elected officials”.
Source(s): Amit Singhal, Patent US 7987188 B2
Negative On-Page Factors
Negative Ranking Factors are things you can do that harm your existing rankings. These factors fit into three categories: accessibility, devaluations, and penalties. Accessibility issues are just stumbling points for Googlebot that could prevent your site being crawled or analyzed properly. A devaluation is an indicator of a lower quality website and may prevent yours from getting ahead. A penalty is far more serious, and may have a devastating effect on your long-term performance in Google. Once again, on-page factors are those that are under your direct control as a part of the direct management of your website.
Negative On-Page Factors
High Body Keyword Density
Keyword Stuffing penalties arise when abusing a once extremely effective tactic: sculpting Keyword Density to a high level. Our own experiments have shown that penalties can happen as early as 6% density, though TF-IDF (covered earlier) is likely at play and this is sensitive to topics, word types, and context.
Source(s): Matt Cutts, Remix
Keyword Dilution
This factor manifests itself from logic: if a higher Keyword Density or TF-IDF is positive, at some point, a total lack of frequency/density will decrease relevance. As Google has improved at understanding natural language, this may be better described as Subject Matter Dilution: writing content that wanders without any clear theme. The same basic concept is at play either way.
Source(s): Matt Cutts
Keyword-Dense Title Tag
Aside from a page as a whole, Keyword Stuffing penalties appear to be possible within the title tag. An ideal title tag should definitely be less than 60-70 characters and hopefully still provide enough value to function as a good search ad in Google’s results. At absolute minimum, there is no benefit in using the same keyword five times in the same tag.
Source(s): Matt Cutts
Exceedingly Long Title Tag
Aside from a page as a whole, Keyword Stuffing penalties appear to be possible within the title tag. An ideal title tag should definitely be less than 60-70 characters and hopefully still provide enough value to function as a good search ad in Google’s results. At absolute minimum, there is no benefit in using the same keyword five times in the same tag.
Source(s): Matt Cutts
Keyword-Dense Header Tags
Heading Tags, such as H1, H2, H3, etc. can add additional weight to certain words. Those attempting to abuse this positive ranking factor will find that they can’t simply cram as many keywords as they can into these tags, even if the tags themselves grow to be no lengthier than usual. Keyword Stuffing penalties appear to be possible simply as a function of the total space within these tags.
Source(s): Matt Cutts
Header Tag (H1, H2, etc.) Overuse
As a general rule, if you want a concrete answer of whether or not an SEO penalty exists, try pushing a positive ranking factor well beyond what seems sane. One easily verified penalty involves placing your entire website in an H1 tag. Too lazy for that? Matt Cutts drops a less-than-subtle hint about too much text in an H1 in this source.
Source(s): Matt Cutts
URL Keyword Repetition
While there doesn’t seem to be any penalties associated with using a word in a URL multiple times, the value added from keyword repetition in a URL appears to be basically nothing. This can be verified very simply by placing a word in a URL five times instead of just once.
Source(s): Speculation
Exceedingly Long URLs
Matt Cutts notes that after about five words, the additional value behind words in a URL dwindles. It’s theorized and pretty replicable that this occurs in Google as well, although directly unconfirmed. Although they operate somewhat differently, Bing has also gone out of their way to confirm URL keyword stuffing is a penalty in their engine.
Source(s): Matt Cutts
Keyword-Dense ALT Tags
Given that ALT tag text is not generally directly visible on the page, ALT tag keyword stuffing has been widely abused. A few descriptive words are fine and actually ideal, but doing more than this can invite penalties.
Source(s): Matt Cutts
Exceedingly Long ALT Tags
Given that ALT tag text is not generally directly visible on the page, ALT tag keyword stuffing has been widely abused. A few descriptive words are fine and actually ideal, but doing more than this can invite penalties.
Source(s): Matt Cutts
Long Internal Link Anchors
At minimum, really long internal anchor text will not bring along with it any additional value – a devaluation. In extreme circumstances, it appears possible to draw Keyword Stuffing webspam penalties from exceedingly lengthy anchor text.
Source(s): Speculation
High Ratio of Links to Text
It’s theorized that just having a site that’s all links and no substance is the mark of a low quality site. This fits the narrative content quality and not ranking pages that look like too much like search results pages, but is not currently supported by a study as proof.
Source(s): Speculation
Too Much “List-style” Writing
Matt Cutts has suggested that any style of writing that just lists a lot of keywords could also fit the description keyword stuffing. Example: listing way too many things, words, wordings, ideas, notions, concepts, keywords, keyphrases, etc. is not a natural form of writing. Too much of this sort of thing will draw devaluations and possibly penalties.
Source(s): Matt Cutts
JavaScript-Hidden Content
Although Google recommends against putting text in JavaScript as it is unreadable by search engines, that does not mean that Google does not crawl JavaScript. In extreme instances where JavaScript may be used to cloak non-JavaScript on-page text, it may still be possible to receive a cloaking penalty.
Source(s): Google
CSS-Hidden Content
One of the first and most well-documented on-page SEO penalties- intentionally hiding text or links from users, especially for the sake of loading the page up with keywords that are just for Google, can invite a nasty penalty. Some leeway appears given in legitimate circumstances like when using tabs or tooltips.
Source(s): Google
Foreground Matches Background
Another common issue that brings about cloaking penalties occurs when the foreground color matches the background color of certain content. Google may use their Page Layout algorithm for this to actually look at a page visually and prevent false positives. In our experience, this can still occur accidentally in a handful of scenarios.
Source(s): Google
Single Pixel Image Links
Once a popular webspam tactic for disguising hidden links, there’s no question that Google will treat “just really small links” as hidden links. This might be done by a 1px by 1px image or just really incredibly small text. If you’re attempting to fool Google using such methods, odds are certainly that they’re going to catch you eventually.
Source(s): Google
Empty Link Anchors
Hidden Links, although often implemented differently than Hidden Text by means such as empty anchor text are also likely to invite cloaking penalties. This is dangerous territory and another once widespread webspam tactic, so be sure to double-check your code.
Source(s): Google
Copyright Violation
Publishing content in a manner that is in violation of the Digital Millennium Copyright Act (DMCA) or similar codes outside of the U.S. can lead to a severe penalty. Google attempts to analyze unattributed sources and unlicensed content automatically, but users can go so far as to report possible infringement for manual action to be taken.
Source(s): Google
Doorway Pages
A site that makes use of Doorway Pages, or Gateway Pages, describes creating masses of pages that are intended to be search engine landing pages, but do not provide value to the user. An example of this would be creating one product page for every city name in America, resulting in what’s known as spamdexing, or spamming Google’s index of pages.
Source(s): Google
Overuse Bold, Italic, or Other Emphasis
At minimum, if you place all the text on your site within a bold tag, for the reason that such text is often given additional weight compared to the rest of the page, you haven’t cracked some code that just makes your whole site rank better. This sort of activity fits Google’s frequent blanket description of “spammy activity”, and we have verified such penalties in our own non-public studies for clients.
Source(s): Matt Cutts
Broken Internal Links
Broken internal links make a site more difficult for search engines to index and more difficult for users to navigate. It’s a tell-tale sign of a low quality website. Make sure your internal links are never broken.
Source(s): Patent US 20080097977 A1, Google via SEL
Redirected Internal Links
The PageRank algorithm carries with it the usual decay when navigating redirects. This is an easy trap to fall into, especially when considering links to “www” vs. “non-www” portions of a site, or addresses with/without a trailing slash.
Source(s): Patent US 6285999 B1, Matt Cutts via SER
Text in Images
Google has come a long way at analyzing image, but on the whole, it’s very unlikely that text that you present in rich media will be searchable in Google. There’s no direct devaluation or penalty when you put text in an image, it just prevents your site from having any chance to rank for these words.
Source(s): Matt Cutts
Text in Video
Just like with images, the words that you use in video can’t be reliably accessed by Google. If you are publishing video, it’s to your benefit to always publish a text transcript such that the content of your video is completely searchable. This is true regardless of rich media format, including HTML5, Flash, SilverLight, and others.
Source(s): Matt Cutts
Text in Rich Media
Google has come a long way at analyzing images, videos, and other formats of media such as Flash, but on the whole, it’s very unlikely that text that you present in rich media will be searchable in Google. There’s no devaluation or penalty here,
Source(s): Matt Cutts
Frames/Iframes
In the past, search engines were entirely unable to crawl through content located in frames. Though they’ve overcome this weakness to an extent, frames do still present a stumbling point for search engine spiders. Google attempts to associate framed content with a single page, but it’s far from guaranteed that this will be processed correctly.
Source(s): Google
Dynamic Content
Dynamic content can create a number of challenges for search engine spiders to understand and rank. Using noindex and minimizing use of such content, especially where accessible by Google, is believed to result in a more positive overall user experience and likely to draw preferential treatment in rankings.
Source(s): Matt Cutts
Thin Content
Although it’s always been better to write more elaborate content that covers a topic thoroughly, the introduction of Nanveet Panda’s “Panda” algorithm established a situation where content with basically nothing of unique value would be severely punished in Google. An industry-wide recognized case study on Dani Horowitz’s “DaniWeb” forum profile pages serves as an excellent example of Panda’s most basic effects.
Source(s): Google, DaniWeb Study
Domain-Wide Thin Content
For a very long time, Google has made an effort to understand the quality and unique value presented by your content. With the introduction of the Panda algorithm, this became an issue that was scored domain-wide, rather than on a page-by-page basis. As such, it’s now usually beneficial to improve the average quality of content in search engines, while using ‘noindex’ on pages that are doomed to be repetitive and uninteresting, such as blog “tag” pages and forum user profiles.
Source(s): Google
Too Many Ads
Pages with too many ads, especially above-the-fold, create a poor user experience and will be treated as such. Google appears to base this on an actual screenshot of the page. This is a function of the Page Layout algorithm, also briefly known as the Top Heavy Update.
Source(s): Google
Use of Pop-ups
Although Google’s Matt Cutts answered no to this question in 2010, Google’s John Mueller said yes in 2014. After weighing both responses and understanding the process behind the Page Layout algorithm, our tie-breaking ruling is also “yes”: using pop-ups can definitely harm your search rankings.
Source(s): Google
Duplicate Content (3rd Party)
Duplicate content that appears on another site can bring about a significant devaluation even when it’s not in violation of copyright guidelines and properly cites a source. This falls in line with a running theme: content that is genuinely more unique and special against a backdrop of the web as a whole will perform better.
Source(s): Google
Duplicate Content (Internal)
Similar to when content duplicated from another source, any snippet of content that is duplicated within a page or even the site as a whole will endure a decrease in value. This is an extremely common issue and can creep up from anything ranging from too many indexed tag pages to www vs. non-www versions of the sites to variables appended to URLs.
Source(s): Google
Linking to Penalized Sites
This was introduced as the “Bad Neighborhood” algorithm. To quote Matt Cutts: “Google trusts sites less when they link to spammy sites or bad neighborhoods”. Simple as that. Google has suggested using the rel=”nofollow” attribute if you must link to such a site. To quote Matt again: “Using nofollow disassociates you with that neighborhood.”
Source(s): MC: Bad Neighbors, MC: Nofollow
Slow Website
Slower sites will not rank as well as faster sites. There are countless tools to assist in performance auditing for both server-side and client-side factors, and they should be used. This factor is executed with the target audience in mind, so seriously consider the geography, devices, and connection speeds of your audience.
Source(s): Google
Page NoIndex
If a page contains the meta tag for “robots” that carriers a value “noindex”, Google will never place it in its index. If used on a page that you want to rank, it’s a bad thing. It can also be a good thing when removing pages that will never be good for Google users, and elevate the average experience on visitor arriving from Google.
Source(s): Logic
Internal NoFollow
This can appear two ways: if a page contains the “robots” meta tag with the value “nofollow”, it will imply that the rel=”nofollow” attribute is added to every link on the page. Or, it can be added to individual links. Either way, this is taken to mean “I don’t trust this”, “crawl no further”, and “do not give this PageRank”. Matt does not mince words here: just never “nofollow” your own site.
Source(s): Matt Cutts
Disallow Robots
If your site has a file named robots.txt in the root directory with a “Disallow:” statement followed by either “*” or “Googlebot”, your site will not be crawled. This will not remove your site from the index. But it will prevent any updating with fresh content, or positive ranking factors that surround age and freshness.
Source(s): Google
Poor Domain Reputation
Domain names maintain a reputation with Google over time. Even if a domain changes hands and you are now running an entirely different web site, it’s possible to suffer from webspam penalties incurred by the poor behavior of previous owners.
Source(s): Matt Cutts
IP Address Bad Neighborhood
While Matt Cutts has gone out of his way to debunk the long-standing practice of “SEO web hosting” on dedicated IP addresses serving any real benefit, this is contradicted by the notion that in rare cases, Google has penalized entire server IP ranges where they might be associated with a private network or bad neighborhood.
Source(s): Matt Cutts
Meta or JavaScript Redirects
A classic SEO penalty that isn’t too common anymore; Google recommends not using meta-refresh and/or JavaScript timed redirects. These confuse users, induce bounce rates, and are problematic for the same reasons as cloaking. Use a 301 (if permanent) or 302 (if temporary) redirect at the server level instead.
Source(s): Google
Text in JavaScript
While Google continues to improve at crawling JavaScript, there’s still a fair chance that Google will have trouble crawling content that’s printed using JavaScript, and further concern that Googlebot won’t fully understand the context of when it gets printed and to whom. While printing text with JavaScript won’t cause a penalty, it’s an undue risk and therefore a negative factor.
Source(s): Matt Cutts
Poor Uptime
Google can’t (re)index your site if they can’t reach it. Logic also would dictate that a site that’s unreliable also leads to a poor Google user experience. While one outage is unlikely to be devastating to your rankings, achieving reasonable uptime is important. One or two days should be fine. More than this will cause problems.
Source(s): Matt Cutts
Private Whois
While it’s often pointed out that Google can’t always access whois data from every registrar, Matt Cutts made it clear at PubCon 2006 that they were still looking at this data, and that private whois, when combined with other negative signals, may lead to a penalty.
Source(s): Matt Cutts
False Whois
Similar to private whois data, it’s been made clear that representatives from Google are aware of this common trick and treating it as a problem. If for no reason other than it being a violation of ICANN guidelines, and potentially allowing a domain hijacker to steal your domain via a dispute without you getting a say, don’t use fake information to register a domain.
Source(s): Matt Cutts
Penalized Registrant
If you subscribe to the notion that private and false whois records are bad, and take into account that Matt Cutts has discussed using this as a signal that identifies webspam, it stands to reason that a domain owner can be flagged and penalized across numerous sites. This is unconfirmed and purely speculative.
Source(s): Speculative
ccTLD in Global Ranking
ccTLDs are country-specific domain suffixes, such as .uk and .ca. They are the opposite of gTLDs, which are global. These are useful in executing international SEO, but can be equally problematic when attempting to rank outside of these countries. An exception to this rule is that a small number of ccTLDs have been widely used for other purposes such as .co, and have been labeled by Google as “gccTLDs”.
Source(s): Google
Too Many Internal Links
Matt Cutts once stated that there was a hard limit of 100 links per page, which was later retracted to say “keep it at a reasonable number”. This was because Google once would not download more than 100K of a single page. That’s no longer true, but since every link divides your distribution of PageRank, this potential makes sense without any altered understanding of how Google works.
Too Many External Links
As a simple function of the PageRank algorithm, it’s possible to “leak PageRank” out from your domain. Note, however, that the negative factor here is “too many” external links. Linking out to a reasonablenumber of external sites is a positive ranking factor that’s confirmed by Mr. Cutts in the same source article to this factor.
Source(s): Matt Cutts
Invalid HTML/CSS
Matt Cutts has said no to this being a factor. Despite this, our experience has consistently indicated yes. Code likely doesn’t have to be perfect and this may be an indirect effect. But the negative effects of bad code are supported by logic as you consider other code-related factors (hint: there’s a code filter up top). Bad code can cause countless, potentially invisible issues including tag usage, page layout, and cloaking.
Source(s): Matt Cutts
Outbound Affiliate Links
Google has vocally taken action against affiliate sites that provide ‘no additional value’ in the past. It’s in the guidelines. There’s much SEO paranoia that surrounds hiding affiliate links using a 301 redirect in a directory blocked by robots.txt, although, Google can view HTTP headers without navigating. A number of affiliate marketers have reported reasonably scientific case studies of penalties from too many affiliate links, therefore, we rate this as likely.
Source(s): Google, Affiliate Marketer’s Study
Parked Domain
A parked domain is a domain that does not yet have a real website on it; often sitting unused at a domain registrar outside of some machine-generated advertising. Anymore, this fails to meet so much other ranking criteria that it probably wouldn’t have much success in Google anyway. They once had some. But Google has repeatedly made it clear that they don’t want to rank parked domains of any kind.
Source(s): Google
Search Results Page
Generally speaking, Google wants users to land on content, not other pages that look like listings of potential content, like the Search Engine Results Page (SERP) that such a user just came from. If a page looks too much like a search results page, by functioning as just an assortment of more links, it’s likely to not rank as well. This may also apply to blog posts outranking tag/category pages.
Source(s): Matt Cutts
Automatically Generated Content
Machine-generated content that’s based upon user search query will “absolutely be penalized” by Google and is considered a violation of the Google Webmaster Guidelines. There are a number of methods that could qualify which are detailed in the Guidelines. Once exception to this rule appears to be machine-generated meta tags.
Source(s): Matt Cutts, Webmaster Guidelines
Too Many Footer Links
It’s been made very clear that links tucked into the footer of a site don’t carry the same weight as those in an editorial context. It’s also true that when Google first began speaking about their actions against paid link schemes, the practice of spamming site footers with dozens of paid external links was widespread, and therefore too many external footer links can draw that sort of penalty.
Source(s): Matt Cutts
Infected Site
Many website owners would be surprised to know that most compromised web servers are not defaced. Often, the offending party will actually go so far as to patch your security holes to protect their newfound property, without you ever knowing. This will then manifest itself in the form of malicious activity enacted on your behalf such as virus/malware distribution and further exploits, which Google takes very seriously.
Source(s): Webmaster Guidelines
Phishing Activity
If Google might have reason to confuse your site with a phishing scheme (such as one that aims to replicate another’s login page to steal information), prepare for a world of hurt. For the most part, Google simply uses a blanket description of “illegal activity” and “things that could hurt our users”, but in this interview, Matt specifically mentions their anti-phishing filter.
Source(s): Matt Cutts
Outdated Content
A Google patent exists surrounding stale content, which is identified in a variety of ways. One such method for defining stale content basically just surrounds being old. What is unclear is whether this factor harms rankings on all queries, or simply when a particular search query is associated with something Google refers to as Query Deserves Freshness (QDF), which means exactly what it sounds like.
Source(s): Patent US 20080097977 A1
Orphan Pages
Orphan pages, meaning pages of your site that are difficult or impossible to find using your internal link architecture, can be treated as Doorway Pages and act as a webspam signal. At minimum, such pages likely do not benefit from internal PageRank, and therefore have far less authority.
Source(s): Google Webmaster Central
Sexually Explicit Content
While Google does index and return X-rated content, it’s not available when their Safe Search feature is turned on, which is Google’s default state. It’s therefore reasonable to consider that unmoderated user-generated content or one-time content that inadvertently crosses a certain line may be blocked by the Safe Search filter.
Source(s): Google Safe Search
Selling Links
Matt Cutts presents a case study where the toolbar PageRank of a domain decreased from seven to three as a direct result of outbound paid links. As a violation of Google’s Webmaster Guidelines, it appears that directly selling links that pass PageRank can lead to penalties on both the on-page and off-page ends of a site.
Source(s): Matt Cutts
Subdomain Usage
Subdomains (thing.yoursite.com) are often viewed as separate websites by Google, as compared to subfolders (yoursite.com/thing/), which are not. This can be negative in a number of ways as it relates to other factors. One such scenario would involve a single, topical site with many subdomains, not benefiting from factors on this page that have “domain-wide” in their names.
Source(s): Matt McGee and Paul Edmondson
Number of Subdomains
The number of subdomains on a site appears to be the most significant factor in determining whether subdomains are each treated as their own sites. Using an extremely large number of subdomains, although not a terribly easy thing to do by mistake, could theoretically cause Google to treat one site like many sites, or many sites like one site.
Source(s): Speculation
HTTP Status Code 4XX/5XX on Page
If your web server returns pretty much anything other than a status code of 200 (OK) or 301/302 (redirect), it is implying that the appropriate content was not displayed. Note that this can happen even if you are able to view the intended content yourself in your browser. In cases where content is actually missing, it’s been clarified by Google that a 404 error is fine and actually expected.
Source(s): Speculation
Domain-wide Ratio of Error Pages
Presumably, the possibility for users to land on pages that return 4XX and 5XX HTTP errors is a sure mark of an overall low-quality website. We speculate this is a problem in addition to pages that are not indexed due to carrying such a HTTP header, and pages that include broken outbound links.
Source(s): Speculation
Code Errors on Page
Presumably, if a page is full of errors generated by PHP, Java, or other server-side language, it meets Google’s definitions of a poor user experience and a low quality site. At absolute minimum, error messages within the page text likely interfere with Google’s overall analysis of the text on the page.
Source(s): Speculation
Soft Error Pages
Google has repeatedly discouraged the use of “soft 404” pages or other soft error pages. These are basically error pages that still return HTTP code 200 in the document headers. Logically, this is difficult for Google to process correctly, and even though your users see an error page, Google (may at minimum) treat these as actual low-quality pages on your site, significantly lowering how the overall quality of your domain’s content is scored.
Source(s): Google
Outbound Links
On some level, something known as “PageRank leakage” does exist: you only have so many “points” to distribute, and “points” that leave your site cannot circle immediately back. But Matt Cutts has confirmed that there are other controls that specifically reward some genuinely relevant and authoritative outbound links. Websites are meant to be intersections, not cul-de-sacs.
Source(s): Matt Cutts, Nicole V. Beard
HTTP Expires Headers
Setting “Expires” headers with your web server can control browser caching and improve performance. Unfortunately, depending on how they’re wielded, they can also cause problems with search indexing, by telling search engines that content will not be fresh again for potentially a long time. In all cases, they may tell Googlebot to go away for longer than desired, as their analysis seeks to emulate a real user experience.
Source(s): Moz Discussion
Sitemap Priority
Many theorize that the “priority” attribute assigned to individual pages in an XML sitemap has an impact on crawling and ranking. Much like other signals that you might hand to Google via Webmaster Tools, it seems unlikely that some pages would really rank higher just because you asked, and is mainly useful as a signal to de-prioritize lesser important content.
Source(s): Sitemaps.org
Sitemap ChangeFreq
The ChangeFreq variable in an XML sitemap is intended to indicate how often the content changes. It’s theorized that Google may not re-crawl content faster than you tell it is changing. It’s unclear however if Google actually follows this attribute or not, but if they do, it seems that it would yield a similar result as adjusting the crawl speed in Google Webmaster Tools.
Source(s): Sitemaps.org
Keyword-Stuffed Meta Description
It’s theorized that, even though Google now tells us that they don’t use meta descriptions in web ranking, only for ads, it may still be possible to send webspam signals to Google if there’s an apparent attempt to abuse the tag.
Source(s): Speculation
Keyword-Stuffed Meta Keywords
Since 2009, Google has said that they don’t look at meta keywords at all. Despite this, the tag is still widely abused by people who don’t understand or believe that idea. It’s theorized that because of the latter fact, this tag may yet serve to send webspam signals to Google.
Source(s): Matt Cutts
Spammy User-Generated Content
Google should single out problems appearing in the user-generated portions of your site and issue very targeted penalties in such a context. This is one of few circumstances where a warning may appear in Google Webmaster Tools. We’re told these penalties are usually limited to certain pages. We’ve found that WordPress trackback spam appearing in a hidden DIV is one way that this penalty can creep up undetected.
Source(s): Matt Cutts
Foreign Language Non-Isolation
Obviously, if you write in a language that doesn’t belong to your target audience, almost no positive, on-page factors can work their charm. Matt Cutts admits that improperly isolated foreign language content can be a stumbling point both for search spiders and for users. To not interfere with positive ranking factors, Google needs to be able to interrelate content on the page as well as sections of a site.
Source(s): Matt Cutts
Auto-Translated Text
Using Babelfish or Google Translate to rapidly “internationalize” a site is a surprisingly frequent practice for something that Matt Cutts explicitly states is a violation of their Webmaster Guidelines. For those fluent in Google-speak, that usually means “it’s not just a devaluation, it’s a penalty, and probably a pretty bad one”. In a Google Webmaster video, Matt categorizes machine translations as “auto-generated content”.
Source(s): Matt Cutts
Missing Robots.txt
As of 2015, Google Webmaster Tools advises site owners to add a robots.txt file to their site when one is missing. This has lead many to theorize that a missing robots.txt file is bad for rankings. We consider this is odd while Google Search’s John Mueller advises removing robots.txt entirely when Googlebot is entirely welcome. We chalk this myth up to department miscommunication.
Source(s): John Mueller via SER
Positive Off-Page Factors
Off-Page Factors describe events that take place somewhere other than on the site that you directly control and are trying to improve performance of in the rankings. This usually takes the form of backlinks from other sites. Positive Off-Page Factors generally relate to an attempt to understand honest, natural popularity, with a large emphasis on popularity achieved from more-trusted and influential sources.
Positive Off-Page Factors
Link Stability
Backlinks appear to gain value as they age. Speculatively, this may be because spam links get moderated and paid link schemes eventually expire. Therefore, backlinks existing for longer periods are worth more. This is also supported by a patent.
Source(s): Patent US 8549014 B2
Keyword Anchor Text
The anchor text used in an external link will help establish relevance of a page towards a search term. The target page does not need to contain this term to rank (see: Google Bombing).
Source(s): Patent US 8738643 B1
Links from Relevant Sites
Links from sites that cover similar material to yours are expected. Contrary to popular misconception and a number of highly destructive link building/unbuilding schemes, not every link to your site needs to come from a domain that’s only dedicated to a subject. This would appear very unnatural. But so would never being a part of industry specific discussions. This is a function of the Hilltop algorithm.
Source(s): Krishna Bharat
Partially-Related Anchor Text
When a backlink portfolio is earned naturally, as it’s supposed to be, not everybody links to a site the same way. Anchor text that includes portions of a keyword phrase, or a keyword phrase plus something else is expected. Google’s patents refer to this as “partially-related” anchor text, though SEOs more often call it “partial match”.
Source(s): Patent US 8738643 B1
Partially-Related ALT Text
Just like partial match anchor text, the ALT attribute of images is something that varies in nature and appears to still carry with it an increase in weight for phrases that contain certain words. This is unconfirmed by Google, but can be proven by very simple experimentation on very non-competitive queries, such as using made-up words. Google’s patents refer to this as “partially-related” anchor text, SEOs more often call it “partial match”.
Source(s): Patent US 8738643 B1
Keyword Link Title
It was long theorized that the “title” attribute of a link might be treated similar to anchor text, giving additional weight to certain words. At PubCon 2005, Google directly dispelled any such possibility, saying that just not enough people used this attribute. Various real world studies appear to confirm that “title” is indeed not a factor.
Source(s): Ann Smarty via SEJ
Keyword ALT Text
Keywords used in the ALT attribute of an image are treated as anchor text. Short, genuinely descriptive ALT tags also improve overall accessibility and have an exceedingly strong impact on images appearing in-line with searches from Google Image Search.
Source(s): Patent US 8738643 B1, Matt Cutts
Context Surrounding Link
For quite some time, it’s been established that the text surrounding a link, in addition to the anchor text within, is considered in evaluating context. Support for this theory is reinforced by a patent and simple experimentation. Therefore, links in text are likely to provide more value than a stand-alone link that’s detached from context.
Source(s): Patent US 8577893, SEO By The Sea
Link From Site in Same Results
In the Google Patent “Ranking search results by reranking the results based on local inter-connectivity” (insert programmer joke about recursion), Google describes a process in which having a backlink from a site that already ranks for a certain search query can increase your own weight for that particular search query by more than it would otherwise.
Source(s): Patent US 6526440 B1
Click Through Rate on Query/Page
It’s been heavily theorized that Click Through Rate from the results page is a ranking factor. It’s a Bing ranking factor. Matt glossed overranking implications in 2009. Repeatedly, Rand Fishkin has used Twitter to lead experiments which look surprisingly conclusive at confirming that CTR is a ranking factor.
Source(s): Moz Study, Patent US 9031929 B1
Click Through Rate on Domain
A patent by Nanveet Panda (of the Panda algorithm) describes assigning site quality scores based on CTR for various searches. The title of this patent is literally “Site quality score”. It also speaks of branded search queries, followed by clicks as the primary method. Still, these factors, in addition to evidence for search query CTRs as a factor, seems to suggest that sitewide CTR may be a factor.
Source(s): Patent US 9031929 B1
Low Bounce Rate
It’s been theorized that Google looks at search user bounce rate as a ranking factor. Even without Google Analytics or Chrome data this could be easily measured in several ways. Matt Cutts says no, and that tracking how long users remain on a page would be “spammable and noisy”. Yet, SEO Black Hat and Rand Fishkin have run studies that indicate otherwise, and Bing’s Duane Forrester has clearly confirmed that Bing uses it; a factor that they call “dwell time”.
Query Deserves Freshness (QDF)
Google doesn’t rank every search query the same way. Certain search queries, especially those that are news-related, are especially sensitive to the freshness of content that they will publish (and mayonly rank content that is recent). Google’s term for this is Query Deserves Freshness (QDF).
Source(s): Matt Cutts, Amit Singhal
Query Deserves Sources (QDS)
A phrase that we’ve coined to cover a scenario described in Google’s Quality Rater Guidelines, used when humans conduct quality control on Google search results. This asks: “this is a topic where expertise and/or authoritative sources are important”. Presumably, this applies to all informational search queries (in contrast to transactional and navigational queries).
Source(s): Barry Schwartz
Query Deserves Oldness (QDO)
This is a phrase that we made up to describe a situation detailed in a Google patent. It’s specifically noted that: “For some queries, older documents may be more favorable than newer ones.” The patent then goes on to describe the process in which documents would be ranked by their age, as a function of the average age of results for that query.
Source(s): Patent US 8549014 B2
Query Deserves Diversity (QDD)
Certain search queries are ranked differently by Google. One theory is called Query Deserves Diversity, likely dependent on a concept called entity salience by attaching meaning to the same word with differing definitions. As a bit of a riff on the concept of Query Deserves Freshness, this would be similar to a Wikipedia disambiguation page, where the search query is vague and a variety of result types are needed at the top of the results. Unconfirmed, but easily replicated.
Source(s): Rand Fishkin
Safe Search
In certain circumstances where adult content may be involved, a site may or may not rank based completely on whether or not Safe Search is enabled in Google’s settings. By default, Safe Search is turned on.
Source(s): Google
Use AdWords
SEO paranoia seems to prevent this myth from dying. There are no credible studies that we have encountered that suggest AdWords will improve rankings in any way. AdWords influencing organic rankings runs counter to Google’s core philosophies, and nobody is more vigilant about speaking out against this myth more than Google.
Source(s): Matt Cutts
Don’t Use AdWords
Just like using AdWords is allegedly a ranking factor in some very un-scientific circles, as is not using AdWords. The notion that AdWords can have any influence on Google’s organic rankings in any way, now or in the future, has been dispelled by Google maybe more aggressively than any other SEO myth.
Source(s): Matt Cutts
Chrome Page Bookmarks
Although directly denied by Matt Cutts, this was affirmed at the 2013 BrightonSEO conference during the ex-Googler fireside. It’s also suggested by a Google Patent, which states: “Search engine may then analyze over time a number of bookmarks/favorites to which a document is associated to determine the importance of the document.”
Source(s): Matt Cutts via SER, BrightonSEO Fireside, Patent US 20070088693
Chrome Site Traffic
Also denied by Google, the Patent “Document scoring based on traffic associated with a document” also touches on using browser traffic data for the purposes of ranking sites, stating: “information relating to traffic associated with a document over time may be used to generate (or alter) a score associated with the document.”
Source(s): Patent US 20070088693, Lifehacker Analysis
User Search History
It’s common to be served personalized search results based on your search history unless you have specifically disabled this feature in Google. As of 2009, signing into a Google account is not a requirement for being served results that are personalized based upon your recent search history.
Source(s): Brian Horling
Google Toolbar Activity
Just as Matt Cutts stated that Google Chrome data is not used in determining rankings in Google’s organic search results, the same was said for the Google Toolbar. Despite this, it’s widely reported by SEOs, which may relate to a Google Patent that directly discusses a method of doing exactly this via a browser plugin.
Source(s): Matt Cutts via SER, Patent US 20070088693
Low Alexa Score
While there are patents and speculation that suggest that Google could theoretically look at site traffic as a ranking factor, there’s absolutely no evidence to support that they are doing so using Alexa at present. In what documentation does exist, it’s suggested that they would do this using Chrome data, which by the way, they’ve totally cleared themselves to do.
Source(s): Patent US 20070088693
Total Branded Searches + Clicks
Nanveet Panda’s patent titled “site quality score” describes a scenario where navigational brand searches in Google (such as “Northcutt contact page”) contribute to a domain-wide quality score. It states: “The score is determined from quantities indicating user actions of seeking out and preferring particular sites and the resources found in particular sites.”
Source(s): Patent US 9031929 B1
High Dwell Time (Long Clicks)
The “Site quality score” patent describes a scenario that rewards branded searches + clicks as a ranking factor. As a part of their methods, it also states: “Depending on the configuration of the system …. a click of at least a certain duration, or a click of at least a certain duration relative to a resource length, for example, may be treated by the system as being a user selection.” It’s also supported by several other sources and used by Bing and Yahoo.
Source(s): Patent US 9031929 B1, Bill Slawski
Submit Site to Google
Google has long had a tool that allowed you to submit your site to be crawled. A long-standing myth is that this provides any ranking benefits whatsoever. In fact, in cases where a site is not even in the index, it almost appears to be a placebo button. For your site to rank, on top of Google simply being award Google will need to instead find it using some worthwhile links.
Source(s): Google
Submit Sitemap Tool
It’s possible to submit an XML Sitemap to Google using Google Webmaster Tools. This does appear to get more pages into the index in some cases, but for similar reasons as the raw “submit site” concept is not ideal, neither is the “Submit Sitemap”. If Google couldn’t find them on its own, they’re likely doomed never to rank. And as Rand Fishkin points out, this tool stops many diagnostic processes cold.
Source(s): Rand Fishkin
International Targeting Tool
Google Webmaster Tools provides a tool for international targeting when it may not be done correctly otherwise, mainly for use with a generic TLD like “.com”, or “gccTLDs” like .co that were intended for a particular country, but widespread use has caused Google to treat them more generically. This can help with rankings in certain countries in certain situations.
Source(s): Google
Reconsideration Requests
Google’s reconsideration request tool is generally the answer to a manual action. This tool essentially petitions Google to have someone manually review a site to determine whether or not a manually placed penalty should be removed. Considering that manual actions make up an extremely small portion of negative ranking factors, this tool should rarely be necessary.
Source(s): Google
Links from ccTLDs in Target Country
Google uses Country Code Top Level Domains (ccTLDs) to establish that a site is relevant to a certain country. It’s widely accepted that backlinks from a particular country’s ccTLDs will improve Google rankings for such a country.
Source(s): Google
Links from IP Addresses in Target Region
Google has told us that having a server near to your target audience, on a broad, international scale, will improve rankings for that audience. It’s also known that a number of other factors serve to establish geographic relevance: proven by simply comparing results from Google.com and Google.co.uk. Therefore, it’s theorized that as with most things, Google analyzes those that link to your site using the same tools as they have confirmed are used to analyze your site.
Source(s): Matt Cutts
Negative Off-Page Factors
Negative Off-Page Factors are generally related to unnatural patterns of backlinks to your site, usually due to intentional link spam. Until the the Penguin algorithm was introduced in 2012, the result of these factors was almost always a devaluation, rather than a penalty. That is, you could lose all, or nearly all, value obtained from linking practices that Google felt may be unnatural, but your site would not be harmed otherwise. While that’s still mostly true, Penguin introduced off-page penalties in a number of cases, which has opened the floodgates for malicious behavior from competing sites as a practice known asnegative SEO or Google Bowling.
Negative Off-Page Factors
Excessive Cross-Site Linking
When owning multiple sites, it’s discouraged to inter-link them for the purpose of inflating your inbound link authority. Risk increases with the number of inter-linked domains. Common ownership may be detected by domain registrant, IP address, similarity of content, similarity of design, and rarely, identified and penalized as part of a manual action. Exception made for Internationalization or “when there’s a really good reason, for users, to do it”.
Source(s): Matt Cutts
Negative SEO (Google Bowling)
Negative SEO, historically dubbed “Google Bowling”, is the act of a malicious linkspam conducted on behalf of your site by a third party. This was once very difficult, since we lived in a world of off-pagedevaluations, rather than off-page penalties. If a devaluation were to occur, a competitor could only exaggerate existing schemes, causing value to be lost sooner or more assuredly. If off-page penalties exist, which they do, negative SEO is proven by logic alone.
Source(s): Matt Cutts
Paid Link Schemes
Links can’t be purchased directly from a website owner for the sole purpose of passing PageRank. Matt Cutts states that this is directly inspired by the FTC’s guidelines on paid endorsements. To phrase this another way, backlinks viewed as endorsements, and genuine endorsements are supposed to happen without direct compensation.
Source(s): Google, Matt Cutts
Fresh Anchor Text
The age of anchor text used in a link, specifically anchor text that appears to be changing on another site, can signify a problem. Speculatively, this implies that the link is not actually from a third party and/or an active experiment in ranking manipulation.
Source(s): Patent US 8549014 B2
Unnatural Ratio of Anchor Text
To an extent, the anchor text used in links establishes relevance of the subject matter. As with every SEO tactic the community abused this to the point they were able, and controls were put in place for when the limits were pushed well beyond what occurs without manipulation. That threshold may be as simple as a flat 10% of a particular anchor text. This is a function of the Penguin algorithm.
Source(s): Penguin 1.0 Announcement, Moz Study
Unnatural Ratio of Anchor Type
Just as the Moz study showed us a high ratio of one anchor, repeatedly reproduced on our work on Penguin-penalized sites, the same can be said for sites that use too much anchor text overall. Analyzing backlinks across popular brands shows high amounts of brand name anchor text, “click here” anchors, URL anchors, and banners. Pushing too far beyond the limits of what occurs naturally invites devaluations, and since Penguin, potential for penalties.
Source(s): Speculative
Unnatural Variety of Linking Sites
If you subscribe to the notion that Google is ultimately watching for natural trends, and you accept the studies done post-Penguin on sites that were severely penalized for carrying an anchor text greater than 10%, you may also subscribe to the notion that any type of unnatural ratio of off-page activity at scale can hurt you. Although no public case study is available at the time of writing this, we have repeatedly witnessed those practicing otherwise successful black hat SEO getting greedy, taking their scheme too far, and being penalized.
Source(s): Speculation
Webspam Footprints
A “footprint” is an off-page SEO term that describes virtually anything that Google might use to identify activity originating from a common source. This might be a forum username, a person’s name, a photo, a guest author biography snippet, some element of a WordPress theme that’s involved in a private blog network, and or just about any subtle detail that relates the efforts of a webspam activity. Obviously, a footprint is not always bad, but if a site even slightly runs afoul of Google’s Webmaster Guidelines, footprints are often a factor that bring about penalties.
Source(s): Matt Cutts via SEL
Comment Spam
If you engage in blog comment spam – that is, commenting in mass in a repetitive, unnatural format, expect to see these links devalued or penalized as a link scheme. Especially your commenting is machine-driven, with odd keyword anchor text, or leaving behind a footprint of irrelevant or repetitive content. Genuine commentary, on the other hand, is fine and actually encouraged. Mr. Cutts suggests using your real name in such circumstances for good measure.
Source(s): Matt Cutts
Forum Post Spam
Forum posts, like blog comments, are fine and actually good inbound marketing when they add to a conversation and are doing for humans rather than search spiders. John Mueller confirms (amongst countless other sources that have appeared on this one over the years), that they are systematically monitoring for link schemes in the form of bulk forum spam.
Source(s): John Mueller via SER
Advertorials (Native Advertising)
Advertorial content, also known as Native Advertising, is systematically sought out by Google’s webspam team, and is considered a paid link. Links in advertorials should be disclosed and given the rel=”nofollow” attribute to avert potential for penalties. Presumably, this is a case where “nofollow” is definitely respected. Undisclosed advertorials can also get an entire publication delisted from Google News.
Source(s): Matt Cutts
Forum Signature & Profile Links
Google seems to decipher what links appear as forum signatures as compared to what links appear as a part of a natural discussion, which would be treated as editorial context and would likely actually be given PageRank. The same appears true for the popular webspam tactic of creating forum profiles. Executed in mass, it seems that both tactics eventually progress from adding very little granular value to an eventual webspam penalty
Source(s): Google
Inbound Affiliate Links
Before we speculate: inbound affiliate links are often affected by PageRank decay via 301 redirects and duplicate content devaluations as the result of URL variables. There’s speculation that inbound affiliate links may be devalued for similar reasons as paid link penalties exist, intentionally or unintentionally. Matt Cutts has suggested using “nofollow” on outbound affiliate links “if you’re worried about paid links” but also has indicated that they’re “usually fine”.
Source(s): Matt Cutts, Speculation
Footer Links
It’s been made very clear that links tucked into the footer of a site don’t carry the same weight as those in an editorial context. This concept is supported by how the Page Layout algorithm works, but it also seems that links in the footer of a site are treated even worse than content that’s just below-the-fold, as Google has specifically spoken out against using too many on more than one occasion.
Source(s): Matt Cutts
Header, Sidebar Links
Like footer links, Google appears to single out links that appear in the header or sidebar of a site (whether or not they are static, sitewide links), defining this area as “boilerplate” in their patents. Specifically, the patent language states: “the article is indexed after the boilerplate has been removed; the resultant weighting may be more accurate since it relies relatively more heavily on non-boilerplate.”
Source(s): Patent US 8041713 B2
WordPress Sponsored Themes
On top of the low value that sitewide footer links now carry, it definitely seems that Google’s webspam team is well aware of the once powerful and now mostly useless tactic of producing WordPress themes with backlinks in them. Such efforts definitely leave an obvious spammy footprint, with similarities to the GWG widget example, and it’s clear that Google isn’t having it.
Source(s): Matt Cutts
Widget Links
This was once a pretty fun link scheme that, while mostly harmless and actually value-added for a lot of users, also failed to fit into a world where links are only used as genuine endorsements. While it seems that you can still distribute widgets in 2015, Google does request using “nofollow” on the links, and not applying anchor text. Google being Google, this also almost assuredly means bad things can happen if you don’t.
Source(s): Google
Author Biography Links
Every time a link building tactic becomes a little too easy to spam, Google devalues it. It’s not “dead”. But “guest posting” in 2010 was near identical to “article marketing” from 2005 for too many. These schemes resulted in the author biography section of a blogs and articles being given less weight. Simple as that. And contrary to popular myth, brands are not punished for “for human” guest posts like New York Times Editorials and other genuinely authoritative media placements.
Source(s): Matt Cutts
Link Wheel (Pyramid / Tetrahedron)
You might study Larry Page’s paper on the PageRank algorithm and determine that you could interlink sites in a triangular/circular fashion to repeatedly pass PageRank to the same sites. Some PageRank decay exists, sure, but it’s very gradual. If this were still 2005, you’d have been rewarded for your brilliance. If you come across someone still selling link wheels, pyramids, or triangles in 2015; prepare for significant devaluations and possibly penalties.
Source(s): Matt Cutts via ClickZ
Article Directories
With how far Google has come with punishing domain-wide content scores with Panda and unnatural patterns of links with Penguin, it’s unclear if Google would even need to go out of their way to punish these sites anymore. It seems, however, that they do still single out these article directories as an issue as recently as a 2014 Matt Cutts video, however, so expect to see longer-term issues if using these methods.
Source(s): Matt Cutts
Generic Web Directories
Generic web directories were one of earliest link schemes. Matt Cutts goes out of his way to state that they do penalize paid, generic directories as paid links if they are not exercising some editorial discretion. He cites Yahoo!’s paid directory as one that is actually alright. With any link, paid or not, there appears to be a theme: editorial discretion is good, complete free-for-all listings are bad.
Source(s): Matt Cutts via ClickZ
Reciprocal Links
Google has a tendency to devalue reciprocal links, more than the expected “leaking PageRank” type effect that can might come from too many outbound links. As a very early link scheme, too many reciprocal links, or pages/sites that link to each other, as a very clear and obvious sign that most of your links were not earned and are not natural, editorial placements.
Source(s): Matt Cutts
Private Network (Link Farms)
Similar to how cross-linking your own sites can draw penalties, as can getting involved with large, private networks of just-for-SEO sites. Google has gotten very aggressive with these sites, taking downentire networks by manual action, with countless automated methods of analyzing their webspam footprints. In 2015 these methods are still widespread as short-term black hat schemes, but on a long enough timeline, every Private Network seems to be dealt.
Source(s): Matt Cutts via ClickZ
Google Dance
This term describes a temporary shake-up that sometimes accompany Google’s ~500 algorithm updates per year. Technically, these effects could be positive or negative, since it’s just rearranging rankings and someone has to go up for another to go down. But since a Google Dance is always unexpected, we’re classifying it as a negative.
Source(s): Danny Sullivan
Manual Action
In spite of every other ranking factor, Google’s webspam team will still occasionally take manual action against certain sites which can take half a year to a year to recover from after you’ve cleaned up the problems. Often, these penalties come with a notification in Google Webmaster Tools. For this reason, it’s critical to constantly look beyond the functionality of today and ask “what does Google want?” Learn Google’s philosophies and market your site in harmony.
Source(s): Matt Cutts
Poor Content Surrounding Links
Google looks at the quality of content surrounding backlinks to determine their quality, especially after the Panda and Penguin algorithms were put into production. The exact implications are unknown, but it’s not unreasonable to assume that Google’s methods for defining quality off-page are at least similar as to how they are defining quality on-page.
Source(s): Patent US 8577893, SEO by the Sea
No Context Surrounding Links
If the context surrounding a link adds value, surely a lack of context must be bad, right? This factor is likely simply a devaluation, as on some level it is still something that occurs naturally in widespread fashion. Expect to receive less, but not a total lack of, value from backlinks that are not placed in an editorial context.
Source(s): Patent US 8577893
Ratio of Links Out of Context
It’s theorized that having too many backlinks without context surrounding them, beyond a certain volume, indicates a clear webspam footprint. This theory mixes three ideas: a Google Patent “Ranking based on reference contexts” establishes the context surrounding a link is a useful indicator of quality, the frequent discussion of webspam footprints by Matt Cutts, and the fact that some amount of no-context links is natural.
Source(s): Patent US 8577893
Irrelevant Content Surrounding Links
A Google patent titled “Ranking based on reference contexts” describes how Google may look at the words surrounding a link to determine what that link relates to. If writing is not focused and thematic, it will not take advantage of this. If surrounding content is irrelevant enough, this would be altogether bizarre against what occurs naturally, and penalties may be possible.
Source(s): Patent US 8577893
Rapid Gain of Links
To quote the Google Patent: “While a spiky rate of growth in the number of back links may be a factor used by search engine to score documents, it may also signal an attempt to spam search engine.” Rapid, spontaneous growth is highly likely to invite additional scrutiny from webspam filters, however, appears more than fine when it comes from genuine editorial exposure or “going viral” without use of intentionally manipulative practices.
Source(s): Patent US 8521749 B2
Rapid Loss of Links
For nearly the same reason that a rapid gain of links can increase the scrutiny to a site’s backlink portfolio, a rapid loss of links is at least as justified a problem; probably more. As a simple logical exercise: webspam is often quickly moderated by site owners and paid links expire. The types of links that Google typically praises are the kind that tend to last.
Source(s): Patent US 8521749 B2
Sitewide Links
Sitewide links are not harmful in of themselves, but tend to be devalued, such that they’re basically just treated as one link. Matt Cutts confirms that sitewide links do happen naturally, but are often also associated with webspam. Because of this, Google’s webspam team does manual reviews of sitewide links. Presumably, there’s some automated component this process as well, and greater overall risk.
Source(s): Matt Cutts
Links from Irrelevant Sites
Google gives a bonus to inbound links from relevant sites since the Hilltop algorithm. A widespread SEO myth and a number of very dangerous “link unbuilding” and “disavowing” services have sprung up that suggest that links from irrelevant sites are inherently bad. While too many such links could result in an unnatural footprint, it would be at least as unnatural to only obtain links from sites that are exactly like your own.
Source(s): Analyzing link profiles of popular sites
Negative Page Link Velocity
A Google patent states: “By analyzing the change in the number orrate of increase/decrease of back links to a document (or page) over time, search engine may derive a valuable signal of how fresh the document is.” This indicates that a decreasing rate of inbound linkers could be damaging, especially (though not necessarily exclusively) on search queries noted to deserve only fresh content.
Source(s): Patent US 8521749 B2
Negative Domain Link Velocity
It’s speculated that if your site’s backlink portfolio is stagnant or losing links faster than you are gaining links over a long time horizon, something is wrong. This may be partially supported by a Google patent, which talks about a declining rate of inbound linkers to a particular document indicating a lack of freshness, combined with the mass of single page ranking factors confirmed to also apply domain-wide.
Source(s): Patent US 8521749 B2
Penalty by Redirect
John Mueller confirms via Google Hangout that organic search penalties can pass through a 301 redirected site. John’s confirmation confirms that this being a realistic factor is likely. The notion of this actually occurring in the wild is probably far less, unless you’re doing something like buying used domain names in attempt to reclaim their inbound link value or trying to circumvent a manual action.
Source(s): John Mueller via SER
Disavowed Links
Google Webmaster Tools added a feature in 2012 that allows you to request that an inbound link be completely ignored. These effects are permanent, irreversible, and can be very damaging to your brand’s long-term search reputation if not used correctly. This should only be used as a mode of last-resort in response to manual action or legitimate webspam mistakes from your past.
Source(s): John Mueller via SER
Links from Penalized Sites
Google has long used the phrase “bad neighborhoods” to describe the interrelation of penalty-prone sites. If your site gets a link from a site that’s already penalized for any reason, you can bet that this draws additional scrutiny to your site, and that enough of this sort of activity can bring about penalties for your site as well.
Source(s): Matt Cutts
Chrome Blocked Sites
Google introduced a tool in 2011 that allowed users to block sites in Google search via Chrome. They stated “while we’re not currently using the domains people block as a signal in ranking, we’ll look at the data and see whether it would be useful”. Therefore, there’s no guarantee that this is an automated factor in rankings, but we’re also not about to believe that nobody on the webspam team is looking at this data.
Source(s): Amay Champaneria
Negative Sentiment
In 2010, Google told us that the sentiment expressed towards a brand, such as in reviews or the text surrounding links, is a ranking factor. Reviews were known to be a huge part of local or “Google Maps SEO” rankings before that. The implications of this are a little complex, but Moz’s Carson Ward did a great piece on it.
Source(s): Amit Singhal, Patent US 7987188 B2
Crawl Rate Modification
Google Webmaster Tools allows you to modify the rate in which your site is crawled by Google. It’s not really possible to speed up Googlebot, but it’s certainly possible to slow it down to zero. This can cause problems with indexing, which mean problems for ranking, especially in regards to factors surrounding fresh content and editing.
Source(s): Google
International Targeting Tool
Google Webmaster Tools provides a tool for international targeting when it may not be done correctly otherwise. Theoretically, this tool could also cause harm if it is used to restrict your site’s appearance in search results to a particular region that does not encompass your entire desired market region.
Source(s): Google
Building Links
One myth that never seems to die surrounds the idea that building links is bad. Google’s Matt Cutts gave us link building advice since the start, and in its purest form, link building is just traditional marketing adapted to the web. Link building runs counter to Google’s philosophies only when methods focus on search engines first. Links are marketing. Build links, just, always build links for humans first.
Source(s): Matt Cutts via SEL
Link Building Services
Paying for a service that pursues links is not the same thing as paid links. Though an exception would exist if that service turns around, pays someone else for links that pass PageRank, and which are then published with zero editorial discretion. Brand-safe link building must be akin to the services of a publicist – where placement can’t be promised, but there’s everything to be gained. Matt Cutts defines “editorial discretion” midway through his paid directories video.
Source(s): Matt Cutts
No Editorial Context
Matt Cutts tells us that all links should be published with editorial discretion. But not all links need to be placed in an editorial context – that is, within the middle of a story or article. It takes very little experimentation to see that that higher quality links outside of editorial context, such as a local Chamber of Commerce membership page, help quite a lot with authority. It’s also plain to see that this would be an unnatural pattern.
Source(s): Julie Joyce
Microsites
It’s been suggested that there’s some penalty reserved for microsites: websites with an extremely narrow scope and not a lot of pages. Matt Cutts gives us Google’s stance: Microsites aren’t hunted and penalized by Google, they’re just usually not a very effective tactic as a part of a long-term strategy since sitewide ranking factors will remain weak. They’re also not very effective at exploiting Exact Match Domain bonuses using keyword-focused domains anymore.
Source(s): Matt Cutts
Click Manipulation
If you subscribe to the notion that Click Through Rate (CTR) is a positive factor, it’s reasonable to suggest that webspam controls exist here too. Rand Fishkin’s Twitter CTR experiments present evidence of a page mass-clicked page in his experiment rising from #6 to #1, dropped to #12, before restoring its position, all within the course of a couple days.
Source(s): Rand Fishkin
Brand Search Manipulation
Another theory is that if brand searches are a ranking factor as patents suggest, that webspam controls must also exist here to prevent abuse. Otherwise, this factor would be far too easy to manipulate.
Source(s): Patent US 9031929 B1
Illegal Activity Report
Google has a form that requests users any report illegal activity occurring within their content. This page implies that any such content will be removed from any Google products, including Google Search. We have no reason to doubt them on this one, and don’t expect that anyone’s going to do an experiment on this factor anytime soon either.
Source(s): Google
DMCA Report
In addition to automated controls for detecting stolen content, un-cited sources, and potential copyright violations, Google also encourages users to send DMCA requests direct to Google. This almost certainly invokes the DMCA process within the United States, during which Google has no choice but to remove any offending context accessible on their domains.
Source(s): Webmaster Tools, DMCA Process
Low Dwell Time (Short Click)
A Google patent suggests seeking “a click of at least a certain duration, or a click of at least a certain duration relative to a resource length” on branded queries. Steven Levy’s “In The Plex” first-hand account of Google suggests that this is basically Google’s best measure of search result quality. Finally, Bing and Yahoo! both have suggested using dwell time, in some scope, as a ranking factor.
Source(s): Patent US 9031929 B1, Steven Levy (In The Plex), Bill Slawski
High Task Completion Time
We have quite a bit of evidence that Click Through Rate and Dwell Time may be ranking factors, though not directly confirmed. We also know of a research paper co-authored by Google employee David Mease, which describes analyzing the overall time it takes a searcher to find a result that they’re happy with and responding with an “alternative experiential design”. Is it possible that automated A/B testing will “shake up” the weighting of factors based on how happy users appear with their results?
Source(s): David Mease
Comments
Post a Comment