Google Ranking Factors

There’s a lot of information floating around these days about what makes a site rank higher in Google’s free, organic listings. Through our research, we’ve found the majority of this advice to be either partially wrong or completely wrong. The advice and products stemming from that information are often dangerous. And finally, the accurate “factors” remaining often aren’t factors that Google considers at all, but tactics and indirect correlation.

Despite all that, we actually know a lot for certain about the way Google ranks web sites. Real SEO knowledge doesn’t come from a random blogger, forum, or magic “get rich quick” scheme. The most trustworthy SEO knowledge only comes from three sources.

Patent filings
Direct statements from Google and/or their team
Applying The Scientific Method

This resource is a complete guide to how Google ranks sites. We’ve included factors that are controversial or even outright myths, but created filters so you can sort out concepts that are less-substantiated. This resource is also limited to what factors matter when ranking in Google’s primary web search for non-local requests. This is important to clarify since local SEO, image-only search, video search, and every other Google search engine plays by at least slightly different rules.

Filters:

Positive On-Page

Negative On-Page

Sitewide

Content

Code

Server

Showing261 of 261Factors.

Positive Off-Page

Negative Off-Page

Authority

Relevance

Quality

Circumstantial

Patterns

Schemes

Intervention

Details

Sources

Select All

Select None

Myth

Concrete

Positive On-Page Factors

On-page SEO describes factors that you are able to manipulate directly through the management of your own website. Positive factors are those which help you to rank better. Many of these factors may also be abused, to the point that they become negative factors. We will cover negative ranking factors later in this resource.

In broad terms, positive on-page ranking factors relate to establishing the subject matter of content, accessibility across various environments, and a positive user experience.

Positive On-Page Factors

Keyword in URL

Keywords and phrases that appear in the page URL, outside of the domain name, aid in establishing relevance of a piece of content for a particular search query. Diminishing returns are apparently achieved as URLs become lengthier or as keywords are used more than once.

Source(s): Patent US 8489560 B1, Matt Cutts

Keywords Earlier in URL

The order in which keywords appear in a URL matters. It’s been theorized that keywords appearing earlier in a URL have more weight. At minimum, it’s been confirmed by Matt Cutts that “after about five” words, the weight of a keyword dwindles.

Source(s): Matt Cutts

Keyword in Title Tag

Title tags define the title of a document or page on your site, and often appear in both the SERP and as snippets for social sharing. Should be no longer than 60-70 characters, depending on the characters (Moz Tool). As with URL, keywords closer to the beginning are widely theorized to have more weight.

Source(s): US 20070022110 A1

Keyword Density of Page

The percentage of times a keyword appears in text. Practicing SEOs once sculpted all content so that a single keyword/phrase appeared 5.5%-6% of the time. In the early-to-mid-2000s, this was very effective. Google has since improved with other types of content analysis that those tactics are scarcely relevant in 2015. And Keyword Density, although referenced in Google Patents, is almost certainly just a simplified concept within TF-IDF, which we’ll cover next.

Source(s): Patent US 20040083127 A1

TF-IDF of Page

Think of TF-IDF, or Term Frequency-Inverse Document Frequency, like Keyword Density with context. TF-IDF weighs the density of keywords on a page against what is “normal” rather than just seeking out a flat, raw percentage. This serves to ignore words like “the” in computation and establishes how many times a literate human should probably mention a phrase like “Google Ranking Factors” in a single document that covers such a topic.

Source(s): Dan Gillick and Dave Orr, Patent US 7996379 B1

Key Phrase in Heading Tag (H1, H2, etc.)

Keywords in Heading tags have strong weight in determining the relevant subject matter of a page. An H1 tag carries the most weight, H2 has less, and so forth. This tag also improves accessibility for screen readers and clear, descriptive headings reduce bounce rates according to various studies.

Source(s): In The Plex, Penn State

Words with Noticeable Formatting

Keywords in bold, italic, underline, or larger fonts have more weight in determining the relevant subject matter of a page, but less weight than words appearing in a heading. This is confirmed by Matt Cutts, SEOs, and a patent that states: “matches in text that is of larger font or bolded or italicized may be weighted more than matches in normal text.”

Source(s): Matt Cutts, Patent US 8818982 B1

Keywords in Close Proximity

The closeness of words to one another implies association. To anyone that’s ever wielded the English language, this won’t come as a surprise. One paragraph about your SEO work in Chicago will thus do more to rank for “Chicago SEO” than two paragraphs, with one about SEO and one about Chicago.

Source(s): Patents: US 20020143758 A1, US 20080313202 A1

Keyword in ALT Text

The ALT attribute of an image is an used to describe that image to search engines and who are unable to display the image. This establishes relevance, especially for Image Search, while also improving accessibility.

Source(s): Matt Cutts

Exact Search Phrase Match

Although Google may return search results that contain only part of a search phrase as it appears on your page (or in some cases, none at all), a patent states that a higher Information Retrieval (IR) score is given for an exact match. Specifically, stating that “a document matching all of the terms of the search query may receive a higher score than a document matching one of the terms.”

Source(s): Patent US8818982 B1

Partial Search Phrase Match

It’s established by a Google patent, that when a page contains an exact match of a search phrase on the page, it significantly perceived to that query relevance, dubbed the Information Retrieval (IR) score. In the process, they confirm that you may still rank for certain search queries when a page contains a search phrase not exactly as it was entered into Google. This is further verified by just doing a lot of Googling.

Source(s): Patent US8818982 B1

Keywords Higher on Page

There’s a natural trend in how we write English: earlier is usually more important. This applies to sentences, paragraphs, pages, HTML tags. Google seems to apply this everywhere as well, with content that appears earlier and more visibly being given more weight. This is, at very least, a function of the Page Layout algorithm, which gives a lot of preference to what appears above-the-fold on your site.

Source(s): Matt Cutts

Keyword Stemming

Keyword stemming is the practice of taking the root or ‘stem’ of a word and finding other words that share that stem (ie. ‘stem-ming’, ‘stem-med’, etc.). Avoiding this, such as for the sake of a keyword density score, results in poor readability and has a negative impact. This was introduced in 2003 with the Florida update.

Source(s): Matt Cutts

Internal Link Anchor Text

The anchor text of a link tells the user where that link leads. It’s an important component of navigation within your website, and when not abused, helps to establish the relevance of a particular piece of content over vague alternatives such as “click here”.

Source(s): Google’s SEO Starter Guide

Keyword is Domain Name

Also referred to as an Exact Match Domain or EMD. A powerful ranking bonus is attributed when a keyword exactly matches a domain and a search query meets Google’s definition of a “commercial query”. This was designed so that brands would rank for their own names, but was frequently exploited and as a result, made less-powerful in various circumstances.

Source(s): Patent EP 1661018 A2, US 8046350 B1

Keyword in Domain Name

A ranking bonus is attributed when a keyword or phrase exists within a domain name. The weight given seems to be less significant than when the domain name exactly matches that of a particular SEO query, but more significant than when a keyword appears later in the URL.

Source(s): Patent EP 1661018 A2

Keyword Density across Domain

Krishna Bharat identified a problem with PageRank when he introduced Hilltop: “a web-site that is authoritative in general may contain a page that matches a certain query but is not an authority on the topic of the query”. Hilltop improved search by looking at the relevance of entire sites, labeled “experts”. Since TF-IDF determines page-level relevance, we make a small assumption that Hilltop defines an “expert” domain using the same tools.

Source(s): Krishna Bharat, Patent US 7996379 B1

TF-IDF across Domain

Saying “Keyword Density” instead of “Term Frequency” in 2015 throws a lot of SEO specialists into a rage, despite being perfect synonyms. What’s important when talking about “Keyword Density” factors is again the latter half of TF-IDF: Inverse Document Frequency. Google throws out words like adverbs with TF-IDF and dynamically evaluates the natural density for topic. Metrics on “how much is natural” have apparently decreased over time.

Source(s): Dan Gillick and Dave Orr, Patent US 7996379 B1

Distribution of Page Authority

Typically, pages that are linked sitewide are given a large boost, pages linked from them get a lesser boost, and so forth. A similar effect is often seen from pages linked from the homepage, because this is commonly the most-linked page on most websites. Creating a site architecture to maximize this factor is commonly known as PageRank Sculpting.

Source(s): Patent US 6285999 B1

Old Domain

This is somewhat confusing since a brand new domain name may also receives a temporary boost. Older domains are given a little more trust, which Matt Cutts emphasizes is pretty minor (while in the process, acknowledging exists). Speculatively, this may be rewarding sites that have had a chance to prove themselves not a part of short-term black hat projects.

Source(s): Matt Cutts

New Domain

New domains may receive a temporary boost in rankings. In a patent discussing methods of determining fresh content, it’s stated “the date that a domain with which a document is registered may be used as an indication of the inception date of the document.” That said, the impact this actually has on one’s rankings is, according to Matt Cutts, relatively small. Speculatively, this may be intended to give a brand new site, or timely niche site, just enough chance to get off the ground.

Source(s): Patent US 7346839 B2, Matt Cutts

Hyphen-Separated URL Words

The ideal method of separating keywords in a URL is to use a hyphen. Underscores can work, but are not as reliable, as they can be confused with programming variables. Mashing words together in a URL is likely to cause words to not be seen as separate keywords, thus preventing any Keyword in URL bonus. Aside from these scenarios, just using a hyphen will not make a site rank higher.

Source(s): Matt Cutts

Keywords Earlier in Tag

An SEO theory manifested itself in the early 2000s called the first third rule. It noted that our language – sentences, titles, paragraphs, even entire web pages, are generally used in order of importance. Although not confirmed by Google, Northcutt’s experience with word order experiments have more frequently indicated that this is a factor.

Source(s): Speculative

Long Domain Registration Term

Google directly states in this patent that longer domain registration terms predict the legitimacy of a domain. Speculatively, those that engage in webspam understand that it’s a short-term, high volume game of burn/rinse/repeat and don’t purchase domains for longer than they need.

Source(s): Patent US 7346839 B2

Public Whois

Despite Google downplaying their ability to investigate Domain Registrant information, we know of a patent that discusses using Domain Registration Terms to single out webspam schemes. We’ve also seen Matt Cutts speak about private whois contributing to penalties, and encouraging visitors on his blog to report fake whois data. We believe that this is wise “play it safe card”, despite only a lack thereof being confirmed as a (negative) factor.

Source(s): Patent US 7346839 B2, Matt Cutts

Use of HTTPS (SSL)

SSL was officially announced as a new positive ranking factor in 2014, regardless of whether the site processed user input. Gary Illyes downplayed the significance of SSL in 2015, calling it a tiebreaker. Although, for an algorithm based on the numeric scoring of billions of web pages, we’ve found that tiebreakers very often make all of the difference on competitive search queries.

Source(s): Google, Gary Illyes

Schema.org

With the advent of Schema.org, a joint project between Google, Yahoo!, Bing, and Yandex to understand logical data entities over keywords, we move further away from the traditional “10 blue links” style of search. Currently, use of Structured Data can improve rankings in a massive variety of scenarios. There are also theories that schema.org can improve traditional search rankings by catering to a ranking method known as entity salience.

Source(s): Schema.org, Matt Cutts

Fresh Content

The full name of this one is technically “fresh content when query deserves freshness”. This term, Query Deserves Freshness (often shortened to QDF), refers to search queries that would benefit from more current content. This does not apply to every query, but it applies to quite a lot, especially those that are informational in nature. These SEO benefits are just one more reason that brand publishers tend to be very successful.

Source(s): Matt Cutts

Domain-wide Fresh Content

There is unconfirmed speculation that domain-wide performance is improved by maintaining fresh content. Speculatively, this means that overall the resource that Google is recommending is less “stale” and more accurate/relevant, especially if at least some significant portion of the information has been worth a little upkeep or supplementation by the owner.

Source(s): Patent US 8549014 B2, Speculation

Old Content

A Google patent states: “For some queries, older documents may be more favorable than newer ones.” It goes on to describe a scenario where a search result set may be re-ranked by the average age of documents in the retrieved results before being displayed.

Source(s): Patent US 8549014 B2

Domain-wide Old Content

Theoretically, for all we have heard about Query Deserves Freshness (QDF), which serves news-like content in a number of circumstances, some sort of “Query Deserves Oldness”. Considering that we’ve never been told about “QDO” by Google, it may be reasonable to conclude that older content is always preferred when QDF is not at play. Just like domain-wide freshness, however, we don’t have too much evidence to confirm a domain-wide seniority score.

Source(s): Speculation

Quality Outbound Links

Although it’s possible for outbound links to “leak PageRank”, web sites are not supposed to be dead ends. Google rewards authoritative outbound links to “good sites”. To quote the source: “parts of our system encourage links to good sites.”

Source(s): Matt Cutts

Relevant Outbound Links

Given that Google analyzes your inbound links for authority, relevance, and context, it seems reasonable to suggest that outbound links should be relevant as well as authoritative. This would likely relate to the Hilltop algorithm, simply in reverse to the manner that’s widely accepted for inbound links.

Source(s): Moz

Good Spelling and Grammar

This is a Bing ranking factor. Amit Singhal stated “these are the kinds of questions we ask” regarding spelling/grammar in Google’s definition of quality content. Matt Cutts said no in 2011 as of “a long time ago”, but also that rankings correlate anyway. Our agency’s findings have been that the first Panda update made this matter a lot. If nothing else, most content-related factors are clearly affected by spelling/grammar.

Source(s): Matt Cutts, Amit Singhal

Reading Level

We know that Google analyzes the reading level of content, since they created such a search filter for the results page (now removed). We also know that content mills, which Google is not fond of, are considered to be very basic, whereas academic writing was very advanced. What we don’t have, as of yet, is a concrete source or study that directly relates reading level to rankings.

Source(s): Correlation Study, Speculation

Rich Media

Rich media, on top of drawing more traffic from in-line image and video search, has long been considered a component of “high quality, unique content”. Video appeared to be the deciding factor with Panda 2.5. Northcutt’s work has also shown a positive correlation. Currently though, there’s no official, public source signing off on this factor.

Source(s): SEL on Panda 2.5

Subdirectories

Categorical Information Architecture has been an SEO discussion point for a long time, as it seems that Google analyzes topic coverage across entire sites. The exact ranking implications of this are unclear, but Google now refers to this as Structured Data, and at very least, will use to display breadcrumbs on the results page, therefore ranking more pages.

Source(s): Google Developers

Meta Keywords

Some SEOs claim that the meta keywords tag never mattered for SEO. That’s a myth. The notion that Google ranks meta keywords in 2015 is also a myth. Both of these facts were confirmed the same way – by placing a zero-competition, made-up word in a meta keywords tag, getting that page into the index, then searching that word. Remember though, that Google is not the only search engine, and could theoretically index countless other dynamic sites that benefit from this tag.

Source(s): Matt Cutts, Experiment Page

Mobile Friendliness

Mobile-friendly websites are given a significant ranking advantage. For now, the ranking implications of this appears to pertain only to users searching on mobile devices. This made its way into the mainstream SEO conversation and became more severe during the Mobilegeddon update in 2015, although experts were speculating on this topic for nearly a decade previous.

Source(s): Various Studies

Meta Description

A good meta description functions as a search ad. Considering how many AdWords agencies exist almost entirely on A/B testing AdWords ads, the marketing value here can’t be understated. Although keywords used in meta descriptions were once widely considered a direct ranking factor, Matt Cutts stated in 2009 that they’re not now.

Source(s): Matt Cutts

Google Analytics

Many have suggested that Google Analytics is or may become a Google ranking factor. All evidence at present, as well as very clear statements from Matt Cutts, indicate that any ranking benefits coming from Google Analytics, now or ever in the future, are an absolute myth. That said, it’s an amazingly powerful tool in the right marketer’s hands.

Source(s): Matt Cutts

Google Webmaster Tools

Just like Google Analytics, there are no confirmed ranking benefits to using Google Webmaster Tools in any way. Webmaster Tools is still useful in unearthing problems related other ranking factors on this page; especially those related to manual penalties and certain crawler errors.

Source(s): Speculation

ccTLD in National Ranking

Country code TLDs such as .uk and .br are believed to carry with them a ranking bonus to searches from the same country, which is especially useful for internationalization. They should also perform far better in contrast to a ccTLD from another country.

Source(s): Speculation

XML Sitemaps

Sitemaps can be useful, though not required, for the purpose of getting more pages of your site into the Google index. The notion that an XML sitemap will improve rankings within Google is a myth. This comes straight from Google and is confirmed by various studies.

Source(s): Susan Moskwa & Trevor Foucher

Salience of Entities

As time goes on, Google seems to do more to analyze ideas and logical entities in preference to words and phrases. It analyzes how we say things in preference to exact search queries that appear on a page. This process, in simple terms, is what’s making it possible to search for “how to cook meat”, and be returned results for steak recipes that might not mention the word “meat” directly anywhere.

Source(s): Jesse Duniet, Dan Gillick, Dan Gillick, Dave Orr, Patent US 20130132433 A1

Phrasing and Context

As keyword density is now virtually a non-factor, a basic understanding of Phrase-Based Indexing tells us that if you write about content thoroughly and elaborately, you stand a far better chance of ranking compared to writing generic content that just happens to drop a lot of keywords. A clear component of one Google patent describes this as the “identification of related phrases and clusters of related phrases”.

Source(s): Patent US 7536408 B2

Web Server Near Users

Google functions differently on many local queries, supplementing traditional results with Google Maps results, and potentially altered organic listings as well. The same is true for national and international searches. By hosting your site at least loosely near to your users, such as within the same country, you are likely to enjoy better rankings.

Source(s): Matt Cutts

Author Reputation

Authorship was an experiment that Google ran from 2011 to 2014, which thrived upon bloggers using the rel=”author” tag to establish the reputation of particular authors. Google directly confirmed by the creation and demise of authorship. Eric Enge did a nice eulogy on the rise and fall of authorship on Search Engine Land.

Source(s): John Mueller

Using rel=”canonical”

The rel=”canonical” tag suggests the ideal URL for a page. This can avert duplicate content devaluations and penalties when multiple URLs might result in the same content. Our experience is that this is only a suggestion to Google and one that is often ignored. According to Google it does not directly improve rankings. Despite all of this, it’s a very good idea.

Source(s): Google

Using rel=”author”

Using rel=”author” was once widespread SEO advice and hypothesized as a positive ranking factor, but Google’s use of this factor at all went away along with an entire practice known as Authorship. The notion that rel=”author” is beneficial for any reason whatsoever is now regarded as a myth.

Source(s): John Mueller

Using rel=”publisher”

Just like rel=”author”, using rel=”publisher” was once widespread SEO advice and also hypothesized as a positive ranking factor. And, just like rel=”author” Google’s use of rel=”publisher” at all went away along with an entire practice known as Authorship.

Source(s): John Mueller

URL uses “www” Subdomain

A common misconception propagated by by SEO bloggers suggests that a site may rank better if your URLs start with “www”. This originates from the idea that we often force all pages on a site to resolve at “www”. The reason that we actually do this is simply to avoid two URLs serving the same content at the same address, which would bring about a negative ranking factor.

Source(s): Speculation

Dedicated IP Address

Web server IP addresses can be useful for geo-targeting certain demographics. They can be negative ranking factors when they sit amidst a significant private webspam operation, or are used by the Hilltop algorithm to identify two sites as being from differing owners. But, the notion that just having a dedicated IP address provides a direct ranking advantage has been repeatedly debunked.

Source(s): Matt Cutts

Subdomain Usage

Subdomains (thing.yoursite.com) are often viewed as separate websites by Google, as compared to subfolders (yoursite.com/thing/), which are not. This has obvious implications with many other factors on this page. Matt Cutts called subfolders/subdomains “roughly equivalent” in 2012, confirming this now happens less often, but still happens. Panda recovery stories post-2012 such as HubPages migration from subfolders/subdomains, prove that it still can be a major factor.

Source(s): Matt Cutts, Matt McGee and Paul Edmondson

Number of Subdomains

The number of subdomains on a site appears to be the most significant factor in determining whether subdomains are each treated as their own sites (as occurs in nature with free web hosting services and hybrid hosting/social sites like HubPages), or just portions of a common site. Presumably, thousands of subdomains means that they don’t all belong to a single thematic site and are likely each websites in their own right.

Source(s): Speculation

Use AdSense

Although SEO paranoia seems to make this frequent advice, it’s directly denied by Google. We’ve also found no real evidence to support, and have seen no noticeable effects when assisting with optimizations for media monetization, which is something that our agency frequently does. We’re therefore prepared to firmly declare this factor a myth.

Source(s): Matt Cutts

Keywords in HTML Comments

This is an early SEO theory that’s very easily debunked by a ten second experiment and a little patience. In the cited example, we place an extremely non-competitive made-up word in our source code, then link to it prominently so that it gets indexed. If that word appears in search, we have evidence that Google ranks by that word. In this case, it doesn’t.

Source(s): Experiment Page

Keywords in CSS/JavaScript Comments

Another twist on an early SEO theory that’s very easily debunked by a ten second experiment and a little patience. In the cited example, we place an extremely non-competitive made-up word in our source code, then link to it prominently so that it gets indexed. If that word appears in search, we have evidence that Google ranks by that word. In this case, it doesn’t.

Source(s): Experiment Page

Keywords in CLASSes, NAMEs, and IDs

Once again, we can debunk theories as to whether or not words in an odd place have any impact on search engines by putting a non-competitive phrase there and waiting. It’s not worth even speculating at what Google tells us or what’s in a patent. And again here, we can confirm that this factor is a myth, at least at the time of writing this.

Source(s): Experiment Page

Privacy Policy Usage

A single experience was posted on Webmaster World in 2012 which sprawled into a larger discussion: does having a Privacy Policy benefit rankings? For what it’s worth, 30% of Search Engine Roundtable-ers voted yes, and it does fit Google’s stated philosophies pretty well. Still, this is very theoretical.

Source(s): SER Discussion

Verifiable Address

A physical address is theorized as a mark of legitimacy in standard search rankings. Loosely supported by the notion that Google looks at citations for local SEO (also known as Google Maps SEO) as mentions of Name, Address, Phone (sometimes shorted to “NAP”) together. “Highly satisfying contact information” is also something that Google quality control auditors are instructed to seek out.

Source(s): Search Engine Land

Verifiable Phone Number

A phone number is theorized as a mark of legitimacy in standard search rankings. Loosely supported by the notion that Google looks at citations for local SEO (also known as Google Maps SEO) as mentions of Name, Address, Phone (sometimes shorted to “NAP”) together. “Highly satisfying contact information” is also something that Google quality control auditors are instructed to seek out.

Source(s): Search Engine Land

Accessible Contact Page

Theorized as a mark of legitimacy. It appears that this may have originated, or is at least best-supported, from a document called Google’s Quality Rater Guidelines. In this document, Google asks quality control auditors to search for “highly satisfying contact information.”

Source(s): Search Engine Land

Low Code-to-Content Ratio

This SEO theory seemed to become widespread in 2011, suggesting that more content and less code is good. Here’s what we know: 1.) Speed is a confirmed factor, 2.) Google’s own PageSpeed Insights tool really presses even a 5Kb reduction in payload size, 3.) Minor code mistakes can cause devaluations and penalties. So at minimum (and more likely) this is an issue of indirect correlation. But to add my own account to several others, I’ve more than once seen this seem to really matter.

Source(s): SitePoint Post, SEOChat Tool

Meta Source Tag

The Meta Source Tag was created for Google News in 2010 to better-attribute sources. It comes in two forms: syndication-source (if syndicating a 3rd party) and original-source (you’re the source). In situations where content is syndicated, this may theoretically help avoid duplicate content penalties. If you’re the original-source, this tag is overridden by rel=”canonical” anyway.

Source(s): Eric Weigle

Meta Geo Tag

Unlike IP address and ccTLDs, Matt Cutts states that they “barely look at this tag, if at all”, although he did suggest that this tag might be considered if you were to use it on a gTLD site (such as “.com”), and attempt to restrict it to a country. So, while this is confirmed to be almost useless, it was suggested that Google does at least look at it and may consider it a factor very, very rarely for internationalization.

Source(s): Matt Cutts

Keywords Earlier in Display Title

More than a decade of studies and correlation research suggests that titles that begin with a keyword usually (but not always) rank better than titles ending in a keyword. It’s easy to test and usually confirms: earlier keywords are better. But our chosen source for this suggests more. Thumback.com conducted a study where title word order changed traffic by 20%-30%. Their best-performing titles didn’t begin with a keyword, but were altered (as Google sometimes does) to do so in Google’s results.

Source(s): Thumbtack Study

Keywords Earlier in Headings

Heading tags are another place where word order appears to really matter. Again, something known as the “first third rule” has been often thrown around on this topic – suggesting that words appearing earlier have more weight. Usually our findings have confirmed this, but regardless, it’s well-worth testing, especially in the H1 position.

Source(s): Speculation

Novel Content against Web

A Google Patent and this SEO’s working experience seem to indicate that Google devalues a lot more than just directly similar content. Google has literally patented methods for calling your content uninteresting. Once determining that a set of articles are related, this patent suggests various methods for determining which content is descriptive, unique, and/or weird (in a good way) when compared to others on the same topic.

Source(s): Patent US 8140449 B1, SEO by the Sea

Novel Content against Self

Google patents suggest that the genuine uniqueness/weirdness of content, as well as how elaborately that content speaks, determines something known as a “novelty score”. This is done by quantifying/qualifying “information nuggets” within text. We pretty much know only that Google’s methods for novelty scoring requires comparing many individual documents. Considering that duplicate content is weighed both internally and externally, however, novelty scores likely are as well.

Source(s): Patent US 8140449 B1

Sitewide Average Novelty Score

Kumar and Bharat’s patent titled “Detecting novel document content” describes how single documents may be scored on how “novel” (that’s an adjective) they are. Assigning an average novelty scores sitewide also appears to fit the narrative of other known sitewide factors such as sitewide thin content (Panda algorithm behavior) and sitewide expert relevance (Hilltop algorithm behavior).

Source(s): Patents US 8140449 B1, US 8825645 B1, Speculation

Quantity of Comments

We know from countless sources and even certain Webmaster Tools messages that Google can separate user-generated content and analyzes it differently. One theory suggests that Google might look at quantities of comments on content to help rate content quality. At present, however, there is no clear evidence for this factor beyond maybe fitting an “if I were Google” narrative. Speculatively, it would also be one of the easiest factors to game.

Source(s): Speculation

Positive Sentiment in Comments

It’s theorized that Google looks at blog comment opinions to determine the quality of content. There is a patent and confirmation from Google that they score the sentiment expressed towards an entire site in product reviews. But according to Amit Singhal, they’re not able to apply this to content, because “if we demoted web pages that have negative comments against them, you might not be able to find information about many elected officials”.

Source(s): Amit Singhal, Patent US 7987188 B2

Negative On-Page Factors

Negative Ranking Factors are things you can do that harm your existing rankings. These factors fit into three categories: accessibility, devaluations, and penalties. Accessibility issues are just stumbling points for Googlebot that could prevent your site being crawled or analyzed properly. A devaluation is an indicator of a lower quality website and may prevent yours from getting ahead. A penalty is far more serious, and may have a devastating effect on your long-term performance in Google. Once again, on-page factors are those that are under your direct control as a part of the direct management of your website.

Negative On-Page Factors

High Body Keyword Density

Keyword Stuffing penalties arise when abusing a once extremely effective tactic: sculpting Keyword Density to a high level. Our own experiments have shown that penalties can happen as early as 6% density, though TF-IDF (covered earlier) is likely at play and this is sensitive to topics, word types, and context.

Source(s): Matt Cutts, Remix

Keyword Dilution

This factor manifests itself from logic: if a higher Keyword Density or TF-IDF is positive, at some point, a total lack of frequency/density will decrease relevance. As Google has improved at understanding natural language, this may be better described as Subject Matter Dilution: writing content that wanders without any clear theme. The same basic concept is at play either way.

Source(s): Matt Cutts

Keyword-Dense Title Tag

Aside from a page as a whole, Keyword Stuffing penalties appear to be possible within the title tag. An ideal title tag should definitely be less than 60-70 characters and hopefully still provide enough value to function as a good search ad in Google’s results. At absolute minimum, there is no benefit in using the same keyword five times in the same tag.

Source(s): Matt Cutts

Exceedingly Long Title Tag

Source(s): Matt Cutts

Keyword-Dense Header Tags

Heading Tags, such as H1, H2, H3, etc. can add additional weight to certain words. Those attempting to abuse this positive ranking factor will find that they can’t simply cram as many keywords as they can into these tags, even if the tags themselves grow to be no lengthier than usual. Keyword Stuffing penalties appear to be possible simply as a function of the total space within these tags.

Source(s): Matt Cutts

Header Tag (H1, H2, etc.) Overuse

As a general rule, if you want a concrete answer of whether or not an SEO penalty exists, try pushing a positive ranking factor well beyond what seems sane. One easily verified penalty involves placing your entire website in an H1 tag. Too lazy for that? Matt Cutts drops a less-than-subtle hint about too much text in an H1 in this source.

Source(s): Matt Cutts

URL Keyword Repetition

While there doesn’t seem to be any penalties associated with using a word in a URL multiple times, the value added from keyword repetition in a URL appears to be basically nothing. This can be verified very simply by placing a word in a URL five times instead of just once.

Source(s): Speculation

Exceedingly Long URLs

Matt Cutts notes that after about five words, the additional value behind words in a URL dwindles. It’s theorized and pretty replicable that this occurs in Google as well, although directly unconfirmed. Although they operate somewhat differently, Bing has also gone out of their way to confirm URL keyword stuffing is a penalty in their engine.

Source(s): Matt Cutts

Keyword-Dense ALT Tags

Given that ALT tag text is not generally directly visible on the page, ALT tag keyword stuffing has been widely abused. A few descriptive words are fine and actually ideal, but doing more than this can invite penalties.

Source(s): Matt Cutts

Exceedingly Long ALT Tags

Source(s): Matt Cutts

Long Internal Link Anchors

At minimum, really long internal anchor text will not bring along with it any additional value – a devaluation. In extreme circumstances, it appears possible to draw Keyword Stuffing webspam penalties from exceedingly lengthy anchor text.

Source(s): Speculation

High Ratio of Links to Text

It’s theorized that just having a site that’s all links and no substance is the mark of a low quality site. This fits the narrative content quality and not ranking pages that look like too much like search results pages, but is not currently supported by a study as proof.

Source(s): Speculation

Too Much “List-style” Writing

Matt Cutts has suggested that any style of writing that just lists a lot of keywords could also fit the description keyword stuffing. Example: listing way too many things, words, wordings, ideas, notions, concepts, keywords, keyphrases, etc. is not a natural form of writing. Too much of this sort of thing will draw devaluations and possibly penalties.

Source(s): Matt Cutts

JavaScript-Hidden Content

Although Google recommends against putting text in JavaScript as it is unreadable by search engines, that does not mean that Google does not crawl JavaScript. In extreme instances where JavaScript may be used to cloak non-JavaScript on-page text, it may still be possible to receive a cloaking penalty.

Source(s): Google

CSS-Hidden Content

One of the first and most well-documented on-page SEO penalties- intentionally hiding text or links from users, especially for the sake of loading the page up with keywords that are just for Google, can invite a nasty penalty. Some leeway appears given in legitimate circumstances like when using tabs or tooltips.

Source(s): Google

Foreground Matches Background

Another common issue that brings about cloaking penalties occurs when the foreground color matches the background color of certain content. Google may use their Page Layout algorithm for this to actually look at a page visually and prevent false positives. In our experience, this can still occur accidentally in a handful of scenarios.

Source(s): Google

Single Pixel Image Links

Once a popular webspam tactic for disguising hidden links, there’s no question that Google will treat “just really small links” as hidden links. This might be done by a 1px by 1px image or just really incredibly small text. If you’re attempting to fool Google using such methods, odds are certainly that they’re going to catch you eventually.

Source(s): Google

Empty Link Anchors

Hidden Links, although often implemented differently than Hidden Text by means such as empty anchor text are also likely to invite cloaking penalties. This is dangerous territory and another once widespread webspam tactic, so be sure to double-check your code.

Source(s): Google

Copyright Violation

Publishing content in a manner that is in violation of the Digital Millennium Copyright Act (DMCA) or similar codes outside of the U.S. can lead to a severe penalty. Google attempts to analyze unattributed sources and unlicensed content automatically, but users can go so far as to report possible infringement for manual action to be taken.

Source(s): Google

Doorway Pages

A site that makes use of Doorway Pages, or Gateway Pages, describes creating masses of pages that are intended to be search engine landing pages, but do not provide value to the user. An example of this would be creating one product page for every city name in America, resulting in what’s known as spamdexing, or spamming Google’s index of pages.

Source(s): Google

Overuse Bold, Italic, or Other Emphasis

At minimum, if you place all the text on your site within a bold tag, for the reason that such text is often given additional weight compared to the rest of the page, you haven’t cracked some code that just makes your whole site rank better. This sort of activity fits Google’s frequent blanket description of “spammy activity”, and we have verified such penalties in our own non-public studies for clients.

Source(s): Matt Cutts

Broken Internal Links

Broken internal links make a site more difficult for search engines to index and more difficult for users to navigate. It’s a tell-tale sign of a low quality website. Make sure your internal links are never broken.

Source(s): Patent US 20080097977 A1, Google via SEL

Redirected Internal Links

The PageRank algorithm carries with it the usual decay when navigating redirects. This is an easy trap to fall into, especially when considering links to “www” vs. “non-www” portions of a site, or addresses with/without a trailing slash.

Source(s): Patent US 6285999 B1, Matt Cutts via SER

Text in Images

Google has come a long way at analyzing image, but on the whole, it’s very unlikely that text that you present in rich media will be searchable in Google. There’s no direct devaluation or penalty when you put text in an image, it just prevents your site from having any chance to rank for these words.

Source(s): Matt Cutts

Text in Video

Just like with images, the words that you use in video can’t be reliably accessed by Google. If you are publishing video, it’s to your benefit to always publish a text transcript such that the content of your video is completely searchable. This is true regardless of rich media format, including HTML5, Flash, SilverLight, and others.

Source(s): Matt Cutts

Text in Rich Media

Google has come a long way at analyzing images, videos, and other formats of media such as Flash, but on the whole, it’s very unlikely that text that you present in rich media will be searchable in Google. There’s no devaluation or penalty here,

Source(s): Matt Cutts

Frames/Iframes

In the past, search engines were entirely unable to crawl through content located in frames. Though they’ve overcome this weakness to an extent, frames do still present a stumbling point for search engine spiders. Google attempts to associate framed content with a single page, but it’s far from guaranteed that this will be processed correctly.

Source(s): Google

Dynamic Content

Dynamic content can create a number of challenges for search engine spiders to understand and rank. Using noindex and minimizing use of such content, especially where accessible by Google, is believed to result in a more positive overall user experience and likely to draw preferential treatment in rankings.

Source(s): Matt Cutts

Thin Content

Although it’s always been better to write more elaborate content that covers a topic thoroughly, the introduction of Nanveet Panda’s “Panda” algorithm established a situation where content with basically nothing of unique value would be severely punished in Google. An industry-wide recognized case study on Dani Horowitz’s “DaniWeb” forum profile pages serves as an excellent example of Panda’s most basic effects.

Source(s): Google, DaniWeb Study

Domain-Wide Thin Content

For a very long time, Google has made an effort to understand the quality and unique value presented by your content. With the introduction of the Panda algorithm, this became an issue that was scored domain-wide, rather than on a page-by-page basis. As such, it’s now usually beneficial to improve the average quality of content in search engines, while using ‘noindex’ on pages that are doomed to be repetitive and uninteresting, such as blog “tag” pages and forum user profiles.

Source(s): Google

Too Many Ads

Pages with too many ads, especially above-the-fold, create a poor user experience and will be treated as such. Google appears to base this on an actual screenshot of the page. This is a function of the Page Layout algorithm, also briefly known as the Top Heavy Update.

Source(s): Google

Use of Pop-ups

Although Google’s Matt Cutts answered no to this question in 2010, Google’s John Mueller said yes in 2014. After weighing both responses and understanding the process behind the Page Layout algorithm, our tie-breaking ruling is also “yes”: using pop-ups can definitely harm your search rankings.

Source(s): Google

Duplicate Content (3rd Party)

Duplicate content that appears on another site can bring about a significant devaluation even when it’s not in violation of copyright guidelines and properly cites a source. This falls in line with a running theme: content that is genuinely more unique and special against a backdrop of the web as a whole will perform better.

Source(s): Google

Duplicate Content (Internal)

Similar to when content duplicated from another source, any snippet of content that is duplicated within a page or even the site as a whole will endure a decrease in value. This is an extremely common issue and can creep up from anything ranging from too many indexed tag pages to www vs. non-www versions of the sites to variables appended to URLs.

Source(s): Google

Linking to Penalized Sites

This was introduced as the “Bad Neighborhood” algorithm. To quote Matt Cutts: “Google trusts sites less when they link to spammy sites or bad neighborhoods”. Simple as that. Google has suggested using the rel=”nofollow” attribute if you must link to such a site. To quote Matt again: “Using nofollow disassociates you with that neighborhood.”

Source(s): MC: Bad Neighbors, MC: Nofollow

Slow Website

Slower sites will not rank as well as faster sites. There are countless tools to assist in performance auditing for both server-side and client-side factors, and they should be used. This factor is executed with the target audience in mind, so seriously consider the geography, devices, and connection speeds of your audience.

Source(s): Google

Page NoIndex

If a page contains the meta tag for “robots” that carriers a value “noindex”, Google will never place it in its index. If used on a page that you want to rank, it’s a bad thing. It can also be a good thing when removing pages that will never be good for Google users, and elevate the average experience on visitor arriving from Google.

Source(s): Logic

Internal NoFollow

This can appear two ways: if a page contains the “robots” meta tag with the value “nofollow”, it will imply that the rel=”nofollow” attribute is added to every link on the page. Or, it can be added to individual links. Either way, this is taken to mean “I don’t trust this”, “crawl no further”, and “do not give this PageRank”. Matt does not mince words here: just never “nofollow” your own site.

Source(s): Matt Cutts

Disallow Robots

If your site has a file named robots.txt in the root directory with a “Disallow:” statement followed by either “*” or “Googlebot”, your site will not be crawled. This will not remove your site from the index. But it will prevent any updating with fresh content, or positive ranking factors that surround age and freshness.

Source(s): Google

Poor Domain Reputation

Domain names maintain a reputation with Google over time. Even if a domain changes hands and you are now running an entirely different web site, it’s possible to suffer from webspam penalties incurred by the poor behavior of previous owners.

Source(s): Matt Cutts

IP Address Bad Neighborhood

While Matt Cutts has gone out of his way to debunk the long-standing practice of “SEO web hosting” on dedicated IP addresses serving any real benefit, this is contradicted by the notion that in rare cases, Google has penalized entire server IP ranges where they might be associated with a private network or bad neighborhood.

Source(s): Matt Cutts

Meta or JavaScript Redirects

A classic SEO penalty that isn’t too common anymore; Google recommends not using meta-refresh and/or JavaScript timed redirects. These confuse users, induce bounce rates, and are problematic for the same reasons as cloaking. Use a 301 (if permanent) or 302 (if temporary) redirect at the server level instead.

Source(s): Google

Text in JavaScript

While Google continues to improve at crawling JavaScript, there’s still a fair chance that Google will have trouble crawling content that’s printed using JavaScript, and further concern that Googlebot won’t fully understand the context of when it gets printed and to whom. While printing text with JavaScript won’t cause a penalty, it’s an undue risk and therefore a negative factor.

Source(s): Matt Cutts

Poor Uptime

Google can’t (re)index your site if they can’t reach it. Logic also would dictate that a site that’s unreliable also leads to a poor Google user experience. While one outage is unlikely to be devastating to your rankings, achieving reasonable uptime is important. One or two days should be fine. More than this will cause problems.

Source(s): Matt Cutts

Private Whois

While it’s often pointed out that Google can’t always access whois data from every registrar, Matt Cutts made it clear at PubCon 2006 that they were still looking at this data, and that private whois, when combined with other negative signals, may lead to a penalty.

Source(s): Matt Cutts

False Whois

Similar to private whois data, it’s been made clear that representatives from Google are aware of this common trick and treating it as a problem. If for no reason other than it being a violation of ICANN guidelines, and potentially allowing a domain hijacker to steal your domain via a dispute without you getting a say, don’t use fake information to register a domain.

Source(s): Matt Cutts

Penalized Registrant

If you subscribe to the notion that private and false whois records are bad, and take into account that Matt Cutts has discussed using this as a signal that identifies webspam, it stands to reason that a domain owner can be flagged and penalized across numerous sites. This is unconfirmed and purely speculative.

Source(s): Speculative

ccTLD in Global Ranking

ccTLDs are country-specific domain suffixes, such as .uk and .ca. They are the opposite of gTLDs, which are global. These are useful in executing international SEO, but can be equally problematic when attempting to rank outside of these countries. An exception to this rule is that a small number of ccTLDs have been widely used for other purposes such as .co, and have been labeled by Google as “gccTLDs”.

Source(s): Google

Too Many Internal Links

Matt Cutts once stated that there was a hard limit of 100 links per page, which was later retracted to say “keep it at a reasonable number”. This was because Google once would not download more than 100K of a single page. That’s no longer true, but since every link divides your distribution of PageRank, this potential makes sense without any altered understanding of how Google works.

Matt (blog), Matt (video)

Too Many External Links

As a simple function of the PageRank algorithm, it’s possible to “leak PageRank” out from your domain. Note, however, that the negative factor here is “too many” external links. Linking out to a reasonablenumber of external sites is a positive ranking factor that’s confirmed by Mr. Cutts in the same source article to this factor.

Source(s): Matt Cutts

Invalid HTML/CSS

Matt Cutts has said no to this being a factor. Despite this, our experience has consistently indicated yes. Code likely doesn’t have to be perfect and this may be an indirect effect. But the negative effects of bad code are supported by logic as you consider other code-related factors (hint: there’s a code filter up top). Bad code can cause countless, potentially invisible issues including tag usage, page layout, and cloaking.

Source(s): Matt Cutts

Outbound Affiliate Links

Google has vocally taken action against affiliate sites that provide ‘no additional value’ in the past. It’s in the guidelines. There’s much SEO paranoia that surrounds hiding affiliate links using a 301 redirect in a directory blocked by robots.txt, although, Google can view HTTP headers without navigating. A number of affiliate marketers have reported reasonably scientific case studies of penalties from too many affiliate links, therefore, we rate this as likely.

Source(s): Google, Affiliate Marketer’s Study

Parked Domain

A parked domain is a domain that does not yet have a real website on it; often sitting unused at a domain registrar outside of some machine-generated advertising. Anymore, this fails to meet so much other ranking criteria that it probably wouldn’t have much success in Google anyway. They once had some. But Google has repeatedly made it clear that they don’t want to rank parked domains of any kind.

Source(s): Google

Search Results Page

Generally speaking, Google wants users to land on content, not other pages that look like listings of potential content, like the Search Engine Results Page (SERP) that such a user just came from. If a page looks too much like a search results page, by functioning as just an assortment of more links, it’s likely to not rank as well. This may also apply to blog posts outranking tag/category pages.

Source(s): Matt Cutts

Automatically Generated Content

Machine-generated content that’s based upon user search query will “absolutely be penalized” by Google and is considered a violation of the Google Webmaster Guidelines. There are a number of methods that could qualify which are detailed in the Guidelines. Once exception to this rule appears to be machine-generated meta tags.

Source(s): Matt Cutts, Webmaster Guidelines

Too Many Footer Links

It’s been made very clear that links tucked into the footer of a site don’t carry the same weight as those in an editorial context. It’s also true that when Google first began speaking about their actions against paid link schemes, the practice of spamming site footers with dozens of paid external links was widespread, and therefore too many external footer links can draw that sort of penalty.

Source(s): Matt Cutts

Infected Site

Many website owners would be surprised to know that most compromised web servers are not defaced. Often, the offending party will actually go so far as to patch your security holes to protect their newfound property, without you ever knowing. This will then manifest itself in the form of malicious activity enacted on your behalf such as virus/malware distribution and further exploits, which Google takes very seriously.

Source(s): Webmaster Guidelines

Phishing Activity

If Google might have reason to confuse your site with a phishing scheme (such as one that aims to replicate another’s login page to steal information), prepare for a world of hurt. For the most part, Google simply uses a blanket description of “illegal activity” and “things that could hurt our users”, but in this interview, Matt specifically mentions their anti-phishing filter.

Source(s): Matt Cutts

Outdated Content

A Google patent exists surrounding stale content, which is identified in a variety of ways. One such method for defining stale content basically just surrounds being old. What is unclear is whether this factor harms rankings on all queries, or simply when a particular search query is associated with something Google refers to as Query Deserves Freshness (QDF), which means exactly what it sounds like.

Source(s): Patent US 20080097977 A1

Orphan Pages

Orphan pages, meaning pages of your site that are difficult or impossible to find using your internal link architecture, can be treated as Doorway Pages and act as a webspam signal. At minimum, such pages likely do not benefit from internal PageRank, and therefore have far less authority.

Source(s): Google Webmaster Central

Sexually Explicit Content

While Google does index and return X-rated content, it’s not available when their Safe Search feature is turned on, which is Google’s default state. It’s therefore reasonable to consider that unmoderated user-generated content or one-time content that inadvertently crosses a certain line may be blocked by the Safe Search filter.

Source(s): Google Safe Search

Selling Links

Matt Cutts presents a case study where the toolbar PageRank of a domain decreased from seven to three as a direct result of outbound paid links. As a violation of Google’s Webmaster Guidelines, it appears that directly selling links that pass PageRank can lead to penalties on both the on-page and off-page ends of a site.

Source(s): Matt Cutts

Subdomain Usage

Subdomains (thing.yoursite.com) are often viewed as separate websites by Google, as compared to subfolders (yoursite.com/thing/), which are not. This can be negative in a number of ways as it relates to other factors. One such scenario would involve a single, topical site with many subdomains, not benefiting from factors on this page that have “domain-wide” in their names.

Source(s): Matt McGee and Paul Edmondson

Number of Subdomains

The number of subdomains on a site appears to be the most significant factor in determining whether subdomains are each treated as their own sites. Using an extremely large number of subdomains, although not a terribly easy thing to do by mistake, could theoretically cause Google to treat one site like many sites, or many sites like one site.

Source(s): Speculation

HTTP Status Code 4XX/5XX on Page

If your web server returns pretty much anything other than a status code of 200 (OK) or 301/302 (redirect), it is implying that the appropriate content was not displayed. Note that this can happen even if you are able to view the intended content yourself in your browser. In cases where content is actually missing, it’s been clarified by Google that a 404 error is fine and actually expected.

Source(s): Speculation

Domain-wide Ratio of Error Pages

Presumably, the possibility for users to land on pages that return 4XX and 5XX HTTP errors is a sure mark of an overall low-quality website. We speculate this is a problem in addition to pages that are not indexed due to carrying such a HTTP header, and pages that include broken outbound links.

Source(s): Speculation

Code Errors on Page

Presumably, if a page is full of errors generated by PHP, Java, or other server-side language, it meets Google’s definitions of a poor user experience and a low quality site. At absolute minimum, error messages within the page text likely interfere with Google’s overall analysis of the text on the page.

Source(s): Speculation

Soft Error Pages

Google has repeatedly discouraged the use of “soft 404” pages or other soft error pages. These are basically error pages that still return HTTP code 200 in the document headers. Logically, this is difficult for Google to process correctly, and even though your users see an error page, Google (may at minimum) treat these as actual low-quality pages on your site, significantly lowering how the overall quality of your domain’s content is scored.

Source(s): Google

Outbound Links

On some level, something known as “PageRank leakage” does exist: you only have so many “points” to distribute, and “points” that leave your site cannot circle immediately back. But Matt Cutts has confirmed that there are other controls that specifically reward some genuinely relevant and authoritative outbound links. Websites are meant to be intersections, not cul-de-sacs.

Source(s): Matt Cutts, Nicole V. Beard

HTTP Expires Headers

Setting “Expires” headers with your web server can control browser caching and improve performance. Unfortunately, depending on how they’re wielded, they can also cause problems with search indexing, by telling search engines that content will not be fresh again for potentially a long time. In all cases, they may tell Googlebot to go away for longer than desired, as their analysis seeks to emulate a real user experience.

Source(s): Moz Discussion

Sitemap Priority

Many theorize that the “priority” attribute assigned to individual pages in an XML sitemap has an impact on crawling and ranking. Much like other signals that you might hand to Google via Webmaster Tools, it seems unlikely that some pages would really rank higher just because you asked, and is mainly useful as a signal to de-prioritize lesser important content.

Source(s): Sitemaps.org

Sitemap ChangeFreq

The ChangeFreq variable in an XML sitemap is intended to indicate how often the content changes. It’s theorized that Google may not re-crawl content faster than you tell it is changing. It’s unclear however if Google actually follows this attribute or not, but if they do, it seems that it would yield a similar result as adjusting the crawl speed in Google Webmaster Tools.

Source(s): Sitemaps.org

Keyword-Stuffed Meta Description

It’s theorized that, even though Google now tells us that they don’t use meta descriptions in web ranking, only for ads, it may still be possible to send webspam signals to Google if there’s an apparent attempt to abuse the tag.

Source(s): Speculation

Keyword-Stuffed Meta Keywords

Since 2009, Google has said that they don’t look at meta keywords at all. Despite this, the tag is still widely abused by people who don’t understand or believe that idea. It’s theorized that because of the latter fact, this tag may yet serve to send webspam signals to Google.

Source(s): Matt Cutts

Spammy User-Generated Content

Google should single out problems appearing in the user-generated portions of your site and issue very targeted penalties in such a context. This is one of few circumstances where a warning may appear in Google Webmaster Tools. We’re told these penalties are usually limited to certain pages. We’ve found that WordPress trackback spam appearing in a hidden DIV is one way that this penalty can creep up undetected.

Source(s): Matt Cutts

Foreign Language Non-Isolation

Obviously, if you write in a language that doesn’t belong to your target audience, almost no positive, on-page factors can work their charm. Matt Cutts admits that improperly isolated foreign language content can be a stumbling point both for search spiders and for users. To not interfere with positive ranking factors, Google needs to be able to interrelate content on the page as well as sections of a site.

Source(s): Matt Cutts

Auto-Translated Text

Using Babelfish or Google Translate to rapidly “internationalize” a site is a surprisingly frequent practice for something that Matt Cutts explicitly states is a violation of their Webmaster Guidelines. For those fluent in Google-speak, that usually means “it’s not just a devaluation, it’s a penalty, and probably a pretty bad one”. In a Google Webmaster video, Matt categorizes machine translations as “auto-generated content”.

Source(s): Matt Cutts

Missing Robots.txt

As of 2015, Google Webmaster Tools advises site owners to add a robots.txt file to their site when one is missing. This has lead many to theorize that a missing robots.txt file is bad for rankings. We consider this is odd while Google Search’s John Mueller advises removing robots.txt entirely when Googlebot is entirely welcome. We chalk this myth up to department miscommunication.

Source(s): John Mueller via SER

Positive Off-Page Factors

Off-Page Factors describe events that take place somewhere other than on the site that you directly control and are trying to improve performance of in the rankings. This usually takes the form of backlinks from other sites. Positive Off-Page Factors generally relate to an attempt to understand honest, natural popularity, with a large emphasis on popularity achieved from more-trusted and influential sources.

Positive Off-Page Factors

Authoritative Inbound Links to Page

Receiving links from other sites that have a large number of inbound links to themselves are worth far more than those without. The same is true for their inbound links, in determining the value of their link to you, and so on. In this way, links are like currency, with hypothetical value ranging from $0 to $1,000,000. This a function of the PageRank algorithm.

Source(s): Larry Page

Authoritative Inbound Links to Domain

PageRank obtained from links from external sites are distributed throughout a domain in the form of internal PageRank. Domain names tend to gain authority as a whole: content published on an authority site will instantly rank far higher than content published on a domain with no real authority.

Source(s): Larry Page

Link Stability

Backlinks appear to gain value as they age. Speculatively, this may be because spam links get moderated and paid link schemes eventually expire. Therefore, backlinks existing for longer periods are worth more. This is also supported by a patent.

Source(s): Patent US 8549014 B2

Social Signals

This phrase, dubbed by Google, refers to ongoing experimentation with sharing and reputation on social media to further appraise the authority of a site. After launching Google+ and ending their firehose agreement with Twitter, Matt Cutts says this is less of a thing as they experimented with Google+ data. Recent studies still confirm that positive social reputation correlates, directly or indirectly, with better rankings.

Source(s): Matt Cutts, Moz Study

Keyword Anchor Text

The anchor text used in an external link will help establish relevance of a page towards a search term. The target page does not need to contain this term to rank (see: Google Bombing).

Source(s): Patent US 8738643 B1

Links from Relevant Sites

Links from sites that cover similar material to yours are expected. Contrary to popular misconception and a number of highly destructive link building/unbuilding schemes, not every link to your site needs to come from a domain that’s only dedicated to a subject. This would appear very unnatural. But so would never being a part of industry specific discussions. This is a function of the Hilltop algorithm.

Source(s): Krishna Bharat

Partially-Related Anchor Text

When a backlink portfolio is earned naturally, as it’s supposed to be, not everybody links to a site the same way. Anchor text that includes portions of a keyword phrase, or a keyword phrase plus something else is expected. Google’s patents refer to this as “partially-related” anchor text, though SEOs more often call it “partial match”.

Source(s): Patent US 8738643 B1

Partially-Related ALT Text

Just like partial match anchor text, the ALT attribute of images is something that varies in nature and appears to still carry with it an increase in weight for phrases that contain certain words. This is unconfirmed by Google, but can be proven by very simple experimentation on very non-competitive queries, such as using made-up words. Google’s patents refer to this as “partially-related” anchor text, SEOs more often call it “partial match”.

Source(s): Patent US 8738643 B1

Keyword Link Title

It was long theorized that the “title” attribute of a link might be treated similar to anchor text, giving additional weight to certain words. At PubCon 2005, Google directly dispelled any such possibility, saying that just not enough people used this attribute. Various real world studies appear to confirm that “title” is indeed not a factor.

Source(s): Ann Smarty via SEJ

Keyword ALT Text

Keywords used in the ALT attribute of an image are treated as anchor text. Short, genuinely descriptive ALT tags also improve overall accessibility and have an exceedingly strong impact on images appearing in-line with searches from Google Image Search.

Source(s): Patent US 8738643 B1, Matt Cutts

Context Surrounding Link

For quite some time, it’s been established that the text surrounding a link, in addition to the anchor text within, is considered in evaluating context. Support for this theory is reinforced by a patent and simple experimentation. Therefore, links in text are likely to provide more value than a stand-alone link that’s detached from context.

Source(s): Patent US 8577893, SEO By The Sea

Brand Name Citation

A major factor of local SEO, or Google Maps SEO, are local citations: brand mentions with company name, address, phone, but no backlink. Rand from Moz noted a case study that he believed supported speculation that this was making its way into “traditional SEO” as well. This study was, however, debunked by several comments without rebuttal, so for now, we consider it a myth.

Source(s): Moz Study

Link From Site in Same Results

In the Google Patent “Ranking search results by reranking the results based on local inter-connectivity” (insert programmer joke about recursion), Google describes a process in which having a backlink from a site that already ranks for a certain search query can increase your own weight for that particular search query by more than it would otherwise.

Source(s): Patent US 6526440 B1

Links from Many “Class C” IP Ranges

In general, Google scores the authority, quality, and relevance of pages and domains that link to you, not IP addresses. The one exception involves the Hilltop algorithm; specifically section 2.1 of Krishna Bharat’s research paper, titled “Detecting Host Affiliation”. Sites that share the same /24 IP range, or first three octets of an IPv4 address (up to the C in A.B.C.D) are treated as having the same owner and disqualified from the Hilltop bonuses derived from the links of third party experts.

Source(s): Krishna Bharat

DMOZ Listing

Of all the websites on the Internet you could get a backlink from, one magical opportunity defies the laws that the rest abide by. That’s DMOZ: Directory Mozilla, The Open Directory Project, once the data feed for The Google Directory. It’s a political nightmare ripe with corruption, but it’s human-edited, and when you finally get listed the effects are noticeable.. even in 2015.

Source(s): Matt Cutts

Click Through Rate on Query/Page

It’s been heavily theorized that Click Through Rate from the results page is a ranking factor. It’s a Bing ranking factor. Matt glossed overranking implications in 2009. Repeatedly, Rand Fishkin has used Twitter to lead experiments which look surprisingly conclusive at confirming that CTR is a ranking factor.

Source(s): Moz Study, Patent US 9031929 B1

Click Through Rate on Domain

A patent by Nanveet Panda (of the Panda algorithm) describes assigning site quality scores based on CTR for various searches. The title of this patent is literally “Site quality score”. It also speaks of branded search queries, followed by clicks as the primary method. Still, these factors, in addition to evidence for search query CTRs as a factor, seems to suggest that sitewide CTR may be a factor.

Source(s): Patent US 9031929 B1

Backlinks from .EDU

A popular hustle that targets SEO newbies sells backlinks from .edu sites. The claim is that they carry more value. Matt Cutts dispelled this directly, stating “Google doesn’t treat .edu and .gov links differently”. While it’s true that these sites can have higher average authority from frequent, natural source citations, “buy guaranteed .edu links” schemes aren’t earned that way, and instead usually spam unsecured forums and blogs where authority is heavily diluted.

Source(s): Matt Cutts

Backlinks from .GOV

Just like .EDU backlinks, the notion that .GOV backlinks have a magical influence on Google beyond that of a normal gTLD with similar attributes is just not true. There has been further speculation that having some links from these sites may carry a “more natural balance”, but when reviewing statements, case studies, and the basic logic of looking at large brands without any of these types of links, this doesn’t appear to hold water.

Source(s): Matt Cutts

Positive Link Velocity

It’s speculated that if your site and its content is gaining backlinks faster than its losing them, you deserve a little more attention than incumbent big brands that otherwise have a somewhat unfair off-page SEO advantage. A Google patent confirms that they’re at least looking at this, stating: “By analyzing the change in the number orrate of increase/decrease of back links to a document (or page) over time, search engine may derive a valuable signal of how fresh the document is.”

Source(s): US 8521749 B2

Low Bounce Rate

It’s been theorized that Google looks at search user bounce rate as a ranking factor. Even without Google Analytics or Chrome data this could be easily measured in several ways. Matt Cutts says no, and that tracking how long users remain on a page would be “spammable and noisy”. Yet, SEO Black Hat and Rand Fishkin have run studies that indicate otherwise, and Bing’s Duane Forrester has clearly confirmed that Bing uses it; a factor that they call “dwell time”.

Source(s): Matt Cutts via SER, SEO Black Hat Study, Rand Fishkin Study

Natural Ratio of Deep Links

As a simple function of the PageRank algorithm, pages that are linked directly are given more authority than indirect links, such as links to a site’s homepage. A reasonable percentage of direct inbound links beyond the homepage would be expected when extremely manipulative practices aren’t at play.

Source(s): Larry Page

Google+ Profile

Although a somewhat unpredictable factor making use of Google+ can carry with it a variety of ranking benefits. Although some speculate that Google+ could help with traditional rankings, however, we believe that the only real benefits as of writing this manifest themselves in non-traditional ways. For examples of these rankings that may be achieved with Google+, see Dr. Pete Meyers 2013 MozCon Presentation, Beyond 10 Blue Links.

Source(s): Dr. Pete’s Study

Twitter Followers

It’s been heavily theorized that a brand’s number of Twitter followers might be a direct ranking factor. Claims from Google, however, are to the contrary. While it’s true that a Twitter audience is an invaluable asset for nurturing a community of brand advocates, that manifests other benefits in the way of long-term drip marketing, word of mouth sales, and backlinks from your content, all evidence indicates that Google is not currently looking at this information.

Source(s): Matt Cutts

Twitter Sharing

According to Google, shares on social media are basically just treated like more backlinks, and there is no additional, direct benefit at the present that comes from content being shared on Twitter. In 2010, Google told Danny Sullivan “who you are on Twitter matters”. In 2014, Matt Cutts said “to the best of his knowledge”, nothing like this existed.

Source(s): Matt Cutts, Danny Sullivan

Facebook Likes

It’s been heavily theorized that a brand’s number of Facebook page “likes” might be a direct ranking factor. Claims from Google, however, are to the contrary. While it’s true that a Facebook audience is an invaluable asset for nurturing a community of brand advocates, that manifests other benefits in the way of long-term drip marketing, word of mouth sales, and backlinks from your content, all evidence indicates that Google is not currently looking at this information.

Source(s): Matt Cutts

Facebook Sharing

According to Google, shares on social media are basically just treated like more backlinks, and there is no additional, direct benefit at the present that comes from content being shared on Facebook. In 2010, Google told Danny Sullivan “who you are on Twitter matters”. In 2014, Matt Cutts said “to the best of his knowledge”, nothing like this existed.

Source(s): Matt Cutts, Danny Sullivan

Google+ Circles

For a short while, Google’s “Search, plus Your World” campaign enabled social search functionality that put personalized search results into overdrive. Massive ranking preference was given to documents/sites +1’d by your Circlers. It appears now that this was deemed a failed experiment. Although there’s no harm in handing Google more positive signals for the future, there’s also no evidence that Circles matter right now.

Source(s): Google

Google+ “+1’s”

Google began experimenting with the Google+ “+1” button most everywhere when it rolled out. We all thought it could impact rankings. Cyrus Shepard posted a correlation study on Moz that showed that sites that were popular enough to get more +1’s ranked better. Matt Cutts told us all of that was bogus and that +1s don’t directly impact rankings a day later. For now, we call this “iffy”, bordering on “myth”.

Source(s): Moz correlation study, Matt Cutts via SER

Link from Older Domain

Microsoft filed a patent request in 2008 that to treat backlinks from older domains with more weight. The breakdown was 100% value for 10+ years, 75% for 6-10 years, 50% for 3-6 years, 25% for 1-3 years, 10% for less than one years. It’s theorized but unconfirmed that Google has a similar process.

Source(s): SEO by the Sea

Query Deserves Freshness (QDF)

Google doesn’t rank every search query the same way. Certain search queries, especially those that are news-related, are especially sensitive to the freshness of content that they will publish (and mayonly rank content that is recent). Google’s term for this is Query Deserves Freshness (QDF).

Source(s): Matt Cutts, Amit Singhal

Query Deserves Sources (QDS)

A phrase that we’ve coined to cover a scenario described in Google’s Quality Rater Guidelines, used when humans conduct quality control on Google search results. This asks: “this is a topic where expertise and/or authoritative sources are important”. Presumably, this applies to all informational search queries (in contrast to transactional and navigational queries).

Source(s): Barry Schwartz

Query Deserves Oldness (QDO)

This is a phrase that we made up to describe a situation detailed in a Google patent. It’s specifically noted that: “For some queries, older documents may be more favorable than newer ones.” The patent then goes on to describe the process in which documents would be ranked by their age, as a function of the average age of results for that query.

Source(s): Patent US 8549014 B2

Query Deserves Diversity (QDD)

Certain search queries are ranked differently by Google. One theory is called Query Deserves Diversity, likely dependent on a concept called entity salience by attaching meaning to the same word with differing definitions. As a bit of a riff on the concept of Query Deserves Freshness, this would be similar to a Wikipedia disambiguation page, where the search query is vague and a variety of result types are needed at the top of the results. Unconfirmed, but easily replicated.

Source(s): Rand Fishkin

Safe Search

In certain circumstances where adult content may be involved, a site may or may not rank based completely on whether or not Safe Search is enabled in Google’s settings. By default, Safe Search is turned on.

Source(s): Google

Use AdWords

SEO paranoia seems to prevent this myth from dying. There are no credible studies that we have encountered that suggest AdWords will improve rankings in any way. AdWords influencing organic rankings runs counter to Google’s core philosophies, and nobody is more vigilant about speaking out against this myth more than Google.

Source(s): Matt Cutts

Don’t Use AdWords

Just like using AdWords is allegedly a ranking factor in some very un-scientific circles, as is not using AdWords. The notion that AdWords can have any influence on Google’s organic rankings in any way, now or in the future, has been dispelled by Google maybe more aggressively than any other SEO myth.

Source(s): Matt Cutts

Chrome Page Bookmarks

Although directly denied by Matt Cutts, this was affirmed at the 2013 BrightonSEO conference during the ex-Googler fireside. It’s also suggested by a Google Patent, which states: “Search engine may then analyze over time a number of bookmarks/favorites to which a document is associated to determine the importance of the document.”

Source(s): Matt Cutts via SER, BrightonSEO Fireside, Patent US 20070088693

Chrome Site Traffic

Also denied by Google, the Patent “Document scoring based on traffic associated with a document” also touches on using browser traffic data for the purposes of ranking sites, stating: “information relating to traffic associated with a document over time may be used to generate (or alter) a score associated with the document.”

Source(s): Patent US 20070088693, Lifehacker Analysis

User Search History

It’s common to be served personalized search results based on your search history unless you have specifically disabled this feature in Google. As of 2009, signing into a Google account is not a requirement for being served results that are personalized based upon your recent search history.

Source(s): Brian Horling

Google Toolbar Activity

Just as Matt Cutts stated that Google Chrome data is not used in determining rankings in Google’s organic search results, the same was said for the Google Toolbar. Despite this, it’s widely reported by SEOs, which may relate to a Google Patent that directly discusses a method of doing exactly this via a browser plugin.

Source(s): Matt Cutts via SER, Patent US 20070088693

Low Alexa Score

While there are patents and speculation that suggest that Google could theoretically look at site traffic as a ranking factor, there’s absolutely no evidence to support that they are doing so using Alexa at present. In what documentation does exist, it’s suggested that they would do this using Chrome data, which by the way, they’ve totally cleared themselves to do.

Source(s): Patent US 20070088693

High MozRank/MozTrust Score

The “toolbar PageRank” scores that we see do not match the actual PageRank data that Google Search uses. That data is often wildly inaccurate these days, and that’s led many to defer to MozRank. Despite this, Google has always been in the business of computing the value of links on their own, and while Moz data might correlate, it’s not related to rankings in any way. The same goes for any other third party metrics like Majestic or Ahrefs.

Source(s): Speculation

Total Branded Searches + Clicks

Nanveet Panda’s patent titled “site quality score” describes a scenario where navigational brand searches in Google (such as “Northcutt contact page”) contribute to a domain-wide quality score. It states: “The score is determined from quantities indicating user actions of seeking out and preferring particular sites and the resources found in particular sites.”

Source(s): Patent US 9031929 B1

High Dwell Time (Long Clicks)

The “Site quality score” patent describes a scenario that rewards branded searches + clicks as a ranking factor. As a part of their methods, it also states: “Depending on the configuration of the system …. a click of at least a certain duration, or a click of at least a certain duration relative to a resource length, for example, may be treated by the system as being a user selection.” It’s also supported by several other sources and used by Bing and Yahoo.

Source(s): Patent US 9031929 B1, Bill Slawski

Submit Site to Google

Google has long had a tool that allowed you to submit your site to be crawled. A long-standing myth is that this provides any ranking benefits whatsoever. In fact, in cases where a site is not even in the index, it almost appears to be a placebo button. For your site to rank, on top of Google simply being award Google will need to instead find it using some worthwhile links.

Source(s): Google

Submit Sitemap Tool

It’s possible to submit an XML Sitemap to Google using Google Webmaster Tools. This does appear to get more pages into the index in some cases, but for similar reasons as the raw “submit site” concept is not ideal, neither is the “Submit Sitemap”. If Google couldn’t find them on its own, they’re likely doomed never to rank. And as Rand Fishkin points out, this tool stops many diagnostic processes cold.

Source(s): Rand Fishkin

International Targeting Tool

Google Webmaster Tools provides a tool for international targeting when it may not be done correctly otherwise, mainly for use with a generic TLD like “.com”, or “gccTLDs” like .co that were intended for a particular country, but widespread use has caused Google to treat them more generically. This can help with rankings in certain countries in certain situations.

Source(s): Google

Reconsideration Requests

Google’s reconsideration request tool is generally the answer to a manual action. This tool essentially petitions Google to have someone manually review a site to determine whether or not a manually placed penalty should be removed. Considering that manual actions make up an extremely small portion of negative ranking factors, this tool should rarely be necessary.

Source(s): Google

Google+ Local Verified Address

It’s often theorized that a Google+ Local page, in which businesses verify their address using a postcard for listing in Google Maps, is a ranking factor in Google’s primary web search results. While true that this is a significant ranking factor for Google Maps searches, and when the local listings box is imposed in-line with traditional Google search results, we’ve found no evidence to support this theory.

Source(s): Speculation

Links from ccTLDs in Target Country

Google uses Country Code Top Level Domains (ccTLDs) to establish that a site is relevant to a certain country. It’s widely accepted that backlinks from a particular country’s ccTLDs will improve Google rankings for such a country.

Source(s): Google

Links from IP Addresses in Target Region

Google has told us that having a server near to your target audience, on a broad, international scale, will improve rankings for that audience. It’s also known that a number of other factors serve to establish geographic relevance: proven by simply comparing results from Google.com and Google.co.uk. Therefore, it’s theorized that as with most things, Google analyzes those that link to your site using the same tools as they have confirmed are used to analyze your site.

Source(s): Matt Cutts

Negative Off-Page Factors

Negative Off-Page Factors are generally related to unnatural patterns of backlinks to your site, usually due to intentional link spam. Until the the Penguin algorithm was introduced in 2012, the result of these factors was almost always a devaluation, rather than a penalty. That is, you could lose all, or nearly all, value obtained from linking practices that Google felt may be unnatural, but your site would not be harmed otherwise. While that’s still mostly true, Penguin introduced off-page penalties in a number of cases, which has opened the floodgates for malicious behavior from competing sites as a practice known asnegative SEO or Google Bowling.

Negative Off-Page Factors

Excessive Cross-Site Linking

When owning multiple sites, it’s discouraged to inter-link them for the purpose of inflating your inbound link authority. Risk increases with the number of inter-linked domains. Common ownership may be detected by domain registrant, IP address, similarity of content, similarity of design, and rarely, identified and penalized as part of a manual action. Exception made for Internationalization or “when there’s a really good reason, for users, to do it”.

Source(s): Matt Cutts

Negative SEO (Google Bowling)

Negative SEO, historically dubbed “Google Bowling”, is the act of a malicious linkspam conducted on behalf of your site by a third party. This was once very difficult, since we lived in a world of off-pagedevaluations, rather than off-page penalties. If a devaluation were to occur, a competitor could only exaggerate existing schemes, causing value to be lost sooner or more assuredly. If off-page penalties exist, which they do, negative SEO is proven by logic alone.

Source(s): Matt Cutts

Paid Link Schemes

Links can’t be purchased directly from a website owner for the sole purpose of passing PageRank. Matt Cutts states that this is directly inspired by the FTC’s guidelines on paid endorsements. To phrase this another way, backlinks viewed as endorsements, and genuine endorsements are supposed to happen without direct compensation.

Source(s): Google, Matt Cutts

Fresh Anchor Text

The age of anchor text used in a link, specifically anchor text that appears to be changing on another site, can signify a problem. Speculatively, this implies that the link is not actually from a third party and/or an active experiment in ranking manipulation.

Source(s): Patent US 8549014 B2

Diluted Page Authority

As a function of the PageRank algorithm, every link on a page divides the overall authority that is passed to the pages that are linked. For example, one page with one link may pass a hypothetical PageRank value of 1.0, whereas an identical page with 1,000 outbound links would pass 0.001.

Source(s): Matt Cutts, Larry Page

Diluted Domain Authority

For nearly the same reason that diluted page authority is possible, it’s possible for an entire domain to dilute outbound PageRank. For this reason, sites that are more choosy about who they link to, relative to who links to them, are valuable, while sites functioning as complete free-for-all link farms have a value near zero.

Source(s): Matt Cutts, Larry Page

Unnatural Ratio of Anchor Text

To an extent, the anchor text used in links establishes relevance of the subject matter. As with every SEO tactic the community abused this to the point they were able, and controls were put in place for when the limits were pushed well beyond what occurs without manipulation. That threshold may be as simple as a flat 10% of a particular anchor text. This is a function of the Penguin algorithm.

Source(s): Penguin 1.0 Announcement, Moz Study

Unnatural Ratio of Anchor Type

Just as the Moz study showed us a high ratio of one anchor, repeatedly reproduced on our work on Penguin-penalized sites, the same can be said for sites that use too much anchor text overall. Analyzing backlinks across popular brands shows high amounts of brand name anchor text, “click here” anchors, URL anchors, and banners. Pushing too far beyond the limits of what occurs naturally invites devaluations, and since Penguin, potential for penalties.

Source(s): Speculative

Unnatural Variety of Linking Sites

If you subscribe to the notion that Google is ultimately watching for natural trends, and you accept the studies done post-Penguin on sites that were severely penalized for carrying an anchor text greater than 10%, you may also subscribe to the notion that any type of unnatural ratio of off-page activity at scale can hurt you. Although no public case study is available at the time of writing this, we have repeatedly witnessed those practicing otherwise successful black hat SEO getting greedy, taking their scheme too far, and being penalized.

Source(s): Speculation

Webspam Footprints

A “footprint” is an off-page SEO term that describes virtually anything that Google might use to identify activity originating from a common source. This might be a forum username, a person’s name, a photo, a guest author biography snippet, some element of a WordPress theme that’s involved in a private blog network, and or just about any subtle detail that relates the efforts of a webspam activity. Obviously, a footprint is not always bad, but if a site even slightly runs afoul of Google’s Webmaster Guidelines, footprints are often a factor that bring about penalties.

Source(s): Matt Cutts via SEL

Comment Spam

If you engage in blog comment spam – that is, commenting in mass in a repetitive, unnatural format, expect to see these links devalued or penalized as a link scheme. Especially your commenting is machine-driven, with odd keyword anchor text, or leaving behind a footprint of irrelevant or repetitive content. Genuine commentary, on the other hand, is fine and actually encouraged. Mr. Cutts suggests using your real name in such circumstances for good measure.

Source(s): Matt Cutts

Forum Post Spam

Forum posts, like blog comments, are fine and actually good inbound marketing when they add to a conversation and are doing for humans rather than search spiders. John Mueller confirms (amongst countless other sources that have appeared on this one over the years), that they are systematically monitoring for link schemes in the form of bulk forum spam.

Source(s): John Mueller via SER

Advertorials (Native Advertising)

Advertorial content, also known as Native Advertising, is systematically sought out by Google’s webspam team, and is considered a paid link. Links in advertorials should be disclosed and given the rel=”nofollow” attribute to avert potential for penalties. Presumably, this is a case where “nofollow” is definitely respected. Undisclosed advertorials can also get an entire publication delisted from Google News.

Source(s): Matt Cutts

Forum Signature & Profile Links

Google seems to decipher what links appear as forum signatures as compared to what links appear as a part of a natural discussion, which would be treated as editorial context and would likely actually be given PageRank. The same appears true for the popular webspam tactic of creating forum profiles. Executed in mass, it seems that both tactics eventually progress from adding very little granular value to an eventual webspam penalty

Source(s): Google

Inbound Affiliate Links

Before we speculate: inbound affiliate links are often affected by PageRank decay via 301 redirects and duplicate content devaluations as the result of URL variables. There’s speculation that inbound affiliate links may be devalued for similar reasons as paid link penalties exist, intentionally or unintentionally. Matt Cutts has suggested using “nofollow” on outbound affiliate links “if you’re worried about paid links” but also has indicated that they’re “usually fine”.

Source(s): Matt Cutts, Speculation

Footer Links

It’s been made very clear that links tucked into the footer of a site don’t carry the same weight as those in an editorial context. This concept is supported by how the Page Layout algorithm works, but it also seems that links in the footer of a site are treated even worse than content that’s just below-the-fold, as Google has specifically spoken out against using too many on more than one occasion.

Source(s): Matt Cutts

Header, Sidebar Links

Like footer links, Google appears to single out links that appear in the header or sidebar of a site (whether or not they are static, sitewide links), defining this area as “boilerplate” in their patents. Specifically, the patent language states: “the article is indexed after the boilerplate has been removed; the resultant weighting may be more accurate since it relies relatively more heavily on non-boilerplate.”

Source(s): Patent US 8041713 B2

WordPress Sponsored Themes

On top of the low value that sitewide footer links now carry, it definitely seems that Google’s webspam team is well aware of the once powerful and now mostly useless tactic of producing WordPress themes with backlinks in them. Such efforts definitely leave an obvious spammy footprint, with similarities to the GWG widget example, and it’s clear that Google isn’t having it.

Source(s): Matt Cutts

Widget Links

This was once a pretty fun link scheme that, while mostly harmless and actually value-added for a lot of users, also failed to fit into a world where links are only used as genuine endorsements. While it seems that you can still distribute widgets in 2015, Google does request using “nofollow” on the links, and not applying anchor text. Google being Google, this also almost assuredly means bad things can happen if you don’t.

Source(s): Google

Author Biography Links

Every time a link building tactic becomes a little too easy to spam, Google devalues it. It’s not “dead”. But “guest posting” in 2010 was near identical to “article marketing” from 2005 for too many. These schemes resulted in the author biography section of a blogs and articles being given less weight. Simple as that. And contrary to popular myth, brands are not punished for “for human” guest posts like New York Times Editorials and other genuinely authoritative media placements.

Source(s): Matt Cutts

Link Wheel (Pyramid / Tetrahedron)

You might study Larry Page’s paper on the PageRank algorithm and determine that you could interlink sites in a triangular/circular fashion to repeatedly pass PageRank to the same sites. Some PageRank decay exists, sure, but it’s very gradual. If this were still 2005, you’d have been rewarded for your brilliance. If you come across someone still selling link wheels, pyramids, or triangles in 2015; prepare for significant devaluations and possibly penalties.

Source(s): Matt Cutts via ClickZ

Article Directories

With how far Google has come with punishing domain-wide content scores with Panda and unnatural patterns of links with Penguin, it’s unclear if Google would even need to go out of their way to punish these sites anymore. It seems, however, that they do still single out these article directories as an issue as recently as a 2014 Matt Cutts video, however, so expect to see longer-term issues if using these methods.

Source(s): Matt Cutts

Generic Web Directories

Generic web directories were one of earliest link schemes. Matt Cutts goes out of his way to state that they do penalize paid, generic directories as paid links if they are not exercising some editorial discretion. He cites Yahoo!’s paid directory as one that is actually alright. With any link, paid or not, there appears to be a theme: editorial discretion is good, complete free-for-all listings are bad.

Source(s): Matt Cutts via ClickZ

Reciprocal Links

Google has a tendency to devalue reciprocal links, more than the expected “leaking PageRank” type effect that can might come from too many outbound links. As a very early link scheme, too many reciprocal links, or pages/sites that link to each other, as a very clear and obvious sign that most of your links were not earned and are not natural, editorial placements.

Source(s): Matt Cutts

Private Network (Link Farms)

Similar to how cross-linking your own sites can draw penalties, as can getting involved with large, private networks of just-for-SEO sites. Google has gotten very aggressive with these sites, taking downentire networks by manual action, with countless automated methods of analyzing their webspam footprints. In 2015 these methods are still widespread as short-term black hat schemes, but on a long enough timeline, every Private Network seems to be dealt.

Source(s): Matt Cutts via ClickZ

Google Dance

This term describes a temporary shake-up that sometimes accompany Google’s ~500 algorithm updates per year. Technically, these effects could be positive or negative, since it’s just rearranging rankings and someone has to go up for another to go down. But since a Google Dance is always unexpected, we’re classifying it as a negative.

Source(s): Danny Sullivan

Manual Action

In spite of every other ranking factor, Google’s webspam team will still occasionally take manual action against certain sites which can take half a year to a year to recover from after you’ve cleaned up the problems. Often, these penalties come with a notification in Google Webmaster Tools. For this reason, it’s critical to constantly look beyond the functionality of today and ask “what does Google want?” Learn Google’s philosophies and market your site in harmony.

Source(s): Matt Cutts

Poor Content Surrounding Links

Google looks at the quality of content surrounding backlinks to determine their quality, especially after the Panda and Penguin algorithms were put into production. The exact implications are unknown, but it’s not unreasonable to assume that Google’s methods for defining quality off-page are at least similar as to how they are defining quality on-page.

Source(s): Patent US 8577893, SEO by the Sea

No Context Surrounding Links

If the context surrounding a link adds value, surely a lack of context must be bad, right? This factor is likely simply a devaluation, as on some level it is still something that occurs naturally in widespread fashion. Expect to receive less, but not a total lack of, value from backlinks that are not placed in an editorial context.

Source(s): Patent US 8577893

Ratio of Links Out of Context

It’s theorized that having too many backlinks without context surrounding them, beyond a certain volume, indicates a clear webspam footprint. This theory mixes three ideas: a Google Patent “Ranking based on reference contexts” establishes the context surrounding a link is a useful indicator of quality, the frequent discussion of webspam footprints by Matt Cutts, and the fact that some amount of no-context links is natural.

Source(s): Patent US 8577893

Irrelevant Content Surrounding Links

A Google patent titled “Ranking based on reference contexts” describes how Google may look at the words surrounding a link to determine what that link relates to. If writing is not focused and thematic, it will not take advantage of this. If surrounding content is irrelevant enough, this would be altogether bizarre against what occurs naturally, and penalties may be possible.

Source(s): Patent US 8577893

Rapid Gain of Links

To quote the Google Patent: “While a spiky rate of growth in the number of back links may be a factor used by search engine to score documents, it may also signal an attempt to spam search engine.” Rapid, spontaneous growth is highly likely to invite additional scrutiny from webspam filters, however, appears more than fine when it comes from genuine editorial exposure or “going viral” without use of intentionally manipulative practices.

Source(s): Patent US 8521749 B2

Rapid Loss of Links

For nearly the same reason that a rapid gain of links can increase the scrutiny to a site’s backlink portfolio, a rapid loss of links is at least as justified a problem; probably more. As a simple logical exercise: webspam is often quickly moderated by site owners and paid links expire. The types of links that Google typically praises are the kind that tend to last.

Source(s): Patent US 8521749 B2

Sitewide Links

Sitewide links are not harmful in of themselves, but tend to be devalued, such that they’re basically just treated as one link. Matt Cutts confirms that sitewide links do happen naturally, but are often also associated with webspam. Because of this, Google’s webspam team does manual reviews of sitewide links. Presumably, there’s some automated component this process as well, and greater overall risk.

Source(s): Matt Cutts

Links from Irrelevant Sites

Google gives a bonus to inbound links from relevant sites since the Hilltop algorithm. A widespread SEO myth and a number of very dangerous “link unbuilding” and “disavowing” services have sprung up that suggest that links from irrelevant sites are inherently bad. While too many such links could result in an unnatural footprint, it would be at least as unnatural to only obtain links from sites that are exactly like your own.

Source(s): Analyzing link profiles of popular sites

Negative Page Link Velocity

A Google patent states: “By analyzing the change in the number orrate of increase/decrease of back links to a document (or page) over time, search engine may derive a valuable signal of how fresh the document is.” This indicates that a decreasing rate of inbound linkers could be damaging, especially (though not necessarily exclusively) on search queries noted to deserve only fresh content.

Source(s): Patent US 8521749 B2

Negative Domain Link Velocity

It’s speculated that if your site’s backlink portfolio is stagnant or losing links faster than you are gaining links over a long time horizon, something is wrong. This may be partially supported by a Google patent, which talks about a declining rate of inbound linkers to a particular document indicating a lack of freshness, combined with the mass of single page ranking factors confirmed to also apply domain-wide.

Source(s): Patent US 8521749 B2

Penalty by Redirect

John Mueller confirms via Google Hangout that organic search penalties can pass through a 301 redirected site. John’s confirmation confirms that this being a realistic factor is likely. The notion of this actually occurring in the wild is probably far less, unless you’re doing something like buying used domain names in attempt to reclaim their inbound link value or trying to circumvent a manual action.

Source(s): John Mueller via SER

Disavowed Links

Google Webmaster Tools added a feature in 2012 that allows you to request that an inbound link be completely ignored. These effects are permanent, irreversible, and can be very damaging to your brand’s long-term search reputation if not used correctly. This should only be used as a mode of last-resort in response to manual action or legitimate webspam mistakes from your past.

Source(s): John Mueller via SER

Links from Penalized Sites

Google has long used the phrase “bad neighborhoods” to describe the interrelation of penalty-prone sites. If your site gets a link from a site that’s already penalized for any reason, you can bet that this draws additional scrutiny to your site, and that enough of this sort of activity can bring about penalties for your site as well.

Source(s): Matt Cutts

Chrome Blocked Sites

Google introduced a tool in 2011 that allowed users to block sites in Google search via Chrome. They stated “while we’re not currently using the domains people block as a signal in ranking, we’ll look at the data and see whether it would be useful”. Therefore, there’s no guarantee that this is an automated factor in rankings, but we’re also not about to believe that nobody on the webspam team is looking at this data.

Source(s): Amay Champaneria

Negative Sentiment

In 2010, Google told us that the sentiment expressed towards a brand, such as in reviews or the text surrounding links, is a ranking factor. Reviews were known to be a huge part of local or “Google Maps SEO” rankings before that. The implications of this are a little complex, but Moz’s Carson Ward did a great piece on it.

Source(s): Amit Singhal, Patent US 7987188 B2

Crawl Rate Modification

Google Webmaster Tools allows you to modify the rate in which your site is crawled by Google. It’s not really possible to speed up Googlebot, but it’s certainly possible to slow it down to zero. This can cause problems with indexing, which mean problems for ranking, especially in regards to factors surrounding fresh content and editing.

Source(s): Google

International Targeting Tool

Google Webmaster Tools provides a tool for international targeting when it may not be done correctly otherwise. Theoretically, this tool could also cause harm if it is used to restrict your site’s appearance in search results to a particular region that does not encompass your entire desired market region.

Source(s): Google

Building Links

One myth that never seems to die surrounds the idea that building links is bad. Google’s Matt Cutts gave us link building advice since the start, and in its purest form, link building is just traditional marketing adapted to the web. Link building runs counter to Google’s philosophies only when methods focus on search engines first. Links are marketing. Build links, just, always build links for humans first.

Source(s): Matt Cutts via SEL

Link Building Services

Paying for a service that pursues links is not the same thing as paid links. Though an exception would exist if that service turns around, pays someone else for links that pass PageRank, and which are then published with zero editorial discretion. Brand-safe link building must be akin to the services of a publicist – where placement can’t be promised, but there’s everything to be gained. Matt Cutts defines “editorial discretion” midway through his paid directories video.

Source(s): Matt Cutts

No Editorial Context

Matt Cutts tells us that all links should be published with editorial discretion. But not all links need to be placed in an editorial context – that is, within the middle of a story or article. It takes very little experimentation to see that that higher quality links outside of editorial context, such as a local Chamber of Commerce membership page, help quite a lot with authority. It’s also plain to see that this would be an unnatural pattern.

Source(s): Julie Joyce

Microsites

It’s been suggested that there’s some penalty reserved for microsites: websites with an extremely narrow scope and not a lot of pages. Matt Cutts gives us Google’s stance: Microsites aren’t hunted and penalized by Google, they’re just usually not a very effective tactic as a part of a long-term strategy since sitewide ranking factors will remain weak. They’re also not very effective at exploiting Exact Match Domain bonuses using keyword-focused domains anymore.

Source(s): Matt Cutts

Click Manipulation

If you subscribe to the notion that Click Through Rate (CTR) is a positive factor, it’s reasonable to suggest that webspam controls exist here too. Rand Fishkin’s Twitter CTR experiments present evidence of a page mass-clicked page in his experiment rising from #6 to #1, dropped to #12, before restoring its position, all within the course of a couple days.

Source(s): Rand Fishkin

Brand Search Manipulation

Another theory is that if brand searches are a ranking factor as patents suggest, that webspam controls must also exist here to prevent abuse. Otherwise, this factor would be far too easy to manipulate.

Source(s): Patent US 9031929 B1

Illegal Activity Report

Google has a form that requests users any report illegal activity occurring within their content. This page implies that any such content will be removed from any Google products, including Google Search. We have no reason to doubt them on this one, and don’t expect that anyone’s going to do an experiment on this factor anytime soon either.

Source(s): Google

DMCA Report

In addition to automated controls for detecting stolen content, un-cited sources, and potential copyright violations, Google also encourages users to send DMCA requests direct to Google. This almost certainly invokes the DMCA process within the United States, during which Google has no choice but to remove any offending context accessible on their domains.

Source(s): Webmaster Tools, DMCA Process

Low Dwell Time (Short Click)

A Google patent suggests seeking “a click of at least a certain duration, or a click of at least a certain duration relative to a resource length” on branded queries. Steven Levy’s “In The Plex” first-hand account of Google suggests that this is basically Google’s best measure of search result quality. Finally, Bing and Yahoo! both have suggested using dwell time, in some scope, as a ranking factor.

Source(s): Patent US 9031929 B1, Steven Levy (In The Plex), Bill Slawski

High Task Completion Time

We have quite a bit of evidence that Click Through Rate and Dwell Time may be ranking factors, though not directly confirmed. We also know of a research paper co-authored by Google employee David Mease, which describes analyzing the overall time it takes a searcher to find a result that they’re happy with and responding with an “alternative experiential design”. Is it possible that automated A/B testing will “shake up” the weighting of factors based on how happy users appear with their results?

Source(s): David Mease

self.note