Duplicate Content Basics, Issues and Remedies
There are a few issues that online marketers and content creators may run into within the realm of SEO and digital marketing, and one of these is the presence of duplicate content. A negative that won’t necessarily hurt your rankings, but may impact your site in a few different ways, duplicate content matters for both search engines and site owners alike, and has a few specific common causes to consider.
At SEO Werkz, a major part of our SEO services includes helping steer you clear of issues like duplicate content. From our on-site optimization services to additional elements like content creation, social media marketing and more, we have years of experience that have allowed us to identify the pitfalls in each of these important areas, then help our clients avoid them. What exactly is duplicate content, why is it such a big negative, and how does it happen? We’ll go over all of these factors here, plus some important information on how to deal with duplicate content if you run into it.
Duplicate Content Basics and Definition
As its name indicates, duplicate content refers to content that appears on the web in more than a single location. By “location” in this situation, we mean a unique URL – duplicate content within the same URL is considered normal and not a problem.
As we noted above, duplicate content does not technically qualify as a penalty within Google’s SEO guidelines. However, it can have a negative impact on your rankings in several ways, including a simple concept: It makes it harder for a search engine to “decide” which version of similar (or identical) content is more relevant to a user’s query, and often will cause none of them to receive prominent placement on search results.
Why It’s Bad
As we also touched on above, duplicate content is an issue for both search engines and the people who run the sites they’re present on. For search engines, the lack of knowledge we mentioned regarding which version of content to include in an index is a big issue; in addition, the engine will not know whether to direct link metrics like trust, authority, link equity, anchor text and others. Finally, on top of these concerns, search engines will be confused about which version they’re supposed to rank for query results, resulting in a situation where some versions get good rankings and others don’t – or even some where none of the versions are ranked well at all.
On the other side of the coin is site owners, who may deal with significant rankings decreases and drops in traffic if duplicate content is discovered. Search engines will almost never display more than one version of the same content, and will instead choose between them – this dilutes the visibility of every single one of the duplicates that are out there. In addition, link equity is often diluted even further based on a simple theme: Other sites also have to choose between duplicate forms of content. Rather than 100% of inbound links pointing to the same piece of content, these links will be to varying different pieces, meaning link equity will be spread out. Inbound links are a prominent factor in search rankings, so this kind of diminished link equity may have a major negative impact.
How Duplicate Content is Created
To be clear, we’re well aware that duplicate content isn’t something most site owners or content creators generate on purpose. It’s usually created accidentally, but it’s likely more prominent than you may think – this is because there are a few ways it can be created if you aren’t careful. Here are the three most common ways you’ll see duplicate content come about:
- Variations in URL: There are some URL parameters used in SEO that may lead to duplicate content problems, particularly in areas like click tracking or various forms of analytics code. Some of these issues are created by the actual parameters, but more often the problem is the actual order in which the parameters are placed in the URL. There also may be cases where session IDs play a role – users visiting a site are mistakenly assigned a session ID that doesn’t match up with the ID stored inside the URL. In still other cases, there are even certain forms of content that are made to be printer-friendly – when multiple versions of these pages are indexed, duplicate content may become a problem. For all of these reasons, we often recommend avoiding the addition of URL parameters or alternate URL versions, as there are alternative ways of getting this information into your pages.
- HTTP vs HTTPS: For some sites, versions are different when they come with or without the well-known “www” prefix. The ideal format involves the version being identical for both these pages, or also situations where both http:// and https:// are used. But if these are separate and both live for search engines, you’ll have issues.
- Scraped or copied content: In other cases, content from blog posts, info pages, editorials and other areas will be scraped or even directly copied. This is a particular issue for e-commerce sites that use product information on many pages, especially for sites that sell popular products you’ll also see on competitor sites.
Remedying Duplicate Content
The primary goal in fixing duplicate content concerns for your site is helping the search engine identify the “correct” duplicate. This involves the canonicalization of content, which can be done in a few ways:
- 301 redirect: The most common method, this involves setting up a 301 redirect from duplicate pages to the original.
- Rel=canonical: This is a method that informs the search engine to treat the page as a copy of a specified URL.
- Preferred domain/parameter handling: Within Google Search Console, you can set the preferred domain for your site, plus specify whether Googlebot should crawl different URL parameters differently.
For more on duplicate content and how to deal with it, or to learn about any of our SEO, PPC, web design or other online marketing services, speak to the staff at SEO Werkz today.