#AskAnSEO: On Indexing, Canonicals, and H Tag Use by @jennyhalasz
Want to ask Jenny an SEO question for her bi-weekly column? Fill out our form or use #AskAnSEO on social media.
We have another round of great questions this month, so let’s get to them!
One question I can’t seem to find substantial answers to is: Are there any implications to SEO for web pages that skip H2s and go directly from an H1 to an H3? Thank you for any insights and advice.
–Michelle K. via Facebook
Probably not. The H1 is generally known as the item on the page that has the most weight, but this is based on old school rules of HTML. The H1, or Heading 1, was technically used as the heading or “title” of the document. Typically, that “Heading” would also be the “Title” of the page (as in the HTML title), which is why that staid “advice” about not making them the same is so asinine (but I digress)… So in the old rules of HTML, before the advent of CSS, the H1 would be the largest text on the page, the H2 the second largest, the H3 the next size, and so on. Therefore, search engines traditionally placed higher weight on the largest text on the page.
Enter SEOs and their penchant for breaking the rules, and some figured out that they could code anything as an H tag but use CSS to control the size and position of the text. So search engines still used this classic signal, but became wary of it. Over time, this signal became less and less important.
I still recommend that clients try to have a single H1 that uses their core keyword and at least one H2 that does, because why not do it? It helps to structure your page visually and provides important clues about the topicality of the page. But Bing (through Duane Forrester) went on record stating that they only look at the H1 and the first H2 in terms of assigning additional weight, and Google has indicated (although not said as clearly) that they use a similar strategy.
However, with the advent and increasing popularity of HTML5, which uses H tags to structure content, this old school recommendation carries less weight than it used to.
If I were you, I’d use the H1 and the H2, but it’s not a deal-breaker. Make sure the topic of the page is clear, that the text is presented well, and that it renders well on mobile, and you’ll have handled three things that carry a lot more weight than the H tag.
What is the best way to approach a hard keyword? I work for a company that manufacturers fiber optics patch cables and Ethernet Cables, both 10/10 keywords
– Johann T. via Twitter
How much time do you have? I jest, but this is the crux of SEO. My best advice that will fit in less than a dissertation is to find your niche. Google’s goal is to provide a variety of results on keywords like this that are short tail, and you have to meet one of those needs to be shown. A keyword like “Ethernet cables” could mean anything from “buy ethernet cables” to “how do ethernet cables work” to “does an ethernet cable make a good dog leash”. So do a search on Google and spend some time breaking down the types of results that are there.
- Purchase-oriented results: These are overrun by sites like Amazon and Best Buy. You probably won’t be able to compete for this.
- Information-oriented results: Things like “what is an ethernet cable” and “how do you use ethernet cables” – mostly dominated by Wikipedia-type sites, but you can compete.
- Comparison results: Notably absent. Are all ethernet cables the same? Could you find an in-road here by writing about why they are different, or how they’re made, or details about how the technology works? Could you provide some unbiased reviews, or discuss how to know when to use a CAT 5 vs. a CAT 5e for example?
It’s still unlikely you’re going to overtake the big guys for a short tail keyword like this, but by focusing on the things that are missing, you can begin to create a niche for yourself. I notice there are no video results on this search – maybe that’s an opportunity. Look at the “People also ask” box for ideas too.
The bottom line is that you’ll never be able to compete with the big guys without huge budgets and resources. So change the playing field instead to something you’re better equipped to compete on.
Can’t figure out why Google Search Console keeps showing that it has indexed 0 pages of one of my client websites when all the other metrics i.e. “site:myclientsite.com” shows several pages of results. The robots.txt file is not blocking anything and I’ve submitted XML sitemaps that seem to be working correctly. Don’t know what else to try. Any ideas? Thank you.
–Marvin K., Salem, Massachusets
According to Maile Ohye of Google (SMX Advanced, 2015), the site: search is “not always accurate.” It’s designed to be an advanced search utility, not an exhaustive look at everything they have in their database. Often, I’ll find pages in there that have been 301’d to a new location (and it shows the original page) or pages that have been noindexed. Sometimes I’ll even find old (very old) 404s or 410s. The bottom line is that the utility is not completely reliable, so don’t put too much stock in it – only use it as a litmus test.
The bigger question is: Why is your Search Console showing 0 pages indexed? Let’s take the usual suspects first:
- Your account is verified, or you wouldn’t be able to see any data.
- Presumably, your account has been open more than two weeks (sometimes it can take that long for Google to catch up).
- It sounds like Google is finding your XML sitemap since you said you’ve tested it and it’s working.
So the next step is:
Check to make sure you are looking at the exact same canonical. For example, if your site resolves to https://www.site.com, the search console report for http: and site.com without the www will not return any indexed pages.
If even that checks out (most common problem), then:
- Check to see if the site is getting any search traffic in the “search analytics” page of Search Console.
- Test a few specific pages from your sitemap – if they show up both in Google search and in the search analytics report, you can be pretty sure you’re dealing with a bug in GSC. In that case, contact GSC through the webmaster forum and ask them to look into it.
I have a website that is a year old but no matter what I try to do (social signals, backlinks, search engine submission,etc) it’s not indexing in Google. What can be the problem? Thanks
–Nikos A., Limassol, Cyprus
Try the most obvious thing first. Make sure you are not blocking Google with a robots.txt file or a meta robots “noindex” tag. If you have Google Search Console, the easiest way to do this is with the Fetch and Render tool. If you don’t have GSC, get it, it’s free – but try a publicly available tool like this one by rexswain.
If all that’s OK, move on to these items:
- Look at your website in a text browser like Lynx, or do Fetch and Render in Google Search Console. Is there anything there? Keep in mind there are still some technologies (like Flash) that are not indexable. If your website has no content in HTML, it’s unlikely it will be indexed.
- Double check that you’re not indexed at all. Do a search in Google for “site:domain.com” where you replace domain.com with your domain. Still nothing? Sometimes people think that just because they get no search traffic, they aren’t indexed. That’s not always the case.
- Make sure you’re searching on your country domain. I noticed in your original question that your email address is a .gr (Greece). Make sure you do the above search on Google.gr. It’s not that you can’t show up in Google.com, just that it’s likely you will show up on Google.gr first.
Finally, check your log files (you can ask your hosting company for a couple days or weeks’ worth depending on your traffic level). Is Googlebot visiting your site? If you’re not seeing it come at all, try doing a manual submission.
If you are seeing Googlebot visiting the site but are still not indexed, that’s a bad sign. Look at the Wayback machine to see if maybe your domain was used for spam before you bought it. It is possible (although unlikely) that your domain is actually blacklisted. If that’s the case, post in the webmaster forum and ask for help as a first step.
If you have built some decent backlinks, have some social traffic, provide something other than what everyone can find everywhere else (that last one is important), and have more than a single page site, it would be highly unlikely that you would not be indexed at all unless there was something more seriously wrong. Nine times out of ten, there is some basic error that has been overlooked like the ones above.
I help run a travel website, and we have some pages with travel options that are similar in most of their itineraries, which means that their main content is the same with few variations. What would be better? Keeping the similar pages noindex in order to make sure they are not regarded as duplicate content, or letting them have a canonical to the “original”/best-selling option?
What would be the reason for using canonicals over noindex for similar pages? And vice versa?
–T. Boesen, Viby J, Denmark
Canonical would definitely be better in this case. Noindex means that search engines are not allowed to add these pages to their index, which means that they would not be able to offer that page for a very specific search. Let’s imagine two pages that are as you describe:
If the Hoober Bloob Highway page is most popular and you choose to noindex the Road to Kalamazoo page, that means that for a very specific search like “sneech watching on the road to kalamazoo”, only the Hoober Bloob Highway page has the opportunity to show up. This, of course, is not a good user experience and is not likely to make a sale. In fact, it probably won’t show up at all, since it’s not specific to Kalamazoo; Google will choose to show one of your competitors that are more specific instead. By contrast, if you make the HBH page the canonical, Google can still decide to show the RTK page if it’s more appropriate for the search, but it doesn’t create a duplicate content problem to have both.
That’s all for now. Keep those questions coming!
Featured Image: Image by Paulo Bobita
In-post Images: Screenshots by author. Taken December 2016.
Source: SEARCH ENGINE JOURNAL