Search Friendly Development
Good morning, Friends. It’s time for Day 2 and I’m coming to you live from the brand new Developer Track. The Developer Track is awesome because it comes complete with a fancy waterfront view. I’d also mention the yummy bagel I’m eating but I think Michael VanDeMar is going to come and kick me if I do. He’s over my bagel stories.
Vanessa Fox will moderate as speakers Nathan Buggia (Microsoft), Maile Ohye (Google) and Sharad Verma (Yahoo) get us started.
Up first is Nathan Buggia. He says he rewrote his entire presentation last night based on what people were talking about on Day 1.
Microsoft is working on lots of big, hard problems. Stuff like:
- Affiliate tracking
- Session management
- Rich Internet application
- Duplicate content
- Understanding analytics
- Error Management
Advanced search engine optimization is analytics. That’s what differentiates it from regular search engine optimization. It means you’re at a larger company with more resources (um, not necessarily). Implement things in a logical order. See what the impact is on your customers and the engines and decide if that’s the right thing to go forward with. Do not implement something because you heard someone on a panel say it was a good idea. PageRank sculpting is a good example of that. Everything on the Web is an opportunity cost.
Nathan says to watch out for complexity. If you build cloaking or situational redirects into your Web site, you can add a lot of complexity to your site. It becomes hard to notice if you have problems on your site because stuff is hidden from even you. You want the simplest architecture you can have. Microsoft says cloaking isn’t all bad, but it’s never the first, second or third solution they recommend.
All Web sites have the same first problems. The first problem is accessibility. That’s where people should start. Can a crawler get to your Web site? Are they hitting 404s? Do you use Flash or Silverlight and are they monopolizing the user experience? Take a look at canonicalization. Are you dividing all your PageRank and reputation?
Search engines are always changing. Someone can come up on the stage and claim they have the big new tactic for search engine optimization and then that may change in a year. What is consistent are the Webmaster Guidelines. Those are things that in spirit all the search engines agree with. If you go to Google’s Webmaster Guidelines and adhere to the spirit of them, then you’re working with the search engines instead of against them.
Nathan gives us an example and uses Nike.com. Nike is a brilliant company. There are few companies that can do the type of branding that they did with Just Do It.
When you go to Nike.com you see the Flash loading. Then you select language, region, etc. Then you get another loading screen because they’re going to play a full minute video. It takes eight seconds to get to that video. Maybe people don’t have eight seconds. Maybe they only have one second. The second run experience is 3 seconds because of the cookie Nike puts on your computer. The cookie resets every day. If you are blind or ADHD, you have a really bad experience on that site.
The site also isn’t great for search. He shows us the HTML behind the page. There’s no Title tag. There’s nothing. It’s just a Flash application. Basically they’re cloaking. The site is also really complicated. Nike has over 2 million pages on their Web site and they’re cloaking for a lot of them. He shows what the Nike SERP description was for a few days after their cloaking broke. It was a user error.
Every investment you make is another investment that you can’t make. If you’re investing all in cloaking, there are other people out there NOT investing in those things. If you type in [Lebron James shoes], Nike doesn’t come up.
Advanced search engine optimization is not spam.
Search engine optimization does equal good Web design.
Design for your customers, be smart about robots and you’ll enjoy long-lasting success.
Sharad Verma is up.
Sharad says he loves his job. This is an opportunity to serve his customers. When he’s not working he loves to travel. Last week he was in Machu Picchu, Peru. He’s giving us a bit of a history lesson and telling us how he took trains and buses on his journey. I’m not sure where this I going but it will tie together soon. Oh, I get it. The moral of the story is that Machu Picchu is accessible and easily discovered. I see what he did there.
As a site owner you’re serving both your users and robots. You need to design your site so you’re not alienating either of them. There are three cranks behind the box – crawling, indexing and ranking. You have control over all three, but more control over crawling.
How do Spiders Crawl Your Web site?
They start with the URL, download the Web page, extract links from the Web page and then follow more links. Sometimes they find invisible links or sometimes they see links but decide not to crawl the content. That could be because the links are excluded in your robots.txt or because they’re duplicate links.
Search engines find your contact via the organic inclusion from crawling. All you have to do as a site owner is put up your site, get links, and let the crawlers in. They’ll do the magic. If you’re not satisfied with what they’re crawling, then you can supplement that with feeds.
Roadblocks of Organic Crawl
Flash: Make sure your site can be read by a robot. If you’re using Flash, make sure you’re offering up alternative navigation.
Dynamic URLs: Difficult to read, lead to duplicate content, waste crawl bandwidth, split the link juice and are less likely to be crawled and indexed.
- Create user friendly, human readable URLs
- 301 redirect dynamic URLs to static versions
- Limit the number of parameters
- Rewrite dynamic URLs through Yahoo! Site Explorer
He asks how many people use Site Explorer and their Dynamic URL Feature. Log in and authenticate your Web site. It allows you to remove parameters from URLs.
Consequences of duplicate content: Less effective crawl, less likely to attract links from duplicate pages.
Solutions to duplicate content: 301 duplicate content to the canonical version, disallow duplicate content in Robots.txt
Other Best Practices:
- Flatten your folder structure
- Redirect old pages to the corresponding new pages with 301/302
- Use keywords in URLs
- Use sub-domains ONLY when appropriate
- Remove the file extension from the URL if you can
- Consistently use canonical URLS for internal linking
- Promote your critical content close to the home page
You can also get your content included through feed based crawling. You can provide feeds through their Sitemaps Protocol to tell the crawler were to find all the pages on your site, especially your deep content. Sharad recommends using all the Meta data supported by Sitemaps Protocol.
Do not exclude your CSS content in the Robots Exclusion Protocol because the engines want to see the layout of your page.
Search engines want your content. Break down those accessibility barriers and let them do their job.
Maile Ohye is up last.
Google wants to help users create better sites. If you have better sites, we all have a better Internet. Aw. She’s going to tell us how to enhance your site at every stage of the pipeline. Maile talks like an infomercial.
Consider progressive enhancement. This means you don’t just begin with Flash. You start with static HTML and then add the “fancy bonuses” like Flash and AJAX later. Then the fancy stuff becomes a complement to your Web site instead of your entire site.
She looks at a page/site that’s rich in media with HTML content and navigation – the Dramatic Chipmunk video on YouTube. The video is in Flash, but there’s descriptive content on the page (title, description, user generated content in the comments) and HTML navigation.
Consider sIFR for Flash
With No Flash, it displays the regular text. With Flash on, you get the Flash.
If you do that the text must match the content viewed by enabled users. It must be accessible to screen readers and search engines.
Consider Hijax for AjAX
Google Webmaster Central
Webmaster Tools: They give crawl errors if you verify your site. In crawl errors, be sure that what you see is what you expect. They’ll show URLs blocked by robots.txt, make sure that’s what you want. They’ll also tell you about time out errors and unreachable links. Use it to verify your link structure and that all your links are findable.
Promote your quality content. Set preferred domain to www or non-www. You don’t want to run two versions of your Web site. [As a note, this doesn't always fix the problem. Be consistent in your linking and don't rely on Google to do your work for you.--Susan]
To reduce duplicate content, keep URLs as clean as possible, internally link to your preferred version and store visitor information in cookies then 301 to canonical version.
Use a cookie to set the affiliate ID and trackingID values.
Proper Use of Response Codes
Use 301s for permanent redirects.
Signals search engines to transfer the properties like link popularity to the target URL. This applies to situations like moving a site to a new domain and modifying the URL structure.
Anatomy of a Search Result
Create a unique, informative title. It acts as informative signal of the URLs contents to a search engine and user. You don’t want your title to say “Untitled”. She talks about how Webmaster Tool can help you locate Title tag issues.
Snippets: Provide the user more content about each search results. The quality of your snippet can impact your click-through.
Influence snippets with Meta Description. Meta Descriptions can be utilized by Google in search results. Meta keywords are of low priority.
Final thoughts from Maile:
- Verify Crawl errors as expected
- Creative descriptive titles, consider adding useful meta descriptions
- Submit site maps for your canonical URL
- View Webmaster Central blog posts