Perspectives

Volume 5, Issue 2 June 23, 2010

Download PDF Print Page

The Cost of Typosquatting

An Examination of Its Impact on the Top 250 Most Popular Websites

Executive Summary

It’s often acknowledged that typosquatting is rampant in the domain name space, but few understand the actual impact of this practice on brand owners.

In order to quantify the economic impact of typosquatting, FairWinds Partners analyzed domain names that are typographical variations of the top 250 most highly trafficked websites. The study concludes that typosquatting costs the brand owners associated with those sites $364 million and 448 million impressions per year in aggregate due to unnecessary advertising costs, lost sales, and poor user experiences (note: not a calculation of broader cybersquatting).

Our findings demonstrate that not all typographical errors are created equal; the kind of typo does matter. Some typos can do over $1M in annual damage to a brand and, in turn, can offer savvy brand owners significant returns on investment if that lost value is recovered. However, this study also shows that brands do not need to attempt to enforce their rights to the majority of infringing domains. In fact, most examples of typosquatting (typographical error based cybersquatting) do not impact a brand’s bottom line, since most domains do little to actually harm the rightful brand owners and would do little to benefit their online strategy. Prioritizing which domains to pursue is the key to reducing losses associated with this form of abuse and recovering the lost value. The analysis and findings presented by FairWinds in this study can help guide any brand’s search for the needles in the typosquatting haystack.

Foreword

As FairWinds reported in a past study, Internet users frequently use Direct Navigation, which is the practice of typing keywords appended with an extension such as dot-COM directly into an Internet browser’s address bar to locate websites. The prevalence of this navigation technique enables the widespread success of cybersquatting and typosquatting in particular, as these practices feed on traffic targeting well-known website addresses.

Typosquatting refers to the practice by which individuals seek to monetize or otherwise benefit from traffic generated by spelling or keystroke mistakes made by direct navigators attempting to reach specific websites. While our previous typosquatting study yielded important information about just what kinds of typos are valuable to both brand owners and the cybersquatters who prey on them, we have conducted a follow-up study in order to determine just how much traffic and money brand owners are losing as a result of this infringing practice.

There are, of course, other types of domain name infringements that are not taken into account in this whitepaper—namely combosquatting. Cybersquatters who practice combosquatting register combinations of brand names (or typos of brand names) and generic keywords (ex. IBMsoftware.com). This type of abuse is also very harmful but was not included in the scope of this particular study.

We have chosen to restrict our examination to typosquatting since it has finite boundaries to work within. For example, there are only so many permutations of the correct spelling of a domain name and we can easily find which of those are registered, who registered them, how they are used in commerce, and how frequently users type them into the address bar. This cannot be done with combosquatting as easily since there are infinite brand name and generic keyword combinations and permutations of spellings therein.

We hope that by conducting a thorough examination of the typo domains of the 250 most trafficked target domain names that fall within our study’s parameters, we will find out more about the scope and scale of typosquatting and cybersquatting in general.

FINDINGS

By calculating the losses experienced by the top 250 most frequently visited dot-COM sites that fell within the parameters of our study, we estimate that, in aggregate, typosquatting costs the brands associated with these sites $285 million per yeardue to unnecessary advertising costs, lost sales, and poor user experiences.

KEY RESULTS

  • The top 255 most highly-trafficked sites in our study lose $265,180,586 to typosquatters each year
  • Brands lose an additional $19,986,288 as a result of misusing the typo domains in our study that they do own
  • 4 brands lose over 25 million visitors per year to typosquatted sites
  • 10 brands lose between 5 and 11 million visitors per year to typosquatted sites
  • 4 brands experienced an annual dollar impact of over $5 million
  • 33 brands experienced an annual dollar impact between $1 million and $5 million
  • Instead of spending money to reclaim all typosquatted names, brands can prioritize reclaims to pursue those domains that will produce a return on investment in a pre-determined time frame

METHODOLOGY

1. Compiling a List of Typosquatted Sites

To begin gathering data, we first looked at Quantcast’s1 list of most highly trafficked websites. Starting with the most highly trafficked site, we measured each domain name against a set of criteria for inclusion in our study. The first 250 domain names that met our criteria became our base data set. The criteria for inclusion were as follows:

  • Domain Tools2, the typo spinning software we used to generate the typos for the study, offers all results in dot-COM/NET/ORG/BIZ/INFO/US. As one would expect to find in such a study, the majority of registered typographical variations, 74 percent in total, fell under the dot-COM extension.
  • The domain name had to be at least six characters in length. Requiring that domain names in a data set contain a minimum of six characters helps to decrease the chance that a typo of a target domain is a different correctly spelled domain.

Based on Internet user behavior, we know that there are instances where direct navigators will remove hyphens from brand names when turning that brand name into a domain name. For example, Internet users searching for the Merriam-Webster Dictionary online may type in merriamwebster.com rather than merriam-webster.com. Many Internet users will likewise add hyphens to the domain name if the brand itself contains or once contained hyphens. For example, while Wal-Mart Stores Inc.’s most frequently communicated domain name is walmart.com and the company has recently removed the hyphen from its brand, many will still type in wal-mart.com. We identified five of these domain names on our list and included their hyphenated or unhyphenated counterparts in the study as well. As a result, our list of 250 became a list of 255. Once we settled on this list of 255 names, we recorded each registered typo of these domains across the more common gTLDs—.COM, .ORG, .BIZ, .INFO, .NET—and .US. This produced an initial data set consisting of 32,836 registered domains.

2. Projecting Traffic to Each Website

Using FairWinds’ proprietary traffic calculation method, we determined the annual traffic numbers for each of these domain names. 

3. Examining Website Content

We recorded the registration data for these domains and based on the registrant and registrar, labeled them as follows:

  • Domains owned by the same brand that owned the target domain were labeled “Brand Owner” domains. These domains are legitimately owned and should either not factor into an analysis of the cost of typosquatting (if the brand is using the domain name to point to relevant content) or should be evaluated as a different type of harm that the brand has inflicted upon itself (if the domain is owned by the brand owner but not being used properly).
  • Domains owned by a brand owner other than the owner of the target domain—most likely due to the fact that a typo of the target domain is another brand—were labeled as “Other Legitimate Owner” domains. Requiring that the target domains have a minimum of six characters reduces these occurrences, but it does not eliminate them entirely.
  • Finally, all of those names that did not belong to a brand owner fell under the umbrella of “Potential Squatter” domains. “Potential Squatter” domain names fall into one of two categories—either the identifying information regarding the owner is hidden or the information regarding the owner does not belong to a brand that the infringing domain targets. Under these circumstances, we know with certainty that the domain is not owned by the brand, but we do not know whether the domain is being used for a legitimate purpose (such as an opinion site) or for cybersquatting.

Once examined, this group of potential squatter domains—just over 28,000 domains, or about 85 percent of our original data set—would provide us with information on the losses incurred by brand owners as a result of typosquatted domains.

Each Potential Squatter domain has a target domain—the target domain is derived from the proper spelling of the brand. Each Potential Squatter domain also has a Potential Squatter behind it—the person who registered the infringing domain. In order to determine the content hosted on each of these domain names (from the data set of 28,000 names), we examined the content of 20 percent of the domains owned by each Potential Squatter for each target domain. These domains were chosen randomly, and the content of each domain was labeled as one of the following:

  • Pay-Per-Click (PPC): PPC websites display a collection of sponsored links, usually pertaining to the keywords contained within the domain. A domain name that contains typos of a brand could resolve to a PPC page that may contain links to that brand, links to the brand’s competitors, and links to related sponsored advertisements.
  • Affiliates: Some brands offer affiliate programs, which allow third-party website owners to post the brands’ links and banners on their site or to send traffic to their site directly through domain redirects; in return, the owner of the site that is hosting the link receives a commission for every click-through that results in a purchase, sign-up, etc. While it is usually in breech of an affiliate program agreement, some cybersquatters plug into affiliate programs by using brand typo domains.
  • Does Not Resolve (DNR): These domain names did not resolve to any content at the time of our review. It is possible that the site was simply down temporarily or that these sites continuously exist without content.
  • Infringing Content: These domain names resolve to content similar to that of the target brand, such as “whitepages.net” resolving to a third-party phone directory site.
  • Other: This category captures domains used for a variety of purposes, such as hosting content for contests, blogs about a brand or product, or a registrar/hosting provider’s “coming soon” information.

After initially examining the list of 28,000 and marking domain names housed on Domain Name Servers (DNS) known for hosting PPC sites as “PPC sites”, there were still thousands of domain names to be examined. So, we looked to see if there were any patterns in the DNS that hosted these domains. We examined 20 percent of the total domains housed in each remaining DNS—if 20 percent of domains on a particular server resolved to only one type of site (PPC, Affiliate, etc), the entire group of domains from our data set that were housed on that server were labeled as that type. Using this process, we were still unable to classify 8,000 of the 28,000 Potential Squatter domains. These 8,000 domains were therefore examined further.

The content of 20 percent of these remaining 8,000 domains was analyzed by first determining which of the 8,000 domains had significant quantifiable traffic. Ten percent of these 8,000, or 800 domains, received detectable traffic. We then took a random sample of 800 domains from the remaining 7,200 that did not receive detectable traffic. Based on the percentages of PPCs, Affiliates, DNRs and Others found in this 20 percent sample set, we projected the percentages of PPCs, Affiliates, DNRs and Others found in the entire population of the 8,000 originally unlabeled domains.

After these calculations, we determined that 23,374, or 84 percent of Potential Squatter domains resolve to PPC sites. Affiliate domains account for 5 percent of the Potential Squatter domains, while six percent did not resolve, three percent hosted “Other” content, and 2 percent resolved to infringing content.

graph1

Graph 1

4. Determining Costs Incurred as a Result of Each of These Content Categories

In order to determine costs incurred as a result of each of these content categories, one must understand and establish several things. First, one must understand brand owner spending on online advertising.

With the critical importance of digital marketing, brands spend significant portions of their online marketing dollars to ensure that their online audience can easily navigate to the relevant content that they expect to find. This includes money spent on search engine optimization (SEO) and sponsored advertising. An SEO strategy for a website is designed to improve the placement – or rank – of that website on search engine results pages through search engine organic/natural relevance scoring algorithms. Another aspect of search engine marketing is improving the visibility of one’s website on search engine results pages through paid placement or “sponsored” advertising.

screenshot1

Screenshot 1 [+]

When a brand is engaged in paid search, the company will not only have its links ranked at the top or on the right side of search engine results, but will also see its links posted on PPC and other websites unless it has opted out of the search engine’s extended/syndicated ad network. When an Internet user clicks on a sponsored link located on the search engine or another site, the brand paying for that sponsored advertisement will pay a click fee (the cost-per-click or CPC). When these links are displayed on sites with a domain containing the brand or a typo of a brand, the user who entered that domain name in the address bar was clearly intending to go directly to the brand’s own website. By paying a click fee for an advertisement on a site that contains its brand, the company is paying for traffic that is rightfully theirs in the first place. It is important to point out that PPC sites are rarely found on page one of a search engine’s results, so click-through from search engines to a PPC site is highly improbable. Therefore, the traffic figures presented throughout this whitepaper almost exclusively represent type-in or Direct Navigation visitor traffic.

The second key to determining the costs incurred through each of these content categories is understanding how Internet users navigate and seek content.

PPC sites

We have found through past research that roughly 25 percent of Internet users who land on a PPC site will click on one of the links presented. Furthermore, we estimate that about three-fourths of this 25 percent (or 18 percent of the total Internet users that who landed on the PPC site) end up clicking on a link for the targeted brand while the remaining one-fourth (or 7 percent of the total) will click on the link of a competitor. When an Internet user is reminded of a competing brand and clicks on a competitor’s link, diversion occurs.

Advertising Costs

For those brand owners who invest in paid search, we employed the following two formulas to calculate losses attributed to the advertising costs of Potential Squatter PPC sites:

1- (The percentage of users likely to click on the target brand’s link) x (Traffic) x (Avg. cost per click)
2- (The percentage of users likely to click on a competitor’s link) x (Traffic) x (Avg. cost per click)

As we have mentioned, about 25 percent of Internet users who land on a PPC site will click on one of the links presented. Three-fourths of this 25 percent (or 18 percent of the total Internet users who have landed on the PPC site) will end up clicking on a link for the targeted brand, while the remaining one-fourth (or 7 percent of the total) will click on the link of a competitor. The traffic component in this formula varies from domain name to domain name; as for the average cost per click. In order to determine the average cost per click of the brand’s own link, FairWinds leveraged Google’s average cost per click (CPC) for a keyword derived from a brand name. The root of each correctly spelled domain (the portion of a domain before the first dot) was converted into keywords (for example, verizonwireless.com became Verizon Wireless) and then individually run through the Google Adwords Traffic Estimator to determine its individual CPC value. The use of Google’s tool presents a transparent and replicable approach from a reputable, official source. CPCs from third parties such as SpyFu were not incorporated because of the source of their data was not transparent.

The Google Estimator revealed 18 domains whose root term(s) did not have a CPC value, due to the fact that no one is currently bidding for those root terms. These 18 domains were removed from the final computation. Next, we calculated the ratio of the sum of all typosquatted domain traffic for each correctly spelled website and the total amount of typosquatted traffic of all the domains in the study. Every keyword term(s)’ CPC was multiplied by the ratio of the sum of traffic to its typosquatted domains and the sum of traffic to all typosquatted domains in the study. When these values are summed they yield a weighted CPC of $2.03.

Based on these numbers, the two formulas become:

1- 18% x (Annual traffic per domain) x $2.03 = Advertising costs for the target brand
2- 7% x (Annual traffic per domain) x $2.03 = Advertising costs for the target brand’s competitor

Those brand owners that do not invest in paid search will not be featured in links on PPC sites; as a result, the 25 percent of visitors who are likely to click on a sponsored link are guaranteed to click on the link of a competitor. The typo domains of a brand that does not pay for search generate unintended advertising costs for the brand’s competitors, which is calculated by:

25% x (Annual traffic per domain) x $2.03 = Advertising costs for the target brand’s competitor

The total cost of unintended advertising that can be attributed to the Potential Squatter PPC sites in our study is $148,578,020 per year.

Lost Sales

The costs associated with lost sales must be added into the total amount of money lost to Potential Squatter PPC sites. These lost sales occur when an Internet user clicks through to the website of the brand’s competitor. This is calculated by the following formula:

For brands that invest in paid search:
(Average conversion rate of a sale x 7%) x (Average order size) x (Annual traffic)

For brands that do not invest in paid search:
(Average conversion rate of a sale x 25%) x (Average order size) x (Annual traffic)

We calculated the average conversion rate by looking at Internet Retailer’s Top 500 e-retailers. Ranking them in order of conversion rate, we eliminated the top five percent and bottom five percent to remove any outliers and then averaged out the remaining 90 percent. We calculated the average order size in the same manner—by ranking the Top 500 e-retailers by average order size, eliminating the top five percent, and then calculating the average of the remaining 90 percent.

Using this formula for each of these domain names, we found that the combined cost of lost sales attributed to PPC sites in our data set adds up to $32,855,146 per year.

Lost Impressions

Based on Interactive Advertising Bureau (IAB) statistics, we estimate an Internet impression—the monetized psychological value of reaching the intended online destination—is worth roughly between 5 and 10 cents for owners of global brands. For this study, we conservatively estimated that an impression is worth 2 cents per visitor. The Potential Squatter PPC sites in our data set garner 268,646,311 visitors per year; as a result, the annual cost of lost impressions to these sites amounts to$5,855,292.

Finally, to determine the total losses incurred through Potential Squatter PPC sites in our data set, we added up the calculated costs of unnecessary advertising, lost sales and lost impressions. The total losses for Potential Squatter PPC sites add up to$187,288,458 per year.

Affiliate sites

Internet users have no reason to realize that they have landed on a cybersquatter’s site if they see the domain name resolve to expected content. As a result, they behave just as they would on the intended target site.

Losses from Affiliate sites

On June 17, 2009, FairWinds released a study on affiliate fraud. One example in the study demonstrated how affiliate fraud could earn cybersquatters (and cost brand owners) as much as 5.6 times the fees that a PPC site would on that same domain name. However, to be conservative in our calculations, we are basing our estimates on the assumption that an affiliate site brings in three times more for the cybersquatter than a PPC site would on that same domain.

Therefore, to calculate the cost of affiliate fraud sites to brand owners, we used the following formula:
(Traffic) x (Average Cost Per Click) x (3) = Commission paid by brand owners to cybersquatters

The total costs of Potential Squatter Affiliate domains (which accounted for 5 percent of all Potential Squatter domains) added up to $33,010,689 per year.

DNR, Content Infringement and Other sites

An Internet user who does not happen upon a site with any relevant content after the first attempt at Direct Navigation will likely make one of three choices: realize their error and type in the domain again, feel they did not make an error and assume the site is down, or revert to using a search engine to find the desired content. Just as was the case with behavior towards sponsored links, three-fourths of these Internet users will then choose the target brand from amongst the search rankings that the search engine provides, while one-fourth will choose a competitor.

Now that we have established several principles for performing assumptions and calculations on our data, we can delve into determining the losses incurred by brand owners from the Potentially Squatted sites in our data.

Lost Sales

These DNR, Content Infringement, and “Other” sites will result in lost sales, which can be calculated in the same way. This is because, as we discussed earlier, an Internet user who does not happen upon a site with any relevant content after his first attempt at Direct Navigation will likely then use a search engine to find his desired content. Three-fourths of these Internet users will then choose the target brand from amongst the search rankings, while one-fourth will choose a competitor. For the three-fourths of visitors who choose the target brand, that brand will incur no losses. For the one-fourth who choose a competitor, however, the target brand will suffer lost sales. With this in mind, we employed the following formula to calculate lost sales:

Formula for calculation of lost sales [+]

Formula for calculation of lost sales.

We use 25 percent since we predict a user who finds no content will then visit a search engine to find content and will choose a competitor 25 percent of the time.

Lost Impressions

Using the same formula that we used to calculate the cost of lost impressions attributed to the Potential Squatter PPC sites in our study, we added up 2 cents per visitor for the 127,998,753 annual visitors arriving at the Potential Squatter DNR, Content Infringement, and “Other” sites in our study. The total cost of lost impressions for these Potential Squatter sites is $3,011,976.

The total cost of both lost sales and lost impressions resulting from the Potential Squatter DNR, Content Infringement and “Other” sites in our data set amounts to$44,881,439.

Adding up these four Potential Squatter sections—losses from PPC sites, losses through Affiliate programs, and the losses that come from Other, Infringing Content and DNR sites—the total ongoing cost of the Potential Squatter domains is $265,180,586 per year.

Annual Cost of Each Resolves Type for Potential Squatter Domains

Graph 2[+]

Monetary Losses to Sites Owned by Brand Owners

While infringing sites are certainly the lion’s share of the problem, brand owners are also losing value and revenue by not properly using the sites that they do own. Out of the 4,794 Brand Owner domains that we identified, 1,348 were typos of the target domain that had traffic and if used improperly, could negatively impact a brand owner’s bottom line. Out of these 1,348 domains, 313 did not resolve (DNR), 15 were PPC sites, one resolves to a search page, and 1,020 resolved correctly.

Graph 3

When combining the lost impressions of all of these domains and the lost sales that e-retailers suffer, these misused PPC and DNR names cost brand owners an additional $19,986,288 per year. This brings the total cost of typosquatting up to $285,166,873.

Discussion

Based on the findings of this study, the task of enforcing the domain name space and pursuing infringements may seem daunting. However, the good news is that much of the harm and loss of revenue is concentrated in only a few, highly valuable infringements for each target brand. A proper domain name enforcement strategy provides guidance regarding how to prioritize these infringements according to which are the most harmful and when to pursue them.

For example, out of the 28,000 Potential Squatter domains, only 4,632 (or 16.5 percent) garner traffic.

Domains Belonging to Potential Cybersquatters

 Graph 4

Also, consider the following graph, which shows the 25 target domains that have the most highly trafficked typosquatted domains:

Target Domains with Highest Squatter Traffic
Graph 5 [+]
Target Domains with >5M Squatted Traffic

Graph 6 [+]

Target Domains with >3M Squatted Traffic

 Graph 7 [+]

To provide an example of the distribution of traffic within the typos of these target domains, let’s consider the popular social networking site Facebook. Traffic data for Facebook can be found in the second bar from the left in Graph 6. All of the 239 Potential Squatter registered typo domain names of facebook.com garner an aggregate of over 48 million visitors per year; however, further analysis shows that 60 percent of this diverted traffic is going to just seven of those domain names. Clearly, Facebook does not need to pursue all 239 registered infringements to drastically reduce the impact of typosquatting on its brand. It is all about determining which domains are the most valuable based on the volume and quality of traffic and then pursuing those domains accordingly.

It is also important, as we have pointed out, for brand owners to appropriately use the domain names that they do own. Domain names that garner traffic should point to the most relevant content in order to provide Internet users with the best possible online experience.

The study showed some interesting infringements that engaged in affiliate fraud, which FairWinds wrote about in 2009. The domain names tysrus.com and toysryus.com (both typos of ToysRUs.com) redirect to content featuring toys and games on Amazon.com. The domain name wwwpandora.com (a typo of www.pandora.com, which allows visitors to stream music) similarly redirects users to music content on Amazon.com.

All too often, though, brand owners allow their domain names to go unused or misused—we found that typo domains such as tiicketmaster.com, youtube,org, hotmaol.com and walgreens.org were all owned by the appropriate brand owner yet at the time of this study still pointed to PPC sites.

The Cost of Recovering Potential Squatter Domain Names

Beyond the ongoing cost of lost sales, unnecessary advertising, lost impressions and commissions to affiliate fraudsters, brand owners spend significant amounts of time and money recovering typosquatted domains by filing UDRP complaints. Brand owners can file UDRP complaints against a registrant who has registered one or more domains that contain infringements on their brand.

There were domain names in our data set for which the registrant information was readily available—for these it was simple to determine how many UDRPs would need to be filed by each brand owner. The domains that infringed on the same brand and were registered to the same owner could be filed as one dispute.

However, the registrant information for many of the domain names in our data set was “protected” or hidden from view. In order to project how many UDRPs would need to be filed in this set, we multiplied the number of domains in this set by 2.1—the average number of domains per registrant that we found in the set of domains with readily available data.

All totaled, we estimated that to reclaim all of the 28,000 Potentially Squatted domain names, 13,185 UDRP complaints would need to be filed. Luckily, brand owners do not need to spend the time and money to reclaim all existing infringements—this is not possible or necessary.

The average cost associated with hiring a law firm to reclaim a name is about $6,000, which includes the UDRP fees, legal fees and (minimal) registration fees. In order to reclaim the 28,000 Potentially Squatted domain names from an estimated 13,185 registrants, brand owners would need to spend $79,110,000.

Added to the cost of lost impressions, lost sales, fraudulent commissions and unnecessary advertising, the recovery of all of these names brings the total cost of Potential Squatter domain names to $364,276,873.

Conclusion

Through the research conducted for this paper, we sought to closely examine a segment of the typosquatting problem. However, typosquatting is just a small part of the overall cybersquatting problem that exists today. We hope that this whitepaper will encourage brand owners to reexamine their domain name strategy, proactively register certain key typo domains for their business, and reactively pursue the most valuable infringements. Though typosquatting is a serious issue, not all typosquatting domains are equally harmful to a brand or equally worth pursuing. Brand owners should have a domain name strategy that determines the worth of a typo domain through components such as the quantity and quality of the traffic it receives, the type of infringement, and the content it currently hosts.

While it would cost $79 million to recover all of the typosquatted domain names in our data set, spending 22 percent of that cost would recover the 10 percent of typosquatting names that would produce a return on investment in less than a year. By spending five percent of the total cost, the 2.5 percent of names that would produce a return investment in less than a month could be recovered. Finally, for less than one percent of the total cost, those names that could produce a return on investment in less than a day can be reclaimed.

Number of Domains and Days to Return Recovery Investment

Graph 8 [+]

With the right metrics and guidance, it is possible to identify opportunities with a high return on investment and thus pinpoint where brands should be focusing their reclaim efforts. However, the widespread practice of typosquatting generally clouds the ability of brands to cost-effectively address harms. We hope that this whitepaper will also encourage legislators to reexamine the regulation of the domain name space and strengthen the Anticybersquatting Consumer Protection Act (ACPA). Typosquatting is a persistent and pervasive problem that requires a greater deterrent than the one that exists today. Furthermore, typosquatting is not confined by national boundaries. To truly impact typosquatting and cybersquatting in general, there needs to be a multinational treaty that sanctions an international crackdown on the practice.