Unveiling the Secrets: The Google Leak That’s Shaking the SEO World

Key Takeaways

  • Google’s internal documentation was leaked to the public, revealing more than 14,000 ranking features.
  • The leaked documents detail the data Google collects and processes on websites, including descriptions of various system functions, explanatory diagrams, and charts covering multiple search-related areas such as index organization, content evaluation, and ranking algorithms.
  • The fundamentals of SEO haven’t necessarily changed, but we now have more data points to speak to in terms of what’s important.

Google’s internal documentation, revealing more than 14,000 ranking features, was leaked to the public. While Google’s team is undoubtedly in overdrive trying to contain the fallout, the SEO community is buzzing with curiosity. This unprecedented leak offers a treasure trove of insights into the inner workings of Google’s search algorithm, and everyone is eager to delve into the details.

The Source of the Leak

The story behind the leak is perhaps as intriguing as the content itself. A few weeks ago, an anonymous source reached out to Rand Fishkin, the co-founder of Moz and the creator of the Domain Authority metric. Despite being out of the SEO game for six years and now running SparkToro, Fishkin remains a significant figure in the industry. The source, driven by frustration with Google’s perceived dishonesty, claimed to have access to internal search documents and sought to expose the truth. 

On May 24, Rand Fishkin confirmed the authenticity of the documents after a video call with the anonymous source. With further validation from former Google employees, it became clear that the leak was genuine. 

Since then, Erfan Azimi has revealed himself as the anonymous leaker. On May 28, he published a 13-minute video on the subject, adding another layer to this already complex story. 

What’s Inside the Leaked Documents?

The leaked documents detail the data Google collects and processes on websites. They include descriptions of various system functions, explanatory diagrams, and charts covering multiple search-related areas such as index organization, content evaluation, and ranking algorithms. 

Despite the extensive details, the documents do not indicate the importance of each parameter within the algorithm, and some parameters are labeled as deprecated. However, their mere presence provides invaluable insights.

Contradictions and Revelations

The fundamentals of SEO haven’t necessarily changed, but we do have more data points to speak to in terms of what’s important. The leaked information even reveals some contradictions to Google’s official statements, including the following: 

  • Domain Authority: Google’s denial of using Domain Authority as a ranking factor is more convoluted than initially thought, with the mention of a “siteAuthority” parameter potentially influencing site rankings.
  • Google sandbox: Officially, Google claims there is no sandbox for new websites, but the documents indicate a “hostAge” attribute used to sandbox fresh spam during serving time.
  • User data from Chrome: Contrary to Google’s statements, the documents show user data from Chrome is indeed used for search-related purposes, such as generating the “Sitelinks” SERP feature. 

The documents also contain a wealth of other fascinating details: 

  • NavBoost and PageRank: Insights into how these factors impact rankings.
  • Content evaluation: The role of authors, links, and criteria that lower a site’s trustworthiness.
  • Panda algorithm: Details on how Panda assesses content quality using embeddings.
  • Big brands vs. small sites: Evidence of a bias toward larger brands over smaller sites. This makes being strategic with SEO much more important for smaller brands. 

Special Whitelists and Biases

One of the most controversial revelations is the use of special whitelists for topics like COVID-19, tourism, and politics. During elections, Google allegedly uses whitelists to promote or demote certain sites to prevent misinformation. This raises significant concerns about bias and manipulation in search results. 

What Google Has to Say

In light of the leak, a Google spokesperson told Gizmodo: “We would caution against making inaccurate assumptions about Search based on out-of-context, outdated, or incomplete information. We’ve shared extensive information about how Search works and the types of factors that our systems weigh while also working to protect the integrity of our results from manipulation.” 

However, Google will not confirm or deny the accuracy of these leaked documents. 

The Aftermath and Future Implications

As Fishkin and SEO expert Mike King have only scratched the surface of these documents, there’s no doubt that the SEO community will be analyzing this data for months to come. The leak has already sparked intense debate and speculation about Google’s transparency and the true nature of its search algorithms. 

“Without the weighting of specific ranking factors in the algorithm, the leak is largely an insight into how Google collects user behavior and not an actionable document,” says Grant Effinger, a strategist manager at Intero Digital. “We also see that Google has not been the most honest with what it has shared with the public regarding its ranking factors and SEO advice — something its current anti-trust trial already revealed.” 

 Effinger goes on to explain the future implications of these newly leaked insights: 

“Will massive, well-known, respected brands have a leg up in search? Yes, as they have for years. Do user behavior and on-site actions have an impact on ranking? Sure. Are these insurmountable advantages? Hardly. With our tactics, experience, and expertise, we have helped private organizations outrank high-authority government websites for relevant keywords. We’ve gotten smaller businesses to outrank larger competitors for their own brand name. These advantages exist, yes, but fates are not sealed. Weapons don’t win wars; tactics do. And we have winning tactics.” 

Leave a Comment

Your email address will not be published. Required fields are marked *