And the award goes to…Megan Fox! Best known for her heroics against the Decepticons in the movie Transformers, Megan Fox had the most copied celebrity image on the Web out of all the images of the women on Maxim’s “2007 Hot 100” and FHM’s “100 Sexiest Women 2007”, despite her official ranking of 18th and 65th on those lists, respectively.
Here’s what we did:
First, we found lists of the hottest women in 2007 from FHM and Maxim, and lists of the hottest men in 2007 from People’s “Hottest Bachelors 2007” and People’s “Sexiest Man Alive 2007”
Next, we used Attributor’s image monitoring platform to scan the web for copies of the images.
Finally, we reviewed the results, tallied the number of times each image was copied, painfully sorted through thousands of pictures of beautiful women, and categorized the type of site doing the copying.
The Results:
The top five most copied female celebrity images on web with the FHM and Maxim rankings shown in parentheses are:
Megan Fox (65,18)
Jessica Alba (1,2)
Rihanna (Not Ranked, 8 )
Halle Barry (16, 55)
Lindsay Lohan (41, 1)
Let’s not forget the men. Matt Damon will undoubtedly be pleased that he led the list of the most copied male celebrity images across both the People “Sexiest Man Alive 2007” and “Hottest Bachelors 2007” lists. His married status did not dampen Web user’s enthusiasm for his photo at all. Bachelor Matthew McConaughey, the cover photo for the “Hottest Bachelors” list, made a respectable runner-up rank in terms of his well-copied image.
The top five most copied male celebrity images from both People lists with official magazine rankings in parentheses are:
Matt Damon
Matthew McConaughey
Patrick Dempsey
James McAvoy
Jake Gyllenhaal
And where are these images being used most? You guessed it. Gossip sites - they represent 36% of all sites found as publishing your own gossip site appears to be the new black. Here’s a breakdown of the sites where we found the images
Gossip Sites 36%
Movie sites: 15%
Fan sites: 7%
Recognized domains that appear to be licensing the images: 7%
Splogs: 2%
Other 33% (Personal homepages/blogs, non-english sites)
January 3, 2008 at 12:56 pm
| Filed under Video
| Comments
Every month, a group suggests a new thesis to enable online content proliferation and strike a balance between the needs of consumers and publishers alike. Predictably, the output is a set of guidelines or call for new guidelines, as the Center for Social Media reported today. Though well-intentioned, more guidelines are not the answer; instead, participatory media will thrive through a community that is empowered by full visibility of online video re-use and publisher web distribution policies.
Determining ‘Fair Use’ is a tough, complex problem – an issue that has caused many media companies and individuals to shy away from embracing the Internet as a distribution channel. “Recut, Reframe and Recycle” the report from the Center of Social Media out of American University examines user-generated video content and classifies usage into nine common practices that appear to be ‘Fair Use’.
Media companies and artists like Lane Hartwell have long since thrown up their hands in trying to determine which instances of re-use to allow among the thousands of copies that appear on the Internet each week. The barriers are substantial:
The tools to locate re-use of images or video content are limited.
Reviewing each and every copy found is burdensome.
Contacting each site to pursue a licensing deal isn’t feasible especially without some type of filter to identify which ones can result in a new revenue source.
Visibility is the answer, and, by this, I don’t just mean a long list of the sites re-using videos across the Internet, sorted by monthly visitor traffic. This won’t help with the nine common classes of ‘Fair Use’ introduced today and will bury most publishers under an avalanche of work.
Publishers of all sizes, and specifically video producers should be able to classify each video as “Promotional” or “Premium” assigning each a set of parameters that specify the maximum duration it can be shown, the branding and link requirements plus any ad-sharing splits.
With contextual, web-wide visibility of re-use, publishers of all sizes can post their distribution policies for the community to embrace. Any mashup less than 30 seconds can be greenlighted as long as a link is provided or full copying of premium videos can be enabled as long as 40% of all ad revenue is directed to the publisher’s AdSense account.
The complexities and possibilities of what can be created are endless, but all seem like a giant step forward.
For many, the last option is clearly best. Your content leaves your site but you continue to get paid for it, either through ad share or a license fee . . . regardless of where it appears or which ad network is monetizing it.
Unfortunately, not everyone can afford a sales force, and, frankly, the market for your content may not as established as Reuters. Or maybe, they are only using a portion of your content.
But wait, there is another option – securing a link back to your content for each instance of re-use.
Why are links important? Google, Yahoo and MSN all base their search results on the number of inbound links to your site. If you aren’t paying attention to the number of links you receive, you’re probably not ranking highly in the search engines and you’re definitely losing out on traffic and customers.
How much is a link worth? It depends, but probably more than you think. According to Mesa-Ariz.-based Text Link Brokers, clients pay between $15 to $1,000 a month for a single link and $600,000 for a full service link building campaign.
And guess what - if your content is similar to categories we’ve analyzed, we’re not talking about a single link or handful of links. It varies by category, but in most cateogries we’re finding 10+ copies of every article that we track. In a search economy dominated by Google, this represents a tremendous traffic building opportunity.
Do search results impact traffic for other media types? Google announcement of universal search results – showing images, videos and text in a single result set – has already been embraced by Yahoo and indicates that search will become an increasingly more important channel for traffic across all media types.
So, I submit to those who know their content is being copied and care about it – there are new ways to act upon reuse so you can capture value for your content. It starts with links and, based on our product roadmap, will lead to even more direct monetization.
Super distribution, cut and paste web, widget economy . . . a collection of buzz words that fuel the conference circuit, yet each term describes a well-documented fact — consumers are interacting with content where they want to, not where you tell them to.
Sumner Redstone called this out in a recent Forbes article: “We are now in a fragmented search economy, which means we need to extend our content beyond our own destination sites so consumers can reach it more easily … The content mountain has officially relocated.”
Or, maybe the mountain has blown up.
So how do you put your content mountain back together?
The first step is to find all the pieces. Where does your content exist across the web? How much is being copied and discussed in the blogosphere? In which social networks is it being copied?
Next you need to classify each piece so you can treat each piece correctly. Key questions include: Which sites copied most or all of your content? How many have ads on them? How much traffic are these sites receiving? Which ones appear higher in search engine rankings than your original?
You’ll be surprised by what you find – in many cases, we’re finding a copy rate of >10x, that is the average article is being copied over ten times.
Now comes the fun and challenging part, deciding how to re-build your content mountain. We’ll give you the following tools:
You can reap huge benefits just by asking each copying site to credit you with a link back to your site. Your marketing team will favor this approach as links equate to increasing your rank in Google and driving more traffic back to your site. Search Engine Optimization is still a black art to many, but one fact is well documented: To get highly ranked in Google, you need to make your site ‘important’ in Google’s eyes and, to do that, your site must have good inbound links - as many as possible.
Perhaps the best sales lead of all is a highly trafficked commercial site that consistently copies your content. Given the ease in which ad networks have made it to share the proceeds, incremental revenue can be an email or phone call away.
Lastly, you could decide that you want to prevent the scores of sites copying your content from sharing with others. Attributor supports this scenario with an efficient take-down notice process – notices that extend to the search engines and ad networks as well as the host site.
While Attributor can provide you with the map where your content resides and several tools with which to act, the blueprint for putting it back together is up to you . One thing is certain – you need a way to generate value for each content piece that exists off your site.
Web advertising networks – which include those run by Google, Yahoo and MSN – do a great job presenting advertisements that are highly relevant to the content on any web page. The question is, how often does the revenue from those ads actually reach the folks who create and own the content?
Our recent study on music lyrics illustrates the magnitude of this issue very well. First some background – last April, Yahoo Music partnered with Gracenote and became the first site to publish “official” song lyrics. The USA Today reported that Yahoo shares with the copyright holders the revenue from the ads that will be displayed alongside the lyrics. Just last week, MTV and AOL announced that they would also promote official lyrics on their web sites.
Why so much attention to song lyrics? It all comes down to Search. According to an Ask.com study, the term “song lyrics” was the 6th most popular search query last year.
What we did:
Loaded lyrics from the following 14 songs into Attributor in mid-September: Umbrella (Rihanna), Before He Cheats (Carrie Underwood), Big Girls Don’t Cry (Fergie), Bleed it Out (Linkin Park), Beautiful Girls (Sean Kingston), You Can’t Stop the Beat (Hairspray Soundtrack), Can’t Tell Me Nothing (Kanye West), The Pretender (Foo Fighters), Stronger (Kanye West), Plies (Shawty), I Get Money (50 Cent), Let it Go (Keyshia Cole), Ayo Technology (50 Cent) and Good Life (Kanye West)
The service then scanned billions of pages across the web to find copies of the songs
For each song we compared the search engine ranking of Yahoo Music’s “official” version with the copies on Google and Yahoo search engines
What we found:
1524 nearly exact copies across 300 different sites
57% of the copies had ads on the pages
None of the copies contained links back to the official version at Yahoo Music
100% of Google searches ranked the copying site higher than the official version when searching with terms for “Song + lyrics (e.g. “stronger lyrics”)
81% of Yahoo Searches ranked the copying site higher than the official version using the same search terms
To view the entire study and find out how much more Kanye West’s new album was copied than 50 Cent’s new release, please download the .pdf
So what can newspapers, magazines and writers do to capture full value for their original content? The first step is understanding how and where your content is being copied. With this information, you can decide how to act through Attributor:
Request a link back to your original improving your search engine ranking.
Ask the site to deposit a % of the revenue they make from your content into your AdSense account.
Send a formal DMCA takedown notice - we will ensure that it gets taken down from search engines.
Last week, Eileen Naughton, Google’s director of media platforms, told the American Magazine Conference, “Don’t fear Google”. With Google’s AdSense revenues surpassing $5 Billion a year, “Fear” is the wrong term. How about making Google and the other search engines accountable?
One of my favorite parts of my job involves hypothesizing about which types or genres of content might be copied frequently and then testing my theories live.
Diving into recipe copying felt very natural for me – I am an aspiring chef who has taken more amateur cooking classes than I’d like to admit. I love to try out new recipes on my unsuspecting family and—unlike my mother’s generation—don’t have to pore through unwieldy, batter-splattered cookbooks to find a new way to make chicken.
Loaded 37,000 publicly available recipes from Epicurious.com, Allrecipes.com and RachelRaymag.com
Let Attributor scan for matches or copies of the recipes.
Reviewed the matches and used Attributor’s % copied sorting feature to eliminate those that seemed to be derivatives, rather than copies of the original.
Took a random sampling of the recipes and plugged the recipe titles into Google search.
What we found:
Over 10,000 copies of the recipes were spread over 3,000 different sites.
Most were almost word-for-word copies. Across all matches, the average % copied was over 70%.
57% of the sites with copied recipes had ads on their pages.
60% of the sites with copied recipes failed to link back to the original recipe site.
For over 50% of the recipes we put into Google search, the copied recipe had a higher search rank than the original.
What it means:
Recipe Sites are definitely losing out on traffic. Using a conservative methodology that excludes the search engine impact of copies outranking the original, we estimate recipe sites are losing ~13% of their monthly traffic to recipe copying.
Will the recipe sites want to send DMCA Takedown notices to the 3,000 sites copying their content? Probably not.
Might they want to request each copying site to link back to their original recipe site? I think so, especially when the recipe sites realize the impact these links would have on their search engine rankings. Links are a critical part of search engine rankings, and anyone who publishes original content needs to understand it.
For those keeping track, Google announced in April they were very close to launching a new filtering system, dubbed “Claim Your Content”. The system would give content owners automated tools to identify copyrighted material for removal. Later, in July, an attorney representing Google said they were planning to roll out Claim Your Content in September.
The industry relaxed a bit. Bloggers rejoiced. Lawyers started to look for new sources of litigation.
It was a great step for the online content economy – at last, the industry would have the transparency and accountability required to support the motivations of those who create and publish valuable content.
Today is the first of October, and still no word from Mountain View. And while everyone waits, some point out that Google continues to profit from sites with unauthorized copies of original content.
Is Google delaying the launch to milk even more out of its immensely profitable search engine? I doubt it. A better explanation for the delay might be the realization of the major challenges involved in getting this right.
You might ask, what’s hard about anything for Google? Here’s what I think are the six reasons it is particularly difficult for Google to do this right:
1.Removal across Google’s main index. Everyone focuses on YouTube’s responsibilities as a hosting site, but since Google is the world’s leading search engine, shouldn’t Google also scan and remove instances of pages with that video in the broader Google index, even when hosted on another site? That’s not an unreasonable expectation, particularly if – as at least one analyst believes –Google is already applying digital fingerprinting to content to improve their web indexing and eliminate duplication.
2. Removal from AdSense network. Google’s AdSense is the fuel that makes much of the online economy go. So if Google removes a video from YouTube but it shows up on another site that has AdSense ads, why shouldn’t the owner expect that Google would remove those ads – as Google’s own policy promises?
3.Removal across the Web. Google has a commanding lead with YouTube, but there are hundreds of thousands of sites that host or embed videos. Without a Web-wide solution, publishers will have no visibility into content popping up on the latest social network, blog or hosting site. Unless Google can make those content claims count across the Web, no individual site has much incentive to go legit, since they know this just gives the edge to less conscientious competitors.
4.A solution for all media types. Video may get all the press, but text is still the Internet’s navigational currency. The text on your site powers your ads and search rank, but text content also supports splogs and useless made-for-AdSense pages. Images make the Web worth viewing, yet nine out of 10 Web images may involve infringement. Can Google expect to bring all types of media together in universal search results, but only let you claim your content when it is a video?
5.Available to publishers of all sizes. I’m sure Google gets this one – they practically invented the “long tail” and realize that success is not just about satisfying Viacom and Disney. But that means that any publisher large or small must be able to stake their claim and have it count.
6.Independent and unbiased. For publishers to feel confident in a content claiming system, they must believe that it works without conflict of interest. Since Google controls and monetizes most search results and puts ads across more content pages than any other ad network, what extra steps must Google take to gain the confidence of publishers that their claims will always count, and that questions of fair use will be resolved objectively?
A long list for sure, but nobody said it would be easy. There’s no doubt Google believes in the potential of the online content economy – that’s why they paid $1.65 billion dollars for YouTube. The question is: Given Google’s unique role in the content economy, are they really in a position to make it work?
What do you think is a must-have for Claim Your Content? Any predictions for its eventual roll-out date?
As a product manager for Del Monte Foods, I found “Place” or distribution the most frustrating of the 4 P’s - it took millions of dollars and several months to get my product on the grocery store shelf. And once it was there, I had limited visibility into what happened once it left the warehouse - forcing me to make major decisions based on fuzzy data points.
If you are publishing content, the Internet has eliminated the barriers to get in front of users. However, you lack insight into what happens to your content after it leaves your site, and, more importantly, you lose control over monetization.
Steve Rubel describes it as the “Cut and Paste Web” and offers three strategies for thriving. Increasingly, we’re finding that publishers view their copied content as an opportunity to extend their brand and drive traffic back to their site. On the other hand, when controlled distribution is mission critical, it’s easy to empathize with publishers who view copied content as a problem.
Whether you view copied content as a problem or an opportunity, the Internet is too important a distribution channel in which to be blind. This is where Attributor can help. We’re excited to be enabling leaders like Reuters and the Associated Press to forge ahead with their digital strategies, and we’d like to help you too.
September 14, 2007 at 3:23 pm
| Filed under General
| Comments
I’m constantly creating lists. Mostly, I follow the to-do list format to reassure myself that I am having a productive day. Other times, I use the classic Pro/Con grid, but somehow this always makes the decision more complicated.
Consider this my personal attempt to cut through the always changing, often confusing online content economy with a Top 10 List.
#10 Your content is being copied all over the place. Whether you are publishing a book or writing song lyrics, your work is appearing in social networks, blogs and web sites. Don’t you want to know where else your content is going and how it’s being used? With this knowledge, you’ll make better decisions about what kind of content you produce in the first place.
#9 It’s time to level the playing field. The big search engines and ad networks (like Google) thrive on indexing and monetizing other people’s content, and as a result, they have a lot of information — like how readers discover your content and what marketers are willing to pay to advertise on it. Maybe it’s time for a level playing field where you have your own source of information on where your content is being used, where it is being indexed and when the ad networks are making money on it?
#8 You can focus on the matches that matter. You don’t have time to sift through thousands of matches per day to find 10 truly actionable ones. You need to be able to focus on the most interesting matches –as defined by you. Maybe you only want to see matches that have ads on the sites or matches on sites above a monthly traffic threshold. Perhaps you are interested in those that aren’t linking back to you or those who have posted an exact duplicate of your most valuable content.
#7 It’s yours, dammit. You invested time and probably money creating your content. Why should someone else get the credit and the traffic?
#6 DMCA takedown notices aren’t the only option. As a content publisher, the law is on your side, and sometimes there’s no alternative to a DMCA takedown notice. But maybe those should be the exception instead of the rule. There is a middle ground where both parties can share in the benefits – either through a link back to the original source or revenue share.
#5 The Internet isn’t slowing down. It’s no surprise that more consumers are choosing to consume more and more of their content online. The online channel is too important to not have visibility.
#4 Fewer lawsuits. By taking the subjectivity out of the Fair Use debate, it’s easier to reach a negotiated outcome, share revenue and avoid costly litigation.
#3No more holding back. With web-wide visibility, you no longer need to erect barriers to view your content online or hold back your highest quality work. You know your readers probably prefer full-text RSS feeds to partial-text, and now you can give them what they want.
#2 You’ll be smarter. Now you can answer questions like ‘How much commercial value does my content have off my site’, ‘Which of my content is spreading the furthest within social networks’ or ‘What licensed content clicks back the most’.
#1You want more traffic. You will be amazed by how many sites use your content without providing a link back to your site. If you can find the sites that are copying and streamline the process of requesting links, you can drive direct traffic and secure better rankings in Google results.
Got another reason ? Please share them in the comments .
A website called Defend Fair Use just launched alleging that large media and content companies are misrepresenting consumer rights under copyright law. This initiative is led by the Computer & Communications Industry Association, a nonprofit back by Google, Microsoft and Yahoo among others.
While we welcome more discussion among these players about the contours of consumers’ right and copyright law, it’s ironic that the same companies alleging exaggerated copyright notices are profiting from duplicate content.
“Big Content” and “Big Technology” are clearly trying to spin the issue. To clarify, let me breakdown the four factors of Fair Use and show where Attributor can provide objective metrics to guide Fair Use determination . . . without boring you to death.
Factor 1: The purpose and character of the use, including whether such use is of a commercial nature is for nonprofit educational purposes.
Detectable. While Attributor won’t identify if the usage is transformative, we automatically detect if the page on which reuse occurs has advertising present. As evidenced by recent moves by the New York Times, advertising is clearly driving the online content economy making commercial use an increasingly important factor.
Also, you can learn a lot about the purpose and character of a use by whether or not attribution is provided, which in the online world, amounts to links from the copy to the original - we report back on attribution for every match we find.
Factor 2: The nature of the copyrighted work
Not Detectable. Sorry, we can’t determine whether your content is fiction or non-fiction, but we’ll add it as a feature request!
Factor 3: The amount and substantiality of the portion used in relation to the copyrighted work as a whole.
Detectable. This is a fancy way of saying that the less of your content that is taken, the more likely it qualifies as Fair Use. For each match, we report back on the percentage of the original content that has been reused.
Factor 4: The effect of the use upon the potential market for or value of the copyrighted work.
Detectable. Not only will we indicate if ads are present on the reusing site, but we will also provide the amount of monthly traffic for the site. We’re also adding functionality that will help you understand the impact of content reuse on your ranking in search engines. As noted in our Harry Potter research, much of content reuse is occurring on sites that appear higher in search engine rankings than the original content owner. This can have a major impact on the relative market value of the original work.
Attributor won’t remove all the emotion from the room in copyright discussions, but it will provide an objective means to evaluate Fair Use disputes and (hopefully) result in less litigation and less posturing between “Big Content” and “Big Technology”.