Category Archives: zemanta

When ReadWriteWeb mentions your Twitter account

Yesterday two of Twitter accounts I control got mentioned in ReadWriteWeb article 50+ Semantic Web Pros to Follow on Twitter. That’s @zemanta (my company’s account) and @gandalfar (personal twitter).

This allowed me to measure how many new followers I would get as a consequence of this story. Today TweetDeck told me for my personal account:

TweetDeck followers counter
TweetDeck followers counter

(On quiet days when I’m at home I normally get one or two new followers a day)

What’s more interesting is that of these ~50 people most of them also followed the other account as if they’ve added all of the 50 people on the list.

Anyone else got any experience with having such a rush of new followers?

Reblog this post [with Zemanta]

How to get most out of the conference (tip #1)

 Image via Wikipedia

Going to conferences is exciting activity, where you get to spend a few days in a completely different environment, around peers listening to exciting people who you most often look up to. But most of trade-show conferences today are not organized like academic conferences where you get to take home a full binder of proceedings that allows you to study the presented material to certain depth and reference it.

There is a very simple trick to keep up with the ideas of smart person on the stage – look up his or hers page and add their blog to your Google Reader. Today’s conference superstars will often blog in great detail about things they present at conferences, post slide-shows and videos of their presentations at other events and link to other people in their field with like minded ideas. If you feel really attached to them, you can also start stalking following them on Twitter or FriendFeed.

The best part of this is – it’s free and it’s easy to incorporate it into your everyday rutine, since you most likely already drink the morning coffee while reading last nights feeds.

Reblog this post [with Zemanta]

Visualizing books using Zemanta and Wordle

Most of my readers are by now probably already aware of Wordle, Java applet, that allows neat visualization tags. Given that Zemanta released early alpha API preview recently, I was looking for a fun project to showcase some of it.

So for this experiment I’m going to try to visualize some of the popular classic books, using text files of Gutenberg project. Technical details at the bottom of the post.

Jane Austen – Pride and Prejudice

Pride and Prejudice as through words and tags
Pride and Prejudice as through words and tags
Pride and Prejudice through words
Pride and Prejudice through words
Pride and Prejudice as tags

Herman Melville – Moby-Dick

Moby Dick through words
Moby Dick through words
Moby Dick through tags
Moby Dick through tags

George Orwell – 1984

1984 through words
1984 through words
1984 through tags
1984 through tags

Technical details

The whole process is done using a simple python script. The script reads in the text file, breaks it into chunks of 360 words, as is roughly one A5 page and then sends it to Zemanta API. It repeats this process for first 30 thousand words of the book. The limit is arbitrary, I just didn’t want to run the script for too long.

Afterward, the text was manually pasted into Wordle and I played with random function and details until I started to like the image. You can also take a look at my full Wordle gallery.

Lessons learned

I’m especially happy how 1984 turned out. For this kind of visualization it’s important to choose source carefuly, so you can get more powerful results that way. I’ll probably continue experimenting with this on 1984 text.

Zemanta Pixie

SemanticCamp London, 16-17th 2008

Source: WikipediaI am going to present the colours of Zemanta this weekend at SemanticCamp in London. Besides showing the latest secret technology we are developing I am also hoping that we can have some interesting debates about usability aspects of the whole thing.

Lessons learned at Barcamp Klagenfurt

Saturday I visited BarCamp – Sezna Confini 2008, user generated conference, that was organized in Klagenfurt. Besides giving a presentation on Zemanta, I also made some interesting notes from some of the presentations that I’m sharing below.

Barcamp time sheet

In the first session “Time & ideas for blogging”, we heard from the authors of So Isses blog. They write a number of blogs and their recommendation is to find a personal routine for blogging, e.g., they like to blog in morning while having their coffee, just writing down ideas as WordPress drafts and then working on them more thoroughly. They also relay heavily on “publish in future” feature that allows them to time the posts in upcoming days. Right now they are already writing articles for mid-March!

Mathias, on the other hand, talked about Zipfs power law/long tail story. His proposal ist that when you’re analyzing data you plot it on log/log scale and if it shows a linear chart you can start investigating if the data follows the principles. He also presented some statistics about “the bend”, the part of at which you start getting exponential growth and your page views start getting . For a lot of online content this is at about 50 users. Check out his presentation.

Max talked about Blogs and their connectivity. There are about 800 new german blogs per day. Some services like blogfever.de, blogrush.com scan the content that you are reading and then locate similiar content to your blog.

He tried to design his own widget using yahoo term extractor. Problem with this approach is that when you take your tags, enter the into google blog search and you will get only one article – your own. Until you remove and suddenly you get 10k articles. This problem is same for google adwords, 95% of tags are not matching close enough. Too few or too many results.
He also created a cool project called Blogsvision.com, that plots blogs on Google maps and connections between them. His experience is that 6-degrees for separation is not working for blogs. Some blogs are totally separated from others in their own islands.

barcamp-max