Category Archives: visualization

Showing a story of Kiberpipa’s intranet opensource project

Kiberpipa is a large organization with lots of volunteers. This means that whatever you do, you’ll have organizational problems and you’ll see technology as a way to solve them. To a certain degree of course. A few years ago BoĊĦtjan and I saw this as an opportunity to reinvent the wheel and write our own groupware software. This is is how intranet project (yes, a terrible name from branding perspective) was born almost 5 years ago.

Since then a number of people have picked up the project and used it to improve their Django skills as well as help Kiberpipa get a bit more organized by a way of technology. While learning my way around some video editing software I’ve thrown together a video of commits of pieces of code into intranet’s code repository. Project used to generate frames is an open source Java based code_swarm. I’d like to encourage it to try it out and run it on your own source repositories.

This is the story of the following video. Please watch it in ‘full screen’ for the best experience:

Kiberpipa Intranet Codeswarm (2006 – 2010) from Jure Cuhalev on Vimeo.

Visualizing Slovenian IT tax spending

My latest released project is focusing on Visualizing Slovenian IT tax spending (139 million euros), the idea here is to take otherwise meaningless numbers and display them visually in a way that tells a story of who is spending how much and on what. The data set comes directly from the government in semi-clean XLS file. Visualization technique I’ve decided on is treemap visualization to represent the data with different box sizes relative to each other.

Give it a try for yourself:

Launch the interactive Slovenian IT tax (in Slovenian)

Visualizing Slovenian IT tax spending

While visualization itself is nice, there are a two points that you have to be careful about when releasing such visualizations to the public:

Transparency of data and data transformations

In my case, the data set came directly from the government. In order to make sure that everyone can check my calculations I’ve included links to their file as well as provided a local copy in case their version changes or disappears.

You’re loosing and reinterpreting data with every visualization. That is why it’s important to also include transformation scripts so that others can check your work and possibly build on top of it or at least make sure that you didn’t do anything tricky with the data.

I’ve opted for a github repository where I’ve pushed all the associated files: http://github.com/gandalfar/itproracun. It’s a bit chaotic but it should be pretty self-explanatory to any python and JavaScript developer.

Telling the story
Every data visualization is trying to tell a story. It might not be obvious to the visualization author but it helps to identify this early in the process.

I started with just a simple breakdown based on the institutions:

It’s very noisy and it’s hard to compare different institutions to each other. Initial comments to this were that it’s not shiny enough. Cleaning the interface up I came to the following revision:

It’s much cleaner and what basically showed that I need to find an angle to this data. I decided to focus on the ratio between software and services vs. hardware and network equipment. Final version now tells a story of how police is spending a lot of their IT money on network and hardware equipment, while Tax Office is spending much more money on software and services.

Agenda of this last version of visualization should be clear to anyone who takes a few moments to study it.

Other lessons learned

Visualization toolkit should be powerful on one hand, but offer first results without too much work. JavaScript InfoVis Toolkit does this job very well. There are some interesting tidbits that are not entirely clear from the documentation, but become obvious once you start thinking how the rendering works.

The biggest time sink is parsing and cleaning up the data. Don’t expect that the .xls file will make any sense from the programmatic point of view, even though it mostly looks fine when viewed manually. Small parsing errors, moved cells and strange line breaks made parsing this data the biggest challenge.

Big thanks go the community of Slo-Tech and my brother that gave valuable feedback during the development.

I hope you’ve enjoyed this visualization. Let me know in comments what other points of view you’d like to see as well as your ideas how to further improve it.

10 Innovative online visualization tools

Here is a quick reference chart of online visualization tools. It’s here mostly for my reference as I plan to update it as I discover and test new ones that being created almost on daily basis.

In no particular order:

  1. http://verifiable.com/ – Turn any set of numbers into an explanatory picture with Verifiable.com
  2. http://manyeyes.alphaworks.ibm.com/manyeyes/ – Shared visualization and discovery
  3. http://www.swivel.com/ – Swivel’s mission is to make data useful so people share insights, make great decisions and improve lives.
  4. http://www.icharts.net/ – iCharts is a web services company that is creating a new market for online publishing and transactions around public and private charts.
  5. http://timetric.com/ – Making data useful
  6. http://www.trackngraph.com/ – The easy way to track and graph information
  7. http://widgenie.com/ – The all powerful data visualizer
  8. http://www.trendrr.com/ – Track, compare and share data, free. Identify trends across social graphs and networks, realize the potential of p2p, track engagement metrics, look at what is really happening, real time.
  9. http://www.wordle.net/ – Beautiful word clouds
  10. http://www.gapminder.org/upload-data/motion-chart/ – Gapminder/Google motion charts

Did I miss any? Leave a comment and I’ll update the post.

Reblog this post [with Zemanta]