With the election of new Slovenian prime minister we also got formal release of a Coalition agreement. Since it’s a 72 page document, I was wondering what keywords would stand out. Here is the result:
![Pogodba za Slovenijo 2012 - 2015 - word cloud (top 80 words)](https://www.jurecuhalev.com/blog/wp-content/uploads/2012/01/sds-80-rc1-550x363.png)
While we’re at it, we can also take a look at the coalition agreement that Pozitivna Slovenija prepared. As we run them through the same process, we get:
![Koalicijska pogodba - Pozitivna Slovenija - 2012](https://www.jurecuhalev.com/blog/wp-content/uploads/2012/01/ps-80-rc1-550x358.png)
A few words on how to reproduce this:
- Grab your favorite OCR software and convert scanned PDF into .docx
- From Word save it into .txt file
- Lemmatize the words so you normalize all the grammar rules
- Apply stop-words (in this case mostly: ministrstvo*, vlada, slovenija*, ..)
- Drop the resulting text into wordle.net
?e ni skrivnost, me zanima kateri software si uporabil za lemetizacijo?
Verjetno: http://lemmatise.ijs.si/Software
Thanks Domen. That’s the one.