WordPress Analytics Engine

data-scheme

I think I’m obsessed with numbers. They give me a feeling of control. Page views, trends, visitor count and more. Not measuring things, makes me sad. If we have a historical data, we can check if our changes worked. Are certain topics more popular? Which stories are more popular on Twitter compared to Facebook? Infinite amount of questions.

I’d like to build a better analytics engine for us. I’ll explain my constraints, and how I’d approach it. Primary plan is that someone can say – “just use X”. If that fails, I can still build it.

Problem definition

We have multiple WordPress installations with overlapping authors. They create blog posts, that are shared to Facebook and Twitter. Each post can include multiple embeds, that they produce – Sound Cloud, YouTube and similar. For each blog, we can also lookup into Google Analytics API and get stats on sessions, page views and time on site.

There are two primary limitations of these external data sources. Firstly, we’re rate limited – so we can only query them about once a day – per post URL. Secondly, we mostly get aggregated data.

We would query these external API’s about once a day. The only limitation we have is that we get aggregate/sum data from them. Facebook only gives us total number of likes, so we need to make subtract previous value. This way we get number of likes in that day.

Potential

Having all the information in one place, it would allow us a couple of things:

  • Weekly reports for authors – sending them encouragements on how their stories did
  • Information for content editors, what got most attention that week
  • Identify old content that suddenly got interest
  • Get information on success of embedded content (Sound Cloud, YouTube)
  • Develop customised indicators – authors with most viewed YouTube videos

Potential Solution?

When researching this topic, there is a software stack that almost fits. It’s LogStash with Kibana. LogStash provides data storage and logging capabilities. Kibana support display of data in many different ways.

The other approach would be to just code it any web framework. But it seems like a huge duplication of work.

Technical Questions

Would ELK stack work? Can LogStash provide input filter that will automatically normalise data for me? It is the right technology stack at all?

Is there anything that solves this in a much better way?

Content Questions

Is it worth building this at all? Is it a good idea to attach numbers to (journalists) work? Did I miss any questions that would be worth exploring?

Would you use such a service?

CC Balazs Sprenc - https://flic.kr/p/7ref1f

Ljubljana Tech Community – We have a problem

Let’s be honest – our tech community is slowing down. We’re organizing less meetups than in previous years. Without them, we’re losing the culture of sharing. While everyone focused on entrepreneurship and startups, we took engineers and developers for granted. This means that we now have less senior developers available, than a few years ago. Good ones are not working for local companies anymore. So who trains the novice developers?

This will be a problem, for many different groups of people.

Developers need to talk to other developers

We’re not challenging each other with new approaches and ideas. Reading on the internet about some new tech, is not enough. You want to have a a discussion with a person that just implemented it. Building on experience of your peers, is the best way to kickstart the project.

Technical leads need success stories

Nobody wants to be the first to try something new. So seeing smaller success stories, makes it easier to start your own. You can also contribute to a small niche of consulting and training. If your company starts to invest into one technology, more people will join you. That way you’ll have access to extra talent, if you need it.

Entrepreneurs need their developers to be able to teach

Great tech companies brag about their solutions. Not the CEOs, but their developers. Every chance they have, they’ll talk about hard problems and how to solve them. In a world, where we keep producing CRUD applications, it’s hard to find novelty.

But that only happens if the company encourages them. First they have too see their CTO’s and lead developer give talks. Sponsor event and host them at their offices. Once you have good role models in place, it’s much easier for everyone. They have to encourage and give space to younger ones, so they can practice. It also means that they recognize that learning goes both ways.

This has an added side effect, that makes it easier to recruit new developers.

OK, enough problems – what are the solutions?

Easy way – send everyone to conferences abroad. There are many excellent conferences around Europe, any many of us are there. But it’s often the same people. The same people that already  give talks and organize events. It’s also likely that they’re not working for a local company. Distributed companies are not picky where their developers live.

It’s also rather expensive.

So that leaves us with the hard solution. I checked the stats of a few different meetup groups. Most of them seem to do 2 – 3 meetups per year. Looking at speakers, they come from a small group of developers.

That’s troubling, because it leads to burnout and lack of new ideas. There is only so much you can learn in a year.

To make this work, we need to inspect the building blocks of a good community. We already have shared interests in different technologies. We wouldn’t be visiting these events otherwise.

What we need to figure out

1. How do we get more people to organize events? Can we encourage small group of people to take care of certain domain. That way the pressure is not on the individual.

2. How do we find new IT companies. Slovenian IT sector is larger than the 5 startups we see everywhere. It’s good to see them paying it forward, but what about the others? If you look at all the companies that are hiring – why are they not present? Why don’t we see CTOs from marketing agencies talk and sponsor?

3. What kind of support do they need? It a problem to find a space, make announcements or something else? Can we build a network of individuals that can help out. That way we don’t need to negotiate about the space every single time.

4. What else? Is it something more simple – like just having a set schedule and following it?

When I wrote the description of local WebCamp, it was simple. We’re doing this event, because that’s the kind of community we want. I believe having active meetup scene, follows the same ideals.

But it’s an issue that is bigger than any individual. We need to recognize that it’s a problem and start doing something. Not just organizing events, but getting more allies on our side. Teach business and marketing professional the importance of tech community.

We need to make sure, that we’re building the community that we want.

Jeff Kubina - https://flic.kr/p/eACKE

How to Super Charge your WordPress with Microservices

Confession: I’m amazed at the authoring experience that WordPress provides. Users are productive from the first hour and UI doesn’t get in their way. The story is similar for developers, if you stay inside the world of theme development. Try to do anymore complex and with tests, and it gets complicated. That is where (micro)service oriented architecture can help us.

At the Open Education Consortium, we use a mix of all the different approaches.

1. Let the WordPress call external API’s

wp1

At OEC, we provide functionality of Course search. While I could build it inside WordPress, it wouldn’t be optimal. I’ve developed it in Django with REST Framework. We get API Documentation for free and our Main WordPress site is using the same calls as everyone else.

To call an external API, you’ll need to extend template_redirect, init,
and query_vars actions and filters.

This way, you can format output inside your WordPress theme.

2. Let the Javascript call external API’s

wp2

This is not strictly WordPress, but it solves the other half of the problems you might have. If you don’t need to have output visible to Search Engines, you can render it with JavaScript. In one example, we’re showing a map of where our members are. We have a separate portal, where members can edit their contact information.

On WordPress side, Leaflet library calls the GeoJSON endpoint and displays it on the map. The data is always up to date and we’re using the best tool for the job.
This portal then exposes GeoJSON API, that Leaflet then displays.

3. Embed your WordPress output

wp3

For our Directory feature, I needed a flexible membership system. I ended writing it in Django with Solr. Django provides powerful form editing, that I didn’t want to replicate inside WordPress.

What I ended doing, is that I exposed API from WordPress with header and footer HTML. That way Django can just include 3rd party content. It also simplifies keeping the themes in sync.

I could do it using iframes, but then I would have problems with urls and the size of iframe content.

There is also a possibility for a future upgrade. Because we’re running on the same domain, we can read cookies from WordPress. This would allow us to check if user is logged into WP Dashboard and elevate their privilages inside Django app.

Conclusion

Modern Web applications are moving away from monolithic, one size fits all, solutions. While it’s common to see such infrastructure in the backend, it’s time that we start embracing it also on user facing sites. I wish some of the infrastructure inside WordPress, would make it easier, but I guess that’s an opportunity for a new library.

PolyConf 2015

Notes from PolyConf 2015 – Conference for developers who code in multiple languages

As developer, I feel that I work inside my technology bubble. I keep using the same tools. Because of this, it takes a lot of effort to learn new development paradigms and languages.
PolyConf was a small conference, less than 300 people in beautiful Poznan, Poland. The breadth of content, participants and organisation made it stand out.

I’ve made some notes on the impressions, that the talks and discussion provided.

Types and Immutable data structures.

Most of the talks were building on top of functional programming paradigms. There are clear benefit of that approach, as we move towards multi-core, distributed computing. I can’t wait to see some of these concept in upcoming Python 3 releases.

In this regard, Facebook’s Flow provides alternative to types in JavaScript. This has benefits over writing your code into TypeScript or ClojureScript.

Language for Every Problem

As we’ve seen in Python community – change is hard. That is why development is happening in new languages. Some decide to build whole ecosystem, while others compile to a common VM or language.

Two of the presented languages – Crystal and Elixir, were both influenced by Ruby. I would like to understand, what is about Ruby syntax that makes it a good basis for language development.

New languages also come with new frameworks. Phoenix for Elixir is one of them. A scalable web framework, that is using streams to keep processes separated and get speed benefits. A year ago, I would be skeptical about using new web framework. But with microservices and Single Page Apps, it looks like there is opportunity to experiment. We’ve offloaded most of the representation and business logic to clients.

Lots of Practical Advice

One of the benefits of being a polyglot is so you can use official libraries. A lot of good things are happening on JVM stack that we can’t just ignore.

Beware of defaults – they are usually not optimal for your use case.

The main benefit if knowing lots of languages, is to be able to borrow different concepts. If you want to explore different concepts, consider:

Data Science

Jupyter Notebook with Pandas looks like the new default for starting with data analysis. Continuum Analytics are sponsoring a powerful ecosystem. I’ve learned that there is a library called PySpark, that introduces out-of-cpu computing supports. This allows Pandas to scale even better.
The demo of Julia language showed power of implicit types and great things they can do for science computing. Worth checking out, especially now that it Juypter supports it.

But wait, there’s more

  • Racket – A programmable programming language
  • miniKanren – embedded Domain Specific Language for logic programming.
  • Emojilisp – because, why not.

Conclusion

I went to the conference without any expectations. For most part I assumed it will be way too complicated for me. But I met a group of passionate people, that were happy to explain in simple words why they love the the languages they work in.

We should figure out how to bring such levels of discussion and participation to our other events.

PolyConf 2016 is already on my conference list for the next year.