Roman’s Meaningful work in Bioinformatics

In Meaningful work interviews I talk to people about their area of work and expertise to better understand what they do and why it matters to them.


Roman Luštrik shares his story about how he went from being a veterinary technician, to a graduate school and research position at University and then to a private sector where he is helping cure cancer with statistics, biology, and data analytics.

What’s your educational background?

At the high school level, I studied as a veterinary technician, since I always loved working with animals. Afterward, I wanted to study to become a veterinarian at the university level, but couldn’t complete the necessary entry requirements. Due to a string of lucky consequences, I ended studying Biology at the University level two years later. Strictly speaking, I graduated in Biology and also got my Ph.D. at the same institution.

How did you get your previous job?

While I was writing my undergraduate thesis, I started doing more data analysis and learning how to code in the R language. I realized that I really enjoy it and that there were people around me with interest in similar subjects. I’ve started to co-organize meetups to be able to discuss this area with similar-minded people. This is where my previous boss approached me and offered me a job at the Biology department, doing research into theoretical biology. In practice that also included programming of simulations and robots as well as IT assistance inside the department. I’ve stayed with them for the next 8 years.

What prompted you to change your job?

During the whole time, I was on various types of contracts depending on funding and how we collaborated. As my last 2-year contract was coming to the end, they didn’t seem to be in a hurry to extend it again. At the same time, I was approached by my current manager and pitched me their company and invited me to join them. I passed all the interviews and soon after that, I started working as a bioinformatician.

Did you have any concerns before switching?

Not really. By that time, I was already unhappy with how things were being done at the University. There was a lot of bureaucracy and many excuses why things can’t be done or processes can’t be changed. As an example, buying a simple lab thermometer, was a week-long process with a lot of paperwork. It was tiring and looking back, I already wished that I could change my job. So a new opportunity was a perfect challenge of something new. I was especially excited about new ways of work and team collaboration. I was also never concerned about not finding a new job if this one didn’t pan out. I have a history of doing random jobs, so I was confident I could find something else to do fairly quickly.

How was the initial experience for you in your current job?

It’s a completely different type of work. I had to learn how to use many new tools, I need to write code more and it’s a new field of work: molecular biology. I used to really dislike this field of biology during my studies. Interesting aside: this is not the first time this happened. I was also trying to avoid mathematics during my studies but ended up doing my Ph.D. in statistics.

It turned out that in the new job, I got what I wanted. Working in this team is a completely different experience. It’s like working with a friend that is constantly trying to make sure that you have everything that you need to be successful at your work.

Can you explain in simplified terms what your company does?

We build tools to analyze very specific biology datasets in the field of Immuno-oncology. We’re helping our customers trying to figure out different tricks to convince the immune system to attack cancer in the human body. It’s a truly multi-disciplinary field connecting genetics, immunology as far as the biology part is concerned. There are also a lot of other roles involved in terms of data science, visualization, and support tools that support this analysis work.

Who do you think would really thrive in your field of work?

Some educational backgrounds are a natural fit for this work: physicists, biochemists, bio-technologists, microbiologists. At the same time, it’s a very complex field and it’s important to have many different experts that can work and support each other. We also employ people with a background in mathematics or computer science. Of course, there are a lot of other people with non-science roles that make sure that the business is running so that we can do our work. There’s also no real reason that somebody with a non-science education couldn’t work in our research team. As long as they are interested in this field and want to learn, there are a lot of opportunities for them to make an impact.

What’s your advice for people that are thinking of making a change?

If you’re already thinking about it – then you should do it sooner rather than later. If you’re unhappy, but not thinking about making the change – then you should also make a job change. In general, try to build social support and have a financial emergency fund. When you’re ready to make a switch, take your time to think about what you want to do next.

What I learned from talking with Roman

There are many opportunities for people who are interested in a wide range of things and are not afraid to ask questions. With this in mind, if you find that you’re bored in your current position, that’s a good opportunity to use that to find something new.

Conditional Cache Mixin for Django DRF

For a project I’m doing I’m looking at adding a conditional header to bypass cache when an X-No-Cache is present. In my case this allows external system to flush cache when certain conditions are met.

I’ve modified code from Django Rest Framework Extension to allow for such behaviour. There might be a better way to do it, but at the moment the flow of the code is clear to me. It also needs drf-extensions as it’s just an additional mixin that offloads the code to cache_response decorator.

from rest_framework_extensions.cache.decorators import cache_response
from rest_framework_extensions.settings import extensions_api_settings


class BaseCacheResponseMixin(object):
    object_cache_key_func = extensions_api_settings.DEFAULT_OBJECT_CACHE_KEY_FUNC
    list_cache_key_func = extensions_api_settings.DEFAULT_LIST_CACHE_KEY_FUNC


class ConditionalListCacheResponseMixin(BaseCacheResponseMixin):
    @cache_response(key_func="list_cache_key_func")
    def _cached_list(self, request, *args, **kwargs):
        return super().list(request, *args, **kwargs)

    def list(self, request, *args, **kwargs):
        if request.META.get("HTTP_X_NO_CACHE") == "1":
            return super().list(request, *args, **kwargs)
        else:
            return self._cached_list(request, *args, **kwargs)


class ConditionalRetrieveCacheResponseMixin(BaseCacheResponseMixin):
    @cache_response(key_func="object_cache_key_func")
    def _cached_retrieve(self, request, *args, **kwargs):
        return super().retrieve(request, *args, **kwargs)

    def retrieve(self, request, *args, **kwargs):
        if request.META.get("HTTP_X_NO_CACHE") == "1":
            return super().retrieve(request, *args, **kwargs)
        else:
            return self._cached_retrieve(request, *args, **kwargs)


class ConditionalCacheResponseMixin(
    ConditionalRetrieveCacheResponseMixin, ConditionalListCacheResponseMixin
):
    pass

Automated form testing in WordPress

I’m developing a Gravity Forms based workflow that is initially handled by a long form. Once the form gets submitted, it creates a custom post in backend so that editorial team can review it.

Throughout the years, I’ve discovered that the most annoying and time consuming part of such development is filling form every time you make a change. This is especially important as there is a number of behind the scenes actions that trigger only on form submit. I’ve decided to automate it this time.

My approach is using two major components:

Step 1: Record steps

I’ve used Cypress Recorder to fill the form and got a result that looked something like this:

describe('Event Form Submission', () => {
  it('Fills in a form', () => {
    cy.visit('http://example.test/event-submit-test/');
    cy.get('#input_2_4').click();
    cy.get('#input_2_4').type('Example Event Title');
    cy.get('#input_2_5').click();
    cy.get('#input_2_5').type('Example Event Description');
    cy.get('#input_2_6').click({force: true});
    cy.get('#input_2_6').type('June 20 - 25, 2022');
    cy.get('#input_2_16').click({force: true});
    cy.get('#input_2_16').type('https://www.example.com');
    cy.get('#input_2_7').click({force: true});
    cy.get('#input_2_7').type('Frankfurt, Germany');
    cy.get('#input_2_8').click({force: true});
    cy.get('#input_2_8').type('{backspace}');
    cy.get('#input_2_8').type('Test Research Limited');

    // .. more of similar to above ..

    cy.get('#gform_submit_button_2').click({force: true});
    cy.url().should('contains', 'http://example.test/event-submit-test/');
  })
})

Step 2: Configure Cypress and install add-ons

By default Recorded doesn’t handle scrolling the window and that makes Cypress unhappy. So I had to add {force: true} to some click() calls to move the window down.

My form is also working with file uploads and has a TinyMCE input textarea, so I had to uninstall two add-ons: cypress-file-upload and @foreachbe/cypress-tinymce. They’re both very simple to use:

// file upload
cy.get('#input_2_17').attachFile('sample-picture.jpg');

// tinyMCE modification
cy.setTinyMceContent('input_2_11', 'This is the new content');

Step 3: Enjoy automated form filling while developing backend code

Lessons learned

I originally asked my question on Facebook in a local developers group. I got some good tips that lead me down this path. Instead of trying to do something with Chrome extensions, I finally took the time to learn End to End testing framework in a non-React environment. My previous experience with such tools was Selenium and I’m happy to see that the whole experience was much better this time.

Taking time to setup my dev workflow saved me many hours and I’m happy that I didn’t blindly start with development.

Lessons from organizing OE Global 2020

Open Education Global Conference is a yearly conference for the Open Education community, that a small non-profit OE Global, co-organizes yearly with a local host institution. For 2020 we planned to hold the conference in mid-November in Taipei, Taiwan. But then Covid happened and we’ve decided to make it a virtual event. 

We’ve had ~125 online sessions, and additional 50 posters, presentations that were delivered asynchronously. It happened over 5 days (Mon-Fri) and across all timezones (from Taiwan, Europe, and across the Americas).

In this blog post, I’ll share some lessons that I learned, mostly from technology and related processes. As it was a big undertaking with more than 20 people involved on the organizing side, I only had to focus on my part of responsibilities.

Some technology works well

Presentation side

For our real-time presentations, we used Zoom in normal ‘meeting’ mode. With attendees joining from everywhere in the world, I’ve rarely seen any problems. People know how to use Zoom and they have a good enough camera and microphone. We also used a machine based auto-captioning service by rev.com to generate real-time subtitles. It’s about 90% accurate and it turned out to be quite a good addition to the presentations.

What was most surprising was how cheap this part of the stack was. Zoom is about 15 USD/month and rev.com is additional an 20 USD. I’ve looked into their competition (mostly Google Meet and Jitsi) and nobody could beat them on features or pricing. Google doesn’t seem to even offer a competing solution (I can’t just easily buy 5 seats for one month).

I also evaluated Jitsi and BigBlueButton, but they both required a lot of server infrastructure at the worse result for the end-user. The reason is that when you’re running a global conference where your participants are connecting from different continents you now have to think about peering servers, network latencies and how do you globally distribute your servers. This isn’t easy or cheap. So we’d have to go with a commercial vendor or develop this expertise in house.

Discussion platform and a shared space

For the online discussion platform, we’ve chosen Discourse. It turned out to be a good decision, but the flexibility and complexity of the platform made it somewhat challenging at the time. Through the planning process, we’ve managed to configure it to our liking and import all the data and users with a series of python scripting and a custom Discourse plugin that took care of the presentation of the program.

Developing plugins for Discourse is both a pleasure and a very frustrating experience. It’s a mixture of (older version of) Ember.js on the frontend and Ruby on Rails on the backend. The issue is that you’re writing a plugin into mostly undocumented API of Discourse. So in the end you have to just go and start reading the source of the Discourse and be a bit creative in how you approach your code. I think it’s worth it, but next time I’ll have to invest more time in just learning Rails so I can figure out better integrations.

I’ve considered going with Sched, as we’ve used it in the past, but I decided that it wasn’t flexible enough for our needs. I also looked into developing it on top of WordPress, but I didn’t find any solutions for communities that seemed like it would work for us.

Before the conference

Since we had a lot of registrations at 25-50 USD per ticket, I’ve decided to go with Stripe, WordPress, and Gravity-forms with Stripe add-on. For the most part, it worked very well and Stripe made the whole credit card integration easy. After working with Eventbrite in the past, I’m happy with the choice as it gave us much more flexibility with lower costs. (We also had WP+Gravity Forms stack pre-built from previous years).

For our Call for Presentations, we’ve used Easychair. It’s a kind of specialized software that is mostly used in space of organizing academic conferences. I’m not a big fan, but at our scale and number of reviewers, all other options seemed worse or much more expensive. We get about 200 presentation proposals that we distribute to 50 reviewers.

Archiving the conference

We’ve recorded most of the sessions in zoom and the recordings were available in Zoom in a matter of minutes after the session ended. We’ve then transferred these recordings to YouTube and pasted links back into our Discourse.

I’m not entirely happy with using YouTube as a video hosting solution. It’s a definite trade-off in terms of privacy and control. At the same time, using any other vendor is significantly more expensive. Hosting it ourselves would also be a costly and non-trivial thing to do. This is something that I’d like to revisit for next year to figure out if there are any better options.

Some technology is less friendly

Building the schedule

Building a schedule for 130 sessions across 5 days, multiple tracks and basically, all time zones end up being a very complex issue. I wish there was better software than Google Spreadsheets that we could use for this. I know there are schedule solvers out there, but I have to find one that would make sense for us (at a sensible price point).

Connecting all the technologies

Our back-office stack is Google Workspace, so a lot of Google Spreadsheets, mixed with Slack, Mailgun, and Python scripts. What I found is that it takes a lot of effort to make good workflows where input in one system (e.g. registration form) triggers a welcome email from another system. If I were building a Software as a Service (SaaS) system, I would make it part of the onboarding process. But with the conference, the system is more of a drip-email campaign. At some point, you release information to everyone, but if a person registered after that date, you then need to both back-fill old information as well as send them a new one.

Onboarding suddenly matters a lot

Challenge with moving from in-person conference to online-only, is that your existing workflows assume in-person experience, but your attendees expect an online experience. Things that you can improvise in-person (e.g. write a new name tag with a pen), becomes a support issue in an online context. If they for some reason didn’t do something correctly online (or their spam filter blocked an email), you don’t have a good way to instantly fix it. It becomes a game of emails and troubleshooting to figure out what’s the core issue.

To make this better next time, I’ll try to sketch all parts of the onboarding process and what are potential issues. I guess we’ll also want to track churn and similar concepts from traditional SaaS businesses. 

 Hallway track still needs work

There are just fewer opportunities to connect with other attendees and ask good questions. We’ve tried a lot of things – different thread stats and Discourse, drop-in zoom sessions, interactive tools. They kind of work, but they still don’t replicate the serendipity of meeting new people and listening to different group conversations during coffee breaks.

I’m hopeful we’ll figure out how to do this in the next few years. It might require a completely new way of thinking and organizing our time.

It’s now harder to be at a conference

Being at the conference is usually associated with a deep dive into a field with intense connection and learning over a few days. When you’re attending from your home office, there’s this tension of trying to still do your work, be with family and also try to follow the conference. I think that trying to do all of these things at once just isn’t possible. I hope we can figure out how to help people take this time for their professional development, without feeling guilty about not attending to some other things.

Overall

Most of all, I’m just excited and surprised how well it all worked. We’ve managed to bring our conference to people and communities that could never afford (or be allowed to) travel to our in-person event. I’m excited at what this means for the future of our field.

Getting tailwind css to work with Roots Sage 9 theme

I’m really enjoying how easy makes Sage WordPress Theme development. It’s very different in the beginning, but it soon feels a lot more like working in Django instead of WordPress.

At the same time, I’ve also been trying to use tailwind for this project. To make it work in production, you need to configure a few more settings for purgecss that official instructions don’t cover. The trick is that you need to define a TailwindExtractor that doesn’t strip out md:underline, hover:underline and similar color prefixed CSS classes.

Notice that I also exclude a few of external packages, so that purgecss doesn’t strip their CSS rules.

// webpack.config.optimize.js

'use strict'; // eslint-disable-line

const { default: ImageminPlugin } = require('imagemin-webpack-plugin');
const imageminMozjpeg = require('imagemin-mozjpeg');
const UglifyJsPlugin = require('uglifyjs-webpack-plugin');
const glob = require('glob-all');
const PurgecssPlugin = require('purgecss-webpack-plugin');

const config = require('./config');

class TailwindExtractor {
  static extract(content) {
    return content.match(/[A-Za-z0-9-_:\/]+/g) || [];
  }
}

module.exports = {
  plugins: [
    new ImageminPlugin({
      optipng: { optimizationLevel: 7 },
      gifsicle: { optimizationLevel: 3 },
      pngquant: { quality: '65-90', speed: 4 },
      svgo: {
        plugins: [
          { removeUnknownsAndDefaults: false },
          { cleanupIDs: false },
          { removeViewBox: false },
        ],
      },
      plugins: [imageminMozjpeg({ quality: 75 })],
      disable: (config.enabled.watcher),
    }),
    new UglifyJsPlugin({
      uglifyOptions: {
        ecma: 5,
        compress: {
          warnings: true,
          drop_console: true,
        },
      },
    }),
    new PurgecssPlugin({
      paths: glob.sync([
        'app/**/*.php',
        'resources/views/**/*.php',
        'resources/assets/scripts/**/*.js',
        'node_modules/vex-js/dist/js/*.js',
        'node_modules/mapbox-gl/dist/*.js',
        'node_modules/slick-carousel/slick/slick.js',
      ]),
      extractors: [
        {
          extractor: TailwindExtractor,
          extensions: ["html", "js", "php"],
        },
      ],
      whitelist: [
      ],
    }),
  ],
};