ISMB/ECCB 2017 conference, Day 3

The third day at the conference.

Posted by Federico Tomasi on Mon 24 July 2017

Another day, another round of fresh and exciting ideas. Today it was a bit difficult to follow each talk (remember I did not have a weekend!). To take it easy, I attended the BioVis track in the morning. I guessed that if they talked about data visualisation, my attention was not required to have high levels. Of course I was wrong.

(Biological) data visualisation

They went into details a lot, but, as the name of the track suggests, they had nice figures. First was time for the keynote by Boudewijn Lelieveld, "Visual analytics for spatial transcriptomics: from single cell to tissue and back". Main topic was: what if Netflix uses PCA? Well, they may suggest the Jaws movie to a family. As for the biomedical data, PCA is not that great, ouside, for example, macroscopic differences. Interesting, instead, is that most of the talks were focused on using some variation of t-SNE algorithm. And actually it seems to work well in such use cases. In particolar for brain transcriptomics (as explained in this talk, with its hierarchical evolution H-SNE). The topic of brain transcriptomics was investigated also in the following talks, by Marwan Abdellah's "Reconstruction and visualization of large-scale volumetric models of neocortical circuits for physically-plausible in silico optical studies" and Sjoerd M. H. Huisman's "BrainScope: interactive visual exploration of the spatial and temporal human brain transcriptome". The goal is to reconstruct the brain network with watertight meshes to have realistic simulations of what happens inside the brain.

Here there is a collection of some photos of the BioVis track.

Deep networks with string data

After that, the last talks of the morning were a bit difficult to follow, so let's get straight to the afternoon. In the afternoon, I switched to the HiTSeq. The website of course is not as nice as the BioVis' one, but the arguments were equally interesting, maybe a little bit more machine learning oriented. Interesting was Xu Min talk about "Chromatin accessibility prediction via convolutional long short-term memory networks with k-mer embedding". Deep networks! I hadn't seen any up to now. Such networks were used with a string embedding, in particular after extracting $k$-mers from the words. This is similar to the string kernel methods, but with the difference that, well, this is not a kernel. And you can use it with LSTM supervised learning. As for the implementations, it is possible to use it in Keras, Theano. So basically the learning is done in two steps.

  1. unsupervised training of $k$-mer embedding;
  2. supervised learning with LSTM.

During all the talk I had the suspect that he had been reading all the time. Nevertheless, it was a nice example of usage of deep networks with string data. Read more here.

Unfortunately, the weather was very bad, and I did not have time to take any photos anyway. Bad day.