Bioinformatics Seminar

The bioinformatics network aims to promote bioinformatics research at Kiel University and its partner institutes by supplying a framework for an inter-disciplinary scientific exchange and an inter-faculty education program.

Monthly seminars provide a meeting point for scientists and students and are open to everyone who is working on or interested in bioinformatics. Topics at the monthly seminars cover all areas of bioinformatics, including theory, method development, and data analysis in the fields of genomics, transcriptomics, metagenomics, population genomics, systems biology, and biostatistics. Seminars are announced here and also by email to the bioinf mailing list.

Interested people can subscribe to the email list here.

The seminar takes place at ZMB, Am Botanischen Garten 11, room 4.03

Next seminar:

8. 11. 2019 (Friday) 15:00



Upcoming seminars:

6. 12. 2019 N.N.
10. 1. 2020 N.N.
7. 2. 2020 N.N.
27. 3. 2020 N.N.
8. 5. 2010 N.N.
5. 6. 2020 N.N.


7. 6. 2019 (Friday) 15:00 Lucas Moitinho-Silva, Institute of Clinical Molecular Biology, CAU

Exploration of microbial abundance patterns in sponge microbiomes

Animal-microbe symbiosis research requires a broad range of techniques from different fields of science. In my talk, I will explore the dichotomy between high microbial abundance (HMA) and low microbial abundance (LMA) sponges with ecological and machine learning methods. I will present the analysis of microbiomes of about 170 sponge species (~ 1800 samples) of which data are public available as part of the sponge microbiome project. In the final part of my talk, I will show how I am transferring these approaches to my current projects about the human microbiome.

10. 5. 2019 (Friday) 15:00 Silke Szymczak, Institute of Clinical Molecular Biology, CAU

Looking into the black box of random forests

Machine learning methods and in particular random forests are promising approaches for classification and regression based on omics data sets. I will first describe in layman's terms how the random forest algorithm works and how a prediction model can be built. However, these complex models are not easy to interpret and one strategy for better understanding is variable selection, i.e. the identification of variables that are important for prediction.
In the second part of my talk I will then present our novel method called surrogate minimal depth (SMD). It is based on the structure of the decision trees in the forest and additionally takes into account relationships between variables. In simulation studies we showed that correlation patterns can be reconstructed and that SMD is more powerful than existing variable selection methods. Thus, SMD is a promising approach to get more insight into the complex interplay of predictor variables and outcomes in a high dimensional data setting.

8. 3. 2019 (Friday) 15:00 Martin Jahn, GEOMAR Kiel

Implications of the virome on marine sponge holobionts

Phages are increasingly recognized as important members of host associated microbial communities. While phage-bacteria interactions have been studied for more than one century comparatively little is known about how phages interact with their animal hosts. An attractive model that allows us to study host-microbe interactions in a natural environment are marine sponges, which are associated with stable, highly complex and specific microbial communities. As filter-feeding animals, sponges pump up to 24,000 litres of seawater through their system per day exposing them to high amounts of external viruses. High exposure to phages, a major bacteriolytic element, raises questions on how microbiome homeostasis can be maintained. Moreover, the diversity and function of residual phages on the sponge microbial community and their distribution in the animal's landscape are largely unexplored.

Therefore, I recently investigated 36 DNA/RNA viromes of four Mediterranean sponge species and nearby seawater references using viral metagenomics. In this seminar, I will walk you through the steps from sampling design to taxonomic and functional analysis and discuss methodological aspects of it. Finally, I will highlight possibilities to connect sequencing with supplementary approaches such as microscopy and functional assays what should be widely applicable to other systems.

8. 2. 2019 (Friday) 15:00 Ribana Roscher, Institute of Geodesy and Geoinformation (IGG), Universität Bonn

Machine Learning for Earth Remote Sensing

Remote sensing observations play an important role in the geo- and bioscientific community, since they enable various applications to accurately monitor the Earth and its changes - on a microscopic level as well as from space. Beside the challenge to deal with large amounts of data and limited class label information, current and future challenges comprise the definition and integration of prior and domain knowledge, the learning of sophisticated features and the fusion of multiple sensor data. This talk will cover several remote sensing applications with focus on deep learning methods which are addressed in my group, and I will present my vision of future methods to learn better models of complex geo- and biophysical processes and phenomena.

18. 1. 2019 (Friday) 15:00  Ana Filipa Moutinho, Max Planck Institute for Evolutionary Biology, Plön

The genomic and structural drivers of protein adaptive evolution

The frequency and nature of adaptive mutations is a long-standing focus of the study of molecular evolution. Here, we address the impact of structural architecture among protein coding regions on the rate of adaptive mutations. We used population genetics to study molecular evolution on a fine scale by analysing the impact of genetic variants in the different conformations of protein structure. With this, we aimed to understand how protein biophysics and coding sequence evolution influence fitness and adaptation. By using Drosophila melanogaster and Arabidopsis thaliana population genomics data, we fitted models of distribution of fitness effects and estimated the rate of adaptive amino-acid substitutions both at the protein and amino-acid residue scale, across different categories of protein function, chaperone affinity, protein-protein interactions, intrinsic protein disorder and structural motifs. We found that most of the adaptive mutations occur at the surface of proteins and that gene age strongly influences the rate of adaptation. Moreover, we observe that the functional class of proteins also plays a role in adaptation, with genes encoding for processes of protein regulation and signaling pathways exhibiting the highest values for the rate of adaptive substitutions. We therefore propose that the rate of adaptive mutations in proteins is driven by new inter-molecular interactions, both at the intra-organism, within protein networks, and at the inter-organism level, through the coevolution with pathogens, and/or by the acquisition of new biochemical activities.

7. 12. 2018 (Friday) 15:00 Mario Stanke, Institut für Mathematik und Informatik, Universität Greifwald

Comparative Genome Annotation

The talk will treat the trade annotation problem: many genomes of different species or strains of a clade are given and the clade is so narrow that larger parts of these genomes can be aligned, e.g. the clade of murine species.
Instead of annotating each genome one-by-one, we develop methods that simultaneously annotate all genomes, thereby exploiting evidence from selection and introducing a coupling of the previously independent sequential labeling problems in order to increase accuracy and consistency of the structural genome annotations. I will present ongoing efforts to improve the AUGUSTUS gene prediction tool.

9. 11. 2018 (Friday) 15:00 Johannes Zimmermann, Institute of Experimental Medicine, CAU Kiel

Fishing with metabolic networks - crafting, catching, curating

Metabolic networks are repositories of knowledge about the metabolic processes that occur in an organism. They are successfully used to examine various phenomenons rather on a integrative pathway level than on gene level only. In my talk, I want to give an introduction to metabolic networks theory focusing on the construction, analysis and curation of such networks. Own contributions are discussed as well as finally the application of metabolic networks to community modeling spotlighted by the metaorganinsm paradigm.

1. 10. 2018 (Monday) 15:00 Dan Graur, University of Houston

Something Old, Something New, Something Borrowed, Something Blue: Applying the Concept of Mutational Load to Genomic Sequences to Determine an Upper Limit on the Functional Fraction of the Human Genome

For the human population to maintain a constant size from generation to generation, an increase in fecundity must compensate for the reduction in the mean fitness of the population caused by deleterious mutations. The required increase depends on the deleterious-mutation rate and the number of sites in the genome that are functional. These dependencies and the fact that there exists a maximum tolerable replacement level fertility (e.g., humans cannot have 100 children) allow us to estimate an upper limit for the fraction of the human genome that can be functional. By estimating the fraction of deleterious mutation out of all mutations in known functional regions, we conclude that the fraction of the human genome that can be functional cannot exceed 25%, and is almost certainly much lower.

15. 6. 2018 (Friday) 15:00 Andreas Tauch, de.NBI Administration Office - ELIXIR Germany, Bielefeld University, Bielefeld, Germany

Bioinformatics in Germany: toward a national-level infrastructure

The German Network for Bioinformatics Infrastructure (de.NBI) is a national initiative funded by the Federal Ministry of Education and Research (BMBF). The mission of the de.NBI initiative is (i) to provide high-quality bioinformatics services to users in basic and applied life sciences research from academia, industry and biomedicine; (ii) to offer bioinformatics training to users in Germany and Europe through a wide range of workshops and courses; and (iii) to foster the cooperation of the German bioinformatics community with international network structures. The infrastructure network was launched by the BMBF in March 2015 and, after two national calls, now includes 40 service projects operated by 30 project partners that are organized in eight service centers. Scientists from Kiel University and from the Fritz Lipmann Institute Jena joined the de.NBI nework as associated partners in 2017. The staff of de.NBI develops further and maintains almost 100 bioinformatics services for the human, plant and microbial research fields and provides comprehensive training courses to support users with different expertise levels in bioinformatics. The network is currently expanding its activities to the European level, as the de.NBI consortium was assigned by the BMBF to establish and run the German node of ELIXIR, the European life-sciences Infrastructure for biological Information. Like de.NBI on the national level, ELIXIR-DE is coordinated from Bielefeld University and includes over twenty partner institutes across Germany.

20. 4. 2018 (Friday) 15:00 Eli Levy Karin, Max-Planck Institute for Biophysical Chemistry, Göttingen, Germany

Statistical techniques in molecular evolution // Tools to explore eukaryotic metagenomics

My PhD focused on developing computational and statistical methods in the field of molecular evolution. In my talk I will give a brief overview of my work, which dealt with various aspects of sequence analysis. I will then present in greater detail one of the projects, TraitRateProp. TraitRateProp is a probabilistic method that allows testing whether the rate of sequence evolution is associated with changes in a binary phenotypic character trait. The method further allows the detection of specific sequence sites whose evolutionary rate is most noticeably affected following the character transition, suggesting a shift in functional/structural constraints. TraitRateProp was first evaluated in simulations and then applied to study the evolutionary process of plastid plant genomes upon a transition to a heterotrophic lifestyle.
Finally, I will present my current work on developing and applying computational tools for the analysis of eukaryotic metagenomics data. Metagenomics is revolutionizing the study of microbes and their fundamental roles in biological, geological, and chemical processes on earth. Despite the important roles eukaryotes play in most environments, they have received little research attention, due to their lower abundance in samples and to the complexity of their gene and genome architectures. To date, we generally cannot reliably predict eukaryotic genes in metagenomics sequences. However, being able to analyze eukaryotic metagenomics data is of great importance to numerous scientific fields, including biotechnology and medicine, ecology and evolution. In my study, I work on developing computational tools for the high-throughput discovery of eukaryotic gene sequences in metagenomics data and for their functional annotation.

23. 3. 2018 (Friday) 15:00 Christoph Kaleta, Institute of Experimental Medicine, Kiel University

Transcriptomic alterations during ageing reflect the shift from cancer to degenerative diseases in the elderly

Disease epidemiology during ageing shows a transition from cancer to degenerative chronic disorders as dominant contributors to mortality in the old. Nevertheless, it has remained unclear to what extent molecular signatures of ageing reflect this phenomenon. Here we report on the identification of a conserved transcriptomic signature of ageing based on gene expression data from four vertebrate species across four tissues. We find that ageing-associated transcriptomic changes follow trajectories similar to the transcriptional alterations observed in degenerative ageing diseases but are in opposite direction to the transcriptomic alterations observed in cancer. We confirm the existence of a similar antagonism on the genomic level, where a majority of shared risk alleles that increase the risk of cancer decrease the risk of chronic degenerative disorders and vice versa. These results reveal a fundamental trade-off between cancer and degenerative ageing diseases that sheds light on the pronounced shift in their epidemiology during ageing.

23. 2. 2018 (Friday) 15:00 Tim Lachnit, Zoological Institute, Kiel University

Viruses, the neglected part of metaorganisms

Eukaryotic organisms are associated and have co-evolved with a complex bacterial community. Together host and bacteria form a synergistic relation. Disturbance of the homeostasis between host and its associated partners may contribute to disease development. While in the last decades research has focused on bacteria host interactions viruses have been disregarded although they represent the most abundant entity in the world outnumbering bacterial cells and are one of the key regulators of bacterial communities killing 20-40% of all bacterial cells each day. In this seminar I’ll introduce you to the viral world living in association with diverse organisms of different habitats ranging from marine algae, sponges, freshwater polyps to fecal samples of mice. On the basis of these examples I’ll emphasize different methodical problems including viral isolation, library preparation and sequence data analysis that challenge viral research and have to be taken into account when working with viruses. 

8. 12. 2017 (Friday) 15:00 Bernhard Haubold, Max Planck Institute for Evolutionary Biology, Plön

Sequence Complexity and Gene Function in the Human Genome

Genome sequences vary locally in their complexity due to duplication events in their evolutionary past. As a result, there is long-standing interest in elucidating the relationship between sequence complexity and the function of the encoded genes. However, measuring local sequence complexity is problematic as most metrics have no bounds that coincide with a known minimum for completely ordered sequences, and a maximum for random sequences. An exception to this rule is our complexity measure CM, which is bounded by 0 and an expectation of 1. This measure is robust to variation in GC-content, and can be computed efficiently. We have implemented CM in our program macle for MAtch CompLExity. Macle takes as input a genome sequence in FASTA format for indexing. In the case of the complete human genome, indexing takes 3.5 h using 128 GB RAM. Given the resulting index, macle computes CM in sliding windows of arbitrary width across the entire genome in roughly 19 s using 25 GB RAM.
To investigate the relationship between sequence complexity and gene function, we determined which genes were enriched in regions of a given complexity. We found that high complexity regions were strongly enriched for regulatory genes active in development. In contrast, low complexity regions were enriched for genes involved in immunity. We end by speculating on the role of the few unannotated regions of high complexity found.

10. 11. 2017 (Friday) 15:00 Tobias Marschall, Max Planck Institute for Informatics, Saarbrücken

Structural Genomic Variation and Horizontal Gene Transfer

Structural variation (SV) is of key importance for the evolution of genomes across the tree of life. This talk presents a tour of methodological developments for SV calling, genotyping, and haplotyping. First, I will explain methods (Clever, Mate-Clever) we developed and applied in the frame of the Genome of the Netherlands (GoNL) project, which sequenced 250 Dutch families, and highlight some of the results of this study. Second, I will venture into the world of bacterial genomics and show how lessons learned from detecting human structural variation can be applied to design a tool (Daisy) to detect recent horizontal gene transfer. Third, I will discuss the impact of technological developments for detecting SVs, using the data produced by the Human Genome Structural Variation Consortium (HGSVC) as an example. The HGSVC sequenced nine human genomes each on seven different platforms (Illumina paired ends, Tru-seq synthetic long reads, jumping libraries, 10X Genomics, PacBio, BioNano optical maps, Strand-seq). In the frame of this project, we particularly explored the abilities of these technologies to resolve haplotypes by employing our WhatsHap method, which I will briefly explain. As a result, the HGSVC has produced a map of haplotype-specific structural variation that highlights SVs as substantially more prevalent in humans than was previously appreciated.

16. 10. 2017 (Monday) 16:00 Itzhak Mizrahi, Ben-Gurion University, Israel

Insights into the rumen microbiome

The mammalian gut microbiota is essential in shaping many of its host's functional attributes. Relationships between gut bacterial communities and their mammalian hosts have been shown in recent years to play an important role in the well-being and proper function of their hosts. A classic example of these relationships is found in the bovine digestive tract in a compartment termed the rumen. The rumen microbiota is necessary for the proper physiological development of the rumen and for the animal’s ability to digest and convert plant mass into basic food products, making it highly significant to humans. In my lecture I will discuss our recent findings regarding this ecosystem's development, and interaction with the host.

14. 7. 2017 (Friday) 15:00 Christian Woehle Institute of Microbiology, CAU Kiel

Tracing back the evolution of the eukaryotic redox proteome

The redox-sensitive proteome (RSP) consists of protein thiols that undergo redox reactions, playing an important role in coordinating cellular processes. Here, we applied a large-scale phylogenomics approach to map the evolutionary origins of the eukaryotic RSP. Based on current-day snapshot of the diatom Phaeodactylum tricornutum we inferred ancestral sequence states and traced the evolution of the RSP stepwise back to the origin of eukaryotes. Our results show, that the majority of P. tricornutum redox-sensitive cysteines (76%) is specific to eukaryotes, yet these are encoded in genes that are mostly of a prokaryotic origin (57%). Furthermore, we find a threefold enrichment in redox-sensitive cysteines in genes that were gained by endosymbiotic gene transfer during the primary plastid acquisition. The secondary endosymbiosis event coincides with frequent introduction of reactive cysteines into existing proteins. While the plastid acquisition imposed an increase in the production of reactive oxygen species, our results suggest that it was accompanied by significant expansion of the RSP, providing redox regulatory networks the ability to cope with fluctuating environmental conditions.

9. 6. 2017 (Friday) 15:00 Giorgio Gonnella, Center for Bioinformatics, University of Hamburg

GFApy: a convenient and extensible Python library for handling sequence graphs

The Graphical Fragment Assembly formats 1 and 2 (GFA1 and GFA2) are recently defined formats for representing sequence graphs, such as assembly graphs (de Bruijn and string graphs), sequence variation graphs and gene splicing graphs. The formats are adopted by several software tools, including sequence assemblers, read mappers, variant analysis tools and interactive visualization tools.
We present a scripting language library for handling GFA files in Python (GFApy). The library allows the user to conveniently parse, edit and write GFA files. Complex operations, such as the separation of the implicit instances of repeats and the merging of linear paths are also supported. Furthermore, the library is easily extensible: we show an example on how to define custom record types for metagenomic analysis.
GFApy is the first library which allows for convenient handling of GFA files using Python and the first publicly available implementation in any language fully supporting the GFA2 specification.

12.5.2017 (Friday) 15:00 Fernando Tria, Institute of Microbiology, CAU Kiel

Phylogenetic rooting using minimal ancestor deviation

Ancestor-descendent relations play a cardinal role in evolutionary theory. Those relations are determined by rooting phylogenetic trees. Existing rooting methods are hampered by evolutionary rate heterogeneity or the unavailability of auxiliary phylogenetic information. We present a novel rooting approach, the minimal ancestor deviation (MAD) method, which embraces heterotachy by utilizing all pairwise topological and metric information in unrooted trees. We demonstrate the method in comparison to existing rooting methods by the analysis of phylogenies from eukaryotes and prokaryotes. MAD correctly recovers the known root of eukaryotes and uncovers evidence for cyanobacteria origins in the ocean. MAD is more robust and consistent than existing methods, provides measures of the root inference quality, and is applicable to any tree with branch lengths. 

10.3.2017 (Friday) 15:00 Malte Rühlemann, IKMB Kiel

Genome-wide association studies of the human gut microbiota

The human gut is the habitat of billions of microorganisms belonging to a manifold of different taxonomic groups with a huge functional repertoire. The gene content of the gut bacteria excesses that of the human host by more than a hundred fold and plays an important role in the digestion of food, modulation of immune functions, and colonisation of pathogens. While changes in the intestinal microbiota have been linked to a variety of different diseases, the question of what factors shape and influence the variation seen in a „normal“ and „healthy“ microbiota are still largely unanswered.
Using uni- and multivariate statistical frameworks adapted to a genome-wide association study setting in two cohorts, comprised of a total of ~ 1,800 individuals from Northern Germany, we wanted to investigate the influence of host-genetic variation on core members of the gut microbiota, as well as on overall beta-diversity of the community. Results show an overlap with previously known candidate genes for host-microbe-interactions from functional studies, sharing with loci identified in association studies of inflammatory disorders and new candidate genes shedding new light onto the mechanisms how the host-genome influences the bugs in our guts.

10.2.2017 (Friday) 15:00 Axel Wedemeyer, Institute of Informatics, CAU Kiel

Filtering reads for De Novo Assembly

These days, sequencing projects often produce huge data sets. Especially for single-cell projects it is necessary to sequence with a very high mean coverage in order to make sure that all parts of the sample DNA get covered by the reads produced. This leads to datasets with large amounts of redundant data. Metagenomic data sets often show a high coverage for abundant species and a low one for rare species.

For a de novo assembly, the assembler has to reconstruct the genetic information out of these data sets alone, a puzzle with sometimes billions of pieces. This is a demanding task, particularly with regard to the amount of RAM needed. Common assemblers like metaSPAdes or AllpathLG regularly need more than the 250GB of memory a common server has our days.

But is all the data necessary for the problem solution? The basic idea of our work is to filter out redundant reads in order to reduce memory and time requirements of the assembly process. The decision whether to keep or dump a certain read is based on a probalistic counting scheme for the k-mers (substrings of reads of length k) seen so far and on the phred score. While this method has been shown to be very effective on single-cell and transcriptomic data sets, we are currently working on adapting it to metagenomic data sets.

13.1.2017 (Friday) 15:00 Beate Slaby, GEOMAR Kiel

Metagenomic binning of a marine sponge microbiome reveals unity in defense but metabolic specialization

Marine sponges are ancient metazoans that are populated by distinct and highly diverse microbial communities. In order to obtain deeper insights into the functional gene repertoire of the Mediterranean sponge Aplysina aerophoba, we combined Illumina short-read and PacBio long-read sequencing followed by un-targeted metagenomic binning. We identified a total of 37 high-quality bins representing 11 bacterial phyla and 2 candidate phyla. Statistical comparison of symbiont genomes with selected reference genomes revealed a significant enrichment of genes related to bacterial defense (restriction-modification systems, toxin-antitoxin systems) as well as genes involved in host colonization and extracellular matrix utilization in sponge symbionts. A within-symbionts genome comparison revealed a nutritional specialization of at least two symbiont guilds, where one appears to metabolize carnitine and the other sulfated polysaccharides, both of which are abundant molecules in the sponge extracellular matrix. A third guild of symbionts may be viewed as nutritional generalists that perform largely the same metabolic pathways but lack such extraordinary numbers of the relevant genes. This study characterizes the genomic repertoire of sponge symbionts at an unprecedented resolution and it provides greater insights into the molecular mechanisms underlying microbial-sponge symbiosis.

9.12.2016 (Friday) 15:00 Matthias Merker, Molecular and Experimental Mycobacteriology, Research Center Borstel

Evolution of multidrug-resistant tuberculosis strains in Eastern Europe

Bacterial factors favoring the unprecedented multidrug-resistant tuberculosis (MDR-TB) epidemic in Eastern Europe remain unclear. We analyzed whole genome sequences from 1,436 clinical MDR Mycobacterium tuberculosis complex (MTBC) strains from different Eastern European settings. The vast majority (70%) of M/XDR-TB infections were caused by three closely related MTBC strain types. Bayesian coalescent analysis revealed that particular MTBC clones with patterns of low fitness cost resistance mutations (first- and second-line drugs) in combination with compensatory mutations existed prior the introduction of standardized TB treatment in the late 1990s. The dominance of particularly fit and highly resistant strains further challenges the application of standard treatment regimens including the new short MDR-TB regimen and highlights the need for universal, rapid comprehensive drug susceptibility testing especially in high burden settings.

11.11.2016 (Friday) 15:00 Silvio Waschina, Institute of Experimental Medicine, CAU Kiel

Costs and necessity of the Black Queen: the impact of metabolic trade-offs on the evolution of microbial community structure and dynamics

Microbial cells often exchange costly produced metabolites with neighbouring cells within their communities - creating a vast network of interdependencies where cooccurring organisms perform complementary metabolic functions. The Black Queen hypothesis aims to explain the evolution of such dependencies through the loss of metabolic functions by a sub-group of cells while the function is retained by coexisting cells that share the function’s essential product. To test this hypothesis requires knowledge of (i) the fitness consequences of metabolic gene loss as well as (ii) the costs that are associated with the biosynthesis of exchanged metabolites. Both quantities, however, usually remain elusive.

Here we addressed this issue using data mining approaches and constraint-based modelling of bacterial metabolism. The computational estimates and predictions were complemented with laboratory experiments of Escherichia coli and Acinetobacter baylyi. The results suggest that loss of conditionally essential biosynthetic functions is highly prevalent in natural bacterial populations. This rampant loss of anabolic functions can be explained by selective advantages of biosynthetic gene loss in the presence of the focal metabolites. In addition, epistatic interactions frequently affected fitness after losing multiple genes. We also identified a carbon source-dependent trade-off between the production costs of different classes of amino acids. Such biochemical trade-offs are known to play a crucial role in the ecology and evolution of microorganisms because coexisting lineages can mutually save metabolic costs by specialising in the production of different essential metabolites. Taken together, our observations demonstrate potential molecular causes underlying the evolution of metabolic interdependency and complementary within microbial communities.

14.10.2016 (Friday) 15:00 Astrid Dempfle, Institute of Medical Informatics and Statistics, CAU Kiel

Statistical aspects of gene-environment interaction

The concept of gene-environment interaction is relevant both in the etiology of complex diseases and in personalized treatment. Statistical aspects in the identification or utilization of such interactions will be highlighted, in particular relating to study design and statistical analysis for disease gene identification or pharmacogenetic clinical trials.

15.7.2016 (Friday) 15:00 Elisabeth Kaltenegger, Botanical Institute, CAU

Interference between paralogues at the protein level affects the dynamics after gene duplication

A common feature of proteins is their assembly into homomeric structures to act as functional units. Usually, the subunits are derived from a single genetic locus. When such a gene is duplicated, the gene products are suggested initially to cross-interact when co-expressed thus resulting in the phenomenon of paralogue interference. In this talk, I will present a case study of protein evolution in which paralogue interference after duplication might have facilitated neofunctionalization of one duplicate. I will also explore further possible ways of how paralogue interference can shape the fate of a duplicated gene and present further illustrative examples. One important outcome is a prolonged time window in which both copies remain under selection increasing the chance to accumulate mutations and to develop new properties. Thereby, paralogue interference can mediate the co-evolution of duplicates.

13.5.2016 (Friday) 15:00 Frederic Bertels, Max Planck Institute for Evolutionary Biology, Plön

Parallel evolution in a long term experiment with HIV-1

One of the most intriguing puzzles in biology is the degree to which evolution is repeatable. The repeatability of evolution or parallel evolution has been studied in a variety of model systems, but has rarely been investigated with clinically relevant viruses. To investigate parallel evolution of HIV-1, we passaged two replicate HIV-1 populations for almost one year in each of two human T-cell lines. For each of the four replicate lines, we determined the genetic composition of the viral population at nine time points by sequencing the entire genome. Mutations that were carried by the majority of the virus population showed an extreme degree of parallel evolution. In one of our evolutionary lines, all 19 majority mutations also occur in another line but appear in a different order. This repeatable pattern of HIV-1 evolution is indicative of a predictable process, which is maximally inconsistent with evolutionary neutrality.

15.4.2016 (Friday) 15:00 Marc Hoeppner, IKMB

Workflow systems in bioinformatics

Within just a few years, the steadily decreasing cost of next-generation sequencing has turned biology into one of the most data intense research disciplines in the world. While this age of "big data" is promising exciting new insights, it also threatens to outpace our ability to make sense of the flood of information and handle it efficiently. Here, one particular challenge is the use of high performance compute infrastructures and the detailed record keeping (data provenance) necessary for good scientific practice. Within this presentation, I will discuss the challenges of big data and how dedicated workflow systems can help accelerate bioinformatics, including some hands-on examples to show that the adoption of such purpose-built solutions do not need to be complicated.

11.3.2016 (Friday) 15:00 Transcriptomics Symposium

  • Rainer Kiko (GEOMAR):
    Accounting for differential RNA yield in RNA-Seq: a copepod example
  • Christian Wohle (IFAM, CAU):
    De-novo RNA-Seq analysis of non-model organisms: a foramenifera example
  • Wentao Yang (Zool. Inst., CAU):
    ABSSeq: a new RNA-Seq analysis method based on modeling absolute expression differences

12.2.2016 (Friday) 15:00 Dirk Fleischer, Kiel Marine Science

Start smart - Data capturing at the point of origin

15.1.2016 (Friday) 15:00  Tobias Lenz, Max Planck Institute for Evolutionary Biology, Plön

Evolutionary genomics of an optimal adaptive immune response

11.12.2015 (Friday) 15:00 Steffen Möller, University of Rostock

eQTL: intertwining disease decomposition and drug repositioning

Expression QTL (eQTL) further annotate disease-associated genetic loci with co-observed changes in the transcriptome. With drugs selected to compensate the disturbance caused for single loci, for a genotyped patient of a multifactorial disease one may derive a recipe for a drug cocktail. This presentation reviews resources available today and emergent algorithms, exemplified on murine data for experimental autoimmune encephalomyelitis, a mouse model for neuroinflammation.

13.11.2015 (Friday) 15:00 Wilhelm Hasselbring, Dept. Computer Science, CAU

Workflows for Scientific Data Processing and Publication


In this presentation, I'll present three related topics: (1) Our PubFlow approach to automate publication workflows for scientific data. The PubFlow workflow management system employs established technology. We integrate institutional repository systems and world data centers (in marine science). PubFlow collects provenance data automatically via our monitoring framework Kieker. In our evaluation in marine science, we collaborate with the GEOMAR Helmholtz Centre for Ocean Research Kiel. (2) Data processing in genomics: I'll briefly sketch bioinformatics tools such as Bioconductor and Galaxy, and indicate how these tools may be combined with advanced data-analysis systems for Internet-scale data processing such as MapReduce/Hadoop, including our own tools ExplorViz and TeeTime. (3) For good scientific practice, it is important that research results may be properly checked by reviewers and possibly repeated and extended by other researchers. I'll discuss publishing code, in addition to data.

Short bio:

Prof. Dr. Wilhelm (Willi) Hasselbring is professor of Software Engineering at Kiel University. In the competence cluster Software Systems Engineering (KoSSE), he coordinates technology transfer projects with (local) industry. In the excellence cluster Future Ocean, he is principal investigator and co-coordinator of the research area Ocean Observations.

9.10.2015 (Friday) 15:00 Anne Kupczok, IFAM CAU

Studying genetic heterogeneity within microbial populations using high-resolution metagenomics

We analyze a microbial symbiont community inhabiting Bathymodiolus mussels. The pattern of genetic variation among symbiotic populations is used to distinguish among modes of symbiont transmission. Therefore a high-resolution metagenomics approach is applied to a data set of multiple mussels. By cross-assembly and and binning into bacterial species, we find one highly abundant and one less abundant symbiont. Single-nucleotide polymorphisms (SNPs) are analyzed to quantify the genetic variation and population structure of the abundant species. We find that host-specfic SNPs are rather rare but population structure is present among the samples. We hypothesize that the observed pattern is caused either by geographic isolation or by selection during symbiont uptake into the host and symbiont maintenance over time.


9.7.2015 (Thursday) 15:00 David Ellinghaus, IKMB Kiel

A systematic cross-disease study of five chronic inflammatory diseases

11.6.2015 (Thursday) 16:00 Corrina Breusing, GEOMAR

Population connectivity and dispersal of vent mussels from the Mid-Atlantic Ridge

30.4.2015 (Thursday) 15:00 Elie Jami, IKMB Kiel

Characterization of the bovine rumen microbiome from birth to adulthood and its potential effect on host physiology

27.3.2015 (Friday) 15:00 Fabian Kloetzl, Max Planck Institute for Evolutionary Biology, Plön

Efficient Estimation of Evolutionary Distances

26.2.2015 (Thursday) 16:00 Oscar Puebla, GEOMAR, Kiel

Genomic atolls of differentiation in coral reef fishes (Hypoplectrus spp)

30.1.2015 (Friday) 15:00 Prof. Dr. Christoph Kaleta, Institute of Experimental Medicine, CAU Kiel

Microbial survival in challenging environments - Be quick or be social

18.12.2014 (Thursday) 16:00 Dr. Ben Krause-Kyora, IKMB CAU, Kiel

Microbial genomics from ancient DNA

5.12.2014 (Friday) 15:00:Dr. Ingram Iaccarino, Institute of Human Genetics UKSH, Kiel

Identification of novel downstream players in MYC-induced cellular transformation

30.10.2014 (Thursday) 16:00 Dr. Julien Y. Dutheil, MPI Plön

The evolution of primates X chromosome and Human-Chimp speciation.

26.9.2014 (Friday) 15:00:Dr. Giddy Landan, Institute of General Microbiology, CAU, Kiel

Origins of major archaeal clades correspond to gene acquisitions from bacteria.


The 13 higher taxonomical groups of archaea unexpectedly correspond to 2,264 group-specific gene acquisitions from bacteria. Interdomain gene transfer is highly asymmetric, transfers from bacteria to archaea are 11-fold more frequent than vice versa

Gene transfers identified at major evolutionary transitions among archaea specifically implicate gene acquisitions for metabolic functions from bacteria as key innovations in the origin of higher archaeal taxa.

New methodology:

Comparison of sets of trees without a reference phylogeny.

Tree compatibility measures tuned to detect a non-vertical, LGT, signal.

Statistics for integration of layered data with small per-layer samples.


267,568 protein coding genes of: 134 sequenced archaeal genomes, in the context of their homologs from: 1,847 reference bacterial genomes.

2014, June 6th, 15:00: Dr. Silke Szymczak, Institute of Clinical Molecular Biology, CAU

Comparison of variable selection methods in random forests for genomic data sets

2014, May 16th: Prof. Dr. Anand Strivastav, Institute of Informatics, CAU

Streaming Algorithms for Big Data Problems in Bioinformatics

2014, April 27th-30th: SMBE Satellite meeting on Reticulated Microbial Evolution

2014, March 18th, 15:00: Prof. David Bryant, University of Otago, New Zealand

Phylogenetic analysis of species radiations using SNPs and AFLPs. (Bio/Bioinformatics/Genetics)

Technological wonders such as next generation sequencing mean that we can now, in principle, obtain SNP (single nucleotide polymorphism) data from multiple individuals in multiple species. This promises enormous benefits for population genetic and phylogenetic analysis, particularly of closely related or poorly resolved species. My interest is in how to analyse these data effectively and responsibly. We have developed an algorithm which estimates species trees, divergence times, and population sizes from independent (binary) makers such as well spaced SNPs. The method is based on coalescent theory (like the BEAST software), though it uses mathematical trickery to avoid having to consider all the possible gene trees. As a `full likelihood' method, it should be more accurate than alternative FST based approaches. I'll talk about our experiences applying this method to AFLP data from alpine plants, and some recent discoveries about the usefulness (or uselessness) of SNP data for estimating population sizes.

2014, March 14th, 15:00:  Dr. Till Bayer. GEOMAR

16S metagenomic analysis of the coral microbiome

2014, January 31st, 15:00: Dr. Johann-Mattis List, Forschungszentrum Deutscher Sprachatlas, Philipps Universität Marburg

Using bionformatics to study the lateral component of language evolution

Ever since August Schleicher (1821-1868) first proposed the idea that the language history is best visualized “bei dem Bilde eines such verästelnden Baumes”, this view has been controversially discussed by linguists, leading to various opposing theories, ranging from wave-like evolutionary scenarios to early network proposals. The reluctance of many scholars to accept the tree as the natural metaphor for language evolution was due to conflicting signals in linguistic data: Many resemblances would simply not point to a unique tree. In the last two decades, historical linguistics has been experiencing a “quantitative revolution” and many automatic approaches from evolutionary biology have been applied to linguistic data. Given the important role that language contact and lexical borrowing play during language history, it is surprising that the majority of the new automatic approaches in historical linguistics assumes a strict “eukaryotic framework” for language evolution and only focuses on the reconstruction of language trees. I will argue that a “prokaryotic framework” for language evolution – based on biological network approaches that help to distinguish vertical from lateral processes during genome evolution – offers a fruitful alternative to current linguistic “dendrophilia” and provides more comprehensive insights into the complexities of language evolution.

2014, January 10th, 15:00: Prof. Dr. Bernhard Haubold, MPI Plön

Alignment-Free Tools for Genome Comparison

Whole genome sequencing has become routine. However, comparing whole genomes by alignment remains challenging. I therefore present three fast computer programs for comparing unaligned genomes. All three are based on calculating the lengths of exact matches between pairs of genomes. This quantity can be looked up efficiently by indexing sequences. I ex- plain how we combine genome indexing with mathematical modeling to construct programs for estimating pairwise substitution rates, closest local homologues, and detecting recombination.

2013, November 29th, 15:00: Launch Symposium II

Dr. Steffen Möller, Institut für Neuro- und Bioinformatik, Universität zu Lübeck: Computational Biology @ Dermatology in Lübeck

Dr. Volkmar Sauerland, Institut für Informatik, CAU:
Introduction:Research Group Discrete Optimization

Dr. Abhishek Kumar, AG Kempken CAU:
Marine Fungi: Application of Next-generation genome and RNA-Seq based methods for the exploration of cancer drug and other antibacterial natural compounds.

Prof. Dr. Tal Dagan, IFAM, CAU:
Introduction: Genomic Microbiology Group
Institute of Microbiology

2013, November 8th, 15:00: Launch Symposium I

Tal Dagan, Genomic Microbiology, IFAM CAU: Introduction

Andre Franke, CAU: Institute of Clinical Molecular Biology

Georg Hemmrich, CAU: bioinformatics resources

Ingo Thomsen, IKMB: Git and Git Server

Bernhard Haubold, MPI Plön: Alignment-Free Phylogeny Reconstruction

Ke Xiao,Plant Breeding Institute, CAU: Identification of bolting genes of sugar beet by whole genome resequencing