Phase Space Invaders (ψ)
With the convergence of data, computing power, and new methods, computational biology is at its most exciting moment. At PSI, we're asking the leading researchers in the field to discover where we're headed for, and which exciting pathways will take us there. Whether you're just thinking of starting your research career or have been computing stuff for decades, come and join the conversation!
Phase Space Invaders (ψ)
Episode 16 - Janusz Bujnicki: Structural modeling, RNA modifications, and advising policy-makers on science
In the sixteenth episode, Janusz Bujnicki tells me about his early switch to bioinformatics, a stroke of serendipity that defined his future career, and how he later managed to reincorporate both biophysics and experimental biology into his research agenda. We talk about the current state of the field of RNA structural prediction, and how we need to bring together physics and data to tackle the ever more complex biological questions that show up on the horizon. Then, we switch gears to discuss the fascinating world of science advisors for policy-making, or to what extent our scientific knowledge can be employed to shape long-term policies, including the blurry borders between science, philosophy and politics. I think it's a laudable aspiration for many of us to understand how we can have a tangible impact on the real world while keeping our humility in check.
Welcome to the phase space invaders podcast. Where we explore the future of computational biology and biophysics by interviewing researchers working on exciting transformative ideas. The last guest of season two is Janusz Bujnicki who serves as the head of the laboratory of bioinformatics and protein engineering. At the international Institute of molecular and cell biology in Warsaw. Janusz is known for combining his knowledge of bioinformatics structural biology or biophysics and experimental molecular biology to integrate data in ways that help solve real life research questions down the line. From structure prediction to molecular design, to basic research. He was also the review of my PhD thesis. But due to our busy schedules, I hadn't met him in person until last month. When we finally managed to sit down for a chat in Warsaw. So I ask Janusz about his early switched to bioinformatics, a stroke of serendipity that very much defined his future career. And how he later managed to incorporate both biophysics and experimental biology into his research agenda. We talk about the current state of the field of RNA structure prediction, and how we need to bring together physics and data. To tackle the evermore complex biological questions that show up on the horizon then we switch gears to discuss the fascinating world of science advisors for policymaking or to what extent our scientific knowledge can be employed to shape long term policies. Including the blurry borders between science philosophy and politics I think it's a laudable aspiration for many of us. To understand how we can have a tangible impact on the real world? While keeping our humility in check. With that in mind, let's go.
Milosz:Janusz Bujnicki, welcome to the podcast.
Janusz Bujnicki:Hello.
Milosz:So I remember you said you began your career in bioinformatics as a sort of side project, right? At the time where the whole field was just getting started. And, uh, thanks to that, you could also start your own research group very early on. And over these years, bioinformatics, structural biology, and simulations were slowly becoming increasingly intermingled, so to say, so that the Today is next to impossible to do one without at least touching all the others. So I wanted to ask first how this evolution, this convergence played out in your personal career to the point where you are now.
Janusz Bujnicki:Oh, wow. This is a very important part of my life, how it all started. When I was finishing my undergraduate studies, was the time when the first bacterial genomes were sequenced. I mean the first genomes were sequenced in general. first eukaryotes, first bacteria, and databases have only begun to become populated with genomic sequences and with the sequences of the predicted proteins. And this was the time of the emergence of the powerful search engines for biological sequences, such as CyBLAST. I have developed interest in sequence comparison already during my, master's thesis research. and this was the very first time when I, was, analyzing sequences, generating alignments. At that time, I was actually editing a sequence alignments in, text editor because of the absence of proper tools and, running first modeling programs to generate predictive models of protein structures. And I continued this. I mean, when I took my first shot at the, PhD, this was molecule biology research. So I was trained an experimental biologist. And I was so fascinated by bioinformatics that I essentially continued doing that kind of research. in my free time in the, in the evenings, in the afternoons on my laptop. And it turned out to be successful.
Milosz:Right. And so which year was that? Uh,
Janusz Bujnicki:Uh, it was, it was about the turn of the century, actually. so
Milosz:right. So,
Janusz Bujnicki:I, and then I, and transitioned to become a, bioinformatician. in about the year 2000, yes.
Milosz:right. And so as I said, you then started incorporating all sorts of structural insights, right? So now your lab features, cryogen facility. And quite a few simulational approaches to integrate, this bioinformatics knowledge into structural research.
Janusz Bujnicki:I mean, my roots are in experimental, uh, research. And I, in my early days of, of scientific research, I learned to be, um, a bioinformatician. This was basically like a self training. And I I knew the basics of programming, but I never really become a programmer. but I became more like a super user of various bioinformatics programs, especially for sequence comparisons, detection of remote homology, modeling of structures. And I specialized in the field related to enzymes acting on DNA, in particular, Restriction modification systems, which act on DNA, and they evolve very quickly. So in order to find relationships between them, one couldn't use just very simple approaches that detect very high sequence similarity. But instead, figured out the way how to use virus tricks to find relationships between proteins that are extremely, extremely diverged. basically my, area of interest combines It's experimental molecular biology with structural biology, bioinformatics and sequence comparisons, and evolutionary biology. And after I had a chance to start my own research group in 2002, I, it was, originally it was purely computational. I have very minimal resources, but then with time and with new grants that I acquired, I went back to my roots and reestablish an experimental activity in my lab. I no longer. worked in the wet lab. So probably my people would never let me, hold a pipette again. they would have to, have me restrained probably. but, yes, absolutely, we do combine experimental research in molecular biology and in structural biology uh, development of new methods. So I do have programmers in the, in the group with data analysis. And here we, the analysis of data ranges from sequence comparisons and evolutionary studies to modeling and to simulations and to sort of computational biophysics. So not just getting a structure for a molecule, but to, generate and explore the entire conformational landscape and understand how the molecule is behaving, in time and, according to different conditions. And this is not only what we do with simulations, but also with experiment. And cryo EM is a very good method to actually analyze not just a single conformation, but to study the different conformations of the molecule.
Milosz:Right, we are at this point where we're starting to realize, I think, that we will have to integrate those data first and physics first approaches, right? Because people try to do both of them separately, and they both have their shortcomings. And so how do you see this, this moment of integration?
Janusz Bujnicki:Um, I think a great moment because of the emergence of deep learning methods, which are kind of agnostic about the history of research. methods basically take the data from whatever source, so that if there's good data coming from physics based calculations and data coming from Experiments that are sort of, I mean, they're based on physical approaches, but they are agnostic to sort of physical formulas. We observe something and we present the data, then maybe we can learn something from it. what is extremely important is to have representative The data sets with good quality, and this is not always the case. And I have seen too many situations where, computational researchers would take just the kind of data that is available, and the data on many different things often so bad, there's just no way that a sort of brute force deep learning method without very careful, data cleaning would generate something that is truly useful with physical and chemical and biological relevance. But I do believe that this approach can can help us a lot. And also it can help scientists to break the walls of silos. I remember from my sort of early days of my adventure with molecular modeling, that there were two different schools of thought. one was physics based modeling. The other was more like comparative modeling, looking at the evolution and so on. And very often people with roots in the physics based approach would like totally ignore the evolutionary modeling. It's like, if you don't have a good physical model under your modeling engine, then this is surely worthless. It cannot be good. and then of course, they came, real life, tests, which have shown that sort of purely physics based approaches were almost worthless. predicting 3D structures of proteins, at least compared the other methods that were based mostly on evolutionary principles, sequence comparisons, and so on. So, I mean, this is no longer the case. Obviously, now people who are doing physics based modeling, obviously they do acknowledge what can be done with comparative modeling methods and the like, and vice versa. But there are also many other silos, and people coming from different fields and different schools of thought are often, dug deep down in their, in the sort of theory and the history generating or analyzing the kind of data in some kind of way. And now with the deep learning approaches, new people can come look at your data from a completely different angle and do something with your data that you may find blasphemous, but then it can actually lead to something useful. But very important is to understand what good data means.
Milosz:Yeah, I sympathize with that. I feel like a lot of the hype actually in, deep learning becomes irrelevant exactly because of the quality of data. So people, throw that, Oh, we're going to solve everything, but then realize how hard it is to integrate different streams of experimental inputs, right? That's why a lot of people try to curate their own datasets that are standardized and so on. And that's, that's why. where the new developments will come from, perhaps. Great. And then there's also the question of going beyond, just simple, sequence based models. As you alluded to in our previous conversation, there is the question of, say, RNA modifications, right, and how that, affects the structure, affects the functional landscape. And, again, your group, how Published Modomics, the database of RNA modifications, and you're working on incorporating this into structural prediction, right? Can you tell us more about this?
Janusz Bujnicki:as I told you earlier, my, my first focus was on enzymes acting on DNA, including DNA methyltransferases, which are enzymes that position a methyl group on different in, in DNA, affecting, not so much of the DNA structure, maybe a little bit, but more so, uh, DNA protein interaction. And then I discovered the world of RNA modifications, which is, uh, very rich in different chemical moieties. So, not only methylation, but many other types of chemical changes which can be, combined together into hyper modifications. So now there are, significantly more than 150, types of modifications are relatively common. There are also some, some additional modifications that so far they have been observed only in, single organisms or in very exotic species of, RNAs. And sometimes it is not clear what is their role. Some of the chemical structures may be actually artifacts of RNA preparation. Some of these chemical modifications may be actually the results of the, of the damage. So, yeah, so there's an interesting connection between chemistry and biology through structure here. true. I'm interested in RNA structure and the effect of RNA modifications on the, structure, but I'm also interested in how the modifications affect the interactions of RN. with other molecules. And here there's a, now we're coming back to the point that we discussed a moment ago. It's a combination of sort of data driven approach and a knowledge physics based approach. In our modeling efforts, the modeling methods we develop are to a large extent based on statistical potentials. So essentially we analyze the database of existing RNA structures, extract certain features, their frequency, then we, create a computational, model in which we compare the frequency of, observed values of a feature with the one expected by chance. And depending whether a given feature, behaves, I mean emerges, And if it's similar to what has been observed earlier, then it gets positive score. But when certain confirmation or interaction appears, which has been rarely observed, then it usually a penalty. This is a very big simplification of what we do. order for this approach to work, we need proper representation of a given feature. So feature is very rare in the data, then it's very hard. to, assess, it's, it's distribution in, in, in, in reality. And with RNA modification, this is the case. So there are not so many, RNA structured, non redundant RNA structures solved in the presence of modifications. Ideally you would like to have at least a few hundred cases of RNAs in which we have the structure solved with, and without the modification. There's nowhere near having this kind of a data set, So in order to generate, some statistics, we need to resort to computational simulations and use more physics based approaches. So it's actually to fill in the gaps in the experimental data. Now we are using, physics based simulations. to then generate statistics for the statistical model, and this process can be, of course, refined and iteratively improved. And the results of predictions have to be checked against the experimental data, whether we are going in the good direction or, or not.
Milosz:Right, that definitely sounds exciting. Also, the model that you use has been validated in the RNA CASP competition, right? A lot of people know about the protein CASP, but I think relatively few know about the RNA branch. Of the competition, which has been how many additions have been held so far?
Janusz Bujnicki:So now I think it's the 16th edition of, of CASP, which happens essentially every two years. So it's been going for a very long time. We have participated in CASP, at the beginning of the mean, not, I have not participated in the very first few, but, around 2000 and I think, I, we, 2010, we participated quite a few times with successes. So my, I was quite successful in, in predicting protein 3D structures. At that time, there were just enough RNA structures to present us as targets. Now there are more in particular, thanks to new cryo EM, Developments. And, since the previous class, cause 15, the organizers decided to expand to include RNA targets as well. and yes, and we, we did quite well. We, had a third execral position the ranking and actually the four groups that top ranking were all using, sort of conventional methods, not deep learning methods. The deep learning methods at that time were not. as good as the, as the conventional approaches, but maybe they will be catching up now. So, we are now combining the use of both deep learning and conventional, for instance, statistical potentials. But I have to mention that methodology. has been tested not only in, CASP, in the CASP, in the branch for RNA, but for much longer period of time in, another experiment, community wide experiment called RNA puzzles, which has been organized by Eric Vestrop from Strasbourg. And, there, this competition has been organized in a different way. So not like one prediction season over a couple of months in the summer of one year, and then nothing, but more like, on a rolling basis. So, so whenever there were new structures released, then they were presented to the modelers and in RNA puzzles, there was more, effort on, comparison of approaches and, learning what is working, what is not working, and less on a competition on, you know, who has won. So we never had such an, had such and obsession about the rankings in RNA puzzles. I mean, of course, I mean, for every target we did have a, separate ranking of predictions, and it did vary a lot, and we know which of the groups did well, whose predictions featured frequently among the good predictions, and we are among those few, groups. But, uh, there is nothing like a specific ranking of RNA puzzles because we think it's actually a little bit counterproductive. I prefer collaboration than competition, really.
Milosz:I see. yeah, I think, you say, uh, with the refinement of cryo EM protocols for RNA, right. All the deep learning, methods will get more data to work with, and maybe that's going to be the, thing that pushes them towards or primes them towards success eventually Yeah, I can also here recommend the CaspRNA special interest group channel on YouTube that your former
Janusz Bujnicki:PhD
Milosz:student
Janusz Bujnicki:Yeah. Student, PhD
Milosz:Started yeah, great talks there about the latest methods in RNA structural prediction and experimental processing and so on. switching to the other topic, you mentioned your experience of being a scientific counselor in policymaking on the European level, right? can you tell us more about how this works? Like how science or how scientific expertise can be translated into actual political decisions? Because this is probably something that's on many people's minds, but. I guess few people know how it works on the inside?
Janusz Bujnicki:Oh yeah.
Milosz:Oh yes, so we're switching gears now, and moving into a completely different
Janusz Bujnicki:world, directly related to what we have been talking about so far. yes, I mean, I have been science advisor. I think we, in this world, most people use the word advisor rather than counselor. and the worlds of science and meet in various different ways. most people are probably aware of, science policy, which is basically, defining policy. for science, so how to fund science, how to fund grants, how to support scientific careers, how to evaluate institutions, how to, plan the scientific development in the future in the strategic way, and so on. There is, science diplomacy, which is basically the use of science. in foreign relationships, like for instance, constructing special international programs to support international collaboration and international affairs, building international infrastructure so people from different countries can meet, and also to maintain um, of diplomacy, even if the political situation tense and scientists still continue talking to each other. And if international relations, you want to, you know, continue talking to your adversary, then very frequently scientists can be this diplomatic channel. but what I have been focusing on is something even different, to policymakers what science Can provide in terms of both Knowledge and recommendations on a given policy related topics. I have to distinguish politics from policymaking I mean policymaking developing rules policies for for a long time rather than for making quick political decision and you know Confronting other politicians for just for voting over certain issues. So it's more like strategic Policy, advice more and more for the long term. So this works in such a way that if policymakers have their science advisors they can ask them To contribute to the policy problems that pol the policymakers are facing. And very frequently these are very broad problems like, you know, climate change or the issue of of the oceans with microplastics or emissions of, CO2 from, from cars, or the problem of sustainability of food production and consumption or the pandemic and so on. And, uh. Obviously, and I think this was showcased by coronavirus pandemic, science is only one of the many things that the policymakers have to take into account, because in many cases, facts may matter less at a given moment than what people believe in. And the policymakers have to take it into account because it's not only about what is there, what the scientists can provide that the data that are telling us, but also how people would react to the data and what kind of decisions people will make. So, in this context, scientists, they want to be advisors, if they want to advise policymakers, they have to understand that science is only one of many, many different things that, have to come in together in order to, for the policymakers to make good decisions, good in the sense that they will actually, serve the purpose, that if the policy has to have a certain effect, then if we take science into account, this effect will be reinforced. So actually, so the policy will work as intended. So this is a lesson. This was for me, at least a lesson in, humility to an understanding how science is important, but actually how other things, are important as well in policymaking. And also that for, in order to provide scientific advice, we always. Scientists from different disciplines So we have different perspectives and from different perspectives have to work together to figure out a comprehensive solution because in particular, if we just focus on advice from the area of physics and computer science, and we ignore the input from social sciences and humanities, such an input can be useful only in very specialized technical situations. But if we are talking about broader societal issues, there is no chance a specialized scientific advice is going to be really useful. So this is why politicians don't have easy time talking to scientists. And likewise, scientists have easy time talking to policymakers because there are many different things. But there are mechanisms that work, and I have a privilege to be a member of the group of chief scientific advisors for the European Commission. we were doing exactly this. We were, as a group of chief scientific advisors, we were consulting the, scientific, world, the researchers working in the academia and various other institutions and in scientific networks and societies. And then we were translating what the science has to say on a given topic to policymakers and also making sure that we answer the kind of question that scientists can actually answer and not going into the territory where it's not the world of science, but of other things that the policymakers have to take into account.
Milosz:you have a concrete example of what kind of questions you are faced with and what kind of answer, like what is the breadth of the answer that you can provide as a panel?
Janusz Bujnicki:Um, sure. So one of the tasks I had quite early my tenure in the in the group of Chief Scientific Advisors was to was to lead the preparation of an explanatory note gene editing versus genetically modified organisms. So the question was formulated in a bit more technical way. It concerned applications in, agriculture, not all the genetic thing that can, that exists. So,, editing of plant, organisms. animals and microorganisms in the context of the of the agriculture and the question was essentially given scientific data given the legal context of how we define in the european law What a genetically modified organism is are is the process of gene editing the same as the process of generating genetically modified organisms. And here, the very important issues that arose were, for instance, that we need to distinguish between the final outcome, the product of a process, and the process itself. And if these two are mixed up, then there are very serious problems. And in the legal terms these things are mixed together. While scientifically there is an issue because we could have exactly the same genetic material and a cell that is identical on the, on the genetic level, which could be a product of many different processes. these different processes are currently regulated by law in a very different way. Which, from the scientific point of view, in many scientific contexts, doesn't make sense. And another issue is the definition of what is natural, which is in the law on genetically modified organisms. How do we define what natural means? Especially given that most of the current agricultural products have been very heavily modified. Throughout the history of agriculture and they are completely unlike the original animals and plants that the humans encountered So what is natural can actually mean different things to different people for many of us A thing that is natural is something that we remember from our childhood But it's not necessarily the real product of nature that would be here with us if humans Have not created our civilization and our industry and and everything So there were many different issues where actually there were questions that science not really answer in a straightforward way because the science it's not the right tool to to say what is natural in all the contexts I mean, science can provide some feedback into definition of what science is, like if we say, if we assume that we concern natural as this, this, this, then there's the consequence. But if these assumptions, these scientific assumptions, are not fulfilled in a given context, then the answer may be different. Another thing was also to, in the context of natural processes, at the time when the law, for genetically modified organisms was created. The knowledge about genetic sequences was very small. As I mentioned at the very beginning of our discussion, when I started my scientific career, there were only a few genomes sequenced and on a handful of of genomes, some proteins in the databases. And, we didn't really know the extent of the genetic exchange between different organisms. So some of the cases of so called horizontal gene transfer, where a piece of the DNA that was present in one organism can end up in another organism just because the DNA was released from one cell and then somehow entered another organism randomly and got incorporated other organism's genome, like between totally different organisms, bacteria and plants, humans and animals, and so on. does happen, it does happen naturally. It does happen even without human intervention. So now the question is, okay, should we consider such naturally existing, according to biology process, natural? And if we replicate it with existing enzymes, like for instance, the CRISPR Cas methodology, which has emerged naturally, if so, if we use naturally existing tools to replicate naturally existing processes, Is this natural or not? Okay. So here, some of these questions are not the type of questions that we as scientists can answer in the whole context of policy, but we do provide answers on some of the aspects of it. And I think this explanation was essentially, you know, showing that what, what are similarities. What are the differences? And it was useful for policymakers. So the, commissioner for science, Carlos Moedas, was quite happy with the explanations because then at the end of the process, he could understand and then tell other policymakers that, the gene edited, products are not the same as the products of gene modifications. I mean, they can be to some extent, but if there's not, one cannot put an sign between the two. And was easier for the policymakers to understand what the potential similarities could be. I could go on and on and on about this process, but we have limited time. So let me stop here.
Milosz:Right. Anyway, very timely and very impactful questions, right? And bleeding into philosophy. I think we had some conversations on the podcast about the, let's say, social obligations of a scientist. I think it's very important to consider these, right? In which ways we can pay back the society for supporting science, in which way we have obligations to educate the public or educate policymakers or, you know, affect how things get funded and so on. I like this angle of the conversation. Um, okay. Anyway, thank you so much for your expertise and your time here. Janusz Bujnicki. Again, it was great to have you on the podcast.
Janusz Bujnicki:Thank you very much. And I hope we'll talk again
Milosz:sure. Definitely. Have a great day.
Janusz Bujnicki:Thank you. Bye bye.
Thank you for listening. See you in the next episode of Face Space Invaders.