As my interest in the field of computational biology is mainly in the computer science aspects, I have not focused on one single biological concept. Rather I have been involved in an array of different problems, the common ground being the attempt to develop non-heuristic algorithms and examine complexities of the problems. Most of the work has been done in collaboration with Christian N. S. Pedersen. Some of the papers listed below have also been published as technical reports in the BRICS report series. Unless deviating significantly from the otherwise published paper these technical reports have been left out of the listing. They can, however, be obtained from the BRICS publications home page.
For papers I have presented at conferences I have made available the overheads from the presentation in as well postscript as LaTeX. The purpose of this, of course, is to make it easy for other people to scavenge for any illustrations or other material they might find useful. It should be mentioned that most of the illustrations are hardcoded in LaTeX (usually using the picture environment and packages such as PSTricks). If you experience any problems with the LaTeX code feel free to contact me (note: some of the files might complain about one or more logo graphic files not being available during compilation; you can either ignore this, or remove the lines where the logos are included from the code). If you are so unfortunate as to be on a computer system where a PostScript viewer is not standard, and that system happens to be a MacIntosh or Windows system that you have some say over, I highly recommend downloading and installing GSView.
We have developed an approximation algorithm for the structure prediction problem in the 2-dimensional HP model that always performs at least as well as the Hart/Istrail 1/4-approximation algorithm. Though we have not been able to prove that it has an approximation factor larger than 1/4 we found a correspondence to a neat problem &emdash; a problem we call the circle problem &emdash; that might aid the analysis of the performance. The work is described in
Prediction of Protein Structures Using Simple Exact Models, Rune B. Lyngsø and Christian N. S. Pedersen.
We have discovered some problems with transforming solutions to the circle problem to solutions to the structure prediction problem so currently we only know for sure that the circle problem can yield an upper bound on the approximation strength of our algorithm. We believe that a new method for the transformation can be developed but as it involves handling a host of special cases this work is currently filed under later. A short summary of our current status was published in
Protein folding in the 2D HP model, Rune B. Lyngsø and Christian N. S. Pedersen, accepted for short presentation at JOBIM'00.
While working on incorporating some types of pesudoknots into the classic dynamic programming algorithm for RNA secondary structure prediction we discovered that with currently used energy functions this algorithm actually took time O(n4) and not as usually claimed O(n3). This inspired us to develop a O(n3) algorithm for RNA secondary structure prediction based on currently used energy functions. The algorithm and associated program is presented in
Fast evaluation of internal loops in RNA secondary structure prediction, Rune B. Lyngsø, Michael Zuker and Christian N. S. Pedersen, published in Bioinformatics, vol. 15(6), pp. 440–445,
and in
Internal loops in RNA secondary structure prediction, Rune B. Lyngsø, Michael Zuker and Christian N. S. Pedersen, presented (overheads available in postscript and LaTeX) at RECOMB'99,
we use it to examine the reasonability of previously used cutoff limits for internal loop size. The two articles have been merged in
An improved algorithm for RNA secondary structure prediction, Rune Lyngsø, Michael Zuker and Christian N. S. Pedersen, published as BRICS report RS-99-15.
The algorithm has been implemented in a C program called ZUKER (a recursive acronym for ZUKER &emdash Unlimited Ken Energy-based RNA-folding).
The work on incorporating some types of pseudoknots in the classic dynamic programming algorithm appears in
Pseudoknots in RNA Secondary Structures, Rune B. Lyngsø and Christian N. S. Pedersen, presented at RECOMB'00,
together with an NP-completeness proof for predicting secondary structures with general pseudoknot interactions in a simple nearest-neighbour model. An extended version of this paper,
RNA Pseudoknot Prediction in Energy Based Models, Rune B. Lyngsø and Christian N. S. Pedersen, published in Journal of Computational Biology, vol. 7(3/4), pp. 409–428,
provides an NP-completeness proof using more realistic energy rules and a more thorough elucidation of the ideas of the proof. Furthermore, the algorithm allowing certain types of structures with pseudoknots has been replaced by a mini-review of three other algorithms for rigorous, energy based pseudoknot prediction, as it came to our knowledge that Tatsuya Akutsu has developed an algorithm superior to ours. A paper by Ieong et al. has lead to a further investigation of the computational complexity of pseudoknot prediction in simpler stacking based models. These results are presented in
Complexity of Pseudoknot Prediction in Simple Models, Rune B. Lyngsø, accepted for the 31st International Colloquium on Automata, Languages and Programming.
On a slightly different note I have also been involved in some work on predicting the common secondary structure of a set of unaligned RNA sequences. This work is presented in
A Mini-Greedy Algorithm for Faster Structural RNA Stem-Loop Search, Jan Gorodkin, Rune B. Lyngsø and Gary D. Stormo, presented at the Twelfth International Conference on Genome Informatics.
We have improved the Jotun Hein algorithm for combined DNA- and protein-level alignment from O(n4) to O(n2). This work is presented in
Comparison of coding DNA, Christian N. S. Pedersen, Rune B. Lyngsø and Jotun Hein, presented at CPM'98.
We have extended the methods for detecting (maximal) pairs in a sequence to allow for constraints on the distance between pairs, constraints that can depend on the size of the substring of the pair. This work was first presented in
Finding maximal pairs with bounded gap, Gerth Stølting Brodal, Rune B. Lyngsø, Christian N. S. Pedersen and Jens Stoye, presented at CPM'99.
An extended version has been published as
Finding maximal pairs with bounded gaps, Gerth Stølting Brodal, Rune B. Lyngsø, Christian N. S. Pedersen and Jens Stoye, Journal of Discrete Algorithms, vol. 1(1), pp. 77–103.
The string statistics problem is the problem of given a string s build a data structure that allows us to determine the number of non-overlapping occurences of any string t in s in time bounded by the length of t. This problem has previously been addressed by Alberto Apostolico and Franco P. Preparata that presented an O(n (log n)2) solution. We present an O(n log n) algorithm in
Solving the String Statistics Problem in Time O(n log n), Gerth Stølting Brodal, Rune B. Lyngsø, Anna Östlin and Christian N. S. Pedersen, presented at ICALP'02.
I am currently working on extensions of our previous work on (efficiently computable) measures on hidden Markov models. This work was presented in
Metrics and similarity measures for hidden Markov models, Rune B. Lyngsø, Christian N. S. Pedersen and Henrik Nielsen, presented (overheads available in postscript and LaTeX) at ISMB'99,
and
Measures on hidden Markov models, Rune B. Lyngsø, Christian N. S. Pedersen and Henrik Nielsen, published as BRICS report RS-99-6.
The methods described for comparing left-right models have been implemented in hmmcomp.
Peter Bro Miltersen asked us whether our methods for computing distances between the probability distributions of two hidden Markov models could be modified to handle other distance measures. More specifically whether our method for computing the L2 distance could be modified to compute the L1 distance. Looking into this we discovered some hardness results for the L1 and L∞ distances as well as finding the string a hidden Markov model is most likely to generate. These results are presented in
Complexity of Comparing Hidden Markov Models, Rune B. Lyngsø and Christian N. S. Pedersen, presented at ISAAC'01.
An extended version of this paper is published as The consensus string problem and the complexity of comparing hidden Markov models, Rune B. Lyngsø and Christian N. S. Pedersen, Journal of Computer and System Sciences, vol. 65(3), pp. 545–569.
When I moved in in an office with Arun Jagota he asked about my previous work. I told him about the comparison of hidden Markov models, and he suggested looking at extensions to e.g. stochastic context-free grammars. This lead to a generalisation of the CYK algorithm for parsing a sequence by a stochastic context-free grammar to allow parsing of a hidden Markov model. This work is presented in
Comparing a Hidden Markov Model and a Stochastic Context-Free Grammar, Arun Jagota, Rune B. Lyngsø, and Christian N. S. Pedersen, presented (overheads available in postscript and LaTeX) at WABI'01.
The version published in the proceedings of WABI'01 contains some errors corrected in the version available for download above.
This work, as well as the work on comparing hidden Markov models, were presented at a poster (available in postscript and LaTeX) at the 2002 meeting of the Program in Mathematics and Molecular Biology, Modeling Across the Scales: Atoms to Organisms.
We have developed a method for normalising microarray data. When compared to robust loess fit normalisation in our experiments so far, it seems never to perform significantly worse, and in situations with extreme fractions of differentially expressed genes and bias between up and down regulation to be clearly superior. It runs significantly faster, even on medium sized data sets, easily allowing normalisation of data set with hundred of thousands of data points. This work is not published, but the code developed for the experiments is available as the IStVaN library.
A friend of mine has been in charge of the electronic registration of finds at an archaeological excavation. He wanted to know whether there was a method for computing the volume of a soil block. There is. We implemented a simple O(n2) method. This work is described in
Nemi, Loc. Santa Maria, and the Application of Computer Technologies in Field Excavations II: The Database, Søren F. Andersen and Rune B. Lyngsø.
One thing I regret about this work, though, is that it didn't land me a trip to Nemi, Italy.
Some years ago we were asked by two people working in breeding research about a problem that is a slight variation on the (bipartite) b-matching problem. Our solution is presented in
The polygamic marriage problem, Peer Berg, Bernt Guldbrandtsen, Rune B. Lyngsø and Christian N. S. Pedersen.
Currently we are waiting for the breeding researchers to finish their experiments with the code based on the solution.