Dr. Radoje (Rade) Drmanac, chief scientific officer and co-founder of Complete Genomics, is a research scientist and inventor in the field of human genome sequencing, including techniques such as DNA sequencing-by-hybridization (SBH), genomic micro- and nanoarrays, combinatorial probe ligation, and long fragment read (LFR) process for accurate whole genome sequencing and haplotyping from 10 human cells. In 1994, he co-founded Hyseq (later Nuvelo) where, as chief scientific officer, he led the effort to discover and patent thousands of genes which formed the basis of Nuvelo’s drug development pipeline. Prior to Hyseq, Rade was a group leader at Argonne National Labs from 1991 to 1994 as part of the Department of Energy’s Human Genome Project. He completed his postdoctoral studies in 1990 in Hans Lehrach's group at the Imperial Cancer Research Fund in London. He earned his Ph.D. in molecular biology for the conception and pioneering development of SBH technology from Belgrade University, where he also received B.S. and M.S. degrees in molecular biology.
In the years of the development of gene sequencing, he is a witness and founder in this industry. He said it is so exciting that MGI is such a young company, and it is also the beginning of this new field. In the interview, he talked about the development of this industry. His eyes light up and he is full of confidence.
Q1. When were you interested in sequencing technology for the first time?
A long time ago, when I was doing my master's degree. I was cloning one gene, and thinking like it is not fun to do gene by gene, takes a year just to clone and sequence one gene. So I was thinking it must be a better way, like if we sequence a whole genome we have all the genes, we don’t need to do it gene by gene.
In 1987 then I got this idea of the sequencing by hybridization. That was actually, we can say, the first massively parallel sequencing method. So that was the one of these new ways to sequence genomes routinely and to obtain large amount of sequence data using DNA arrays. It switches from the electrophoresis of one piece of DNA to massively parallel sequencing millions of DNA segments on an array. In the sequencing by hybridization sequence readout was by little oligonucleotide probes hybridized to DNA. But fundamentally, that was massively parallel sequencing using microarrays. The initial array solution was using the micron size beads with emulsion PCR to generate clonal DNA cluster on each bead. After emulsion PCR the beads were put on the surface and hybridized with labeled probes to read sequence. So that was 1987-1989 period when I think the beginning of this high-throughput genomics and large scale sequencing was initiated.
Q2. What kind of advancements will sequencing technology bring to biological research? And what difference will it make to our daily life?
That is a very good question. Technologies need time to develop and then have the impact to medicine, improving health and many other aspects of our lives. What is interesting about genetic information is that our genome is affecting all the aspects of our life. It is not the destiny, but it is a very sophisticated program needed to define how our tissues work. The good thing about the genetic program is that we carry our genes in ourselves. You know, the sooner we read the program – and of course we need more studies to understand it – the earlier we can predict the future of our body and mind. With that type of thinking, we can extend use of sequencing to additional aspects of monitoring health. So the first goal is to read our genes, our inherited genes. That defines our predisposition – good and bad – predisposition to amazing abilities, and sometimes it can be predisposition to some terrible diseases, and everything in between.
But also the same technology – the high-throughput sequencing that MGI is continuously developing – can enable health monitoring. We can do many tests – the more affordable the sequencing we have, the deeper and more comprehensive tests we can do – like cell-free DNA can tell us if we have some dangerous mutations, uncover mutations we have; deep analyses of our gut and other microbes can tell us whether we have healthy microbiome; single cell – RNA sequencing of immune cells can tell us how the immune system is functioning or if something is happening to it, because in addition to our predisposition, our genetic program is an open program. We have to frequently monitor the state of our genetic program in our tissues.
We need to monitor health – that is the thing we couldn’t do effectively before – we could do simple things like blood pressure, sugar in the blood. But it is molecular monitoring that can tell us the real state of our tissues – it is the new and more precise monitoring that deep sequencing allows. What would that allow? When we monitor health and see some changes early on, you know, we can do prevention of the diseases – prevent them as much as we can.
There are more and more essays – including an exciting recent paper, that would use the high-throughput sequencing. One of the new methods is to do some DNA barcoding so that you can monitor every molecule in the cell and distance of these molecules; these barcodes are defusing and linking at a rate that reflects the distance. It is really amazing. They call it DNA microscopy. You cannot do a regular microscopy at that scale, at the nanometer distance easily – but with barcoding, you can do that. I am excited, you know, some people like George Church, they developed in situ sequencing and some other in situ monitoring. But this DNA microscopy is an interesting one because we first do the barcoding, and then we do the regular sequencing not in situ sequencing. So there are many new technologies, many different deep sequencing tests, in addition to whole-genome sequencing. And whole-genome sequencing is also progressing. It used to be 90% of variant detected are good genomes, and now we have our PCR-free –True Genome sequencing method that reduced the number of errors from hundreds of thousands to a few hundred. The improvements are really amazing, going toward a perfect genome using our technologies, like single-tube long fragment read, stLFR, for de novo assembly of individual human genomes, we don’t need the genome reference any more.
In addition, the de novo assemblies can make maximally complete and accurate genomes. This is needed, because when you sequence your inherited genome, it better be accurate. Any errors can lead to wrong predictions, and due to incomplete detections we can miss some important predispositions. So we are very excited about these technologies we have to get to the perfect genomes and all these other monitoring tests, using high-throughput massively parallel sequencing on our DNA nanoball arrays. It took 30 years or more to get to this stage, but the exciting thing is that it is still the beginning of using genomics in healthcare. For another 30 years, we can see big growth of MGI developing ever better technology, more accurate, more affordable, and a lot more applications. It is really exciting – MGI is such a young company. It is also the beginning of this new field.
Q3. What do you think of sequencing technology's future development?
Yeah, good topic to discuss. There is really amazing progress already. If you think about where we’ve started with capillary or even gel sequencing of a few samples at a time. Uh, and then sequencing whole genome for a billion dollars at the beginning of this century. Then In 2010 Complete Genomics published in Science our first whole genome sequencing, where our material cost was only five thousand dollars. At that time there are a couple of other companies such as Solexa, that had the genome sequencing cost of fifty thousand dollars. So this was a breakthrough as a tenfold reduction in the cost of sequencing. And since then we are below one thousand dollars and in BGI close to five hundred dollars. So that's another ten fold cost reduction. So from millions of dollars per genome to fifty thousand to five thousand to five hundred. And the price will inevitably continue to go down. I firmly believe MGI will be the first company to provide hundred-dollar genome.
Why is hundred dollars genome important? Not that personal genome sequence is not worth more than hundred dollars. You know, if you think about sequencing a person’s genome and used it for life, in my opinion, it is worth tens of thousands of dollars to one hundred thousand dollars. But hundred-dollar genome is a matter of affordability. Many people cannot afford thousand dollars to pay for their critically important genetic information. Another aspect of the lower sequencing cost is to enable big databases.
So if we have a low cost genome, you know, we can then assemble this enormous database. And sometimes we think that millions of people is enough. I think it'll be billions, you know, such an enormous database will enable reverse engineering and full understanding of all aspects of our genetic program, our human program. It’s a very complex program, but we know that we can understand it and reverse engineer it, and know exactly what it means if you have given a gene variant. So to be able do that type of large scale project we need extremely large omics data base and then include artificial intelligence.
The future of the sequencing technology is definitely to furder reduced cost increased throughput and quality. And our T7 instrument is, as you know, a perfect example toward that goal. Our DNBSEQ is one of the best massively parallel technology, because the DNA nanoballs are so small about two hundred nano millimeters. So we can have this higher density array, have more information on the same surface. T7 is about seven hundred nanometers. T10 is already five hundred nanometers. And we can go down to four hundred and maybe even a smaller distance. So that allows us to get more information with the same volume of reagent on the same chip surface. Terabase of sequence becomes less and less expensive. In addition, because our DNBs pack 300 copies in just 200nm ball they are 5x brighter relative to background than PCR clusters enabling highly cost-effective and accurate sequencers.
The other dimension of the future development is sequence quality. So what we realized that the technology that we use now, that uses the label nucleotides and polymerase has some difficulties in incorporating them. It's quite good becuase we optimize the enzyme, but nevertheless, still there is a conflict sometimes between the label and the polymerase. And especially with removing the fluorescent dyes. There are a few atoms left as a scar on the base, so we don't have a natural base anymore. So what we proposed couple of years ago is a new way to do sequencing, a new chemistry, where we use natural basis, you don't put any label on them, no base modifications.
We call this new technology CooNGS or CoolMPS—cool massively parallel sequencing. I actually prefer CoolMPS. “Cool” is because we don't use the label (hot) nucleotides, just unlabeled (cold) nucleotides. When I described this chemistry to George Church and other scientists, they're so excited that it's possible to use unlabeled nucleotides and that we have such specifically developed antibodies that recognize natural bases without any label. So when I would explain that, they would say, that's cool. I said that's why we call it CoolMPS.
This is an example of the future of MPS. We already have demonstration that the accuracy is higher and the read length will be longer than standard chemistry. So, after improving accuracy and providing the most accuret sequencing at low cost and high throughput, then what is left after that is developing myriad of applications.
We have many advantages because we don't use PCR in making DNA nanoballs. We use rolling circle replication. In rolling circle replication, the Ph29 polymerase with strong strand displacement is making hundreds of copies always by coping the original circle, the original template. So we don't make copy of the cupy and because of that. this is the only massively parallel sequencing that doesn't make clonal errors.. Only the clonal errors are detectable. You know, there is maybe one out of three hundred copies in a DNB that has an error, but that doesn't matter, we detect correct signal from the other 299 copies.
So, now that we have error-free arrays, if we do PCR-free libraries as we already developed such process, and have kits for making the PCR-free WGS libraries, then, for the first time, we have a true PCR-free whole genome sequencing.
What is going to happen next many other applications will convert into PCR-free process, such as RNAseq or microbiome. We can do this because our efficiency of DNB loading on the arrays is so high. We can convert eighty percent of molecules in circles. And then ninety five percent of circles in the DNBs and than ninety percent of nanoballs can be arrayed. This is not possible with PCR clusters, some 90% of molecules are not converted into clusters. So practically, there is a very small loss of starting molecules, practical, almost fifty percent of initial molecules is sequenced. Because of that, we don't need PCR. We don't need to amplify sample DNA. We just take natural molecules, make circles and DNBs and sequence them.
So I see conversion to no PCR in all of these different libraries. PCR-free libraries sequenced on DNBSEQ platform do not need molecular barcodes (UMIs) that are use primarily to avoid PCR library and PCR-cluster errors. Every read is becoming informative and usable, we do not need to find several reads with the same UMI. Furthermore, DNB making has no molecular mechanism of index switching and allows us to pool hundreds of indext samples into one sequencing reaction.
We also developed recently stLFR technology. LFR, for "long fragment read", is a powerful barcoding technology we have proposed several years ago. The idea is to use in a new way massively parallel reads, because we can sequence billions of them at low cost and they are accurate butt they are relatively short. You know, even we five hundred base reads are still relatively shorter the length of the genome. So what this technology does is to use initial long fragments, like hundred kilo bases, two hundred kilo bases, and to add to every sequencing template, every sub fragment from that long molecule the same barcode.
We call this process co-barcoding, because all of the templates from one along fragments get the same barcode. These sub-fragments are co-barcoded. So what we develop recently is stLFR technology where we for the first time have practically unlimited number barcode, we create three billion of these barcodes on microbeads. We then take hundred million out of three billion and put them put in a single tube with ten million long genomic DNA fragments from one sample.
Almost every long fragment will get its own unique barcode because we have more barcodes than long fragments, hundred million barcodes for ten million fragments. So for the first time, we can have unique co-barcoding, This was never achieved before, stLFR is the best technology for this type of barcoding. It's much simpler too, no need for the oil emulsion, or special instruments. It's also really inexpensive. It's practically like making a regular library, slightly higher. But what this unique barcoding allows us for the first time is to do really efficient haplotype-phased de novo assembly of personal genomes. It enables "perfect" genome for all. Now we can also sequence full genomes of hundreds of new species, fish or bird or other species because we can it's not like I have to do one year of different mate-pair library preparations
Making a library is just one single tube reaction. And then you can do de novo assembly. So, you can see this sort of applications enabled by efficient stLFR barcoding, more efficient sequencing on DNB arrays with better CoolMPS sequence chemistry altogether will give us better quality, low cost, higher throughput. So at the end, we will have unlimited sequencing. DNA sequencing will not limit the precision healthcare development. The limiting factors will be something else. Delivery, doctor's visits or sample collection or something like that, not sequencing, we'll provide unlimited sequencing.
Q4：What is your most impressive moment in MGI and is there any wishes you want to make to MGI?
It’s very difficult to pick the most exciting one. There are so many. Whenever I talked about MGI, there are sort of feelings that we can do these wonderful things; we can do genome and trans-omics for all. This goal and business approach is one of the most exciting things for me about BGI and MGI. it is not about how to quickly make money and retire. It is really about how to improve lives for millions of people. As Wang Jian told us many times, if we can define an important need to solve, forget about difficulties of getting funding or profit, just focus on it to develop it. The profit will come naturally. If we have a good product, if we solve an important need, the profit would come.
So in many exciting moments, to hear MGI for two hours, talking about from atom to DNA, to your health. It is exciting to see passion and we’re sort of contact to improving health. While in the biological age, after industry age, information age, now it's DNA age.
In terms of complete achievement, it was really exciting when we presented BGISEQ-500 at ICG Conference. Just having the first DNBSEQ instrument develop in BGI. It was so exciting to see Xun like a rock star on the stage, with lights flashing. We then developed MGISEQ-2000, and now we have DNBSEQ-T7. Liu Jian did a great announcement at the last year ICG.
Another exciting moment is when we participate at AGBT, ESHG and other conferences, we have a lot of people coming to us. It is so exciting to see the solutions we bring. We are solving so many fundamental problems. It is not just one solution, but it is sort of a whole new platform with many different choices for different application. And different business attitudes to work together and implement these exciting health-monitoring applications.
It is exciting to see that our users and customers are excited. Because we are working for new moms and new kids. I have three grandkids. They were born with NIFTY tests, which we did not have 5 or 10 years ago. It is really exciting to see these developments.
There will be more exciting moments in the years to come. CoolMPS, is one of those growing excitements, when people hear about it and how it solves limitations of current sequencing chemistry enabling more accurate and more affordable sequencing, Excitement never ends.
Grow, grow, grow. Let’s achieve this wonderful vision. BGI is now 20 years ago. It is really amazing how a company survived from a humble beginning to the biggest genomics company in the world. I’ve heard that Wang Jian slept by the sequencers in the lab. He would wake up at midnight to start a new sequencing run. That was how a company started.
I came from Serbia, I went to US, I came now to China. I have experience of very different world regions. It is exciting to see the whole world is integrating and economically developing. It used to be just a few developed countries in the West. I hope, we can bring our sequencing for health monitoring to India, my Serbia, Africa as we are doing it already. There are no reasons why all people would not have their genome sequence. I even declared that every kid deserves genome sequencing. Every kid really deserves their genome sequence. Because that would help them live better. It is not to be decided by the government, parents or any other party. They should just have it as natural right.
Go BGI, go MGI. We have lots of things to do. It is very exciting to work in this field. Genomic based precision healthcare is coming. We will live 100 years healthy, or even 150 years. For Wang Jian and me, we need to reverse aging. You guys are so young. You have more chances to take the advantages of future precision medicine to keep you young longer. That’s the most rewarding aspect of our work.