Host: Benjamin Thompson
Welcome back to the Nature Podcast. This week, the AI that designs computer chips…
Host: Noah Baker
And zooming in on the 3D structure of DNA. I’m Noah Baker.
Host: Benjamin Thompson
And I’m Benjamin Thompson.
[Jingle]
Interviewer: Benjamin Thompson
Whatever device you’re using to listen to this podcast on right now, assuming it’s not a wax cylinder, I think it’s safe to say that it’s being controlled by a microchip. Often the size of your thumbnail or smaller, these chips are fabulously complicated and they’re only getting more so as technology advances. On each one are millions or perhaps billions of tiny components and squeezing them all on is a headache for the people who design them. But that might be beginning to change. This week in Nature, a team from Google reports a new way to do this using an AI. To find out more, I spoke to Anna Goldie, one of the researchers involved, and began by asking why designing chips is so difficult.
Interviewee: Anna Goldie
This is a very complex process and partly it’s because there are billions of components and you’re trying to fit them into a very small area but also because of the sheer number of constraints that you’re attempting to optimise for. The area has to be as small as possible because that relates to the cost of manufacturing. And then there’s the heat profile, so if a certain configuration would overheat, it also can’t be manufactured. And then there’s how much power would be consumed by this particular layout which is highly related to the cost of deploying data centres and the environmental impact of the chip. And so, given all of these different constraints, humans are spending up to 8-9 months just to generate a layout that fulfils all the constraints.
Interviewer: Benjamin Thompson
So, humans clearly at the moment have a big role to play in designing computer chips. People have been trying to sort of automate this process for a really, really long time.
Interviewee: Anna Goldie
Yeah.
Interviewer: Benjamin Thompson
But haven’t necessarily been successful, right?
Interviewee: Anna Goldie
That’s correct. And of course, chip floorplanning is just part of the chip design process but just chip floorplanning on its own, there’s been five decades of research attempting to basically surpass humans.
Interviewer: Benjamin Thompson
That’s where your work comes in this week. You’ve come up with a rather different method to do it and it seems to be a lot quicker. What’s the kind of headline for what you’ve achieved?
Interviewee: Anna Goldie
Okay, so, in far less time than human experts, our deep reinforcement learning method can generate layouts that fulfil all of these constraints and are comparable or even superhuman in their quality.
Interviewer: Benjamin Thompson
Well, let’s talk about then. So, you’ve developed this machine learning algorithm and I think from what you said in your paper it thinks a bit like a game. Each of these different components is like a piece, like a chess piece I suppose, and then it tries to figure out where they can all go to win the game, and sort of the game is to design a computer chip. How does it go about doing this?
Interviewee: Anna Goldie
The deep reinforcement learning agent starts out with this blank canvas, this empty board, and it places each of the components of this chip one at a time onto the canvas, and at the very end it gets a score – a reward based on how well it did at placing. And it uses that reward to sort of punish itself or reward itself for having done well, update all the weights of its neural network model and then give it another go.
Interviewer: Benjamin Thompson
And that score then is based on things like how hot does the chip get or how efficiently does it run and if the scores are high enough the AI thinks it has won, which you call its win condition. And yet a win condition looks different depending on the type of chip you’re trying to design.
Interviewee: Anna Goldie
So, in some sense, each chip is like its own game because it has its own set of pieces and actually its own win conditions. Like if there’s some chip that’s being deployed in the data centre then really what matters is maybe power consumption, but if you’re deploying a chip, say, a self-driving car then maybe latency matters a lot more, otherwise you won’t detect that pedestrian quickly enough. So, in some sense, like making our algorithm generalise across these different contexts was a much bigger challenge than just having an algorithm that would work for one specific chip.
Interviewer: Benjamin Thompson
And so, of course, you could see where the machine learning algorithm, where the AI, placed the different components on the chip. Did it do it in a strange way, in a weird way? Did it throw up any surprises compared to how a human would go about designing a chip?
Interviewee: Anna Goldie
Oh, absolutely. The placements that this RL agent generates are very strange, very alien, to humans. Our physical designers, when they would first see these layouts, they’re kind of these rounded, organic-looking, curved placements, they thought there’s no way that this is going to be high quality. They almost didn’t want to even evaluate them. They were worried that maybe these wouldn’t be even manufacturable, but that was also not the case.
Interviewer: Benjamin Thompson
So then, this is not just an on-paper exercise. You’ve actually taken some of these designs and put them into manufacture. Maybe you could tell us a bit about that?
Interviewee: Anna Goldie
This Google Brain research team that I was part of, we are deeply embedded in the TPU team, which is this tensor processing unit, that Google’s latest AI accelerator chip, which is designed to help AI algorithms run more quickly and power efficiently, and a number of our placements were actually used in Google’s latest TPU which was physically manufactured in January of this year, which is super exciting for us.
Interviewer: Benjamin Thompson
So, an AI has designed an AI chip. That’s kind of full circle, I suppose.
Interviewee: Anna Goldie
Yeah, exactly, and that was our vision with this work. We wanted to kind of close this loop. Now that machine learning has become so capable – that’s all thanks to advances in hardware and systems – can we use AI to design better systems and hardware to run the AI algorithms of the future.
Interviewer: Benjamin Thompson
What do the chip-makers that you’ve spoken to, the human chip-makers, think about this way of doing things?
Interviewee: Anna Goldie
I mean, of course it’s a bit scary, right? I mean, the placements look weird and also it’s a bit threatening in some sense to have this AI algorithm coming and doing part of your job. But to be honest, this is the part that the chip designers maybe least enjoy doing. So, I’ve heard a number of people suggest that they were kind of happy for this to be done by an algorithm at this point and maybe they can shift their attention to other parts of the process that will allow for further performance gains.
Host: Benjamin Thompson
That was Anna Goldie from Google Brain. We’ll pop a link to her paper and a Nature editorial on the subject of microchip design in the show notes.
Host: Noah Baker
Coming up in the show, a new technique to identify physical interactions between distant sections of a genome. But before that, it’s time for the Research Highlights, read by Dan Fox.
[Jingle]
Dan Fox
Have you ever wondered how physically fit you could get if you really applied yourself to an exercise regime? Previously, the only way to find out would be to pull on your gym kit and get started. But now, scientists have identified a group of biomarkers linked to intrinsic cardiovascular fitness and another linked to fitness gained from training. Researchers sampled blood from more than 650 sedentary adults. The team also measured participants’ maximum oxygen intake before and after a 20-week exercise programme. Analysis of some 5,000 blood proteins identified 147 linked with baseline oxygen-intake levels and 102 linked with improvements to oxygen intake after the exercise programme. The authors say that with better knowledge, the proteins that indicate the blood’s oxygen-carrying capacity could serve as biomarkers for a person’s fitness and future health risks. Jog over to Nature Metabolism to read that research in full.
[Jingle]
Dan Fox
The composition of a marine creature’s teeth have inspired the creation of new inks for 3D-printing strong, lightweight materials. The coast-dwelling mollusc Cryptochiton stelleri has been dubbed the wandering meatloaf because of its large, oval, reddish-brown body, which can reach more than 30 centimetres in length. However, the mollusc’s modest exterior conceals several dozen rows of sharp teeth, which are among the hardest organic objects known in nature. The mollusc feeds by scraping them along rocks covered in algae. Researchers analysed these teeth with a range of advanced imaging techniques and unexpectedly detected nanoparticles of santabarbaraite, an iron mineral previously only seen in rocks. The researchers suggest that these particles could toughen the teeth without adding much weight. The team then designed 3D-printing inks inspired by this composition. They used these inks to make strong, lightweight materials that vary in hardness and stiffness and might find applications in fields such as soft robotics. Get your teeth into that research in Proceedings of the National Academy of Sciences of the United State of America.
[Jingle]
Host: Noah Baker
Next up, a new technique is allowing researchers to uncover more information than ever before about how our genes are controlled by getting up close and personal with DNA. Here’s reporter Ali Jennings.
Interviewer: Ali Jennings
Most of the cells in your body contain a copy of your entire genome. Strings of DNA made up of roughly 3 billion base pairs in total. Within your genome are around 20,000 genes that code for the body’s proteins. But no cell needs to produce all of those proteins all of the time. This means turning genes on when they’re needed and off when they aren’t. Key to this process are short stretches of DNA called enhancers and promoters.
Interviewee: James Davies
Promoters are the sequence right next to the gene itself which become activated for gene expression, and enhancers are these sequences that can be hundreds of thousands of base pairs away, and enhancers require to physically contact the promoter in order to turn the gene on. But we don’t really understand exactly what causes the 3D contacts between the enhancer and promoter.
Interviewer: Ali Jennings
This is James Davies from the University of Oxford in the UK. This week in Nature, James and his colleagues have a paper out about a new technique they’ve been working on that will let them see how these enhancers and promoters come together in much finer detail than has been achieved before, which could have important implications for understanding how genes function and how they can be involved in disease.
Interviewee: James Davies
Previously, we were limited to resolutions of between 500 and 1,000 base pairs, and what we’ve been developing is methods for determining this at increasing resolution, and our latest iteration gets us down to nearly a single base pair, so you could make out which elements were contacting and which ones weren’t contacting at the relative strengths of those contacts.
Interviewer: Ali Jennings
In order for an enhancer to contact a promoter, the DNA needs to fold into a complex 3D shape. To work out what this shape is, in the past, researchers have used a method called chromosome confirmation capture. But this method doesn’t give a perfect picture of the exact place that these interactions happen in the DNA. James and his team have combined a number of techniques to refine the chromosome confirmation capture method. Now, they can see in much finer detail how promoters and enhancers interact, nearly down to the level of individual base pairs, and that has revealed new information about the cellular machinery that brings enhancers and promoters together.
Interviewee: James Davies
So, one of the key things that we’ve found is that active enhancers and promoters are bound by this protein called NIPBL, which loads this other protein called cohesin onto the DNA, and cohesion is a ring structure.
Interviewer: Ali Jennings
The cohesin ring is placed on top of the strand of DNA between the enhancer and the promoter. Then the DNA is pushed up through the ring, forming a little loop. As more and more DNA is pushed through the cohesin ring, the loop gets bigger and bigger and the enhancer is pulled closer to the promoter. But this can’t go on forever.
Interviewee: James Davies
The other thing that we think is important is that the ring actually gets blocked by this other protein called CTCF, and it gets stabilised there and so you end up with these loops being formed.
Interviewer: Ali Jennings
James and his team found that the structure of these loops, along with molecules called transcription factors, hold enhancers and promoters in the right configuration to activate a gene properly and also prevented other genes from being inappropriately activated. Surprisingly, James also found that this mechanism is used differently between different kinds of cells.
Interviewee: James Davies
So, in some cell types, the cohesin gets loaded in one place and in other cell types it will get loaded in a different place, and that leads to cell-type specific changes in the 3D confirmation of the DNA.
Interviewer: Ali Jennings
These different 3D confirmations could explain how even the same genes can end up being expressed differently in different kinds of cells. James’ new technique has allowed him to see for the first time how gene expression machinery fits together at the level of single base pairs of DNA. So, what does he hope to do with it now?
Interviewee: James Davies
We’re really interested in why different genes look so different with this technique. So, some genes appear to be controlled by enhancers that are extremely close to the promoter whereas others tend to be controlled by enhancers that are huge distances away, and it’s an open question as to why that occurs.
Interviewer: Ali Jennings
And James thinks that there are many other things that this technology could be used for.
Interviewee: James Davies
So, one of the real problems is that most of the variation in the human genome that’s linked with disease lies in the non-coding genome.
Interviewer: Ali Jennings
‘Non-coding’ refers to parts of the genome that don’t code specifically for genes but could still contain promoters and enhancers.
Interviewee: James Davies
So, what we are hoping to do with this technology is to use it to decipher how these variants cause disease. So, for example, it’s a single base pair difference between people that have twice the risk of getting very sick with COVID versus people who don’t, and it lies in an enhancer, and what we can use this technology to do is to identify which gene that variation is affecting, and it does this really very precisely.
Host: Noah Baker
That was James Davies. You can find a link to his paper over in the show notes.
Host: Benjamin Thompson
Now, it’s time for the Briefing chat, where we talk about a couple of our favourite stories from the Nature Briefing. Noah, why don’t I go first this time. I’ve got some good news this week. The human genome has been sequenced.
Host: Noah Baker
So, that’s something that happened a while ago. I’m pretty sure there was a whole deal about that and we did a big anniversary special recently. That’s not news, right?
Host: Benjamin Thompson
Yes, alright, fair enough. It’s not the year 2000 thankfully – I had a terrible haircut then. It is very much the year 2021. But what I’ll say is, when that sequence came out back in 2000, that was very much the first draft and it was missing about 15% of the genome, so I understand, and a 2013 version missed about 8%. So, there was still a lot of gaps there in what we knew about what the actual sequence was.
Host: Noah Baker
And so, is that just a case of the technology wasn’t good enough to be able to get them all or we needed to just find more samples? Why were there gaps? And I’m assuming that now there are no gaps?
Host: Benjamin Thompson
Very astute of you, Noah, you’re absolutely right. So, a team has announced that they have sequenced the entire genome. This is a big collaboration of scientists, and I will say this is a preprint article. This hasn’t been peer reviewed yet. It was reported in Stat. And you’re right, it was a technological thing. So, the way that sequencing tends to work is that a genome is chopped up into little bits and then a computer tries to put all those puzzle pieces together and work out where the overlaps are and build up this entire length of DNA. But that’s quite problematic when there are really, really long stretches of one base pair of very similar, so A, A, A, A, A, A, A, A for hundreds of bases is really hard for the computer to work out where is that overlap. So, it’s really, really struggled with this kind of final few percent, but it turns out that the team have actually managed to do that and, as I say, they’ve reported it very recently.
Host: Noah Baker
And has there been a change to the methodology they’re using or is this just a case of brute forcing for such a long time that eventually they’ve found all those gaps and filled them in?
Host: Benjamin Thompson
Well, the technology has come a long way since 2000 as you might imagine, and a lot of this work has involved the technological advances of two companies, and what their technologies are able to do is read very, very long sequences of DNA in one go and that makes it so much easier to sort of overlap where these things are. And that’s revealed a little bit more about the human genome. The number of base pairs has gone up 4.5% on the 2013 version, but what’s interesting is the protein-coding region has only gone up 0.4%, so not a huge amount. And you may wonder why this is important but, I think as we heard earlier with Ali’s report on the structure of DNA, these non-coding regions are super important to learn about how genes are regulated or maybe how disease occurs, so just because it isn’t a gene doesn’t mean it isn’t important.
Host: Noah Baker
Yes, absolutely. I mean, there’s the huge ENCODE consortium that’s been running for quite some time as sort of the follow-up to the Human Genome Project, which has been looking at non-coding regions and trying to work out what impact they have and how they work, I suppose, on the functioning of the body. So, when the initial Human Genome Project came out, since then, that kind of resource has been vital to scientists, to clinicians, to clinical researchers all over the world to study disease, to study the way the body works. What is this addition aside from being neat and lovely and tied up in a bow that we’ve got the whole genome now? Is this likely to actually be useful for scientists?
Host: Benjamin Thompson
Yeah, I mean, there’s a couple of things going on here. I mean, one of these sections that wasn’t known much about is this area of a chromosome called a centromere and, if you remember your school lessons, that’s super important in terms of cell division, and not really much was understood about the sequence in this area. And this current work is, I think, going to be used as a reference genome, and I think future work from a different large consortium of scientists is going to do full sequences of other people and compare and contrast and look for these small differences that, as we’ve heard, could be really, really important.
Host: Noah Baker
That’s amazing. So, now that the full genome has been sequenced, what’s next? Are genomicists done now?
Host: Benjamin Thompson
Not quite, I’m afraid to say. So, apparently, there’s 0.3% of this sequence that still needs to be checked. That sounds like a small amount but when you’re talking about the length of a genome, small is massive, right? But also what’s interesting is that this sequence doesn’t include a Y chromosome. So, at some point we’ll need a complete sequence of that too to really understand what is going on in its entirety.
Host: Noah Baker
And so, the complete human genome sequence that’s been announced now is actually still not quite complete. We’ve still got a little bit further to go but big achievement nonetheless.
Host: Benjamin Thompson
Oh, absolutely. I think ‘complete’ in inverted commas but, yes, a huge achievement, you’re absolutely right. But let’s move on, Noah. What have you got for me in this week’s Briefing chat?
Host: Noah Baker
I have got the opportunity to sort of indulge myself in one of my favourite things in all of planetary science, which is the acronyms that space missions use, which are brilliant and wonderful. So, last week, NASA announced the next two missions that are part of its Discovery program, which are both missions to Venus.
Host: Benjamin Thompson
So, Venus then, I guess it’s one of our neighbouring planets in the Solar System, a little bit closer to the Sun, well a lot closer to the Sun than we are, but it couldn’t be more different to Earth, right?
Host: Noah Baker
Well, in many ways, yes, and in many ways, no, actually. It’s the second closest planet to the Sun, so it’s one step closer, as you say, than Earth. But it’s remarkably similar to Earth in terms of its mass and its size and its composition. It’s a rocky planet with a core, it has an atmosphere, but that’s about where the similarities end. Venus is also kind of described like Earth’s twisted sister. It looks very similar but it’s also hellish. It’s the hottest planet in the Solar System. The pressure on the surface of Venus is 90 times higher because of this incredibly thick atmosphere made up carbon dioxide and sulfuric acid rain and storms. There are massive volcanoes all over Venus. It’s completely and utterly inhospitable, but in many other ways it’s very similar to Earth and so it’s raised this kind of question, ‘Is Venus the post-apocalyptic Earth?’ Could Venus have at some time in its past been much more like Earth, where it might have actually even have been temperate, maybe even habitable. Is this what Earth could look like in many, many millennia from now. And I think that’s one of the reasons that scientists are interested to go and study it.
Host: Benjamin Thompson
So, tell me about these missions then. What are they going to do specifically? Presumably they are satellites that are going to do some orbiting or something like that?
Host: Noah Baker
Right, great, so this is my opportunity to talk about my acronyms which I love. So, there’s two. The first one is called DAVINCI+, which stands for ‘Deep Atmospheric Venus Investigation of Noble gases, Chemistry and Imaging’ and the second is called VERITAS, which stands for Venus Emissivity, Radio Science, InSAR, Topography, and Spectroscopy. Brilliant again.
Host: Benjamin Thompson
Easy for you to say.
Host: Noah Baker
Exactly. So, DAVINCI+ is aimed specifically at understanding more about the atmosphere of Venus, so this thick carbon dioxide atmosphere and then the sulfuric acid clouds that are lower down. Part of the reason that Venus is so incredibly hot is because of this thick layer of greenhouse gases and they want to understand how this works. So, the aim of DAVINCI+ is to travel down into the atmosphere where there’s this really high pressure to take samples and understand a little bit more about how that works. VERITAS, on the other hand, is going to orbit Venus, as you say, so that one’s going to continue studying the surface of Venus, so that’s looking at the topography. It’s going to be scanning and mapping the surface with the view to try to reconstruct some of the geological history of Venus to understand how volcanoes could have played a role in its development, for example.
Host: Benjamin Thompson
Is it unusual to commission two missions to the same planet? Like there’s a bunch of the rest of the Solar System that we know even less about, I suppose, so why Venus and why now?
Host: Noah Baker
Yeah, that’s an interesting question. So, NASA has three kind of categories of mission that it sends. The first is called Discovery, there’s also one called New Frontiers and then there’s the flagship categories. So, flagships are things like Perseverance and Curiosity – they cost billions. New Frontiers are a bit cheaper – only US$850 million they’re capped at. And then the Discovery missions, which is what these two both are, those missions are very cheap, so capped at a mere US$450 million to be able to go there. And people essentially submit proposals for the various missions, and these two missions to Venus have been proposed for quite some time and have been beat out by missions to asteroids in the past, which has got some planetary scientists a bit peeved. But this time round, they’ve both been chosen. They’ve kind of been in the pool for a while and they beat out the other two finalists which were to go to one of Neptune’s moons, Triton, and also one of Jupiter’s moons, Io, which are both also quite high up on planetary scientists’ lists to study. The reason, I guess, in this case that these two have been sent is that many scientists consider Venus to be a rather understudied planet. The last mission that went to Venus was the Magellan mission, but that was 30 years ago – that ended in 1994 – and so there’s been a real appetite to get back to Venus to understand more about it soon because I think people want to try to learn more about the history and possible future of Earth based on what’s been happening on Venus.
Host: Benjamin Thompson
I mean, you say ‘soon’ then, Noah. When can we expect some pictures or some information about what’s going on from Venus?
Host: Noah Baker
So, ‘soon’ in planetary science terms is often not as soon as we think. So, at the moment, all that’s happened is that the missions have been announced, and there are projections that they may be leaving in around 2030, so you could expect some kind of feedback in the near future after that, maybe a year or two after that. Maybe we’ll get some pictures. Maybe we’ll get some samples. Maybe we possibly even find some possible hints at life in the atmosphere of Venus, which is something that for some time people have been speculating about being possible. There are these kind of streaks that you can see in the atmosphere of Venus, even from Earth, which absorb ultraviolet radiation and so there has been the hypothesis that perhaps within these streaks could be microbial life that is absorbing UV radiation high up in the atmosphere at a place where the density is lower, the temperature is more similar to Earth. It could be a little bit more habitable, for example.
Host: Benjamin Thompson
Wonderful. Well, I hope you’ll join me in nine years to tell me the results of that and we’ll maybe talk about another version of the sequence of the human genome. But until then, let’s call it for this week’s Briefing chat. Thank you so much, Noah. And listeners, if you’re interested in more stories like this but delivered directly to your inbox then make sure you sign up for the Nature Briefing. And of course, we’ll put a link in the show notes where you can do so.
Host: Noah Baker
That’s all for the podcast this week. Drop us a line at any time on email – podcast@nature.com – or Twitter – @NaturePodcast. I’m Noah Baker.
Host: Benjamin Thompson
And I’m Benjamin Thompson. Thanks for listening.